Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Annotations for Data Path Components #1169

Closed
rachitnigam opened this issue Sep 8, 2022 · 7 comments · Fixed by #1456
Closed

Annotations for Data Path Components #1169

rachitnigam opened this issue Sep 8, 2022 · 7 comments · Fixed by #1456
Assignees
Labels
C: Calyx Extension or change to the Calyx IL Status: Available Can be worked upon

Comments

@rachitnigam
Copy link
Contributor

rachitnigam commented Sep 8, 2022

Another shot to answer the evergreen undefinedness of undriven signals in Calyx (#922). In its most reduced operational form, it asks when is it safe to provide 'x as the default value for a port instead of '0.

The core of this proposal are two new attributes: @data and @control. The @data attribute marks a cell as a data path component, and the @control attribute marks a cell as a control path component. These attributes are mutually exclusive, and a cell can only have one of them.

comb primitive std_add[WIDTH](@data a: WIDTH, @data b: WIDTH) -> (out: WIDTH);
primitive std_mult_pipe[WIDTH](@control go: 1, @data a: WIDTH, @data b: WIDTH) -> (out: WIDTH, @control done: 1);
cells {
  @data add0 = std_add(32);
  @control incr = std_add(32);
  @data mult = std_mult_pipe(32); // mult.go must always be defined
}

An instance marked with @control requires all its ports to have '0 as the default value. An instance marked with @data additionally needs to mark some of its ports with the @data attribute to state that these can accept 'x as the default value. This is because even for data path components, certain ports like the go port might always need a non-'x value.

This proposal follows a safe-by-default approach to the problem: it is always safe to ignore the @data attributes and provide all ports with a default value of '0. The @control attribute is just a safety mechanism that can be used to catch errors early.

Conditions for @control Instances

Following the things enumerated in #922, here is a list of some constraints the forces an instance to be a control instance:

  1. If it is used in the guard of an assignment
  2. If it is used as the done condition of a group
  3. If it is used as the conditional port for if or while
  4. If it is used as the input to a @go port such as write_en
  5. If it is used as an input for another @control instance

Note that if any port of an instance is used in these positions, the instance must be marked as a @control instance. Number (5) is the challenging one here since it's a recursive condition. However, I don't see any other way to address the problem of propagation of undefinedness from data path to the control path. The interesting question is whether this approach will leave any instances as @data instances.

We can design a new pass that can compute which cells are @control using the criteria above. If these conditions are necessary and sufficient, all the cells not marked with @control are @data cells.

Semantics

Everything up till now gives a very operational view of how to address the undefinedness problem. However, it doesn't give us a clear view on what the exact semantic properties are. As far as I can tell, the semantics needs to be defined in terms of these attributes. There is no a priori way to define which signals are data path and which are control path; we can only call programs undefined given an assignment of attributes to instances.

@rachitnigam
Copy link
Contributor Author

The Filament issue above points out an optimization that this can enable in the backend:


It occurs to me that this optimization should be done by Calyx and not Filament itself because this reasoning applies to all data ports. Specifically, rewriting:

in = guard ? data : 'x

into

in = data

Should always be safe

@sampsyo
Copy link
Contributor

sampsyo commented Sep 13, 2022

Finally catching up on this very nice idea, @rachitnigam, after your helpful explanation today!

I think the note about allowed optimizations above is especially important. It's a very cool upside that we can show that certain optimizations are allowed on always-never-undefined values that are not possible on maybe-undefined values!!

Just to jot down some high-level thoughts that came out of today's brief discussion:

  • One way to "phase" this work would be to hammer things out in a fully dynamic context first, then work out the static checking. That would look like this:
    1. In the first phase, define the dynamic semantics of the undefined value (here denoted undef) and the dynamic events that should be errors. For example, for error conditions: a guard expression evaluating to undef is not allowed, and it is also an error for a @go port ever to be assigned to undef. For dynamic semantics, undef aggressively taints guard expressions (e.g., (undef < 42) == undef), and it propagates aggressively through parameters like std_add. Hopefully, we extend the interpreter to implement these semantics precisely.
    2. In the second phase, add @data/@control annotations as a way of soundly avoiding the dynamic error conditions. The theorem is basically that a well-typed (with respect to these annotations) Calyx program cannot ever encounter an undef-related error at run time.
  • For the static checker, I think it would be nice to view this as an information flow type system—i.e., all types are qualified (so, like, @data 32 or @control 1). The design and goal guarantees would be pretty similar to nullability annotations, as in Java's nullability checker.
  • As a corollary, a fun way to approach the issue above with annotated cells would be to make cells qualifier-polymorphic. The problem here is that we want to support cells that are sometimes used as data and sometimes used as control. For example, you might want the out port of std_add(32) to sometimes be @data 32 and sometimes be @control 32, depending on the adder. Qualifier polymorphism would mean that you would pass in a qualifier parameter, in exactly the same way that you pass in a width. So you could declare the primitive to have all its ports have type @Q WIDTH, and Q would be a type-level variable. Then if you instantiate std_add(32, @data) or std_add(32, @control) to fill in that Q.
  • A big question in my mind is whether we want non-undefness to be required or enforced by the compiler.
    • Required: The program itself needs to guarantee that the value is never undef. The compiler doesn't need to insert any zero fallbacks. This has somewhat more predictable behavior, I'd say, since there aren't any mysterious zeroes that appear implicitly; all values that happen at run time come from the program text.
    • Enforced: The compiler inserts zero-fallback assignments for all @control ports. This is the current behavior (but we currently do this to all ports). It would be lower effort, especially on the frontends' parts, because this is pretty much what they're expecting anyway. But it's admittedly funkier in terms of semantics, because you don't get direct control over where the zeroes get injected.

@rachitnigam
Copy link
Contributor Author

A big question in my mind is whether we want non-undefness to be required or enforced by the compiler.

One important consequence of answering this question is whether the optimization can be represented in Calyx or not. If the compiler enforces this property then the optimization occurs in backend while the Calyx code is being compiled in Verilog. On the other hand, if programs are required to maintain it, then we can represent the optimization within Calyx.

A middle ground solution is following the philosophy outlined in #518 and explicitly instantiating all the muxing in Calyx program before we even reach the backend.

@sampsyo
Copy link
Contributor

sampsyo commented Sep 14, 2022

Yeah, good point. I think it could be fun to think a bit about what the IR would look like in the "required" version, and what the relevant lowering/optimization passes would look like.

@rachitnigam
Copy link
Contributor Author

We need to say something stronger here:

If it is used as an input for another @control instance

In general, a lot of the computations are going to end up transitively related to assignments to a group's done signal. For example, here is an image of a part of the writes-to graph for lpm.futil:
Screenshot 2023-03-08 at 10 23 12 PM

The thing to notice is that most group[done] writes come from the done condition of a register (or something similar). Concretely, for assignment like this:

r.in = lt.out
r.write_en = 1'd1
group[done] = r.done;

We should still be able to mark lt.out as a @data cell because the r.done signal is going to contain a non-undef value regardless of what lt.out contains.

In contrast, for something like this:

r.in = lt.out
r.write_en = c.out ? 1'd1
group[done] = r.done;

We would need to mark c.out as a @control cell because whether or not r.done triggers depends upon c.out's value and because of this, it cannot be undef

@sampsyo
Copy link
Contributor

sampsyo commented Mar 10, 2023

Hmm; interesting example. You observed:

We would need to mark c.out as a @control cell because whether or not r.done triggers depends upon c.out's value

Correct me if I'm wrong, but in the statically-checked version of this system, we would need to require anything used in a guard to be @control, right? That is, it would be a compile-time error for a guard to have a @data type annotation. (Is that right?)

@rachitnigam
Copy link
Contributor Author

rachitnigam commented Mar 13, 2023

Yup! Guards are not allowed to "glitch" and hold 'x values ever. Maybe there is some deeper principle at play here that puts guards in the same category as @control ports

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C: Calyx Extension or change to the Calyx IL Status: Available Can be worked upon
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants