Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Papercut] Check read and write-together specs for combinational groups #1215

Closed
janpaulpl opened this issue Oct 20, 2022 · 9 comments · Fixed by #1259
Closed

[Papercut] Check read and write-together specs for combinational groups #1215

janpaulpl opened this issue Oct 20, 2022 · 9 comments · Fixed by #1259
Assignees
Labels
C: Calyx Extension or change to the Calyx IL Type: Paper cut Spurious or confusing errors

Comments

@janpaulpl
Copy link
Contributor

When implementing a single source shortest path graph analytics algorithm (Graphicionado) in Calyx, I stumbled onto a bug in my implementation where I didn't iteratively update the values for VProperty (the shortest path cycle of each vertex from the initial active vertex). The output was the default value for all vertices: 99 (excluding the initial active vertex which is 0).

However, when running the program with the verilator front end instead of the interpreter, the output replaces all instances of 99 with 0. This output holds for whatever default values in VProperty.

<========== RUNNING WITH VERILATOR FRONT END ==========>

$ fud e --to dat -v -s verilog.data sssp/data/graph.futil.data sssp/src/graphicionado.futil 
✔ futil.run_futil
✔ transform
✔ verilog.mktmp
✔ verilog.json_to_dat
✔ verilog.compile_with_verilator
✔ verilog.simulate
✔ verilog.output_json
✔ verilog.cleanup
{
  "cycles": 732,
  "memories": {
    "EdgeIDTable": [0, 2, 3, 4],
    "TotalEdgeCount": [4],
    "TotalVertexCount": [4],
    "VProperty": [0, 0, 0, 0],
    "edges": [[0, 1, 1], [0, 2, 1], [1, 3, 1], [2, 3, 1]]
  }
}

<========== RUNNING WITH CALYX INTERPRETER ==========> 

$ fud e --to interpreter-out -s verilog.data sssp/data/graph.futil.data sssp/src/graphicionado.futil 
{
  "main": {
    "EdgeIDTable": [0, 2, 3, 4],
    "TotalEdgeCount": [4],
    "TotalVertexCount": [4],
    "VProperty": [0, 99, 99, 99],
    "edges": [[0, 1, 1], [0, 2, 1], [1, 3, 1], [2, 3, 1]]
  }
}

The buggy code can be found in the calyx-graphicionado private repo, and you can replicate the issue by running the following commands from root:

$ fud e --to interpreter-out -s verilog.data sssp/data/graph.futil.data sssp/src/graphicionado.futil

$ fud e --to dat -v -s verilog.data sssp/data/graph.futil.data sssp/src/graphicionado.futil 

I'm running Verilator 5.001 devel rev v4.228-150-g68927d4fd.

The correct code can be found under tests/correctness/graphicionado and doesn't produce an output difference.

@janpaulpl janpaulpl added the C: Interpreter / Cider Calyx Interpreter & Debugger label Oct 20, 2022
@rachitnigam rachitnigam added the Type: Bug Bug in the implementation label Oct 20, 2022
@rachitnigam
Copy link
Contributor

Looked at this a little bit and tried to reduce the file. A particular problem seems to be this line:
https://github.com/cucapra/calyx-graphicionado/blob/main/sssp/src/graphicionado.futil#L537

There no address provided for the read from the memory TotalEdgeCount. The compiler warns about this after lowering comb group:

% cargo run -- ../calyx-graphicionado/sssp/src/graphicionado.futil -p remove-comb-groups -p validate
    Finished dev [unoptimized + debuginfo] target(s) in 0.07s
     Running `target/debug/futil ../calyx-graphicionado/sssp/src/graphicionado.futil -p remove-comb-groups -p validate`
Error: [Papercut] Required signal not driven inside the group.
When read the port `TotalEdgeCount.read_data', the ports [TotalEdgeCount.addr0] must be written to.
The primitive type `std_mem_d1' requires this invariant.

The interpreter output difference is likely because of the difference caused by this. We should make the papercut pass warn about such uses of memories in comb group as well.

@sampsyo
Copy link
Contributor

sampsyo commented Oct 21, 2022

Super interesting. I wonder if this (missing input address for a memory read) is something that would be easy for the interpreter to catch? Probably not, because we'd just see it as a zero instead of a missing value. But someday soon we will have a proper treatment of undefinedness…

@cgyurgyik
Copy link
Collaborator

cgyurgyik commented Oct 21, 2022

Super interesting. I wonder if this (missing input address for a memory read) is something that would be easy for the interpreter to catch? Probably not, because we'd just see it as a zero instead of a missing value. But someday soon we will have a proper treatment of undefinedness…

I imagine we could add another feature to the interpreter representation of primitives that ensures X signals are driven together, similar to @{write_together,read_together}.

@rachitnigam
Copy link
Contributor

Yeah, that kind of validation would be useful. Maybe the interpreter should have a bunch of hooks that allow us to add these kinds of analyses in a modular fashion (we already have something for @bound).

@EclecticGriffin
Copy link
Collaborator

EclecticGriffin commented Oct 25, 2022

Hmm okay, did a tiny bit of digging and reproduced the error locally, though for some reason my local version of verilator Verilator 4.228 2022-10-01 rev UNKNOWN.REV has a syntax error, copied below.

[fud] ERROR: `verilator -cc --trace /var/folders/4s/d76r5g0j6pg3xgq7hnhkng200000gn/T/tmpllyaxslg --exe /Users/griffin/research/capra/calyx/fud/sim/testbench.cpp --build --top-module main --Mdir /var/folders/4s/d76r5g0j6pg3xgq7hnhkng200000gn/T/tmpi1m4ckje>&2' failed:
=====STDERR=====
%Error: /var/folders/4s/d76r5g0j6pg3xgq7hnhkng200000gn/T/tmpllyaxslg:3890:22: syntax error, unexpected TYPE-IDENTIFIER, expecting IDENTIFIER or randomize
 3890 |     processing_phase process (
      |                      ^~~~~~~
%Error: Exiting due to 1 error(s)

=====STDOUT=====

Anyway, afaik this is a case where the verilator version is only "working" because the undriven value is zero and because the memory address is not being driven when the conditional is checked (per the error @rachitnigam identified). Since the memory only contains a single slot, this accidentally behaves as expected. The interpreter doesn't update the value of the memory because it is not given new input, so it retains an undriven output (in this case 0) which means the comparison always fails and the values are consequently never updated.

Whether we consider this an interpreter error is down more to our opinions on undefined values and I don't know that it really should agree with verilator here given that the program is unambiguously buggy. Though in the future it would definitely be nice to highlight this error more directly

@sampsyo
Copy link
Contributor

sampsyo commented Oct 25, 2022

Weird about the Verilog syntax error!!! 😬

But yes, this seems (to me) to be a pretty clear instance of our broader conversation about undefinedness (e.g., #922 and #1013). I think the status-quo idea to beat would consist of these axioms:

  • The IL semantics should consider an undriven address port to be an error.
  • The Verilog backend would be allowed to (but not required to) just supply a zero in this case.
  • The interpreter should somehow trigger a fail-stop error when this happens.

How to satisfy all of those, however, may be kinda subtle.

@janpaulpl
Copy link
Contributor Author

janpaulpl commented Oct 26, 2022

I was also getting the weird syntax error

%Error: /tmp/tmp0mtqibwd:4025:22: syntax error, unexpected TYPE-IDENTIFIER, expecting IDENTIFIER or randomize
 4025 |     processing_phase process (
      |                      ^~~~~~~
%Error: Exiting due to 1 error(s)

=====STDOUT=====

When the call for processing_phase was defined as process in main

    /*********** Other Structure **************/
    curIter = std_reg(32); // current iteration
    add = std_add(32);
    lt = std_lt(32);
    p = process_edge();
-   process = processing_phase();
+   pp = processing_phase();
    app = apply_phase();

This is the same reason why all other structures have short naming conventions...

Specifically, at line 571 does the code invoke process.

I fixed this by just replacing process with pp. In fact, it works with proces as well. Because it works with the interpreter, process is probably already something in verilator and it can't be used in that specific context...

@sampsyo
Copy link
Contributor

sampsyo commented Oct 26, 2022

Ah, so process is a reserved word in Verilog!

That would do it. I know this is silly, but we could consider some form of mangling in our Verilog backend to avoid problems like this. (Possibly using a hard-coded list of reserved words, which get a __ appended or something.)

@rachitnigam
Copy link
Contributor

Ah, we should add it to the list of restricted identifiers in the well-formedness check

@janpaulpl janpaulpl self-assigned this Nov 1, 2022
@rachitnigam rachitnigam changed the title Interpreter output doesn't equal verilator front-end output [Papercut] Check read and write-together specs for combinational groups Nov 8, 2022
@rachitnigam rachitnigam added C: Calyx Extension or change to the Calyx IL Type: Paper cut Spurious or confusing errors and removed C: Interpreter / Cider Calyx Interpreter & Debugger Type: Bug Bug in the implementation labels Nov 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C: Calyx Extension or change to the Calyx IL Type: Paper cut Spurious or confusing errors
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants