Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Coq still misinterprets Verilog as Coq #5041

Closed
secworks opened this issue Oct 10, 2020 · 9 comments · Fixed by #5075
Closed

Coq still misinterprets Verilog as Coq #5041

secworks opened this issue Oct 10, 2020 · 9 comments · Fixed by #5075
Labels
Good First Issue This is a great opportunity to start contributing to Linguist Misidentified Language

Comments

@secworks
Copy link

secworks commented Oct 10, 2020

Coq sometimes misinterpret Verilog HW description language files as being Coq prover language files.
There has been a number of issues opened related to this. But all of them has been closed without a fix.
https://github.com/github/linguist/issues?q=is%3Aissue+verilog+coq+is%3Aclosed

And the issue still persists and is (probably increasing) due to the trend towards open hardware design.
As an example of files being misinterpreted here is a repo:
https://github.com/fusesoc/tiny-cores

These files in the repo above are clearly Verilog, but considered by Linguist to be Coq files:
shift_reg/piso.v
shift_reg/sipo.v

Expected language:

Verilog
https://en.wikipedia.org/wiki/Verilog

Detected language:

Coq
https://en.wikipedia.org/wiki/Coq

@lildude
Copy link
Member

lildude commented Oct 12, 2020

As mentioned in at #4750 (comment): this will be because the heuristic isn't precise enough and would need tweaking by someone that knows the languages involved.

#4751 was an attempt at improving things, but it went stale /cc @0x00a1e9

If you've got the bandwidth and knowledge of either/both of these languages, we'd be happy for a new PR or for a request to re-open and finish that other PR.

@lildude lildude added Good First Issue This is a great opportunity to start contributing to Linguist Misidentified Language labels Oct 12, 2020
@secworks
Copy link
Author

I do know Verilog. But someone else must provide knowledge for Coq.
That said, it should be really easy if linguist can parse words.

Verilog files will normally contain a "module"-"endmodule" pair.

Browsing the Coq language, these words are not part of the language.
On the other hand, Coq contains a "::=" operator that would be a syntax error in Verilog.
The Coq operators "<+" and "<-" would also be syntax errors in Verilog, SystemVerilog.

Coq language ref I used:
https://coq.inria.fr/refman/language/core/basic.html

SystemVerilog LRM:
http://www.ece.uah.edu/~gaede/cpe526/SystemVerilog_3.1a.pdf

Verilog-2001 HDL Cheat sheet:
https://sutherland-hdl.com/pdfs/verilog_2001_ref_guide.pdf

@lf-
Copy link
Contributor

lf- commented Nov 1, 2020

@lf-
Copy link
Contributor

lf- commented Nov 1, 2020

#4751 looks like it went stale because of a lack of explanation of why the sample given looked so different. In brief, this is because the new sample is (program-like, cannot be synthesized into hardware) testbench code that uses tasks as opposed to the existing samples which are mostly if not entirely synthesizable Verilog. These subsets look very different because of the typical sequential execution model of testbench code as compared to synthesizable hardware description that has everything happen at the same time.

If this is the only issue with #4751, I'd support merging it in its current state as it looks like it appropriately removes the imprecise Coq comment heuristic that's almost certainly causing this problem.

@Alhadis
Copy link
Collaborator

Alhadis commented Nov 1, 2020

@lf- You definitely sound like you know your stuff; so you're saying (* this *) is a valid comment in Verilog as well? (Or, well, any legal Verilog syntax, for that matter?)

@lf-
Copy link
Contributor

lf- commented Nov 1, 2020

@Alhadis yes, (* ... *) is valid Verilog as a synthesis attribute but not as a comment. Similar in function to [[attributes]] in C++ in that it's a directive to the compiler related to the next item.

I believe that this syntax was introduced to supersede the previous and rather cursed practice of putting synthesis attributes in comments.

@Alhadis
Copy link
Collaborator

Alhadis commented Nov 1, 2020

In that case, the changes proposed by #4751 were most definitely valid. Unfortunately, the submitter deleted their fork, so I can't reopen the PR.

@lf- If you want to submit a new PR reintroducing those changes, rest assured you have my blessing. 👍 I feel more confident letting you take the reins here, as you're clearly the expert.

@lf-
Copy link
Contributor

lf- commented Nov 2, 2020

I have 🦇👻reanimated👻🦇 and rebased #4751 onto master in #5075 as requested :)

@Alhadis
Copy link
Collaborator

Alhadis commented Nov 2, 2020

Thanks! 👍 Approved for merge (though only a GitHub staff member can do that, so we'll have to wait until @lildude gets online)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Good First Issue This is a great opportunity to start contributing to Linguist Misidentified Language
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants