Suggest policy of reviewing failures before accepting new tests #1871

wsnyder · 2021-11-27T13:20:12Z

#1870 points out some tests that were added that are not consistent/complete designs.

I'd like to suggest a new policy that before any new tests be accepted, failures against every tool be manually reviewed, and the tests only accepted if all failures represent real tool issues versus some problem with the test (e.g. missing modules/packages).

Alternatively, before being accepted, the tests could be whitelisted or otherwise filtered to run against only those tools that don't need complete designs.

Thanks for considering.

tgorochowik · 2021-11-27T14:16:09Z

Thanks for the report, I agree! We'll try to come up with some ideas on how to address this

tgorochowik · 2021-12-13T08:33:49Z

We are working on a way to separate the main test suite (all those "unit tests" etc) from the more complicated tests for which the setup is not always fair to all the tools. These are going to get their own table at the bottom and will be excluded from the main statistics. Later, once we see that the extra tests are equally good for a number of tools they can be moved back to the main report.

This should make the problem less important and will allow us to keep the tests that may seem broken for some tools but are still useful for various reasons.

wsnyder · 2021-12-13T12:39:48Z

I appreciate the work going into this project, so please take this as constructive suggestions.

What it currently feels to me (perhaps wrongly), is currently if a new test is deemed interesting for the parser, it's imported, and the other tools that run here are ignored. I suggest the goal of sv-tests should be more inclusive.

I can see the usefulness of splitting the tests. Splitting will be an improvement in that it will no longer harm the "grade" of those tools that are not applicable. However, I don't think addresses my larger point, it will still take time away from the tool teams whom will still (hopefully) see "secondary tests" false failures.

I suggest it as a tool team's job to for a given tool to:

understand all failures against the tool.
try to drive the tool's failures down.
try to reduce any special cases in the harness or otherwise that make the tool harder to use/test

I suggest it as a test importer's job before importing a test to master to:

ensure the test applies to as many tools as possible & see that they did the best job reasonable against that
ask "is this new test's failure something I can fix by editing the test/import"? If the answer is yes, that should be done before merging. (e.g. if a test can't find a submodule, and that submodule is part of the third-party code, that's something that needs fixing.)
ask "is this new test's failure because it doesn't apply to that tool type (e.g. non-synthesizable)?" If the answer is yes, then a :type: or something should mark that test only to run on appropriate tools.
if blacklisting any tests, notify the third-party upstream so they have the option of improving their tests

Again my goal is that when a new test is added, if it's failing on a tool that means a tool designer has work to do on the tool, not on even looking on a non-appropriate test.

wsnyder mentioned this issue Sep 2, 2023

Fix Veer-EL2 with Verilator #4941

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Suggest policy of reviewing failures before accepting new tests #1871

Suggest policy of reviewing failures before accepting new tests #1871

wsnyder commented Nov 27, 2021

tgorochowik commented Nov 27, 2021

tgorochowik commented Dec 13, 2021

wsnyder commented Dec 13, 2021

Suggest policy of reviewing failures before accepting new tests #1871

Suggest policy of reviewing failures before accepting new tests #1871

Comments

wsnyder commented Nov 27, 2021

tgorochowik commented Nov 27, 2021

tgorochowik commented Dec 13, 2021

wsnyder commented Dec 13, 2021