Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Coverage: Source of covered / uncovered data #788

Open
Blebowski opened this issue Nov 2, 2023 · 3 comments
Open

Coverage: Source of covered / uncovered data #788

Blebowski opened this issue Nov 2, 2023 · 3 comments
Labels

Comments

@Blebowski
Copy link
Sponsor Contributor

When doing code coverage, there is no way how to find out in the generated report the original coverage
database that lead to the coverage item being covered or uncovered. This would be a nice feature.

This is related to adding --name option that would uniquely distinguish the elaborations of otherwise
equal design. We can either keep within covdb the path to the covdb itself, or keep the --name to distinguish
individual "runs".

Then, upon merging of coverage, the original path of all covdb that were merged would be placed into the covdb.
There would be a list of paths in the beginning of the covdb that would indicate all the DBs that were used to
get this final "merged" one.

During merging, we always know the covdb that is being currently "merged" into the "old" covdb. Thus, we could
keep per-tag mask of covdbs that lead to the tag being covered / uncovered. Only drawback is, that the number of
databases is not limited, so, the "per-tag" mask of covdbs would have variable length depending on the number of
covdbs. Or, we can support only e.g. up to 64 covdbs, but I would like this to work for unlimited number of covdbs.

@nickg nickg added the coverage label Nov 2, 2023
@Blebowski
Copy link
Sponsor Contributor Author

Blebowski commented Jan 14, 2024

Hi,

I am trying to rethink how to implement this, and I come no an interesting limitation. Since branch, expression and toggle coverage encode multiple "bins" in single coverage item, if you merge such items together, it is very difficult to track covdb
where each of the bins was covered. It would essentially lead to tracking each of the masks separately, which ends up
with up to 4 variable length arrays. Simply, it is starting to seem messy...

It seems that the design decision to encode multiple "bins" into a single cover_item_t indeed makes the things more complicated than it is necessary. I am thinking to refactor this, and keep single cover_item_t for a single bin. The num attribute would then give how many follow-up cover_item_t correspond to the same "original coverage item".

Thus, each branch with if/else and each toggle would be split into two coverage items (0/1, true/false). Each binary expression
would be split into up-to 4 coverage items (01,10,11,00). Anyway this is already done for "case" branch coverage where number
of bins is not limited, thus it is not possible to encode it into single item. Similarly, for FSM coverage, each state is encoded in its own cover_item_t, and multiple cover_item_t are aggregated into a single table during report generation.

With this change, the run-time data for each "bin" / cover_item_t would become a counter (the same as for statements), so the code for merging and converting the cover_item_t to cover_chain_t would simplify.

Also, hierarchy path of each cover_item_t would need to be extended to distinguish different bins. I would suffix the hierarchy path with the bin name exactly as it is placed into the exclude file. Handling of exclude files will maybe simplified.

A clear drawback is the memory usage (and therefore size of covdb). If e.g. running with --include-mems, NEORV32 contains about half-milion cover items that are just toggles on the memories. Rest of the NEORV32 is about 50 K of cover items. Since each toggle would split into two, the covdb size (and also size of run-time allocated coverage data) would essentially double...
Similarly, if there is a lot of e.g. XOR expressions, each XOR would split into 4 separate items.

That being said, the user interface may change slightly, especially in the coverage exclude file, and therefore be backwards incompatible.

Since this is a big change, I am curious about your opinion. @nickg, would you go for it, or do you think the increased memory usage is not worth possibly simpler implementation?

@nickg
Copy link
Owner

nickg commented Jan 16, 2024

A clear drawback is the memory usage (and therefore size of covdb). If e.g. running with --include-mems, NEORV32 contains about half-milion cover items that are just toggles on the memories. Rest of the NEORV32 is about 50 K of cover items. Since each toggle would split into two, the covdb size (and also size of run-time allocated coverage data) would essentially double... Similarly, if there is a lot of e.g. XOR expressions, each XOR would split into 4 separate items.

I think we really need to reconsider whether include-mems for toggle coverage is useful. At the time we didn't see any other tools supporting this and it doesn't make sense to avoid doing useful simplifications and improvements because of one very niche feature.

That being said, the user interface may change slightly, especially in the coverage exclude file, and therefore be backwards incompatible.

I don't think this matters too much as there are not many users. It's better to iterate the design a bit and arrive at a good solution.

@Blebowski
Copy link
Sponsor Contributor Author

I think we really need to reconsider whether include-mems for toggle coverage is useful. At the time we didn't see any other tools supporting this and it doesn't make sense to avoid doing useful simplifications and improvements because of one very niche feature.

I still think it is useful. VCS, Aldec and NCSIM do have this feature. OTOH, you are right it is a niche feature, and I would simplify the implementation even at the expense of increased memory usage in this one case. The implementation in might end up being even faster. I would definitely compare performance before and after.

I will go ahead with it then. My time options are very limited, so it again will take some time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants