Support finite-state-machine coverage #763

sean-anderson-seco · 2023-09-18T16:36:34Z

I am working on a project with coverage requirements, and one of those requirements is for "FSM coverage." This is rather nebulously defined, but from what I can tell it refers to the following set of features:

Coverage for the state variable, to see if each state is tested. This is sort of like a specialized form of expression coverage.
Coverage for state transitions, to see if each transition is tested. This requires a list of valid state transitions, as we typically don't care if we don't test A -> B if such a transition is invalid. There are two ways to approach this. First, the valid transitions can be determined from the VHDL, in which case this is a subset of branch coverage (although things like when STATE7 downto STATE0 can be covered with only two transitions). Second, the transitions can be specified "out-of-band" by some other method (such as a DSL in comments or PSL). In the latter case, this is a sort of specialized toggle coverage.
Coverage for n state transitions in a row. This is really more of an extended feature, unnecessary for basic coverage.
Assertions against invalid states (if using a std_logic_vector instead of an enum), and invalid state transitions. This is not really coverage, but is often considered part of FSM coverage, since it can be easily implemented once valid states and transitions are known.

There are several approaches for implementing this feature:

Extend expression coverage to enums. This might not always be relevant, since some enums might be reused in different places with uninteresting enumerations handled by default cases. However I think exclude files can be used to ignore such cases. This would not handle non-enum state variables, but it could be possible to add some kind of annotation/comment to cover them. It could also be possible to automatically detect state variables and extract the states, such as what is done in synthesis (ex1, ex2), but that is probably unnecessary for an initial attempt.
Add a DSL used to explicitly specify state transitions (e.g. in a comment). This is the approach used by Aldec [1]. Their approach always requires specifying states explicitly, but I think this is only really necessary for non-enum state variables.
Implement the cover and covergroup directives in PSL, which can be used to implement FSM coverage (ex1, ex2). This depends on PSL support #455, but is a much more flexible approach to adding new coverage types, since arbitrary expressions can covered.

I'm not going to need this feature for a while, and I don't plan on working on it in the near future. So this should be considered more of a wishlist idea. However, I've been pretty happy with the existing coverage, and I would love to avoid buying a commercial simulator just to get one specific type of coverage.

[1] Behind a login, but you can see some examples here and here.

The text was updated successfully, but these errors were encountered:

Blebowski · 2023-09-18T18:39:05Z

Hi @sean-anderson-seco,

thanks for hints :) Basically everything you said is correct. I was thinking about implementing this when I was doing first coverage implementation. I did some "research" about FSM extraction, and I ended up leaving the idea since it was quite complex.

Coverage for the state variable, to see if each state is tested. This is sort of like a specialized form of expression coverage.

This one should be pretty easy to implement. Simply by tracking if all possible enum values of "signal" / "variable" were visited.
There are few issues, like:

Not all enums in code are FSMs -> A simple hint / DSL in the meta-comment above the state register definition would solve this.
Not all signals of the FSM enum type are the state registers. There are many FSM coding styles, e.g. single process, two/three process, where the next_state decoder is combo process and the state register is separate clocked process. To avoid false positives, tracking only state register would be good. Again, here a simple meta-comment would do the job.

It could even work with reversed logic. Consider all enums as FSMs unless there is a meta-comment that marks the signal as "not an FSM". Or some option like --fsm-track-all-enums that would turn on this behavior. I know that in higher level FPGA designs, people use enums quite a lot for set of valid config values transmitted e.g. over a certain bus. That is not really an FSM. Making it simple to use is IMHO important. I can imagine that there could be a lot of false-positives here, and excluding each one line-by-line can be time consuming.

As for the coverage type, I would split it from expression coverage to separate FSM coverage (or FSM state coverage).

Implementation-wise, there might be a limit on number of states that can be supported. I am not sure whether it is not the case now with enum itself.

Coverage for state transitions, to see if each transition is tested. This requires a list of valid state transitions, as we typically don't care if we don't test A -> B if such a transition is invalid. There are two ways to approach this. First, the valid transitions can be determined from the VHDL, in which case this is a subset of branch coverage (although things like when STATE7 downto STATE0 can be covered with only two transitions). Second, the transitions can be specified "out-of-band" by some other method (such as a DSL in comments or PSL). In the latter case, this is a sort of specialized toggle coverage.

This is more tricky. You can google so many articles on the FSM extraction topic. AFAIK, simulator/synthesizer tool vendors do have it, but it never works 100 % (or close to it) for different FSM coding styles. To do it correctly, one needs to synthesize the FSM as you have shown. I also think the Aldec approach here is tricky. The meta-code in the comment that defines the valid transitions is quite complex, there is whole "grammar" involved with it. Once the FSM changes, you need to update the code
too, so suddenly there are two things to keep in sync -> Verification becomes a night-mare.

I have seen a company, where design and test-bench repository were split, and designer did not have write access to TB repository and vice-versa for the verification guy. This was some sort of process to guarantee that "the code is tested by someone else than by its designer". Altough it may be crazy, I would not want to be a verification engineer who can't write
meta-comments into the design to identify all valid transitions. And this information often arises only once the coverage
is being debugged. That is typically done by verification guys...

Coverage for n state transitions in a row. This is really more of an extended feature, unnecessary for basic coverage.

Yup agreed, I think we can start dealing with this once we have the first two types working well. This is basically a cyclomatic coverage that tracks how is FSM "traversed" from "IDLE" state to its final state.

Assertions against invalid states (if using a std_logic_vector instead of an enum), and invalid state transitions. This is not really coverage, but is often considered part of FSM coverage, since it can be easily implemented once valid states and transitions are known.

This is also quite tricky. I know some designers implement VHDL FSMs via std_logic_vector, and define set of constants that are names of the states. This is good if you want to explicitly implement an FSM that has redundancy in its state vector (for safety or security). Other than that, there are tool specific attributes (I think enum_encoding) where you can specify for synthesizer how to encode the FSM. In VHDL this is simply problematic. System Verilog enums are much better suited for this purpose.

sean-anderson-seco · 2023-09-18T19:50:36Z

Coverage for the state variable, to see if each state is tested. This is sort of like a specialized form of expression coverage.

This one should be pretty easy to implement. Simply by tracking if all possible enum values of "signal" / "variable" were visited. There are few issues, like:

Not all enums in code are FSMs -> A simple hint / DSL in the meta-comment above the state register definition would solve this.

One way could be to extend the exclude file format to something like

exclude work.top.my_variable_to_ignore BIN_STATE

although maybe there should be a version of the above which acts on the enum type itself, in the case where the enum is never a state variable.

Not all signals of the FSM enum type are the state registers. There are many FSM coding styles, e.g. single process, two/three process, where the next_state decoder is combo process and the state register is separate clocked process. To avoid false positives, tracking only state register would be good. Again, here a simple meta-comment would do the job.

It could even work with reversed logic. Consider all enums as FSMs unless there is a meta-comment that marks the signal as "not an FSM". Or some option like --fsm-track-all-enums that would turn on this behavior. I know that in higher level FPGA designs, people use enums quite a lot for set of valid config values transmitted e.g. over a certain bus. That is not really an FSM. Making it simple to use is IMHO important. I can imagine that there could be a lot of false-positives here, and excluding each one line-by-line can be time consuming.

I think the easiest way for this is to only include signals assigned in clocked processes by default. Although this gets close to detecting state machines. However, I think we could get pretty far with just a simple heuristic like looking for rising_edge and its respellings. Maybe the exclude file could be extended with include to handle the case where autodetection does not work? I think as long as we have reasonably-useful defaults and the ability to override them, we will be fine.

As for the coverage type, I would split it from expression coverage to separate FSM coverage (or FSM state coverage).

Yes, I agree.

Implementation-wise, there might be a limit on number of states that can be supported. I am not sure whether it is not the case now with enum itself.

I think the limit right now is 2**32. But a lower heuristic cutoff like 256 might be more appropriate.

Coverage for state transitions, to see if each transition is tested. This requires a list of valid state transitions, as we typically don't care if we don't test A -> B if such a transition is invalid. There are two ways to approach this. First, the valid transitions can be determined from the VHDL, in which case this is a subset of branch coverage (although things like when STATE7 downto STATE0 can be covered with only two transitions). Second, the transitions can be specified "out-of-band" by some other method (such as a DSL in comments or PSL). In the latter case, this is a sort of specialized toggle coverage.

This is more tricky. You can google so many articles on the FSM extraction topic. AFAIK, simulator/synthesizer tool vendors do have it, but it never works 100 % (or close to it) for different FSM coding styles. To do it correctly, one needs to synthesize the FSM as you have shown. I also think the Aldec approach here is tricky. The meta-code in the comment that defines the valid transitions is quite complex, there is whole "grammar" involved with it.

This is why I think it might be better to just support PSL for this area, as PSL already has a grammar which can express these relationships. I believe something like

-- Aldec enum fsm_enc CURRENT = state
-- Aldec enum fsm_enc STATES = A, B
-- Aldec enum fsm_enc TRANS = A -> B, B -> A

could be expressed in PSL as

cover { state = A; state = B };
cover { state = B; state = A };

and of course you could also do state coverage manually with something like

cover { state = A };
cover { state = B };

Once the FSM changes, you need to update the code too, so suddenly there are two things to keep in sync -> Verification becomes a night-mare.

Yes, I am not really a fan of this, since most other forms of coverage are "automatic." But it's unavoidable if you view coverage as a test for the testbench and the module. If you don't care about that, then branch coverage is good enough.

I have seen a company, where design and test-bench repository were split, and designer did not have write access to TB repository and vice-versa for the verification guy. This was some sort of process to guarantee that "the code is tested by someone else than by its designer". Altough it may be crazy, I would not want to be a verification engineer who can't write meta-comments into the design to identify all valid transitions. And this information often arises only once the coverage is being debugged. That is typically done by verification guys...

Well, as I understand it, things like vunits are designed to allow this kind of split verification method. Which is another point in the PSL direction.

Assertions against invalid states (if using a std_logic_vector instead of an enum), and invalid state transitions. This is not really coverage, but is often considered part of FSM coverage, since it can be easily implemented once valid states and transitions are known.

This is also quite tricky. I know some designers implement VHDL FSMs via std_logic_vector, and define set of constants that are names of the states. This is good if you want to explicitly implement an FSM that has redundancy in its state vector (for safety or security). Other than that, there are tool specific attributes (I think enum_encoding) where you can specify for synthesizer how to encode the FSM. In VHDL this is simply problematic. System Verilog enums are much better suited for this purpose.

IMO the encoding used is really up to your synthesis tool. Most synthesis tools support recoding state machines (so your
binary state machine could be recoded as one-hot), or optimizing (eliminating) default cases. But they also typically support forcing a particular encoding style (although I have never seen any which use enum_encoding) and preventing removal of the default case. The only major advantage I can see is better X-propagation, although you can sort of emulate it by doing something like

type state_t is (Uninitialized, ...);
-- ...
case state is
-- ...
when Uninitialized =>
    my_output <= (others => 'X');
end case;

but this is of course a bit error prone, and will probably mess with your synthesizer. (One other feature I would love to see would be better (non-standard) X propagation options, but that's a wish for another issue).

Back to the point, I think initially we just shouldn't support std_logic_vector . It might be easier to defer that style altogether, suggesting to use explicit PSL cover statements.

On the other hand, adding assertions against invalid state transitions would probably be much more verbose than if they were calculated automatically.

sean-anderson-seco · 2023-09-19T22:54:54Z

OK, to illustrate the PSL point a bit further,

type state_t is (
	IDLE,
	START,
	DATA,
	STOP,
	ERR
);
signal state: state_t;

type trans_t is array (state_t, state_t) of boolean;
function gen_transitions return trans_t is
	variable valid: trans_t;
begin
	for s in state_t loop
		valid(s, s) := true;
	end loop;
	valid(IDLE,  START) := true;
	valid(START, IDLE)  := true;
	valid(START, DATA)  := true;
	valid(DATA,  STOP)  := true;
	valid(STOP,  IDLE)  := true;
	valid(STOP,  ERR)   := true;
	valid(ERR,   IDLE)  := true;
	return valid;
end;
constant valid: trans_t := gen_transitions;

-- ...

for_from: for state_from in state_t generate begin
	for_to: for state_to in state_t generate begin
		check: if valid(state_from, state_to) generate
			cover { state = state_from; state = state_to };
		else generate
			assert never { state = state_from; state = state_to };
		end generate;
	end generate;
end generate;

And now you have cover/asserts for all the valid/invalid transitions. This technique can be used with GHDL using --psl-report, although it lacks the usual coverage niceties (such as automatic merging of reports, although perhaps that could be done with jq...).

It's more verbose than the Aldec DSL, but I think the generate loops can be abstracted away to be reused when necessary.

Blebowski · 2023-09-20T08:58:53Z

Yup, thanks, understood. I like the idea of emulating FSM transition tracking by PSL sequences. One thing that would need to be taken care of, is then filtering of the dummy added PSL assertions so that they don't show up in the functional coverage reports. Also, the PSL support is very early state and I don't know what are @nickg plans in this regard. If you follow the discussion in the PSL issue, I think it is not a priority now.

As for the FSM state tracking, I think a PSL cover point would not even be needed. I think it would suffice to emit code for an additional "process" during lower phase. There would be a run-time data (similar to coverage counters), whose each bit would keep track of a state visited (lets.call it RT_STATE_VECT). The emited code would correspond to something like:

process (my_state)
begin
    case (my_state) is
    when STATE_A => RT_STATE_VECT(0) := 1
    when STATE_B => RT_STATE_VECT(1) := 1
    when STATE_C => RT_STATE_VECT(2) := 2
    ...
end process;

The slightly tricky part might be in long enums that have more states than number of bits in the RT_STATE_VECT. But that is easy to handle simply by allocating multiple consecutive words for single enum. E.g. 33-rd state would then be represented by bit 0 of second word (assuming RT_STATE_VECT has 32 bits).

I think the addressing logic (selecting the word and also the bit) may even not need a case in the emited code. Since enum values are numbered from 0 in the generated code, pseudo-code will look something like this (assuming RT_STATE_VECT is an array of uint32_t):

RT_STATE_VECT[ENUM_VAL / 32](ENUM_VAL % 32) = 1

I dare to say that this should be doable with the current API of VCODE and its simpler than emmiting additional PSL cover point.

@nickg what do you think?

nickg · 2023-09-20T20:42:32Z

Also, the PSL support is very early state and I don't know what are @nickg plans in this regard. If you follow the discussion in the PSL issue, I think it is not a priority now.

It supports some basic FL operators but not much else. Recently I've been focussing on adding VHDL-2019 support but that is mostly(?) done now.

As for the FSM state tracking, I think a PSL cover point would not even be needed. I think it would suffice to emit code for an additional "process" during lower phase. There would be a run-time data (similar to coverage counters), whose each bit would keep track of a state visited (lets.call it RT_STATE_VECT). The emited code would correspond to something like:

[...]

@nickg what do you think?

I think you could do that with the existing coverage counters which are just an array of 32-bit integers that you treat as a bit mask.

Blebowski · 2023-09-30T06:59:21Z

Thinking about this again, I think emitting extra process as described above is too complex. Simpler way to achieve the same is to use callback on the FSM state signal. Then the implementation will be very similar to toggle coverage that is already implemented.

Blebowski · 2023-10-08T14:50:28Z

I'am trying to give a shot at implementing this. To handle enums with arbitrary number of literals, I need to either emit multiple cover tags to track single FSM, or rework the data of cover_tag_t to be flexible size array. Thus I would need to first rework the
tags from array to a linked list as noted in: #564

nickg · 2023-10-08T16:41:21Z

I would rather keep it as a flat array. I think you can just allocate N/32 sequential tags and then refer to them by the index of the first one.

Blebowski · 2023-10-08T17:43:34Z

I would rather keep it as a flat array.

If we want to solve the #564, we need to be able to insert a tag into the middle of array once a tag that does not exist in first .covdb is encountered during merging of second .covdb. How to do that without re-allocating and copying rest of the array from index where item was inserted ? Don't you think that would be overkill ?

The current way of accessing an element of tag array is either accessing last element, or accessing i-th element when iterating all elements. Both of these as well as inserting new element can be done effectively with singly linked list. Drawback is heap allocations for each tag, current array always allocates in "batch".

I think you can just allocate N/32 sequential tags and then refer to them by the index of the first one.

Yup, that is what I tried to do in: Blebowski@3cce18d.
The code started to get messy when I got to generating coverage report. If you want to process all values of a single enum into a single table, you consume multiple tags at once during loop in cover_report_scope. So you need start skipping the tags during the loop. It just seems messy...

Wouldn't it be better to put data in cover_tag_t at the end of the struct, make it an C99 style array, and keep it single element for all existing cover-tag types and N/32 elements for FSM state tags? When converted to linked-list, flexible size data member could be used for each element of cover_tag_t.

The size of the allocated array in the run-time code would differ from number of tags, so "tag-index" in the array would no longer be offset into the run-time data, but I don't think that would mind too much...

Blebowski · 2023-10-08T18:10:55Z

OK, I see now that you changed the tag array not to be flat accross the whole design, but only across the current cover scope in:
1ad3f89#diff-78cc7c5d81397a35561b50dc95f435b72092b6912067aa78e4fcb2ae4178cf77

Was the intention to solve the problem of above? WIth this new implementation, I think it could be sufficient to simply append the "new" tag to the most common scope.

nickg · 2023-10-08T18:58:54Z

Was the intention to solve the problem of above? WIth this new implementation, I think it could be sufficient to simply append the "new" tag to the most common scope.

I did that because code gets generated on-demand now rather than all at once when you call lower_unit, so the old scheme of pushing/popping scopes doesn't work.

The name cover_tag_t is quite confusing now. We should rename it to something else e.g. cover_bin_t as they are really representing coverage bins and reserve "tag" for indices into the array of counters.

Blebowski · 2023-10-08T19:07:29Z

OK, I see. Anyway, it helps also with the merging/inserting. If the tag from second .covdb does not exist in first one, we simply locate a most suitable scope (that has most of its path prefix matching), and append the tag to array of this scope. In the report, items within single scope might not be strictly ordered by "lines", but I don't think that minds.

The name cover_tag_t is quite confusing now. We should rename it to something else e.g. cover_bin_t as they are really representing coverage bins and reserve "tag" for indices into the array of counters.

OK. I would choose cover_item_t. Current cover_tag_t already has bins (e.g. 0->1 , 1->0 toggles), so that would be confusing. Then it would be:

cover_item_t
-> tag (index into run-time array)
-> bins (0->1 toggle, etc...)

Blebowski · 2023-10-22T12:04:48Z

@sean-anderson-seco I have implemented the basic FSM coverage, @nickg just merged it.

It only tracks FSM state coverage, and NVC considers signals of all user defined enums as FSMs.
I have added some ways how to further specify what is, and what is not an FSM.
Feel free to give it a try.

As for other types of FSM coverage, it requires much more elaborate implementation of FSM detection
and state transition detection. It needs translation of the RTL to some kind of "logic graph" and tracing its signals.
Essentially, its synthesis of the code. A nice paper about this:

Automatic Controller Detection for Large Scale RTL Designs, Wei Song, Jim Garside,
School of Computer Science, the University of Manchester

@nickg , what do you think about it?

For me, it seems a lot of complexity, and I don't really dare to try to do that now. I do think the feature
is usefull (especially detecting all valid transitions and transition coverage), but maybe its too early to
do that in NVC. WBU getting back to this in some time later (e.g. once there is stable Verilog support
and coverage is extended on Verilog too). ?

sean-anderson-seco · 2023-10-23T16:27:17Z

@sean-anderson-seco I have implemented the basic FSM coverage, @nickg just merged it.

Looks good :)

Blebowski · 2023-11-12T15:05:36Z

Hello @sean-anderson-seco , do you think it is OK to close this issue, or would you like to keep it open as a TODO for the transition coverage?

nickg added the coverage label Sep 18, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support finite-state-machine coverage #763

Support finite-state-machine coverage #763

sean-anderson-seco commented Sep 18, 2023

Blebowski commented Sep 18, 2023

sean-anderson-seco commented Sep 18, 2023 •

edited

sean-anderson-seco commented Sep 19, 2023

Blebowski commented Sep 20, 2023 •

edited

nickg commented Sep 20, 2023

Blebowski commented Sep 30, 2023

Blebowski commented Oct 8, 2023

nickg commented Oct 8, 2023

Blebowski commented Oct 8, 2023

Blebowski commented Oct 8, 2023

nickg commented Oct 8, 2023

Blebowski commented Oct 8, 2023 •

edited

Blebowski commented Oct 22, 2023

sean-anderson-seco commented Oct 23, 2023

Blebowski commented Nov 12, 2023

Support finite-state-machine coverage #763

Support finite-state-machine coverage #763

Comments

sean-anderson-seco commented Sep 18, 2023

Blebowski commented Sep 18, 2023

sean-anderson-seco commented Sep 18, 2023 • edited

sean-anderson-seco commented Sep 19, 2023

Blebowski commented Sep 20, 2023 • edited

nickg commented Sep 20, 2023

Blebowski commented Sep 30, 2023

Blebowski commented Oct 8, 2023

nickg commented Oct 8, 2023

Blebowski commented Oct 8, 2023

Blebowski commented Oct 8, 2023

nickg commented Oct 8, 2023

Blebowski commented Oct 8, 2023 • edited

Blebowski commented Oct 22, 2023

sean-anderson-seco commented Oct 23, 2023

Blebowski commented Nov 12, 2023

sean-anderson-seco commented Sep 18, 2023 •

edited

Blebowski commented Sep 20, 2023 •

edited

Blebowski commented Oct 8, 2023 •

edited