-
Notifications
You must be signed in to change notification settings - Fork 9
Add more labels to the quickcheck-lockstep StateMachine tests #431
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
jorisdral
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice to get a better sense of our test distribution! I only have minor suggestions
| updDupTableActionLog stats = case action of | ||
| Lookups ks tableVar | ||
| | not (null ks) -> updateLastActionLog tableVar | ||
| RangeLookup _ tableVar -> updateLastActionLog tableVar |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could check in this case if the range argument to the action does not represent an empty range
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't look like this was done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doh! I did it on the wrong one! (updNumActionsPerTable rather than updDupTableActionLog)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. Done. 🤦
c85bd32 to
2ef54a4
Compare
|
@jorisdral everything should be addressed now. Follow-up PR nearly ready too... |
2ef54a4 to
b04fddd
Compare
jorisdral
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, though there is one unresolved comment
| updDupTableActionLog stats = case action of | ||
| Lookups ks tableVar | ||
| | not (null ks) -> updateLastActionLog tableVar | ||
| RangeLookup _ tableVar -> updateLastActionLog tableVar |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't look like this was done
b04fddd to
c970ab4
Compare
Instead of using Dynamic. This will allow inspecting the Table at an unknown type, for use with polymrphic operations like size that don't care about the concrete table type.
In the QLS tests.
Initial results
Number of tables (100 in total):
33% NumTables "2 <= n < 4"
23% NumTables "4 <= n < 8"
17% NumTables "1 <= n < 2"
16% NumTables "8 <= n < 16"
9% NumTables "n == 0"
2% NumTables "16 <= n < 32"
Number of actions per table (410 in total):
27.8% NumTableActions "n == 0"
22.0% NumTableActions "2 <= n < 4"
19.3% NumTableActions "4 <= n < 8"
16.1% NumTableActions "1 <= n < 2"
12.4% NumTableActions "8 <= n < 16"
2.4% NumTableActions "16 <= n < 32"
As we can see, 9% of runs have no tables at all! This seems a bit
surprising, and not terribly useful. Although such runs will be quick,
in CI we typically only run 100 tests.
And also in many runs there a lot of tables (nearly 20% have 8 or more).
This is probably too many to get interesting interleavings of operations
between related (duplicated) tables. So we may want to reduce the number
of tables, and make more related tables so we can get better coverage of
interleaved operations on related tables.
And as we can also see, over a quarter of tables have no actions at all!
And generaly the number of actions per table is quite low. This would
probably be better with fewer tables.
Initial results:
Table sizes (440 in total):
41.1% TableSize "n == 0"
21.1% TableSize "32 <= n < 64"
13.4% TableSize "16 <= n < 32"
8.9% TableSize "8 <= n < 16"
7.5% TableSize "64 <= n < 128"
2.7% TableSize "4 <= n < 8"
2.3% TableSize "128 <= n < 256"
2.0% TableSize "2 <= n < 4"
0.7% TableSize "1 <= n < 2"
0.2% TableSize "256 <= n < 512"
Again, it's not great that 40% of tables are empty. We'll try and
improve that. On the othe hand, the long tail is quite long, going out
over 100 elements.
Both success, failing and total.
Initial results:
Actions total (100 in total):
55% NumActions "10 <= n < 100"
36% NumActions "1 <= n < 10"
7% NumActions "n == 0"
2% NumActions "100 <= n < 1000"
Actions that succeeded total (100 in total):
52% NumActions "10 <= n < 100"
38% NumActions "1 <= n < 10"
9% NumActions "n == 0"
1% NumActions "100 <= n < 1000"
Actions that failed total (100 in total):
68% NumActions "1 <= n < 10"
22% NumActions "n == 0"
10% NumActions "10 <= n < 100"
As we can see, there's a surprising number of runs with no actions at
all (7%) or no successful actions (9%).
While it's important to get coverage of failing actions, we don't need
lots of coverage since the reasons are always the same ones. There are
not a lot of complex interactions in these cases. So we're wasting run
actions on them.
Initial results:
Actions that failed (409 in total):
35.0% ActionFail "Open" "MErr ErrSnapshotDoesNotExist"
10.3% ActionFail "DeleteSnapshot" "MErr ErrSnapshotDoesNotExist"
7.1% ActionFail "Inserts" "MErr ErrTableHandleClosed"
6.8% ActionFail "Deletes" "MErr ErrTableHandleClosed"
6.8% ActionFail "Lookups" "MErr ErrTableHandleClosed"
6.1% ActionFail "RetrieveBlobs" "MErr ErrBlobRefInvalidated"
6.1% ActionFail "Updates" "MErr ErrTableHandleClosed"
5.4% ActionFail "ReadCursor" "MErr ErrCursorClosed"
3.9% ActionFail "Snapshot" "MErr ErrSnapshotExists"
3.4% ActionFail "RangeLookup" "MErr ErrTableHandleClosed"
3.2% ActionFail "Open" "MErr ErrSnapshotWrongType"
2.0% ActionFail "Duplicate" "MErr ErrTableHandleClosed"
2.0% ActionFail "NewCursor" "MErr ErrTableHandleClosed"
2.0% ActionFail "Snapshot" "MErr ErrTableHandleClosed"
As we can see, each action has exactly one reason for failing. This
shows clearly that we don't need a lot of coverage for these failures.
The only non-trivial one is the blob refs becomming invalidated.
Otherwise, there's no complex logic or state for any of these failures.
This adds a measure of how much coverage we are getting of uses of
duplicate tables, to help us make sure we are getting good test coverage
of the multiple writable table handles feature.
We count the number of interleaved actions on different but related
tables. In particular we count multiple actions on the same table only
once: we are only interested in actions that interleave between
different but related tables. Tables are defined to be related if they
derive (via 'duplicate') from the same ultimate parent table.
Initial results:
Interleaved actions on table duplicates (270 in total):
88.1% DupTableActionLog "1 <= n < 2"
6.3% DupTableActionLog "2 <= n < 4"
3.3% DupTableActionLog "4 <= n < 8"
1.9% DupTableActionLog "8 <= n < 16"
0.4% DupTableActionLog "16 <= n < 32"
Obviously this isn't great yet: 88% of runs have no interleaved actions
at all.
c970ab4 to
28c4890
Compare
Occurrences of `ModelFindVariables` and `ModelLookup` in the `InLockstep` class are replaced by the newly exposed `ModelVarContext`. A `ModelFindVariables` can be recovered from a `ModelVarContext` using the new `findVars` functions. A `ModelLookup` can be recovered from a `ModelVarContext` using the new `lookupVars` function. Since these functions can be recovered from `ModelVarContext`, existing tests are guaranteed to be adaptable to the new `InLockstep` API. Motivation: previously in the `InLockstep` class, member functions would be passed a `ModelFindVariables` or a `ModelLookup`, but never both. In practice this turned out to be too restrictive, because one might want access to both in the same function, see IntersectMBO/lsm-tree#431. For example, `arbitraryWithVars` was previously only passed a `ModelFindVariables`, but without a `ModelLookup` one can not filter these variables based on the (modelled) outcome of the corresponding actions. The use case for #431 in particular was to filter variables that reference stateful handles (e.g., file handles) for handles that are open. Because of this filtering, one can skew the action distribution to generate more actions on open handles. It is often more interesting to test actions on open handles than it is to test actions on closed handles, which will presumably always return the same error.
Occurrences of `ModelFindVariables` and `ModelLookup` in the `InLockstep` class are replaced by the newly exposed `ModelVarContext`. A `ModelFindVariables` can be recovered from a `ModelVarContext` using the new `findVars` functions. A `ModelLookup` can be recovered from a `ModelVarContext` using the new `lookupVars` function. Since these functions can be recovered from `ModelVarContext`, existing tests are guaranteed to be adaptable to the new `InLockstep` API. Motivation: previously in the `InLockstep` class, member functions would be passed a `ModelFindVariables` or a `ModelLookup`, but never both. In practice this turned out to be too restrictive, because one might want access to both in the same function, see IntersectMBO/lsm-tree#431. For example, `arbitraryWithVars` was previously only passed a `ModelFindVariables`, but without a `ModelLookup` one can not filter these variables based on the (modelled) outcome of the corresponding actions. The use case for #431 in particular was to filter variables that reference stateful handles (e.g., file handles) for handles that are open. Because of this filtering, one can skew the action distribution to generate more actions on open handles. It is often more interesting to test actions on open handles than it is to test actions on closed handles, which will presumably always return the same error.
Occurrences of `ModelFindVariables` and `ModelLookup` in the `InLockstep` class are replaced by the newly exposed `ModelVarContext`. A `ModelFindVariables` can be recovered from a `ModelVarContext` using the new `findVars` functions. A `ModelLookup` can be recovered from a `ModelVarContext` using the new `lookupVars` function. Since these functions can be recovered from `ModelVarContext`, existing tests are guaranteed to be adaptable to the new `InLockstep` API. Motivation: previously in the `InLockstep` class, member functions would be passed a `ModelFindVariables` or a `ModelLookup`, but never both. In practice this turned out to be too restrictive, because one might want access to both in the same function, see IntersectMBO/lsm-tree#431. For example, `arbitraryWithVars` was previously only passed a `ModelFindVariables`, but without a `ModelLookup` one can not filter these variables based on the (modelled) outcome of the corresponding actions. The use case for #431 in particular was to filter variables that reference stateful handles (e.g., file handles) for handles that are open. Because of this filtering, one can skew the action distribution to generate more actions on open handles. It is often more interesting to test actions on open handles than it is to test actions on closed handles, which will presumably always return the same error.
Add labels for:
Initial results for each in each patch.
It's clear from several of these measures that the generation is suboptimal. One idea to improve it is to make the generation more sophisticated by (usually) only generating actions on live tables (or cursors, or existant snapshots). This is however difficult-to-impossible to do with the current
quickcheck-locksteppackage which does not provide alookUpfunction to the action generationarbitraryWithVars. So there's no way to filter vars by their liveness (from looking at the model). It may make sense to extendquickcheck-lockstepto support this.