Skip to content

Conversation

@dcoutts
Copy link
Collaborator

@dcoutts dcoutts commented Oct 14, 2024

Add labels for:

  • number of tables
  • number of actions per table
  • logical size of tables
  • per-run number of actions (failing and succeeding)
  • label failing actions with the exception
  • label interleaving depth on duplicate tables

Initial results for each in each patch.

It's clear from several of these measures that the generation is suboptimal. One idea to improve it is to make the generation more sophisticated by (usually) only generating actions on live tables (or cursors, or existant snapshots). This is however difficult-to-impossible to do with the current quickcheck-lockstep package which does not provide a lookUp function to the action generation arbitraryWithVars. So there's no way to filter vars by their liveness (from looking at the model). It may make sense to extend quickcheck-lockstep to support this.

@dcoutts dcoutts marked this pull request as ready for review October 15, 2024 09:59
@jorisdral jorisdral changed the title Add more label to the quickcheck-lockstep StateMachine tests Add more labels to the quickcheck-lockstep StateMachine tests Oct 15, 2024
Copy link
Collaborator

@jorisdral jorisdral left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice to get a better sense of our test distribution! I only have minor suggestions

updDupTableActionLog stats = case action of
Lookups ks tableVar
| not (null ks) -> updateLastActionLog tableVar
RangeLookup _ tableVar -> updateLastActionLog tableVar
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could check in this case if the range argument to the action does not represent an empty range

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't look like this was done

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doh! I did it on the wrong one! (updNumActionsPerTable rather than updDupTableActionLog)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. Done. 🤦

@dcoutts dcoutts force-pushed the dcoutts/lockstep-io-sim-tweak-generation branch 3 times, most recently from c85bd32 to 2ef54a4 Compare October 16, 2024 16:39
@dcoutts dcoutts requested a review from jorisdral October 16, 2024 16:39
@dcoutts
Copy link
Collaborator Author

dcoutts commented Oct 16, 2024

@jorisdral everything should be addressed now. Follow-up PR nearly ready too...

@dcoutts dcoutts force-pushed the dcoutts/lockstep-io-sim-tweak-generation branch from 2ef54a4 to b04fddd Compare October 16, 2024 16:45
Copy link
Collaborator

@jorisdral jorisdral left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, though there is one unresolved comment

updDupTableActionLog stats = case action of
Lookups ks tableVar
| not (null ks) -> updateLastActionLog tableVar
RangeLookup _ tableVar -> updateLastActionLog tableVar
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't look like this was done

@dcoutts dcoutts force-pushed the dcoutts/lockstep-io-sim-tweak-generation branch from b04fddd to c970ab4 Compare October 21, 2024 12:18
Instead of using Dynamic. This will allow inspecting the Table at an
unknown type, for use with polymrphic operations like size that don't
care about the concrete table type.
In the QLS tests.

Initial results
      Number of tables (100 in total):
      33% NumTables "2 <= n < 4"
      23% NumTables "4 <= n < 8"
      17% NumTables "1 <= n < 2"
      16% NumTables "8 <= n < 16"
       9% NumTables "n == 0"
       2% NumTables "16 <= n < 32"

      Number of actions per table (410 in total):
      27.8% NumTableActions "n == 0"
      22.0% NumTableActions "2 <= n < 4"
      19.3% NumTableActions "4 <= n < 8"
      16.1% NumTableActions "1 <= n < 2"
      12.4% NumTableActions "8 <= n < 16"
       2.4% NumTableActions "16 <= n < 32"

As we can see, 9% of runs have no tables at all! This seems a bit
surprising, and not terribly useful. Although such runs will be quick,
in CI we typically only run 100 tests.

And also in many runs there a lot of tables (nearly 20% have 8 or more).
This is probably too many to get interesting interleavings of operations
between related (duplicated) tables. So we may want to reduce the number
of tables, and make more related tables so we can get better coverage of
interleaved operations on related tables.

And as we can also see, over a quarter of tables have no actions at all!
And generaly the number of actions per table is quite low. This would
probably be better with fewer tables.
Initial results:
      Table sizes (440 in total):
      41.1% TableSize "n == 0"
      21.1% TableSize "32 <= n < 64"
      13.4% TableSize "16 <= n < 32"
       8.9% TableSize "8 <= n < 16"
       7.5% TableSize "64 <= n < 128"
       2.7% TableSize "4 <= n < 8"
       2.3% TableSize "128 <= n < 256"
       2.0% TableSize "2 <= n < 4"
       0.7% TableSize "1 <= n < 2"
       0.2% TableSize "256 <= n < 512"

Again, it's not great that 40% of tables are empty. We'll try and
improve that. On the othe hand, the long tail is quite long, going out
over 100 elements.
Both success, failing and total.

Initial results:
      Actions total (100 in total):
      55% NumActions "10 <= n < 100"
      36% NumActions "1 <= n < 10"
       7% NumActions "n == 0"
       2% NumActions "100 <= n < 1000"

      Actions that succeeded total (100 in total):
      52% NumActions "10 <= n < 100"
      38% NumActions "1 <= n < 10"
       9% NumActions "n == 0"
       1% NumActions "100 <= n < 1000"

      Actions that failed total (100 in total):
      68% NumActions "1 <= n < 10"
      22% NumActions "n == 0"
      10% NumActions "10 <= n < 100"

As we can see, there's a surprising number of runs with no actions at
all (7%) or no successful actions (9%).

While it's important to get coverage of failing actions, we don't need
lots of coverage since the reasons are always the same ones. There are
not a lot of complex interactions in these cases. So we're wasting run
actions on them.
Initial results:
      Actions that failed (409 in total):
      35.0% ActionFail "Open" "MErr ErrSnapshotDoesNotExist"
      10.3% ActionFail "DeleteSnapshot" "MErr ErrSnapshotDoesNotExist"
       7.1% ActionFail "Inserts" "MErr ErrTableHandleClosed"
       6.8% ActionFail "Deletes" "MErr ErrTableHandleClosed"
       6.8% ActionFail "Lookups" "MErr ErrTableHandleClosed"
       6.1% ActionFail "RetrieveBlobs" "MErr ErrBlobRefInvalidated"
       6.1% ActionFail "Updates" "MErr ErrTableHandleClosed"
       5.4% ActionFail "ReadCursor" "MErr ErrCursorClosed"
       3.9% ActionFail "Snapshot" "MErr ErrSnapshotExists"
       3.4% ActionFail "RangeLookup" "MErr ErrTableHandleClosed"
       3.2% ActionFail "Open" "MErr ErrSnapshotWrongType"
       2.0% ActionFail "Duplicate" "MErr ErrTableHandleClosed"
       2.0% ActionFail "NewCursor" "MErr ErrTableHandleClosed"
       2.0% ActionFail "Snapshot" "MErr ErrTableHandleClosed"

As we can see, each action has exactly one reason for failing. This
shows clearly that we don't need a lot of coverage for these failures.
The only non-trivial one is the blob refs becomming invalidated.
Otherwise, there's no complex logic or state for any of these failures.
This adds a measure of how much coverage we are getting of uses of
duplicate tables, to help us make sure we are getting good test coverage
of the multiple writable table handles feature.

We count the number of interleaved actions on different but related
tables. In particular we count multiple actions on the same table only
once: we are only interested in actions that interleave between
different but related tables. Tables are defined to be related if they
derive (via 'duplicate') from the same ultimate parent table.

Initial results:
      Interleaved actions on table duplicates (270 in total):
      88.1% DupTableActionLog "1 <= n < 2"
       6.3% DupTableActionLog "2 <= n < 4"
       3.3% DupTableActionLog "4 <= n < 8"
       1.9% DupTableActionLog "8 <= n < 16"
       0.4% DupTableActionLog "16 <= n < 32"

Obviously this isn't great yet: 88% of runs have no interleaved actions
at all.
@dcoutts dcoutts force-pushed the dcoutts/lockstep-io-sim-tweak-generation branch from c970ab4 to 28c4890 Compare October 21, 2024 12:20
@dcoutts dcoutts enabled auto-merge October 21, 2024 12:21
@dcoutts dcoutts added this pull request to the merge queue Oct 21, 2024
Merged via the queue into main with commit 3f5bda8 Oct 21, 2024
24 checks passed
@dcoutts dcoutts deleted the dcoutts/lockstep-io-sim-tweak-generation branch October 21, 2024 13:10
jorisdral added a commit to well-typed/quickcheck-lockstep that referenced this pull request Nov 4, 2024
Occurrences of `ModelFindVariables` and `ModelLookup` in the `InLockstep` class
are replaced by the newly exposed `ModelVarContext`. A `ModelFindVariables` can
be recovered from a `ModelVarContext` using the new `findVars` functions. A
`ModelLookup` can be recovered from a `ModelVarContext` using the new
`lookupVars` function. Since these functions can be recovered from
`ModelVarContext`, existing tests are guaranteed to be adaptable to the new
`InLockstep` API.

Motivation: previously in the `InLockstep` class, member functions would be
passed a `ModelFindVariables` or a `ModelLookup`, but never both. In practice
this turned out to be too restrictive, because one might want access to both in
the same function, see IntersectMBO/lsm-tree#431. For
example, `arbitraryWithVars` was previously only passed a `ModelFindVariables`,
but without a `ModelLookup` one can not filter these variables based on the
(modelled) outcome of the corresponding actions. The use case for #431 in
particular was to filter variables that reference stateful handles (e.g., file
handles) for handles that are open. Because of this filtering, one can skew the
action distribution to generate more actions on open handles. It is often more
interesting to test actions on open handles than it is to test actions on closed
handles, which will presumably always return the same error.
jorisdral added a commit to well-typed/quickcheck-lockstep that referenced this pull request Nov 4, 2024
Occurrences of `ModelFindVariables` and `ModelLookup` in the `InLockstep` class
are replaced by the newly exposed `ModelVarContext`. A `ModelFindVariables` can
be recovered from a `ModelVarContext` using the new `findVars` functions. A
`ModelLookup` can be recovered from a `ModelVarContext` using the new
`lookupVars` function. Since these functions can be recovered from
`ModelVarContext`, existing tests are guaranteed to be adaptable to the new
`InLockstep` API.

Motivation: previously in the `InLockstep` class, member functions would be
passed a `ModelFindVariables` or a `ModelLookup`, but never both. In practice
this turned out to be too restrictive, because one might want access to both in
the same function, see IntersectMBO/lsm-tree#431. For
example, `arbitraryWithVars` was previously only passed a `ModelFindVariables`,
but without a `ModelLookup` one can not filter these variables based on the
(modelled) outcome of the corresponding actions. The use case for #431 in
particular was to filter variables that reference stateful handles (e.g., file
handles) for handles that are open. Because of this filtering, one can skew the
action distribution to generate more actions on open handles. It is often more
interesting to test actions on open handles than it is to test actions on closed
handles, which will presumably always return the same error.
jorisdral added a commit to well-typed/quickcheck-lockstep that referenced this pull request Dec 3, 2024
Occurrences of `ModelFindVariables` and `ModelLookup` in the `InLockstep` class
are replaced by the newly exposed `ModelVarContext`. A `ModelFindVariables` can
be recovered from a `ModelVarContext` using the new `findVars` functions. A
`ModelLookup` can be recovered from a `ModelVarContext` using the new
`lookupVars` function. Since these functions can be recovered from
`ModelVarContext`, existing tests are guaranteed to be adaptable to the new
`InLockstep` API.

Motivation: previously in the `InLockstep` class, member functions would be
passed a `ModelFindVariables` or a `ModelLookup`, but never both. In practice
this turned out to be too restrictive, because one might want access to both in
the same function, see IntersectMBO/lsm-tree#431. For
example, `arbitraryWithVars` was previously only passed a `ModelFindVariables`,
but without a `ModelLookup` one can not filter these variables based on the
(modelled) outcome of the corresponding actions. The use case for #431 in
particular was to filter variables that reference stateful handles (e.g., file
handles) for handles that are open. Because of this filtering, one can skew the
action distribution to generate more actions on open handles. It is often more
interesting to test actions on open handles than it is to test actions on closed
handles, which will presumably always return the same error.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants