Skip to content

Conversation

@dcoutts
Copy link
Collaborator

@dcoutts dcoutts commented Oct 22, 2024

This is an attempt to get the QLS tests to get more coverage of the duplicate table feature, and more generally to have more interesting coverage.

In particular, we move several of the boring special cases out of the QLS test and into unit tests. This is for things like operating on closed table handles, or opening non-existent snapshots. These things of course must be covered, but they don't have interesting interactions so they don't need to be covered in complex situations. This lets the limited length of action sequences to be used for more interesting things.

See quickcheck-dynamic#85 for further details on why action sequences are so short.

We also change how actions are generated to have fewer tables, and thus more actions per table. We limit the number of tables at once to 5.

Initial results (from 28c4890):

      Interleaved actions on table duplicates (270 in total):
      88.1% DupTableActionLog "1 <= n < 2"
       6.3% DupTableActionLog "2 <= n < 4"
       3.3% DupTableActionLog "4 <= n < 8"
       1.9% DupTableActionLog "8 <= n < 16"
       0.4% DupTableActionLog "16 <= n < 32"

and now we get:

      Interleaved actions on table duplicates (125 in total):
      55.2% DupTableActionLog "1 <= n < 2"
      12.8% DupTableActionLog "4 <= n < 8"
      11.2% DupTableActionLog "2 <= n < 4"
      11.2% DupTableActionLog "8 <= n < 16"
       7.2% DupTableActionLog "16 <= n < 32"
       2.4% DupTableActionLog "32 <= n < 64"

So this is considerably improved, with 45% of runs now having non-trivial interleavings rather than 12%. And where previously we had only 2.3% of original tables with interleavings of length 8 or more, we now have over 20%.

One important caveat is that the change in the key and value types mean the tests run longer. Summary of running times from my system:

    propLockstep_ModelIOImpl:           OK (11.15s)
    propLockstep_RealImpl_RealFS_IO:    OK (24.24s)
    propLockstep_RealImpl_MockFS_IO:    OK (77.06s)
    propLockstep_RealImpl_MockFS_IOSim: OK (111.07s)

The purpose of the change is to allow the generation of actions to look
up information from the model. Instead of action generation only being
given a function to find model variables, it now gets a context from
which one can both find model variables, and look those variables up.

This will allow filtering variables based on information from the model,
such as whether tables are closed.

This first patch just updates to the API changes without making any
changes in functionalty.
In preparation for adding more unit tests.
@dcoutts dcoutts force-pushed the dcoutts/lockstep-io-sim-tweak-generation-2 branch from 8669753 to b95de4c Compare October 22, 2024 15:10
Copy link
Collaborator

@jorisdral jorisdral left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

Tests are timing out because CI currently only allows tests to run for around 5 minutes. These lines will have to be tweaked:

# https://docs.github.com/en/actions/using-workflows/workflow-commands-for-github-actions#setting-an-environment-variable
- name: Set test timeout (Unix)
if: ${{ startsWith(matrix.os, 'ubuntu') || startsWith(matrix.os, 'macOS') }}
run: |
echo "TASTY_TIMEOUT=5m" >> "$GITHUB_ENV"
# https://github.com/actions/runner/issues/2281#issuecomment-1326748709
- name: Set test timeout (Windows)
if: ${{ startsWith(matrix.os, 'windows') }}
run: |
echo "TASTY_TIMEOUT=5m" >> "$env:GITHUB_ENV"

To make it rather more convenient to use for literal constant names.
This is exposed in the API, not just for tests, but I think this is a
helpful API addition.
Previously the QLS test spent half of its search space on making sure we
get coverage for using operations at two different table types. While it
is important to get this coverage, we do not need to divide our search
space in half to achieve it. That search space can be better spent on
longer sequences of actions on fewer tables that can have sharing and
thus interesting interactions.

So we move the {Key,Value,Blob}{1,2} types into the unit test module,
and do a simple unit test where we create, insert, snapshot and restore
tables of two different types.

And we also simplify the QLS test to use a single set of key,value,blob
types.
For snapshots we cover the cases of: DeleteMissingSnapshot,
OpenMissingSnapshot, SnapshotTwice and OpenSnapshotWrongLabel. (These
are all tags from the QLS test.)

For tables and cursors we cover use after close and idempotent close.
The intention here is to cover the "boring" cases of using tables that
are not open using unit tests, so that more of the QLS search space can
be spent on interesting things that may take longer action sequences to
find.

So having added specific unit tests for snapshots, tables and cursors we
now adjust the generation of actions so that we only ever generate
references to resources that are currently open (or equivalent for
snapshots). This involves consulting the model when we generate actions
and filtering out variables that do not correspond to available
resources.

We leave blobrefs unfiltered since they have some interesting logic to
them. But these cover only a small portion of the search space.

In particular we also limit the number open tables and open cursors to 5
at once. By having fewer tables, we have more operations per table, and
thus more opportunity for interesting interleavings on duplicate tables.
Fewer:
 * new tables
 * opening of snapshots
 * new cursors
 * table closes
 * fewer snapshots

More:
 * table duplicates

Note also that tables and cursors are limited to 5 each at any one time.
This is intended to allow us to increase the QC size for the QLS tests
to get longer action sequences, without also getting individually larger
actions (which makes things too slow).
@dcoutts dcoutts force-pushed the dcoutts/lockstep-io-sim-tweak-generation-2 branch from 15dd804 to 2b3e9db Compare October 23, 2024 14:09
@dcoutts
Copy link
Collaborator Author

dcoutts commented Oct 23, 2024

In the first version of this PR the running times on my machine were:

    propLockstep_ModelIOImpl:           OK (17.80s)
    propLockstep_RealImpl_RealFS_IO:    OK (86.42s)
    propLockstep_RealImpl_MockFS_IO:    OK (251.20s)
    propLockstep_RealImpl_MockFS_IOSim: OK (284.86s)
Test.Database.LSMTree.Normal.StateMachine.DL
    prop_example:                       OK (455.09s)

But with a couple changes they're down to this:

    propLockstep_ModelIOImpl:           OK (11.15s)
    propLockstep_RealImpl_RealFS_IO:    OK (24.24s)
    propLockstep_RealImpl_MockFS_IO:    OK (77.06s)
    propLockstep_RealImpl_MockFS_IOSim: OK (111.07s)

Those changes are:

  • Use SerialisedBlob rather than LargeRawBytes for the Blob type.
  • Disable the StateMachine.DL.prop_example test

In particular, the latter was not justifying its long running time with any crucial coverage.

@dcoutts
Copy link
Collaborator Author

dcoutts commented Oct 23, 2024

Oh and so hopefully this will mean we do not need to adjust the CI testsuite timeouts. 🤞

@dcoutts dcoutts force-pushed the dcoutts/lockstep-io-sim-tweak-generation-2 branch from 2b3e9db to 0a130e3 Compare October 23, 2024 14:30
Use pre-existing test types created for this purpose with reasonable
looking Arbitrary instances.

These are definately larger. They may be too large for this kind of
test. The test running times are longer.

Also have to adjust the DL example to match the key type. In particular
the key type is now designed to create some collisions, so the size of a
sequence of keys is not the same as the size of the map of the same
keys.
It's utility does not justify its running time.
To better reflect its content.
@dcoutts dcoutts force-pushed the dcoutts/lockstep-io-sim-tweak-generation-2 branch from 0a130e3 to 9c68700 Compare October 23, 2024 14:32
Copy link
Collaborator

@jorisdral jorisdral left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! There is just the hlint failure to fix, but then we can merge

@dcoutts dcoutts enabled auto-merge October 24, 2024 12:41
@dcoutts dcoutts added this pull request to the merge queue Oct 24, 2024
Merged via the queue into main with commit 2e76991 Oct 24, 2024
24 checks passed
@dcoutts dcoutts deleted the dcoutts/lockstep-io-sim-tweak-generation-2 branch October 24, 2024 13:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants