New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fuzz lsm/tree.zig #204
Fuzz lsm/tree.zig #204
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(I just noticed that this is draft PR, sorry 🤦.)
8 | ||
else | ||
// 2. We want to generate enough ids that the cache can't hold them all. | ||
Environment.cache_entries_max; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should vary this mean between fuzz runs? That way we can test a wider variety of workloads, all the way from "everything fits in cache" to "hardly anything fits in cache".
FuzzOp{ .compact = {} } | ||
else | ||
// Otherwise pick a random FuzzOp. | ||
switch (fuzz.random_enum(random, std.meta.Tag(FuzzOp), fuzz_op_distribution)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure of the best way to do this, but I think we will find more interesting errors if we vary the tree workload (as generated by the FuzzOps) in other interesting ways. Some examples:
- Insert a bunch of keys (multiple commits worth), then remove all/most of those keys. (Right now this would crash compaction, though that will be fixed by King's iterator PR).
- For key selection, it isn't merged yet, but the VOPR-workload tests various interesting id patterns (uniform random, zigzig, ascending, descending).
- (Though the VOPR-workload has some extra requirements that this test doesn't; it hides some data in the id to use later for verification.)
- One common pattern in our secondary indexes: composite keys that are the tuple
(small-cardinality value, strictly increasing timestamp)
(This pattern applies to e.g.:Account.ledger
,Account.code
). - Probably others I haven't thought of!
It's just a draft because I want to merge segmented arrays first and then use the fuzz utils in here. For the varying workloads, there are a lot of things we can try but I think it makes sense to wait until the low-hanging crashes are fixed. Can't tell if new workloads are improving coverage if everything always crashes in the same place. |
5e47e67
to
33bc0f1
Compare
@sentientwaffle This is ready now. All that changed since your last review is rebasing on main and using test/fuzz. I'll do more workload variations in later prs. Going to look into storage size first, since the space amplification in fuzz tests seems really high. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should remove test.zig
until the forest fuzzer covers checkpoints.
03e35e8
to
02a1cca
Compare
@sentientwaffle Ok, test.zig is back. |
Other than that one note it LGTM. I do think we will want to expand the tree fuzzer as time goes on, but this is fine for now. |
So far this is only hitting the same compaction iterator bugs as lsm_forest_fuzz.
TODO: