Implement snapshots for union tables #608

dcoutts · 2025-03-04T16:47:56Z

This includes quite a bit of preparatory refactoring, but the main feature is including merging trees (resulting from table unions) into the snapshots, and corresponding tests.

dcoutts · 2025-03-04T16:49:38Z

@recursion-ninja in particular, this means you now have a SnapMergingRunState, toSnapMergingRunState and fromSnapMergingRunState that you could reuse in two places in the merging tree snapshot types and code.

mheinzel · 2025-03-04T18:15:30Z

src/Database/LSMTree/Internal/Snapshot.hs

+      -- This will set the correct nominal credits, but it will not do any more
+      -- merging work because fromSnapMergingRunState already supplies all the
+      -- merging credits already.
+      supplyCreditsIncomingRun conf ln ir nominalCredits


Just to make sure I understand it correctly:

it will not do any more merging work is based on the invariant that the snapshotted merging run's physical/merge credits were at least as high as dictated by the incoming run's nominal credit. If that assumption wasn't true for some reason, we would fix that here by doing some merging work.

Is that correct?

That is correct.

recursion-ninja · 2025-03-05T16:38:39Z

@recursion-ninja in particular, this means you now have a SnapMergingRunState, toSnapMergingRunState and fromSnapMergingRunState that you could reuse in two places in the merging tree snapshot types and code.

These functions and data-type definitions should work well for de-duplicating the code. I am having minor trouble with the LevelNo in the type signature of toSnapMergingRunState since the MergingTree doesn't have any level number, but I think this is pretty easily fixable.

dcoutts · 2025-03-05T16:47:32Z

I am having minor trouble with the LevelNo in the type signature of toSnapMergingRunState since the MergingTree doesn't have any level number, but I think this is pretty easily fixable.

Yeah, I was talking to @mheinzel about that yesterday.

I think it's ok to fake it for now and we can think about tidying it up. It's just a proxy for an order of magnitude size.

dcoutts · 2025-03-05T16:49:56Z

I've rebased and accepted the changes to the golden files for snapshot format. So hopefully this will now pass CI.

dcoutts · 2025-03-06T11:50:20Z

These functions and data-type definitions should work well for de-duplicating the code. I am having minor trouble with the LevelNo in the type signature of toSnapMergingRunState since the MergingTree doesn't have any level number, but I think this is pretty easily fixable.

@recursion-ninja I have now updated this again to avoid the need for any LevelNo arg in the snapshot restore for merging runs. The merging run snapshot now stores the RunParams rather than deriving the run params from the level no during snapshot restore.

This required a bit of additional refactoring, so there's an extra patch in here now to introduce the RunParams type and use it comprehensively.

src/Database/LSMTree/Internal/Snapshot/Codec.hs

mheinzel

Nice, that goes in the right direction. Makes sense to store credits this way and good that the level number is not needed for the merging trees.

mheinzel · 2025-03-07T08:38:12Z

src/Database/LSMTree/Internal/Snapshot.hs

-    SnapCompletedMerge !r
-  | SnapOngoingMerge !(V.Vector r) !t
+    SnapCompletedMerge !NumRuns !MergeDebt !r
+  | SnapOngoingMerge   !RunParams !MergeCredits !(V.Vector r) !t


Long-term it might make sense to re-construct as much as possible of the run params from the config when opening a snapshot, so users can change (part of) the config, re-open the DB and have the restored tables use the new config. But in the meantime I think it's totally sensible to just take the params from the snapshot. 👍

Perhaps, but I think if possible a nicer design is to load a snapshot and then when the table is online, change the parameters. That's more general since it allows changing online too.

mheinzel · 2025-03-07T08:40:29Z

src/Database/LSMTree/Internal/Snapshot.hs

+    bracketOnError
+      (do uniq <- incrUniqCounter uc
+          let runPaths = runPath dir (uniqueToRunNumber uniq)
+          MR.new hfs hbio resolve runParams mergeType runPaths rs)


Long-term it might make sense to re-construct as much as possible of the run params from the config when opening a snapshot, so users can change (part of) the config, re-open the DB and have the restored tables use the new config. But in the meantime I think it's totally sensible to just take the params from the snapshot. 👍

dcoutts · 2025-03-08T00:17:48Z

I think this is in a pretty good state, however I still suspect the ref counting for the snapshots for merging trees is wrong.

There isn't yet a test that demonstrates that. I'm trying to (partially) enable the StateMachine tests to show it.

I think a simple unit test for a table with unions and then saving and restoring a snapshot would demonstrate the ref counting errors. But I'd also like to make sure the state machine tests would give coverage of it too.

dcoutts · 2025-03-10T09:28:43Z

In fact the state machine tests do find a nice simple reproducer, so that's ok:

      *** Failed! Falsified (after 1 test and 54 shrinks):
      do var1 <- action $ Action Nothing (New (PrettyProxy @((Key,Value,Blob))) (TableConfig {confMergePolicy = MergePolicyLazyLevelling, confSizeRatio = Four, confWriteBufferAlloc = AllocNumEntries (NumEntries 1), confBloomFilterAlloc = AllocFixed 10, confFencePointerIndex = CompactIndex, confDiskCachePolicy = DiskCacheNone, confMergeSchedule = Incremental}))
         action $ Action Nothing (Updates [(Key (KeyForIndexCompact {getKeyForIndexCompact = [0,0,0,0,0,0,0,0]}),Delete)] (unsafeMkGVar var1 (OpFromRight `OpComp` OpId)))
         var6 <- action $ Action Nothing (Union (unsafeMkGVar var1 (OpFromRight `OpComp` OpId)) (unsafeMkGVar var1 (OpFromRight `OpComp` OpId)))
         action $ Action Nothing (CreateSnapshot Nothing (SnapshotLabel "Key Value Blob") "snap1" (unsafeMkGVar var6 (OpFromRight `OpComp` OpId)))
         action $ Action Nothing (OpenSnapshot (PrettyProxy @((Key,Value,Blob))) (SnapshotLabel "Key Value Blob") "snap1")
         pure ()
      Expected 1 open handles, but found 5

So this is clearly a ref counting mistake. I'll try and fix that now...

dcoutts · 2025-03-10T11:19:39Z

Ok, pushed a patch to fix all the referencing counting bugs I can find from the state machine tests.

dcoutts · 2025-03-10T13:00:55Z

Ok, the only thing remaining here is to restore the ability to override the config with TableConfigOverride to openSnapshot.

We'll do this selectively so that it's overridden for runs and merges in levels but left as was for tree merges (where the policy is currently ill defined).

dcoutts · 2025-03-10T16:43:09Z

Ok, I think the best thing to do is to merge this PR with its current feature set. This includes making the TableConfigOverride param to openSnapshot effectively useless. And then in a follow-up PR we introduce a specific feature to change the table config for open/online tables.

Something like:

changeConfig :: TableConfigOverride -> Table m h -> m ()

These params are (almost) always used together, so this helps to reduce verbosity and the number of separate args that have to be passed around. We also slightly simplify when args are provided. Now all these params are provided to the RunBuilder, and none extra are needed when the RunBuilder is turned into the Run. Previously, two were passed to the RunBuilder and one more passed to the Run (though actually finalising the RunBuidler also needed the third param anyway). However, the real motivation for all this is to improve restoring snapshots of merging runs. At the moment, deserialising a merging run requires us to supply all three of these params from the surrounding context. This is ok in the context of levels, but becomes awkward for merging runs embedded in a merging tree. The obvious solution is to serialise this set of parameters with the merging run, and then they're directly available to use when reconstructing the merge during snapshot restore. Changes to serialisation are not in this patch, but the next.

The goal is to make the representation of the snapshot of the merging run closer to being sufficient information to reconstruct the merging run upon snapshot restore. Previously we only embed the merging run within an incoming run and we use information from that context to reconstruct the merging run. In future we want to reuse the representation of snapshots of merging runs within merging trees, and those do not contain the same context information as an incoming run does. So to enable reuse, it makes sense to make the merging run snapshot representation more standalone. In particular: * SnapIncomingRun no longer contains NumRuns or MergeDebt, and these get moved down to the SnapMergingRunState. * SnapIncomingRun now stores the NominalDebt * SnapMergingRunState also gains the RunParams and MergeCredits

Again, the goal is to make the representation more self-contained so that less information needs to be supplied from the context during snapshot restore. And again the intnded use case is merging trees, and runs within them. In the case of a run it is the disk caching policy, and the index type. So overall for a run we store the run number (to find the disk files), the index type (so we know how to restore the index) and the caching policy. Arguably the index type should not need to be known in advance and the index file format should simply say which kind of index it is.

It is just a simple `traverse` to lift it to Levels. This makes it simpler to reuse with merging trees of runs, which can also just use `traverse`.

Put hfs hbio first since that's more global and what we use elsewhere. Roughly in order of lest frequently varying to most frequently varying, or equivalently most global to most local.

The union level contains an optional MergingTree. This is now included in the snapshot representation. Update codecs and tests. This changes the snapshot file format. The golden test files are updated in the next patch. Co-authored-by: Duncan Coutts <duncan@well-typed.com> Co-authored-by: Joris Dral <joris@well-typed.com>

following the change of snapshot representation types.

It should duplicate its input reference for consistency, since it retains it.

Also tweak the MergingTree constructor utilities to make this more straightforward.

Use utils for maybe, list and vector. And use a single-item encoding for Maybe, rather than it being a pair of things. The pair style means one has to remember to count it as two when setting the tuple lengths.

Also update the credit docs for nominal credits.

The shrink tests were timing out.

Feedback addressed. Can do any further review post-merge.

jorisdral

LGTM

Since #608, using a `TableConfigOverride` when opening a snapshot has no effect. This commit restores the feature that we can change the disk cache policy on snapshot open. Internally, instead of threading through a modified config to open the snapshot, we modify the whole `SnapshotMetaData` (replacing the disk cache policy everywhere) and open the snapshot from that modified metadata. A first idea we had was to modify a table config while its table is live, but there are some challenges. Changing a table config when the table is offline seem to be more straightforward, so that's the approach we took (for now). An explanation of the challenges can be found in the `*.Config.Override` module.

mheinzel reviewed Mar 4, 2025

View reviewed changes

dcoutts force-pushed the dcoutts/snapshot-refactoring branch from 9afc1fa to 55b7225 Compare March 5, 2025 16:49

dcoutts marked this pull request as ready for review March 5, 2025 16:50

dcoutts requested review from jorisdral, recursion-ninja and wenkokke as code owners March 5, 2025 16:50

dcoutts force-pushed the dcoutts/snapshot-refactoring branch 4 times, most recently from b7ad924 to 9b8cf74 Compare March 6, 2025 11:48

dcoutts force-pushed the dcoutts/snapshot-refactoring branch from 9b8cf74 to 8c3cab1 Compare March 6, 2025 14:02

dcoutts commented Mar 6, 2025

View reviewed changes

src/Database/LSMTree/Internal/Snapshot/Codec.hs Outdated Show resolved Hide resolved

dcoutts commented Mar 6, 2025

View reviewed changes

src/Database/LSMTree/Internal/Snapshot/Codec.hs Outdated Show resolved Hide resolved

dcoutts mentioned this pull request Mar 7, 2025

Adding snapshotting functionality for MergingTree #602

Merged

2 tasks

mheinzel approved these changes Mar 7, 2025

View reviewed changes

dcoutts force-pushed the dcoutts/snapshot-refactoring branch 2 times, most recently from 1655f32 to 1527fcf Compare March 8, 2025 00:14

dcoutts force-pushed the dcoutts/snapshot-refactoring branch 2 times, most recently from 90e0481 to 5261940 Compare March 10, 2025 12:59

mheinzel mentioned this pull request Mar 10, 2025

Enable unions in lockstep tests #614

Merged

dcoutts and others added 14 commits March 13, 2025 16:26

Make snapshotRun/openRun singluar on Run rather than Levels

e61a430

It is just a simple `traverse` to lift it to Levels. This makes it simpler to reuse with merging trees of runs, which can also just use `traverse`.

Make function argument order more consistent in snapshot functions

8a39751

Put hfs hbio first since that's more global and what we use elsewhere. Roughly in order of lest frequently varying to most frequently varying, or equivalently most global to most local.

Update golden files for snapshot disk representations

53aa7e3

following the change of snapshot representation types.

Use more consistent reference handling for newIncomingMergingRun

5203408

It should duplicate its input reference for consistency, since it retains it.

Fix the reference handling for fromSnapMergingTree

5cbb822

Also tweak the MergingTree constructor utilities to make this more straightforward.

Make the snapshot codec CBOR encoding more regular

45212f0

Use utils for maybe, list and vector. And use a single-item encoding for Maybe, rather than it being a pair of things. The pair style means one has to remember to count it as two when setting the tuple lengths.

Add specialise pragmas to everything in MergingTree

1e032d2

More consistent naming of snapshot merging and incoming run types

153e65b

Split IncomingRun out of MergeSchedule

0204324

Also update the credit docs for nominal credits.

Reduce the number of shrinks involving RunParams

2080353

The shrink tests were timing out.

dcoutts force-pushed the dcoutts/snapshot-refactoring branch from 6ef8b50 to 2080353 Compare March 13, 2025 16:27

dcoutts enabled auto-merge March 13, 2025 16:27

dcoutts changed the title ~~Refactoring of snapshot merging run representation~~ Implement snapshots for union tables Mar 13, 2025

dcoutts added this pull request to the merge queue Mar 13, 2025

Merged via the queue into main with commit cff7a39 Mar 13, 2025
27 checks passed

dcoutts deleted the dcoutts/snapshot-refactoring branch March 13, 2025 17:34

jorisdral reviewed Mar 17, 2025

View reviewed changes

jorisdral mentioned this pull request Mar 17, 2025

Miscellaneous changes for IncomingRun #630

Merged

jorisdral mentioned this pull request Mar 24, 2025

Allow overriding the disk cache policy (feature restored) #641

Merged

Implement snapshots for union tables #608

Implement snapshots for union tables #608

Uh oh!

Conversation

dcoutts commented Mar 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dcoutts commented Mar 4, 2025

Uh oh!

mheinzel Mar 4, 2025

Choose a reason for hiding this comment

Uh oh!

dcoutts Mar 4, 2025

Choose a reason for hiding this comment

Uh oh!

recursion-ninja commented Mar 5, 2025

Uh oh!

dcoutts commented Mar 5, 2025

Uh oh!

dcoutts commented Mar 5, 2025

Uh oh!

dcoutts commented Mar 6, 2025

Uh oh!

Uh oh!

Uh oh!

mheinzel left a comment

Choose a reason for hiding this comment

Uh oh!

mheinzel Mar 7, 2025

Choose a reason for hiding this comment

Uh oh!

dcoutts Mar 7, 2025

Choose a reason for hiding this comment

Uh oh!

mheinzel Mar 7, 2025

Choose a reason for hiding this comment

Uh oh!

dcoutts commented Mar 8, 2025

Uh oh!

dcoutts commented Mar 10, 2025

Uh oh!

dcoutts commented Mar 10, 2025

Uh oh!

dcoutts commented Mar 10, 2025

Uh oh!

dcoutts commented Mar 10, 2025

Uh oh!

Uh oh!

jorisdral left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

dcoutts commented Mar 4, 2025 •

edited

Loading