Skip to content

Conversation

@dcoutts
Copy link
Collaborator

@dcoutts dcoutts commented Mar 4, 2025

This includes quite a bit of preparatory refactoring, but the main feature is including merging trees (resulting from table unions) into the snapshots, and corresponding tests.

@dcoutts
Copy link
Collaborator Author

dcoutts commented Mar 4, 2025

@recursion-ninja in particular, this means you now have a SnapMergingRunState, toSnapMergingRunState and fromSnapMergingRunState that you could reuse in two places in the merging tree snapshot types and code.

Comment on lines 493 to 688
-- This will set the correct nominal credits, but it will not do any more
-- merging work because fromSnapMergingRunState already supplies all the
-- merging credits already.
supplyCreditsIncomingRun conf ln ir nominalCredits
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to make sure I understand it correctly:

it will not do any more merging work is based on the invariant that the snapshotted merging run's physical/merge credits were at least as high as dictated by the incoming run's nominal credit. If that assumption wasn't true for some reason, we would fix that here by doing some merging work.

Is that correct?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is correct.

@recursion-ninja
Copy link
Collaborator

@recursion-ninja in particular, this means you now have a SnapMergingRunState, toSnapMergingRunState and fromSnapMergingRunState that you could reuse in two places in the merging tree snapshot types and code.

These functions and data-type definitions should work well for de-duplicating the code. I am having minor trouble with the LevelNo in the type signature of toSnapMergingRunState since the MergingTree doesn't have any level number, but I think this is pretty easily fixable.

@dcoutts
Copy link
Collaborator Author

dcoutts commented Mar 5, 2025

I am having minor trouble with the LevelNo in the type signature of toSnapMergingRunState since the MergingTree doesn't have any level number, but I think this is pretty easily fixable.

Yeah, I was talking to @mheinzel about that yesterday.

I think it's ok to fake it for now and we can think about tidying it up. It's just a proxy for an order of magnitude size.

@dcoutts dcoutts force-pushed the dcoutts/snapshot-refactoring branch from 9afc1fa to 55b7225 Compare March 5, 2025 16:49
@dcoutts
Copy link
Collaborator Author

dcoutts commented Mar 5, 2025

I've rebased and accepted the changes to the golden files for snapshot format. So hopefully this will now pass CI.

@dcoutts dcoutts marked this pull request as ready for review March 5, 2025 16:50
@dcoutts dcoutts force-pushed the dcoutts/snapshot-refactoring branch 4 times, most recently from b7ad924 to 9b8cf74 Compare March 6, 2025 11:48
@dcoutts
Copy link
Collaborator Author

dcoutts commented Mar 6, 2025

These functions and data-type definitions should work well for de-duplicating the code. I am having minor trouble with the LevelNo in the type signature of toSnapMergingRunState since the MergingTree doesn't have any level number, but I think this is pretty easily fixable.

@recursion-ninja I have now updated this again to avoid the need for any LevelNo arg in the snapshot restore for merging runs. The merging run snapshot now stores the RunParams rather than deriving the run params from the level no during snapshot restore.

This required a bit of additional refactoring, so there's an extra patch in here now to introduce the RunParams type and use it comprehensively.

@dcoutts dcoutts force-pushed the dcoutts/snapshot-refactoring branch from 9b8cf74 to 8c3cab1 Compare March 6, 2025 14:02
Copy link
Collaborator

@mheinzel mheinzel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, that goes in the right direction. Makes sense to store credits this way and good that the level number is not needed for the merging trees.

SnapCompletedMerge !r
| SnapOngoingMerge !(V.Vector r) !t
SnapCompletedMerge !NumRuns !MergeDebt !r
| SnapOngoingMerge !RunParams !MergeCredits !(V.Vector r) !t
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Long-term it might make sense to re-construct as much as possible of the run params from the config when opening a snapshot, so users can change (part of) the config, re-open the DB and have the restored tables use the new config. But in the meantime I think it's totally sensible to just take the params from the snapshot. 👍

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps, but I think if possible a nicer design is to load a snapshot and then when the table is online, change the parameters. That's more general since it allows changing online too.

bracketOnError
(do uniq <- incrUniqCounter uc
let runPaths = runPath dir (uniqueToRunNumber uniq)
MR.new hfs hbio resolve runParams mergeType runPaths rs)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Long-term it might make sense to re-construct as much as possible of the run params from the config when opening a snapshot, so users can change (part of) the config, re-open the DB and have the restored tables use the new config. But in the meantime I think it's totally sensible to just take the params from the snapshot. 👍

@dcoutts dcoutts force-pushed the dcoutts/snapshot-refactoring branch 2 times, most recently from 1655f32 to 1527fcf Compare March 8, 2025 00:14
@dcoutts
Copy link
Collaborator Author

dcoutts commented Mar 8, 2025

I think this is in a pretty good state, however I still suspect the ref counting for the snapshots for merging trees is wrong.

There isn't yet a test that demonstrates that. I'm trying to (partially) enable the StateMachine tests to show it.

I think a simple unit test for a table with unions and then saving and restoring a snapshot would demonstrate the ref counting errors. But I'd also like to make sure the state machine tests would give coverage of it too.

@dcoutts
Copy link
Collaborator Author

dcoutts commented Mar 10, 2025

In fact the state machine tests do find a nice simple reproducer, so that's ok:

      *** Failed! Falsified (after 1 test and 54 shrinks):
      do var1 <- action $ Action Nothing (New (PrettyProxy @((Key,Value,Blob))) (TableConfig {confMergePolicy = MergePolicyLazyLevelling, confSizeRatio = Four, confWriteBufferAlloc = AllocNumEntries (NumEntries 1), confBloomFilterAlloc = AllocFixed 10, confFencePointerIndex = CompactIndex, confDiskCachePolicy = DiskCacheNone, confMergeSchedule = Incremental}))
         action $ Action Nothing (Updates [(Key (KeyForIndexCompact {getKeyForIndexCompact = [0,0,0,0,0,0,0,0]}),Delete)] (unsafeMkGVar var1 (OpFromRight `OpComp` OpId)))
         var6 <- action $ Action Nothing (Union (unsafeMkGVar var1 (OpFromRight `OpComp` OpId)) (unsafeMkGVar var1 (OpFromRight `OpComp` OpId)))
         action $ Action Nothing (CreateSnapshot Nothing (SnapshotLabel "Key Value Blob") "snap1" (unsafeMkGVar var6 (OpFromRight `OpComp` OpId)))
         action $ Action Nothing (OpenSnapshot (PrettyProxy @((Key,Value,Blob))) (SnapshotLabel "Key Value Blob") "snap1")
         pure ()
      Expected 1 open handles, but found 5

So this is clearly a ref counting mistake. I'll try and fix that now...

@dcoutts
Copy link
Collaborator Author

dcoutts commented Mar 10, 2025

Ok, pushed a patch to fix all the referencing counting bugs I can find from the state machine tests.

@dcoutts dcoutts force-pushed the dcoutts/snapshot-refactoring branch 2 times, most recently from 90e0481 to 5261940 Compare March 10, 2025 12:59
@dcoutts
Copy link
Collaborator Author

dcoutts commented Mar 10, 2025

Ok, the only thing remaining here is to restore the ability to override the config with TableConfigOverride to openSnapshot.

We'll do this selectively so that it's overridden for runs and merges in levels but left as was for tree merges (where the policy is currently ill defined).

@dcoutts
Copy link
Collaborator Author

dcoutts commented Mar 10, 2025

Ok, I think the best thing to do is to merge this PR with its current feature set. This includes making the TableConfigOverride param to openSnapshot effectively useless. And then in a follow-up PR we introduce a specific feature to change the table config for open/online tables.

Something like:

changeConfig :: TableConfigOverride -> Table m h -> m ()

dcoutts and others added 14 commits March 13, 2025 16:26
These params are (almost) always used together, so this helps to reduce
verbosity and the number of separate args that have to be passed around.

We also slightly simplify when args are provided. Now all these params
are provided to the RunBuilder, and none extra are needed when the
RunBuilder is turned into the Run. Previously, two were passed to the
RunBuilder and one more passed to the Run (though actually finalising
the RunBuidler also needed the third param anyway).

However, the real motivation for all this is to improve restoring
snapshots of merging runs. At the moment, deserialising a merging run
requires us to supply all three of these params from the surrounding
context. This is ok in the context of levels, but becomes awkward for
merging runs embedded in a merging tree.

The obvious solution is to serialise this set of parameters with the
merging run, and then they're directly available to use when
reconstructing the merge during snapshot restore.

Changes to serialisation are not in this patch, but the next.
The goal is to make the representation of the snapshot of the merging
run closer to being sufficient information to reconstruct the merging
run upon snapshot restore.

Previously we only embed the merging run within an incoming run and we
use information from that context to reconstruct the merging run.

In future we want to reuse the representation of snapshots of merging
runs within merging trees, and those do not contain the same context
information as an incoming run does. So to enable reuse, it makes sense
to make the merging run snapshot representation more standalone.

In particular:
 * SnapIncomingRun no longer contains NumRuns or MergeDebt,
   and these get moved down to the SnapMergingRunState.
 * SnapIncomingRun now stores the NominalDebt
 * SnapMergingRunState also gains the RunParams and MergeCredits
Again, the goal is to make the representation more self-contained so
that less information needs to be supplied from the context during
snapshot restore. And again the intnded use case is merging trees, and
runs within them.

In the case of a run it is the disk caching policy, and the index type.
So overall for a run we store the run number (to find the disk files),
the index type (so we know how to restore the index) and the caching
policy.

Arguably the index type should not need to be known in advance and the
index file format should simply say which kind of index it is.
It is just a simple `traverse` to lift it to Levels. This makes it
simpler to reuse with merging trees of runs, which can also just use
`traverse`.
Put hfs hbio first since that's more global and what we use elsewhere.
Roughly in order of lest frequently varying to most frequently varying,
or equivalently most global to most local.
The union level contains an optional MergingTree. This is now included
in the snapshot representation. Update codecs and tests.

This changes the snapshot file format. The golden test files are updated
in the next patch.

Co-authored-by: Duncan Coutts <duncan@well-typed.com>
Co-authored-by: Joris Dral <joris@well-typed.com>
following the change of snapshot representation types.
It should duplicate its input reference for consistency, since it
retains it.
Also tweak the MergingTree constructor utilities to make this more
straightforward.
Use utils for maybe, list and vector. And use a single-item encoding for
Maybe, rather than it being a pair of things. The pair style means one
has to remember to count it as two when setting the tuple lengths.
Also update the credit docs for nominal credits.
The shrink tests were timing out.
@dcoutts dcoutts force-pushed the dcoutts/snapshot-refactoring branch from 6ef8b50 to 2080353 Compare March 13, 2025 16:27
@dcoutts dcoutts dismissed jorisdral’s stale review March 13, 2025 16:27

Feedback addressed. Can do any further review post-merge.

@dcoutts dcoutts enabled auto-merge March 13, 2025 16:27
@dcoutts dcoutts changed the title Refactoring of snapshot merging run representation Implement snapshots for union tables Mar 13, 2025
@dcoutts dcoutts added this pull request to the merge queue Mar 13, 2025
Merged via the queue into main with commit cff7a39 Mar 13, 2025
27 checks passed
@dcoutts dcoutts deleted the dcoutts/snapshot-refactoring branch March 13, 2025 17:34
Copy link
Collaborator

@jorisdral jorisdral left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

jorisdral added a commit that referenced this pull request Mar 24, 2025
Since #608, using a `TableConfigOverride` when opening a snapshot has no effect.
This commit restores the feature that we can change the disk cache policy on
snapshot open.

Internally, instead of threading through a modified config to open the snapshot,
we modify the whole `SnapshotMetaData` (replacing the disk cache policy
everywhere) and open the snapshot from that modified metadata.

A first idea we had was to modify a table config while its table is live, but
there are some challenges. Changing a table config when the table is offline
seem to be more straightforward, so that's the approach we took (for now). An
explanation of the challenges can be found in the `*.Config.Override` module.
jorisdral added a commit that referenced this pull request Mar 27, 2025
Since #608, using a `TableConfigOverride` when opening a snapshot has no effect.
This commit restores the feature that we can change the disk cache policy on
snapshot open.

Internally, instead of threading through a modified config to open the snapshot,
we modify the whole `SnapshotMetaData` (replacing the disk cache policy
everywhere) and open the snapshot from that modified metadata.

A first idea we had was to modify a table config while its table is live, but
there are some challenges. Changing a table config when the table is offline
seem to be more straightforward, so that's the approach we took (for now). An
explanation of the challenges can be found in the `*.Config.Override` module.
jorisdral added a commit that referenced this pull request Mar 31, 2025
Since #608, using a `TableConfigOverride` when opening a snapshot has no effect.
This commit restores the feature that we can change the disk cache policy on
snapshot open.

Internally, instead of threading through a modified config to open the snapshot,
we modify the whole `SnapshotMetaData` (replacing the disk cache policy
everywhere) and open the snapshot from that modified metadata.

A first idea we had was to modify a table config while its table is live, but
there are some challenges. Changing a table config when the table is offline
seem to be more straightforward, so that's the approach we took (for now). An
explanation of the challenges can be found in the `*.Config.Override` module.
jorisdral added a commit that referenced this pull request Apr 3, 2025
Since #608, using a `TableConfigOverride` when opening a snapshot has no effect.
This commit restores the feature that we can change the disk cache policy on
snapshot open.

Internally, instead of threading through a modified config to open the snapshot,
we modify the whole `SnapshotMetaData` (replacing the disk cache policy
everywhere) and open the snapshot from that modified metadata.

A first idea we had was to modify a table config while its table is live, but
there are some challenges. Changing a table config when the table is offline
seem to be more straightforward, so that's the approach we took (for now). An
explanation of the challenges can be found in the `*.Config.Override` module.
recursion-ninja pushed a commit that referenced this pull request Apr 28, 2025
Since #608, using a `TableConfigOverride` when opening a snapshot has no effect.
This commit restores the feature that we can change the disk cache policy on
snapshot open.

Internally, instead of threading through a modified config to open the snapshot,
we modify the whole `SnapshotMetaData` (replacing the disk cache policy
everywhere) and open the snapshot from that modified metadata.

A first idea we had was to modify a table config while its table is live, but
there are some challenges. Changing a table config when the table is offline
seem to be more straightforward, so that's the approach we took (for now). An
explanation of the challenges can be found in the `*.Config.Override` module.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants