Skip to content

Conversation

@jorisdral
Copy link
Collaborator

@jorisdral jorisdral commented Mar 22, 2025

Since #608, using a TableConfigOverride when opening a snapshot has no effect. This commit restores the feature that we can change the disk cache policy on snapshot open.

Internally, instead of threading through a modified config to open the snapshot, we modify the whole SnapshotMetaData (replacing the disk cache policy everywhere) and open the snapshot from that modified metadata.

A first idea we had was to modify a table config while its table is live, but there are some challenges. Changing a table config when the table is offline seems to be more straightforward, so that's the approach we took (for now). An explanation of the challenges can be found in the *.Config.Override module.

@jorisdral jorisdral self-assigned this Mar 22, 2025
@jorisdral jorisdral marked this pull request as draft March 22, 2025 22:59
@jorisdral jorisdral force-pushed the jdral/config-override branch 2 times, most recently from 15d6cf1 to 76ebb9f Compare March 24, 2025 10:55
@jorisdral jorisdral marked this pull request as ready for review March 24, 2025 10:56
@jorisdral jorisdral force-pushed the jdral/config-override branch from 76ebb9f to bc2933b Compare March 27, 2025 10:19
Copy link
Collaborator

@mheinzel mheinzel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, just a few comments about style and documentation.

, snapMergingTree =
let rdc = diskCachePolicyForLevel dcp UnionLevel
in override rdc snapMergingTree
, ..
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we add another field to the metadata, this makes it easy to miss that we might have to apply the override there as well. But can use .. in the pattern match, but I think it would be better to explicitly supply all the fields here.

Copy link
Collaborator Author

@jorisdral jorisdral Mar 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right about it being easy to miss new fields. An approach I'd normally take is to pattern match on all fields and their types, so that any change to a type would become a compiler error here. I can try it out, and ping you for your thoughts later. The downside is that the code becomes more verbose (not a problem IMO if it means more future proofness)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've made some changes in 0d30694, let me know what you think

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems a little excessive. Quite boilerplate-y and not sure it needs to fail when re-ordering fields or changing the type of a field (while keeping the same name). I think it would be enough to rely on field labels, catching renamed/added/removed fields. It's okay, though. It's clear what the code does, so the boilerplate is not an issue.

Copy link
Collaborator Author

@jorisdral jorisdral Apr 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree it's boilerplate-y -- if it turns out to be more of nuisance than useful we can always change it.

The thing about matching on record field selectors (like \Foo{bar}) is that it will still happily compile if we add new fields. Or am I misunderstanding your suggestion? BTW, something like safe-wild-cards would help with ensuring we have to address all of the fields, I've used it in the past in fs-sim to ensure I wasn't forgetting any of the Errors fields

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My suggestion was to construct the record using labels, but not .., so we list out all fields. This fails to compile if we rename or add a field.

  override dcp SnapshotMetaData{..} = SnapshotMetaData {
        snapMetaLabel
      , snapMetaTableType
        snapMetaConfig = override dcp snapMetaConfig
      , snapWriteBuffer
      , snapMetaLevels = override dcp snapMetaLevels
      , snapMergingTree =
          let rdc = diskCachePolicyForLevel dcp UnionLevel
          in  override rdc snapMergingTree
      }

But I think what you did is reasonable as well.

@jorisdral jorisdral force-pushed the jdral/config-override branch from bc2933b to 0d30694 Compare March 31, 2025 12:29
Since #608, using a `TableConfigOverride` when opening a snapshot has no effect.
This commit restores the feature that we can change the disk cache policy on
snapshot open.

Internally, instead of threading through a modified config to open the snapshot,
we modify the whole `SnapshotMetaData` (replacing the disk cache policy
everywhere) and open the snapshot from that modified metadata.

A first idea we had was to modify a table config while its table is live, but
there are some challenges. Changing a table config when the table is offline
seem to be more straightforward, so that's the approach we took (for now). An
explanation of the challenges can be found in the `*.Config.Override` module.
@jorisdral jorisdral force-pushed the jdral/config-override branch from 0d30694 to 4d1e2b6 Compare April 3, 2025 09:32
@jorisdral jorisdral enabled auto-merge April 3, 2025 09:32
@jorisdral jorisdral added this pull request to the merge queue Apr 3, 2025
Merged via the queue into main with commit ad1292b Apr 3, 2025
30 checks passed
@jorisdral jorisdral deleted the jdral/config-override branch April 3, 2025 11:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants