Skip to content

Introduce a minimum partition version #2778

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Dec 14, 2022

Conversation

dominiklohmann
Copy link
Member

This introduces a minimum partition version for VAST in order to enable future cleanup of things we know only existed in databases where all partitions have a partition version lower than the minimum. This PR also bumps the minimum partition version to 1.

@dominiklohmann dominiklohmann added maintenance Tasks for keeping up the infrastructure blocked Blocked by an (external) issue labels Dec 9, 2022
@dominiklohmann dominiklohmann requested a review from lava December 9, 2022 15:05
@dominiklohmann dominiklohmann force-pushed the story/sc-36344/bump-minimum-partition-version branch from bf968a9 to 0c2a7a1 Compare December 9, 2022 15:10
Copy link
Member

@lava lava left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Useful mechanism to have in general

@dominiklohmann dominiklohmann removed the blocked Blocked by an (external) issue label Dec 9, 2022
@dominiklohmann dominiklohmann force-pushed the story/sc-36344/bump-minimum-partition-version branch 2 times, most recently from a0fd670 to b022435 Compare December 14, 2022 16:03
This introduces a minimum partition version for VAST in order to enable
future cleanup of things we know only existed in databases where all
partitions have a partition version lower than the minimum. This PR also
bumps the minimum partition version to 1.
This is superseded by the per-partition version and has been effectively unused
since before VAST v1.0; now that the minimum partition version is one that was
set in VAST v2.2 we can remove it.
@dominiklohmann dominiklohmann force-pushed the story/sc-36344/bump-minimum-partition-version branch from b022435 to 075a274 Compare December 14, 2022 17:53
@dominiklohmann dominiklohmann merged commit cfc56de into master Dec 14, 2022
@dominiklohmann dominiklohmann deleted the story/sc-36344/bump-minimum-partition-version branch December 14, 2022 19:07
dominiklohmann added a commit that referenced this pull request Dec 14, 2022
Following the introduction of a minimum partition version of 1 with
#2778, we can now—finally—remove the archive and everything that
belongs to it.
dominiklohmann added a commit that referenced this pull request Dec 14, 2022
Following the introduction of a minimum partition version of 1 with
#2778, we can now—finally—remove the archive and everything that
belongs to it.
dominiklohmann added a commit that referenced this pull request Dec 14, 2022
Following the introduction of a minimum partition version of 1 with
#2778, we're no longer able to encounter any table slices
that are MessagePack-encoded or use an older version of the mapping of
our data model onto Apache Arrow.

There's a future refactoring that is now unblocked, which is a total
replacement of the legacy baggage that is the `table_slice` API with an
easier-to-use wrapper around the same underlying data that makes more
assumptions about the underlying column-major memory layout.

This is based on #2797 to avoid merge conflicts.
dominiklohmann added a commit that referenced this pull request Dec 16, 2022
Following the introduction of a minimum partition version of 1 with
#2778, we're no longer able to encounter any table slices
that are MessagePack-encoded or use an older version of the mapping of
our data model onto Apache Arrow.

There's a future refactoring that is now unblocked, which is a total
replacement of the legacy baggage that is the `table_slice` API with an
easier-to-use wrapper around the same underlying data that makes more
assumptions about the underlying column-major memory layout.

This is based on #2797 to avoid merge conflicts.
dominiklohmann added a commit that referenced this pull request Dec 19, 2022
Following the introduction of a minimum partition version of 1 with
#2778, we can now—finally—remove the archive and everything
that belongs to it.
dominiklohmann added a commit that referenced this pull request Dec 21, 2022
Following the introduction of a minimum partition version of 1 with
#2778, the augmented partition synopsis is now irrelevant as
the schema is always contained in the partition synopsis already. This
has no user-facing implications, it's purely an internal cleanup.
dominiklohmann added a commit that referenced this pull request Dec 21, 2022
Following the introduction of a minimum partition version of 1 with
#2778, all partitions are now guaranteed to be homogeneous.
The rebuilder as such does not need to take care of heterogeneous
partitions any longer.
dominiklohmann added a commit that referenced this pull request Dec 21, 2022
Following the introduction of a minimum partition version of 1 with
#2778, all partitions are now guaranteed to have an id range
specified in their partition synopsis. Rewriting old partition synopses
on startup is no longer necessary.
dominiklohmann added a commit that referenced this pull request Dec 21, 2022
Following the introduction of a minimum partition version of 1 with
#2778, all partitions are now guaranteed to have an id range
specified in their partition synopsis. Rewriting old partition synopses
on startup is no longer necessary.
dominiklohmann added a commit that referenced this pull request Dec 21, 2022
Following the introduction of a minimum partition version of 1 with
#2778, all partitions are now guaranteed to be homogeneous.
The rebuilder as such does not need to take care of heterogeneous
partitions any longer.
dominiklohmann added a commit that referenced this pull request Dec 21, 2022
Following the introduction of a minimum partition version of 1 with
#2778, all partitions are now guaranteed to have an id range
specified in their partition synopsis. Rewriting old partition synopses
on startup is no longer necessary.
dominiklohmann added a commit that referenced this pull request Dec 21, 2022
Following the introduction of a minimum partition version of 1 with
#2778, the augmented partition synopsis is now irrelevant as
the schema is always contained in the partition synopsis already. This
has no user-facing implications, it's purely an internal cleanup.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
maintenance Tasks for keeping up the infrastructure
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants