Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

irmin-pack: fix GC when target commit is in lower #2242

Merged
merged 4 commits into from
May 3, 2023

Conversation

metanivek
Copy link
Member

@metanivek metanivek commented May 1, 2023

After an existing store migrates to the lower layer, we want a GC that targets a commit in the lower layer to still run and create a sparse file (prefix + mapping) in the upper so that reads for new data do not need to go to the lower.

Before this PR, this kind of GC almost worked. The issue is that it would fail when attempting to swap the volume control file since a new temporary one was not created. This PR adds a test for this scenario and enables a previously written test that also failed with this bug.

(@Firobe tagged you for review since you wrote the original test if you wanted to double-check my changes that enable it)

Update1: I just added a commit that now checks that a requested commit for GC is strictly newer than the previous GC commit. I reused an existing error (Gc_disallowed) by adding a string to keep from adding another error to our already long list of errors. It also finds an internal use for the latest_target_gc_offset field. 🎉

Update2: There was another issue hiding if you called GC on a commit older than the latest commit in the lower. Thanks @art-w for spotting! This is now fixed. The root issue was that previous code assumed the offset of the GC commit should always be the new start offset of the suffix. But, we don't necessarily want that when calling GC on commits in the lower after a migration.

@metanivek metanivek requested review from art-w and Firobe May 1, 2023 17:43
@codecov-commenter
Copy link

codecov-commenter commented May 1, 2023

Codecov Report

Merging #2242 (08722c8) into main (4c38a4b) will increase coverage by 0.01%.
The diff coverage is 80.00%.

❗ Current head 08722c8 differs from pull request most recent head 2bec6bf. Consider uploading reports for the commit 2bec6bf to get more accurate results

📣 This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more

@@            Coverage Diff             @@
##             main    #2242      +/-   ##
==========================================
+ Coverage   67.93%   67.94%   +0.01%     
==========================================
  Files         136      136              
  Lines       16603    16622      +19     
==========================================
+ Hits        11279    11294      +15     
- Misses       5324     5328       +4     
Impacted Files Coverage Δ
src/irmin-pack/unix/errors.ml 28.88% <ø> (ø)
src/irmin-pack/unix/gc_worker.ml 3.24% <0.00%> (ø)
src/irmin-pack/unix/io_errors.ml 58.33% <ø> (ø)
src/irmin-pack/unix/gc.ml 72.64% <75.00%> (-0.77%) ⬇️
src/irmin-pack/unix/lower.ml 61.98% <100.00%> (+0.91%) ⬆️
src/irmin-pack/unix/store.ml 65.78% <100.00%> (+0.34%) ⬆️

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

We want to create a prefix/mapping in the upper in this scenario to
support the migration of existing stores so that they can continue to
call GC for data in the lower while getting the benefit of the sparse
file in the upper.
Since any commit can be the target of a GC when a store's behavior is
archive, we need to guard against calling GC on commits prior to the
most recent successful GC. When store behavior is delete, this is
automatically enforced since older commits are inaccessible.
Copy link
Contributor

@art-w art-w left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested successfully, and it LGTM!

@metanivek metanivek merged commit 1f046dd into mirage:main May 3, 2023
5 checks passed
@metanivek metanivek deleted the gc_in_lower branch May 3, 2023 15:21
metanivek added a commit to metanivek/opam-repository that referenced this pull request May 3, 2023
…min-pack, irmin-pack-tools, irmin-mirage, irmin-mirage-graphql, irmin-mirage-git, irmin-http, irmin-graphql, irmin-git, irmin-fs, irmin-containers, irmin-cli, irmin-chunk and irmin-bench (3.7.1)

CHANGES:

### Fixed

- **irmin-pack**
  - Fix issue when migrating v2 stores to use lower layer (@metanivek, mirage/irmin#2241)
  - Fix issue when calling GC for a commit in the lower after migration
    (@metanivek, mirage/irmin#2242)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants