Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decouple loose indexes #36

Merged
merged 5 commits into from
Nov 10, 2021
Merged

Decouple loose indexes #36

merged 5 commits into from
Nov 10, 2021

Conversation

peterwaller-arm
Copy link
Contributor

@peterwaller-arm peterwaller-arm commented Nov 8, 2021

Depends on #29, so beware full diff contains that diff.

This builds on #29 (remove repository index).

  • It removes the notion of a PackId::Loose. I intend to go a bit further here and get rid of PackId as an enum but that can be done later in a no-functional-changes PR.
  • It removes --pack|-P from the command line interface, simplifying the user interface a bit.
    • There is a new API, repo.find_snapshot(maybe_canonical_snapshot) -> SnapshotId. This API returns an error if the given string does not uniquely identify a snapshot within a single index.
    • Users can specify a pack with the syntax pack:snapshot, or just snapshot, which feeds through this API.
    • Use of : as a packid:snapshot_name separator enables / to be used for namespacing packs by using directories on the filesystem, and for / to appear in snapshot in the future too, if we wanted.
  • Store is reimplemented. It now no longer has to do work which scales with the number of snapshots which makes it quite a bit cheaper (it was taking 10s of seconds per snapshot when you have lots of snapshots, but now it's a flat 100ms).
  • Stores are now independent so we can execute multiple of them in parallel sharing the object store safely.
  • Stores save the snapshot to an index called packs/loose/<snapshot_name>.idx. So a loose snapshot is uniquely identified with loose/snapshot_name:snapshot_name, for the time being, if snapshot_name exists in other packs, or just with the string snapshot_name otherwise.
  • Packing is generalized to take multiple indexes as input. If the indexes are not specified, all loose indexes are taken.
    • TODO: Check that I remembered to sort indexes by name.
  • Introduce a 'loose' bit to the PackIndex #34 For now loose packs are identified as index files lacking a pack. I intend to introduce a bit to the beginning of an index file identifying it as loose, so we can have 'indices' without having the packs.
  • After this PR, loose indices are not removed during pack creation, nor are loose objects cleaned. Implement 'elfshaker gc' to clear unneeded loose objects and packs #17 is therefore a little more important. And probably we want to delete the loose indices by default.

There is one slightly confusing nomenclature issue with this PR, in that we refer to loose packs as packs, when they aren't yet packed. So the loose snapshots live in a 'loose/name.pack.idx'.

Resolves #15.

@peterwaller-arm peterwaller-arm changed the title Decouple loose indices Decouple loose indexes Nov 8, 2021
@peterwaller-arm peterwaller-arm force-pushed the decouple-loose-indices branch 3 times, most recently from 98786dc to a6b5981 Compare November 8, 2021 13:03
@peterwaller-arm
Copy link
Contributor Author

I'm going to pull out the fs2 / locks changes into a separate PR. Other than that, this is ready to go.

@peterwaller-arm peterwaller-arm force-pushed the decouple-loose-indices branch 2 times, most recently from 72b0116 to 6d38cf6 Compare November 8, 2021 13:45
@peterwaller-arm
Copy link
Contributor Author

OK, I'm now done with this. Left atomic writes alone here, got rid of the fs2 dependency, which can be thought about separately as part of #40.

@peterwaller-arm
Copy link
Contributor Author

Assuming it's green, I'm done fiddling here. The last action broke the tests again.

@peterwaller-arm peterwaller-arm force-pushed the decouple-loose-indices branch 2 times, most recently from cbdb0fd to eb17bf1 Compare November 8, 2021 19:03
Copy link
Member

@veselink1 veselink1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! There will be no slowdown now when loose snapshots pile up.
A couple of comments, some might be non-trivial to resolve, sorry 🤔.

src/bin/elfshaker/list.rs Show resolved Hide resolved
src/bin/elfshaker/list.rs Outdated Show resolved Hide resolved
src/bin/elfshaker/pack.rs Outdated Show resolved Hide resolved
src/bin/elfshaker/pack.rs Show resolved Hide resolved
src/bin/elfshaker/pack.rs Outdated Show resolved Hide resolved
src/bin/elfshaker/pack.rs Outdated Show resolved Hide resolved
src/bin/elfshaker/store.rs Show resolved Hide resolved
src/bin/elfshaker/list.rs Show resolved Hide resolved
Signed-off-by: Peter Waller <peter.waller@arm.com>
Signed-off-by: Peter Waller <peter.waller@arm.com>
Signed-off-by: Peter Waller <peter.waller@arm.com>
Avoid crashing with divide by zero on an empty pack.

An empty pack isn't very useful, except for testing.

Signed-off-by: Peter Waller <peter.waller@arm.com>
Copy link
Member

@veselink1 veselink1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, everything's fixed and we're now sorting by mtime, which preserves the order of the snapshots, thanks!

Typo in check.sh causes the tests to fail.

test-scripts/check.sh Outdated Show resolved Hide resolved
@peterwaller-arm peterwaller-arm force-pushed the decouple-loose-indices branch 2 times, most recently from c435aed to 4c224ae Compare November 9, 2021 21:20
If no snapshots are specified, pack them in the order they were created.

Signed-off-by: Peter Waller <peter.waller@arm.com>
@peterwaller-arm
Copy link
Contributor Author

Frustratingly elfsshaker store twice in quick succession is resulting in identical nanosecond-granularity timestamps in the mtime field, causing the test to non-deterministically fail. So I chose an EPOCH=$(date) and manually set distinct timestamps on the packs.

@peterwaller-arm peterwaller-arm merged commit b9c9cf0 into main Nov 10, 2021
@peterwaller-arm peterwaller-arm deleted the decouple-loose-indices branch November 10, 2021 09:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Decouple loose snapshot indexes
2 participants