Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pkg/nixpath/chunker: add #81

Closed
wants to merge 1 commit into from
Closed

pkg/nixpath/chunker: add #81

wants to merge 1 commit into from

Conversation

flokli
Copy link
Collaborator

@flokli flokli commented Jun 19, 2022

This provides two different implementations to chunk data.

I'm not entirely sure if this should go into pkg/nixpath/chunker, or in another place.

@flokli flokli requested a review from adisbladis June 19, 2022 14:33
Comment on lines +16 to +18
chunkerOpts.NormalSize = 64 * 2024
chunkerOpts.MinSize = chunkerOpts.NormalSize / 4
chunkerOpts.MaxSize = chunkerOpts.NormalSize * 4
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just used the values here I used in nix-casync initially. This probably should still be refined once we can ingest a bit of data.

@@ -6,6 +6,7 @@ require (
github.com/alecthomas/kong v0.5.0
github.com/dgraph-io/badger/v3 v3.2103.2
github.com/google/go-cmp v0.5.5
github.com/poolpOrg/go-fastcdc v0.0.0-20211130135149-aa8a1e8a10db
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will you be able to re-use that and be casync-compatible or is that a new system altogether?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a new chunking mechanism, with a much smaller interface. You pass in a reader to some data and can use some iterator interface to get chunks, while reading through the data.

nix-casync used desync, which provided a lot of functionality that we didn't use (.caidx). Also, the way it was designed required us to first write the data to be chunked to a (temporary) file.

The chunking method used to chunk up data shouldn't matter when it comes to substitution. However, using the same chunking method with similar parameters should yield more block reuse.

This provides two different implementations to chunk data.
@flokli
Copy link
Collaborator Author

flokli commented Jun 20, 2022

I marked this to a draft. I'm not entirely sure it belongs in pkg/nixpath, and if so, how to structure other things in the package.

Also, right now, go-nix mostly contains (re-)implementations of some common concepts found in Nix, and things like chunking mechanisms are not part of that. This might change once we have come up with a (already in-the-works) new remote store protocol.

@flokli
Copy link
Collaborator Author

flokli commented Jul 13, 2022

Closing in favor of #86.

@flokli flokli closed this Jul 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants