-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement new index type that also includes mutltihash code #217
Conversation
@willscott This index implementation uses a nested map of I'd love to get your feedback whenever you get a chance. 🍻 |
Unless we can think of cases where we wouldn't be able to get away with the simpler case, lets see if we can get away with it? |
@masih @willscott one of these days when I carve some time to build what I was planning to build before joining PL, I am definitely going to use multihash-level truncated full-sha3-512 ( can go into the weeds why if interested ). So you won't be able to index my content e.g. ( whether this is a showstopper or not: 🤷 ) |
To be precise: |
that means the car will have to be read through block by block to calculate the list of full multi-hashes to provide for indexing, rather than being able to crib it off the index of the car. Eventually, it sounds like we should be able to store the computed announcement for indexer nodes if we do have to do the extra work of computation to generate the announcement. that'll probably happen after MVP (although @masih already wrote the walking code once 😅 ) |
a73976a
to
22949f0
Compare
@rvagg anything we should be aware of in putting together a PR for the additional index multicodec format over in multicodecs? |
nope, should be pretty straightforward, just add a |
thanks @rvagg, going to open a PR for it just now if we are happy with |
Define a new codec for CARv2 `MultihashIndexSorted`. See: - ipld/go-car#217 - ipld/go-car#214
Define a new codec for CARv2 `MultihashIndexSorted`. See: - ipld/go-car#217 - ipld/go-car#214
c28358b
to
ac2b95a
Compare
Define a new codec for CARv2 `MultihashIndexSorted`. See: - ipld/go-car#217 - ipld/go-car#214
ac2b95a
to
276d94a
Compare
Implement a new CARv2 index that contains enough information to reconstruct the multihashes of the data payload, since `CarIndexSorted` only includes multihash digests. The new index builds on top of the existing `IndexSorted` by adding an additional layer of grouping the multi-width indices by their multihash code. Note, this index intentionally ignores any given record with `multihash.IDENTITY` CID hash. Add a test that asserts offsets for the same CID across sorted index and new multihash sorted index are consistent. Add tests that assert marshal unmarshalling of the new index type is as expected, and it does not load records with `multihash.IDENTITY` digest. Relates to: - multiformats/multicodec#227 Fixes: - #214
276d94a
to
ce2e998
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Self-review
// This implementation decodes multihashes twice: once here to group by code, and once in the | ||
// internals of multiWidthIndex to group by digest length. The code can be optimized by | ||
// combining the grouping logic into one step where the multihash of a CID is only decoded once. | ||
// The optimization would need refactoring of the IndexSorted compaction logic. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Postponing the optimisation here to reduce the size of this PR
Update CARv2 specification to include a format for MultihashIndexSorted and fully-indexed characteristic bitfield. Relates to: - ipld/go-car#239 - ipld/go-car#217
Update CARv2 specification to include a format for MultihashIndexSorted and fully-indexed characteristic bitfield. Relates to: - ipld/go-car#239 - ipld/go-car#217
Update CARv2 specification to include a format for MultihashIndexSorted and fully-indexed characteristic bitfield. Relates to: - ipld/go-car#239 - ipld/go-car#217
Update CARv2 specification to include a format for MultihashIndexSorted and fully-indexed characteristic bitfield. Relates to: - ipld/go-car#239 - ipld/go-car#217
Update CARv2 specification to include a format for MultihashIndexSorted and fully-indexed characteristic bitfield. Relates to: - ipld/go-car#239 - ipld/go-car#217
Update CARv2 specification to include a format for MultihashIndexSorted and fully-indexed characteristic bitfield. Relates to: - ipld/go-car#239 - ipld/go-car#217
Update CARv2 specification to include a format for MultihashIndexSorted and fully-indexed characteristic bitfield. Relates to: - ipld/go-car#239 - ipld/go-car#217
Implement a new CARv2 index that contains enough information to
reconstruct the multihashes of the data payload, since
CarIndexSorted
only includes multihash digests. The new index builds on top of the
existing
IndexSorted
by adding an additional layer of grouping themulti-width indices by their multihash code.
Note, this index intentionally ignores
any given record with
multihash.IDENTITY
CID hash.Add a test that asserts offsets for the same CID across sorted index and
new multihash sorted index are consistent.
Add tests that assert marshal unmarshalling of the new index type is as
expected, and it does not load records with
multihash.IDENTITY
digest.Relates to:
MultihashIndexSorted
codec multiformats/multicodec#227Fixes #214