Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Amend equivalence claims to support variant addresses by blob digest #60

Closed
Gozala opened this issue Apr 24, 2024 · 8 comments · Fixed by #61
Closed

Amend equivalence claims to support variant addresses by blob digest #60

Gozala opened this issue Apr 24, 2024 · 8 comments · Fixed by #61
Assignees

Comments

@Gozala
Copy link

Gozala commented Apr 24, 2024

Context

Roundabout uses equivalence claims to resolve car cid from the piece cid.

With addition of blob interface we no longer assume CARs and need to publish equivalence claims that use multihash digest instead. Desired goal here that when roundabout queries content claims via piece cid it should get a response from which it can derive location in order to be able to read without having to try two alternative read paths.

/${car.cid}/${car.cid}.car
/${base56btc(blob.multihash)}/${base56btc(blob.multihash)}.blob

What

  1. Make it possible to produce equivalence claims without having to wrap multihash into a CID
  2. Make it possible to query equivalence claims and derive exact path
@Gozala
Copy link
Author

Gozala commented Apr 24, 2024

After reflecting more on this it kind of feels that it would make more sense to use RAW CID as opposed to multihash. Rational here is that Piece CID refer to the same data model - byte array as RAW CID, which makes more sense than claim that CID is equivalent to some multihash. Multihash refers to byte array and CID may reference to arbitrary data structures.

Equivalency claims across multihashes also makes sense e.g. if claim is made that some blake3 multihash is equivalent to some sha256 multihash, but equivalence between multihash and cid seems kind of strange.

@vasco-santos
Copy link
Contributor

vasco-santos commented Apr 25, 2024

After reflecting more on this it kind of feels that it would make more sense to use RAW CID as opposed to multihash. Rational here is that Piece CID refer to the same data model - byte array as RAW CID, which makes more sense than claim that CID is equivalent to some multihash. Multihash refers to byte array and CID may reference to arbitrary data structures.
Equivalency claims across multihashes also makes sense e.g. if claim is made that some blake3 multihash is equivalent to some sha256 multihash, but equivalence between multihash and cid seems kind of strange.

I agree with this. Doing RAW seems a great option. If RAW comes right from filecoin/offer from the client in new flow, I assume everything would simply work right? Because then content would be propagated everywhere including for equivalency claim.

With that, from what I understand we essentially cut the need to do any work on:

and really only Roundabout needs to be able to assume RAW CIDs w3s-project/w3infra#357 (I think it expects only CAR CIDs)

@alanshaw
Copy link
Member

e.g. if claim is made that some blake3 multihash is equivalent to some sha256 multihash, but equivalence between multihash and cid seems kind of strange.

yeah but we're not saying the hashes are equivalent, we're saying some aspect of the content they are addressing is equivalent. It doesn't matter if it's a CID vs multihash vs content integrity string thing.

@Gozala
Copy link
Author

Gozala commented May 1, 2024

yeah but we're not saying the hashes are equivalent, we're saying some aspect of the content they are addressing is equivalent. It doesn't matter if it's a CID vs multihash vs content integrity string thing.

I think calling it equivalent is a stretch and word I would reach for is they are related, but then I'd like to also capture "relation", as in "related how ?". Case in point is I could say

  • bafy...foo is related to bgr..bza because file content on the left hashes to bgr..bza
  • bafy...foo is related to bgr...root because root node hash corresponds bgr...rot in say sha-512
  • bafy...foo is related to bag...car because DAG blocks are contained by bag...car

I can keep going, but point I'm trying to make is that kind of relation is a crucial detail and unless captured explicitly would be assumed something implicitly which will break down as soon a what is assumed implicitly is extended.

@alanshaw
Copy link
Member

alanshaw commented May 4, 2024

They address the same bytes of data was the assumption I was working under.

I know piece CID is a stretch on that.

So then a dagpb CID is equivalent to a raw CID is equivalent to a v0 CID is equivalent to a multihash (which is what a v0 CID is anyway).

@Gozala
Copy link
Author

Gozala commented May 6, 2024

So then a dagpb CID is equivalent to a raw CID is equivalent to a v0 CID is equivalent to a multihash (which is what a v0 CID is anyway).

I'm not sure I follow this. DAG-PB CID v0 or v1 both refer to the root block of the DAG. If I refer to the same block by RAW CID it is clear that I'm asking for the bytes for encoded root block, if I'm referring by DAG-PB CID I'm referring to a UnixFS file / directory not the root block. Point I'm trying to make is in one case I refer to a block in other I refer to a DAG, in case of multihash I would expect reference to a block again not a DAG.

That is also a problem I'm trying to flag. If we say this CID is equivalent to this multihash it sounds like we're comparing block to a DAG which is what I find misleading. If we put CAR cids in the mix it gets even more ambiguous are we talking about files, DAGs or bytes ?

@alanshaw
Copy link
Member

alanshaw commented May 7, 2024

Ok that makes sense thanks for explaining your rationale.

@alanshaw
Copy link
Member

We decided to use raw CIDs instead.

alanshaw added a commit that referenced this issue May 29, 2024
* Claims can now be published by multihash, the `content` property
should be a `Link` OR `{ digest: Uint8Array }` when sending an
invocation
* Read API `GET /claims/:cid` is deprecated
* Added `GET /claims/cid/:cid` for reading claims by CID
* Added `GET /claims/multihash/:multihash` for reading claims by
(base58btc encoded) multihash
* ~~Removed "Relation claim" - this is not published by any w3up infra
and has significant overlap with upcomsing
[w3-index](https://github.com/w3s-project/specs/blob/main/w3-index.md)
claim~~

resolves #60

BREAKING CHANGE: Client read interface and client claim types now use
multihashes. Relation claim has been removed in favour of upcoming
dag-index claim.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants