-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Generated CIDv0 differs from the one generated by IPFS #77
Comments
In general, you can't expect this to work. Basically, IPFS is a filesystem built on-top-of IPLD (i.e., it creates IPLD nodes to encode blocks, indirect blocks, inodes, etc). IPFS chunks files into blocks (~256KiB by default) and builds a merkle-tree on top of these chunks (using IPLD). The CID of a file corresponds (approximately) to the hash of the root node of this merkle-tree. In your case, I assume you're file is less than 256KiB. That means it fits in a single chunk. In the CIDv0 case, IPFS is taking your file content and wrapping it in a protobuf datastructure then hashing it. IPFS has to do this because CIDv0 only supports one IPLD "codec" (DagPB). Every IPLD object with a V0 CID uses this same DagPB format. CIDv1, on the other hand, supports many IPLD codecs (the specific codec used is recorded in the CID itself). In this case, because your file fits in a single block, IPFS is using the Raw codec (raw binary). That's why |
(Closing for tracking, please feel free to continue discussing/asking questions) |
Hi @Stebalien and @AminArria, I'm running into the same issue - I'm trying to generate a v0 CID that matches the one that an IPFS node would generate, but without running an IPFS node. Is there some way of doing this? |
I believe you can use |
@Stebalien You're right, I was hoping that this would work to replicate that behaviour that the
But it seems like |
IPFS data is first chunked. Then a merkledag is encoded on-top of the data. The final CID relates to the root of this tree.
`prefix.Sum(data)` assumes that `data` is a single node in the tree and will return the CID of that node. It won't do any IPFS chunking.
To do that, you'll need to extract code from go-ipfs's add function.
Also note: you really shouldn't do this. Given different options (in `ipfs add`), different chunking algorithms, different hash functions, format changes, etc., the resulting CID may be different. A given CID always points to the same file, calling `ipfs add` on the same file isn't guaranteed to produce the same CID (unless you use the exact same options, etc.).
|
@Stebalien Thanks so much for all your help and for the quick responses! And thanks for the heads up on how the CID can change depending on the IPFS config. I'll be sure to take that into account as I build my solution. |
Hi @Stebalien, I have a follow-up question. I was watching https://www.youtube.com/watch?v=Z5zNPwMDYGg to learn more about how adding data to IPFS works (great video, by the way). I have a very specific use-case: generate CIDs locally that match the ones produced by the What I tried doing was wrapping the data in a UnixFS file wrapper before calculating the CID, but I'm still not getting a matching CID. I've verified that the Merkle DAG should be just a single node by using https://dag.ipfs.io/, so no chunking/node balancing should be needed. It seems I'm still missing something - do you know what? Here is a short code snippet showing exactly what I'm doing:
|
These APIs are really bad, I'm so sorry. The (current) file format wraps a protobuf within a protobuf. You've just created the inner protobuf, but you still need to create the outer one. You need to call
Are you willing to change the defaults? If you are, you can use NOTE: "fits into one chunk" means <= 1MiB (ish). IPFS will refuse to transfer larger chunks over bitswap as we don't want to download too much data without verifying it. |
@Stebalien Yep, that did the trick!
No need to apologize! Thanks for all your hard work on this awesome (and free) project!
For now my requirement is to support the default IPFS settings, but this is really good to know. I'll keep this in mind in case my requirements change (and/or I need to support more configurations). Thanks again so much for your help! |
FYI, there's a chance this will become the default in the near future (TM). But that's been the case for a while now.
|
@DRK3 by any chance were you able to extend the solution to programmatically get ipfs cid for multiple chunks ( my file sizes will be around 4-5 mb ) |
@Vikram710 It's been awhile, but from what I recall I didn't have the need for multiple chunks, so it may be possible but I haven't attempted it. |
Hi, i'm receiving some files and want to verify the CID sent, thus I'm doing (summarized):
The CID I'm receiving is the one generated by doing
ipfs add -n path/to/file
, but it doesn't match the one generated bygo-cid
.Something I'm doing wrong?
PS: This works fine for CIDv1
The text was updated successfully, but these errors were encountered: