You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This means that if the upload to IPFS fails for any reason, the caller of POST /works won't find out about it at all. It also means that if the file sent to IPFS is broken before reaching IPFS, a valid IPFS hash will be generated but the file will be broken, which is what may be happening in #53.
Solution Proposal 1
A way to validate that the file added to IPFS is correct would be to generate the hash of the file in the same way IPFS does and matching our hash against the one returned by IPFS.
The multi-hash format only provides a wrapper around the hash algorithm, so you can change from SHA256 to BLAKE2b, and the prefix will differ. HOWEVER, the hash is also affected by the chunking algorithm, DAG format and CID version, so you can have completely different hashes even if the format is marked the same.
This functionality should already be available in js-ipfs. Maybe we can use the js implementation of IPFS to calculate the hash without actually running the code? (Need to confirm, but I believe const ipfs = new IPFS() would have the unwanted side effect of eating a whole lot of RAM and CPU).
Using js-ipfs to add -n requires creating an instance of IPFS, which in turn runs the IPFS Node. I tried reading the source code of js-ipfs to understand how it works internally and made some small modifications to make it run without running the IPFS node, but was unsuccessful.
I thought about requesting this feature, but I gather this behavior is intentional: the hash generated by ipfs add does not only depend on the contents of the file, but also some node-specific values that could change from node to node.
the hash is also affected by the chunking algorithm, DAG format and CID version, so you can have completely different hashes even if the format is marked the same
Multihash just hashes the file and prints out it hash in multihash format.
ipfs add creates merkeldag and wraps the file in needed format, that is why the hash is different.
Solution Proposal 2
Plan B would be to ipfs cat right after ipfs add and verify the content matches. This operation is much more expensive, basically duplicating the memory and network costs, and would need to be async and have a retry mechanism like the one we use for downloads since we can't rely on IPFS being 100% available.
I'd rather wait before we go this way. Should be relatively straight forward to implement, but has too high costs.
Solution Proposal 3
In my mind we should be able to have a pure function that calculates an IPFS hash for a given content such as add -n would do, considering the DAG format, CID version and chunking algorithm but without instantiating the node.
If these values could change from IPFS node to IPFS node but are constant over a run of the node, we could have the Po.et Node request these values from the IPFS Node on start up once.
Comparing this to Solution 2, we'd be making a constant-size network request once per start up against doing a variable-size (which could be extremely large) network request to IPFS for each add.
Problem
Currently when the
POST /works
is requested API will dispatch a message to RMQ that is picked up by Storage, which in turn calls the following code:node/src/Storage/ClaimController.ts
Lines 41 to 58 in 3e62476
This means that if the upload to IPFS fails for any reason, the caller of
POST /works
won't find out about it at all. It also means that if the file sent to IPFS is broken before reaching IPFS, a valid IPFS hash will be generated but the file will be broken, which is what may be happening in #53.Solution Proposal 1
A way to validate that the file added to IPFS is correct would be to generate the hash of the file in the same way IPFS does and matching our hash against the one returned by IPFS.
IPFS uses the multihash protocol. By default, it currently uses SHA-256 encoded in base58, which is why IPFS hashes start with "Qm".
Calculating the IPFS hash without using IPFS altogether could be quite complex.
This functionality should already be available in js-ipfs. Maybe we can use the js implementation of IPFS to calculate the hash without actually running the code? (Need to confirm, but I believe
const ipfs = new IPFS()
would have the unwanted side effect of eating a whole lot of RAM and CPU).https://github.com/ipfs/js-ipfs/blob/bddc5b4a967258306bed8bd37a9b4fd308b98fc8/test/http-api/files.js#L31-L48
UPDATE
Using js-ipfs to
add -n
requires creating an instance ofIPFS
, which in turn runs the IPFS Node. I tried reading the source code of js-ipfs to understand how it works internally and made some small modifications to make it run without running the IPFS node, but was unsuccessful.I thought about requesting this feature, but I gather this behavior is intentional: the hash generated by
ipfs add
does not only depend on the contents of the file, but also some node-specific values that could change from node to node.Solution Proposal 2
Plan B would be to
ipfs cat
right afteripfs add
and verify the content matches. This operation is much more expensive, basically duplicating the memory and network costs, and would need to be async and have a retry mechanism like the one we use for downloads since we can't rely on IPFS being 100% available.I'd rather wait before we go this way. Should be relatively straight forward to implement, but has too high costs.
Solution Proposal 3
In my mind we should be able to have a pure function that calculates an IPFS hash for a given content such as
add -n
would do, considering the DAG format, CID version and chunking algorithm but without instantiating the node.If these values could change from IPFS node to IPFS node but are constant over a run of the node, we could have the Po.et Node request these values from the IPFS Node on start up once.
Comparing this to Solution 2, we'd be making a constant-size network request once per start up against doing a variable-size (which could be extremely large) network request to IPFS for each
add
.I've uploaded some basic tests to ipfs-hash-tests.
The text was updated successfully, but these errors were encountered: