Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ipfs get & cat: link resolution not working for HAMT-sharded directories #8072

Closed
dokterbob opened this issue Apr 13, 2021 · 7 comments
Closed
Assignees
Labels
kind/bug A bug in existing code (including security flaws) need/analysis Needs further analysis before proceeding topic/sharding Topic about Sharding (HAMT etc)

Comments

@dokterbob
Copy link
Contributor

dokterbob commented Apr 13, 2021

Version information:

go-ipfs version: 0.8.0-ce693d7
Repo version: 11
System version: amd64/darwin
Golang version: go1.14.15

Description:

On several occasions, I have found that links which can be listed, when requesting them, yield a 404. Example:

$ ipfs ls /ipfs/QmUygZRt3uF4gco8Ff3qmRa9xpYZsodhijPPVD2XmubBLr/
[...]
bafkreid5idh3bq5gj2wck4ghg55sof37zmd42bhn63rdzcrxvors7vgszq 13838 Roman_Election.jpg.webp
[...]
$ ipfs get /ipfs/QmUygZRt3uF4gco8Ff3qmRa9xpYZsodhijPPVD2XmubBLr/Roman_Election.jpg.webp
Error: no link named "Roman_Election.jpg.webp" under QmUygZRt3uF4gco8Ff3qmRa9xpYZsodhijPPVD2XmubBLr

The same result appears on the public gateway: https://gateway.ipfs.io/ipfs/QmUygZRt3uF4gco8Ff3qmRa9xpYZsodhijPPVD2XmubBLr/Roman_Election.jpg.webp

@dokterbob dokterbob added kind/bug A bug in existing code (including security flaws) need/triage Needs initial labeling and prioritization labels Apr 13, 2021
@dokterbob
Copy link
Contributor Author

Potentially related: #3911

@aschmahmann aschmahmann added need/analysis Needs further analysis before proceeding and removed need/triage Needs initial labeling and prioritization labels Apr 19, 2021
@aschmahmann
Copy link
Contributor

This may be resolved by #7976, but needs investigation to check.

@lidel lidel added the topic/sharding Topic about Sharding (HAMT etc) label Apr 19, 2021
@aschmahmann aschmahmann self-assigned this Apr 19, 2021
@lidel lidel changed the title Link resolution not working for some hashes ipfs get: link resolution not working for HAMT-sharded directories Apr 19, 2021
@aschmahmann aschmahmann moved this from Backlog to Weekly Candidates in Maintenance Priorities - Go May 10, 2021
@aschmahmann aschmahmann moved this from Weekly Candidates to Backlog in Maintenance Priorities - Go May 10, 2021
@Stebalien
Copy link
Member

So, sharded directories should definitely work here. We have tests, people use them all the time on the gateways, I've fixed these bugs before, etc.

It looks like these blocks are intermediate nodes in the sharded directory but I haven't confirmed that. The real bug here is that we don't verify the HAMT structure when listing, we just blindly walk the DAG.

@lidel lidel changed the title ipfs get: link resolution not working for HAMT-sharded directories ipfs get & cat: link resolution not working for HAMT-sharded directories May 25, 2021
@lidel
Copy link
Member

lidel commented May 25, 2021

Indeed, /wiki/ is sharded, and below works fine (0.9.0-rc1):

$ ipfs get /ipfs/QmT5NvUtoM5nWFfrQdVrFtvGfKFmG7AHE8P34isapyhCxX/wiki/Anasayfa.html
Saving file(s) to Anasayfa.html

The CID listed here (QmUygZRt3uF4gco8Ff3qmRa9xpYZsodhijPPVD2XmubBLr) is intermediate:
https://explore.ipld.io/#/explore/QmUygZRt3uF4gco8Ff3qmRa9xpYZsodhijPPVD2XmubBLr

The real HAMT root like /wiki/ looks like this:

https://explore.ipld.io/#/explore/QmRNXpMRzsTHdRrKvwmWisgaojGKLPqHxzQfrXdfNkettC

Maintenance Priorities - Go automation moved this from Backlog to Done May 25, 2021
@dokterbob
Copy link
Contributor Author

dokterbob commented Jun 16, 2021

It looks like these blocks are intermediate nodes in the sharded directory but I haven't confirmed that. The real bug here is that we don't verify the HAMT structure when listing, we just blindly walk the DAG.

It's quite possible these are intermediate nodes, but that does not appear to me as a sound reason for the API to break.

If I can list the links from one resource to another, I should be able to follow those links, shouldn't I?
This is a core assumption in the ipfs-search.com crawler: the ls command lists links, which the get command might query. Yet the get command does not recognise this link. It seems to me that one of them will need to be changed, or else this specific exception to the API (we can't get resources from intermediate HAMT-nodes) might be documented, perhaps with additional documentation on how to identify HAMT nodes so that applications can properly account for this.

I would really appreciate if you reconsider this and either open the issue again or create a follow-up ticket implementing the suggested API. Thanks!

@Stebalien
Copy link
Member

You're absolutely correct. We should have failed at the ipfs ls point: #8196.

@dokterbob
Copy link
Contributor Author

Thanks for picking up on this @Stebalien!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug A bug in existing code (including security flaws) need/analysis Needs further analysis before proceeding topic/sharding Topic about Sharding (HAMT etc)
Projects
No open projects
Development

No branches or pull requests

4 participants