Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding large directory (10k items) results in an error #1691

Closed
hsanjuan opened this issue Jun 14, 2022 · 4 comments
Closed

Adding large directory (10k items) results in an error #1691

hsanjuan opened this issue Jun 14, 2022 · 4 comments
Assignees
Labels
effort/days Estimated to take multiple days, but less than a week kind/bug A bug in existing code (including security flaws) P0 Critical: Tackled by core team ASAP

Comments

@hsanjuan
Copy link
Collaborator

A user reported that adding a directory with 10k items on cluster does not work well.

@hsanjuan hsanjuan added kind/bug A bug in existing code (including security flaws) P0 Critical: Tackled by core team ASAP effort/days Estimated to take multiple days, but less than a week labels Jun 14, 2022
@hsanjuan hsanjuan self-assigned this Jun 14, 2022
@hsanjuan hsanjuan added this to the Release v1.0.2 milestone Jun 14, 2022
@hsanjuan
Copy link
Collaborator Author

Indeed adding a directory with many files hits a dagservice: block not found error.

@hsanjuan
Copy link
Collaborator Author

So the latest versions of go-unixfs that we are use have a DynamicDirectory type which switches itself from BasicDirectory to HAMTDirectory (and back) depending on how many children have been added. When that switch happens, it reads all the children from the BasicDirectory to add them to the HAMTDirectory as new links.

I believe we can patch go-unixfs to not re-read children and just re-use links in the BasicDirectory to add them to the HAMT directory.

@hsanjuan
Copy link
Collaborator Author

Basic testing suggests the problem is solved with ipfs/go-unixfs#120.

@hsanjuan hsanjuan changed the title Adding large directory (10k items) results in different CID as IPFS Adding large directory (10k items) results an error Jun 15, 2022
@hsanjuan hsanjuan changed the title Adding large directory (10k items) results an error Adding large directory (10k items) results in an error Jun 15, 2022
@hsanjuan
Copy link
Collaborator Author

The problem from the cluster side is that our DAGService is write-only and we cannot really read blocks that have written previously, since those blocks may have been written in the IPFS daemon of a completely different cluster peer. The assumptions for Adding is that we just write blocks in places, no need to re-read them.

hsanjuan added a commit that referenced this issue Jun 15, 2022
hsanjuan added a commit that referenced this issue Jun 16, 2022
hsanjuan added a commit that referenced this issue Jun 16, 2022
…ries

Fix #1691: adding fails on large directories
hsanjuan added a commit that referenced this issue Jun 16, 2022
Fixes #1691 by updating to the latest go-unixfs and adding a test.

The test is verified to fail on the previous go-unixfs version.
hsanjuan added a commit that referenced this issue Jun 16, 2022
Fixes #1691 by updating to the latest go-unixfs and adding a test.

The test is verified to fail on the previous go-unixfs version.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
effort/days Estimated to take multiple days, but less than a week kind/bug A bug in existing code (including security flaws) P0 Critical: Tackled by core team ASAP
Projects
None yet
Development

No branches or pull requests

1 participant