New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
0.4.5: repo does not scale beyond ~ 8000 files #3621
Comments
Thank you, this is great. we should:
|
My bet is that it might be the pin sharding we are currently use.ing it is being enabled at 8Ki items in the pin set, and causes at least 256 objects to be created with each add (change of the sharing seed, IDK why it was designed this way). This shouldn't be a be a 0.4.5 blocker. |
Even without the 8192 limit, the overhead before a gc seems rather high (roughly 2200%) @Kubuxu anyway...I think you are right. This is not a blocker for 0.4.5, people should be made aware that adding a lot without garbage collecting in between is not possible |
oh... that must be the pinset stupidity. I made a note to fix that last quarter and didnt. Thank you for bringing it up and making it a priority. |
In the meantime, depending on the dataset, adding without pinning should not trigger this bug. @mguentner do you mind rerunning the tests that created that graph with |
For reference ipfs 0.4.4 with the same dataset (pinned): |
I think a quick backwards compatible change we can make to fix this is to fix the random seed thing in the pinsets. That should prevent this exponential explosion thing in the short term. Long term we need to find a better datastructure to manage the pins. @mguentner could you get me a script that generates these graphs? That would be super helpful |
@mguentner nvm, found your script linked above. sorry about that |
Yeah, making the random seed in pin/set.go return 42 causes the numbers to look much better: https://ipfs.io/ipfs/QmaiQCNM7bURXHtmfe9BXFsZfjgXj7NsSobobc9u2iRYQJ I see no good reason why we should continue using a random number there... Im actually not certain what use that had in the first place (I think it was trying to make sure the set didnt get weirdly unbalanced). |
Ah, the random number was used to ensure that when placing items in nested buckets, they didnt all end up in the same spot. Say 1000 entries hash into bucket slot 50, We then recursively add those 1000 entries into their own 'sub-set'. Without the random number, they would all hash into bucket slot 50 of the sub-bucket, and we would likely have some uber-recursion-stack-overflow sort of thing. Whats weird is that the test i ran didnt appear to trigger that scenario... Need to investigate the conditions that lead to that issue |
(havent found the problem yet, just logging my thoughts here) One easy solution is to just use the subtree depth as the seed instead of a random number. Its deterministic AND solves the bucket collision issue. |
Make that comment about the random number a comment in the code, for the next random developer wondering ;) |
Figured out the recursion issue that the random number is meant to prevent, confirmed that using the depth of the tree instead of a random number solves the problem as well as preventing the exponential object creation issue. Will push some code soon |
As with the graph / test case, this is redundant with having a repo that constantly pin large real datasets (e.g. the ongoing data.gov or an existing cdnjs) as a natural test case. Performance regression can be detected by how long it takes to version-update these datasets. |
(drawing the parallel with nix and nixpkgs-hydra build -- in this scope, ipfs to data is as nix to software) |
Version information:
go-ipfs version: 0.4.5-dev-
Repo version: 5
System version: amd64/linux
Golang version: go1.7.4
Type:
Bug
Priority:
P0
Description:
When adding a lot of files locally (or using the daemon without
--enable-gc
), the repo grows even faster in size after a certain limit.It seems to be around 8000, my gut tells me it's 8192 💫
Unless you have a lot of space with good IO, this makes it very hard to import files without
garbage collecting in between (say every 1k).
This is probably a blocker for 0.4.5
Here is a fancy graph that displays the problem:
After running
ipfs repo gc
the repo size shrinks from 16 GiB to 78 MiBIf you want to reproduce it, go and fetch QmcsrSRuBmxNxcEXjMZ1pmyRgnutCGwfAhhnRfaNn9P94F @ ipfs.io
The text was updated successfully, but these errors were encountered: