0.4.5: repo does not scale beyond ~ 8000 files #3621

mguentner · 2017-01-22T03:14:03Z

Version information:

go-ipfs version: 0.4.5-dev-
Repo version: 5
System version: amd64/linux
Golang version: go1.7.4

Type:

Bug

Priority:

P0

Description:

When adding a lot of files locally (or using the daemon without --enable-gc), the repo grows even faster in size after a certain limit.
It seems to be around 8000, my gut tells me it's 8192 💫
Unless you have a lot of space with good IO, this makes it very hard to import files without
garbage collecting in between (say every 1k).
This is probably a blocker for 0.4.5

Here is a fancy graph that displays the problem:

After running ipfs repo gc the repo size shrinks from 16 GiB to 78 MiB

If you want to reproduce it, go and fetch QmcsrSRuBmxNxcEXjMZ1pmyRgnutCGwfAhhnRfaNn9P94F @ ipfs.io

The text was updated successfully, but these errors were encountered:

jbenet · 2017-01-22T06:35:12Z

Thank you, this is great. we should:

add this as a test case to go-ipfs (i.e. verify smooth scaling of repo size)
have a way to get graphs/reports like this

Kubuxu · 2017-01-22T19:16:06Z

My bet is that it might be the pin sharding we are currently use.ing it is being enabled at 8Ki items in the pin set, and causes at least 256 objects to be created with each add (change of the sharing seed, IDK why it was designed this way).

This shouldn't be a be a 0.4.5 blocker.

mguentner · 2017-01-22T20:04:18Z

Even without the 8192 limit, the overhead before a gc seems rather high (roughly 2200%)

@Kubuxu anyway...I think you are right. This is not a blocker for 0.4.5, people should be made aware that adding a lot without garbage collecting in between is not possible

whyrusleeping · 2017-01-24T05:03:38Z

oh... that must be the pinset stupidity. I made a note to fix that last quarter and didnt. Thank you for bringing it up and making it a priority.

whyrusleeping · 2017-01-24T05:05:19Z

In the meantime, depending on the dataset, adding without pinning should not trigger this bug. @mguentner do you mind rerunning the tests that created that graph with --pin=false ?

mguentner · 2017-01-24T15:10:58Z

For reference ipfs 0.4.4 with the same dataset (pinned):

And now ipfs 0.4.5 without pinning as requested by @whyrusleeping (mind the scale)

whyrusleeping · 2017-01-27T23:49:18Z

I think a quick backwards compatible change we can make to fix this is to fix the random seed thing in the pinsets. That should prevent this exponential explosion thing in the short term. Long term we need to find a better datastructure to manage the pins.

@mguentner could you get me a script that generates these graphs? That would be super helpful

whyrusleeping · 2017-01-28T00:39:54Z

@mguentner nvm, found your script linked above. sorry about that

whyrusleeping · 2017-01-28T01:01:30Z

Yeah, making the random seed in pin/set.go return 42 causes the numbers to look much better: https://ipfs.io/ipfs/QmaiQCNM7bURXHtmfe9BXFsZfjgXj7NsSobobc9u2iRYQJ

I see no good reason why we should continue using a random number there... Im actually not certain what use that had in the first place (I think it was trying to make sure the set didnt get weirdly unbalanced).

whyrusleeping · 2017-01-28T01:07:57Z

Ah, the random number was used to ensure that when placing items in nested buckets, they didnt all end up in the same spot. Say 1000 entries hash into bucket slot 50, We then recursively add those 1000 entries into their own 'sub-set'. Without the random number, they would all hash into bucket slot 50 of the sub-bucket, and we would likely have some uber-recursion-stack-overflow sort of thing. Whats weird is that the test i ran didnt appear to trigger that scenario... Need to investigate the conditions that lead to that issue

whyrusleeping · 2017-01-28T01:09:28Z

(havent found the problem yet, just logging my thoughts here)

One easy solution is to just use the subtree depth as the seed instead of a random number. Its deterministic AND solves the bucket collision issue.

ghost · 2017-01-28T02:07:57Z

Make that comment about the random number a comment in the code, for the next random developer wondering ;)

whyrusleeping · 2017-01-28T05:48:52Z

Figured out the recursion issue that the random number is meant to prevent, confirmed that using the depth of the tree instead of a random number solves the problem as well as preventing the exponential object creation issue. Will push some code soon

rht · 2017-01-28T06:26:16Z

As with the graph / test case, this is redundant with having a repo that constantly pin large real datasets (e.g. the ongoing data.gov or an existing cdnjs) as a natural test case. Performance regression can be detected by how long it takes to version-update these datasets.

rht · 2017-01-28T06:30:11Z

(drawing the parallel with nix and nixpkgs-hydra build -- in this scope, ipfs to data is as nix to software)

whyrusleeping added this to the ipfs 0.4.6 milestone Jan 24, 2017

whyrusleeping added kind/bug A bug in existing code (including security flaws) topic/perf Performance topic/technical debt Topic technical debt labels Jan 24, 2017

whyrusleeping mentioned this issue Jan 28, 2017

Make pinset sharding deterministic #3640

Merged

whyrusleeping self-assigned this Jan 28, 2017

whyrusleeping added the status/in-progress In progress label Jan 28, 2017

This was referenced Jan 30, 2017

Report: GC times and improvments from recent optimzations. #3462

Open

Disk space usage of old Files API nodes #3254

Open

Test the benefits of Rabin Fingerprinting vs normal chunking ipfs-inactive/archives#137

Open

kenXengineering mentioned this issue Feb 7, 2017

NASA PODAAC AQUARIUS Satellite Data climate-mirror/datasets#236

Open

whyrusleeping closed this as completed in #3640 Feb 12, 2017

whyrusleeping removed the status/in-progress In progress label Feb 12, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

0.4.5: repo does not scale beyond ~ 8000 files #3621

0.4.5: repo does not scale beyond ~ 8000 files #3621

mguentner commented Jan 22, 2017 •

edited

Loading

jbenet commented Jan 22, 2017

Kubuxu commented Jan 22, 2017

mguentner commented Jan 22, 2017 •

edited

Loading

whyrusleeping commented Jan 24, 2017

whyrusleeping commented Jan 24, 2017

mguentner commented Jan 24, 2017

whyrusleeping commented Jan 27, 2017

whyrusleeping commented Jan 28, 2017

whyrusleeping commented Jan 28, 2017

whyrusleeping commented Jan 28, 2017

whyrusleeping commented Jan 28, 2017

ghost commented Jan 28, 2017

whyrusleeping commented Jan 28, 2017

rht commented Jan 28, 2017

rht commented Jan 28, 2017 •

edited

Loading

0.4.5: repo does not scale beyond ~ 8000 files #3621

0.4.5: repo does not scale beyond ~ 8000 files #3621

Comments

mguentner commented Jan 22, 2017 • edited Loading

Version information:

Type:

Priority:

Description:

jbenet commented Jan 22, 2017

Kubuxu commented Jan 22, 2017

mguentner commented Jan 22, 2017 • edited Loading

whyrusleeping commented Jan 24, 2017

whyrusleeping commented Jan 24, 2017

mguentner commented Jan 24, 2017

whyrusleeping commented Jan 27, 2017

whyrusleeping commented Jan 28, 2017

whyrusleeping commented Jan 28, 2017

whyrusleeping commented Jan 28, 2017

whyrusleeping commented Jan 28, 2017

ghost commented Jan 28, 2017

whyrusleeping commented Jan 28, 2017

rht commented Jan 28, 2017

rht commented Jan 28, 2017 • edited Loading

mguentner commented Jan 22, 2017 •

edited

Loading

mguentner commented Jan 22, 2017 •

edited

Loading

rht commented Jan 28, 2017 •

edited

Loading