-
-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
IPFS Repo GC Is Unrealistic At Non-Trivial Scale: rm -rf
+ resync is faster
#7213
Comments
This is an experience report, not a bug report. |
Expanding on that, we are very well aware that go-ipfs's garbage collection system does not scale to large repos. However, debating it to death in yet another bug report isn't going to bring us closer to fixing the issue. |
I disagree that this is an "experience report". Taking days to run GC is a bug, but to each their own |
I agree we need to fix this issue. However, we have a limited amount of developer bandwidth split across a massive project. I closed this issue because we have many open issues on the same topic and yet another "this sucks" issue isn't going to get us any closer to fixing it:
If you do manage to fix this issue, I'd be happy to accept a patch. |
Why couldn't we have graph style index that keeps track of all the IPFS object refs/descendants rather then having to query the entire data store for GC? |
Version information:
Description:
At any non-trivial scale, when an IPFS node has several hundreds of thousands of pins, and even a million or more, the process of running garbage collection of go-ipfs nodes is basically an impossible task. While running a GC process on a node with ~750k pins, in a 30 minute process not 1 pin was removed, and my on-disk size of the repository go-ipfs uses had not gone down even a single byte.
In practice, the act of
rm -rf /path/to/datastore
and resyncing the node is astronomically faster than dealing with a full garbage collection process. In fact this solution is not only faster, but it doesn't block your IPFS node to the extent that garbage collection does. I can't think of an alternative that doesn't involve waiting hours, if not days to perform a full garbage collection on an ipfs node.This is a pretty big concern, and makes using go-ipfs extremely unfeasible outside of hobby environments, and test environments.
The text was updated successfully, but these errors were encountered: