-
Notifications
You must be signed in to change notification settings - Fork 337
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Running out of disk space #1258
Comments
Thank you for reporting the bug! We will have a look into it shortly |
Tangential suggestion: It should be pretty easy to calculate an approximate estimate of the maximum possible disk usage from |
I can reproduce this problem. It looks like disk space accounting does not include uploaded files. When I restart swarm, immediately a ton of disk space is freed up as |
Hm, bee isn't releasing all of the disk space even after a restart,
|
Please try to give as much information as to what you have done prior to this problem surfacing. |
Yes, I tried 10mil. Once I realized that disk space management wasn't working then I reduced back to 5mil.
On one node, I probably uploaded faster than sync'ing. For example, maybe I uploaded 30G of data to the node very quickly and then waited for it to sync.
If you can provide some guidance about how to not trigger the issue then that would also help. I gather that I shouldn't mess with the db-capacity setting. Also, I should not uploaded too fast? I was trying to find where the limits were, to help with testing, but I am content to play within expected user behavior too. I'm curious to hear from @alsakhaev too |
+1: started a node on raspi with 32gb sd card, ran out of disk space after 10hrs |
+1: have set up docker-based nodes and all of their localstores have easily surpassed the db-capacity limit and use between 30Gb and 40Gb now |
+1: Running multiple bees in Kubernetes containers. Each bee exhausts it's disk space allocation (doubling the db capacity has no effect besides chewing more space, and consequently exceeding). |
Thanks all, for the comments and reports. We are releasing soon and included several improvements that aim to address this issue. We would greatly appreciate if you could try it out and report back here. |
I can confirm that running 0.5.3 the |
This issue can be reliably reproduced with a rPI |
I am running 0.5.3 using the default db-capacity. I can see the bee is doing garbage collecting in the same time of consuming more space. Once the garbage collecting fall off and the disk usage reached 100% then everything not working anymore. The log will keep reporting No Space Left and the garbage collecting will also stop to work. |
1 similar comment
the bug has a severe impact on the entire network because people are just purging the |
This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 5 days. |
It should be resolved with the latest release. However the problem is multi tiered so shipping a database migration that would fix the problem which is already exacerbated on some nodes was not trivial to do. If you |
Any plans to publish guidance on this? In particular, how to detect if the issue exists within a node so that we don't just start nuking everything and dropping retrievability on chunks already stored in the swarm. |
Ok, I'm out of ideas now. Sorry, but the node is going to store what it thinks it needs to store. Nuking your DB periodically will just chew up your bandwidth and strain your neighborhood to push it all back to you, not to mention risking actually dropping the only copy of some chunks that your node was supposed to store. Unless there's still a lurking issue with that somewhere that hasn't been uncovered and isn't visible with the metrics we currently have available. |
One more final thought after having gone back through and re-read everything. If you are still uploading through this node, are you asking it to pin the content on the upload? Pinned content is not garbage-collected and is also not counted against the db-capacity configuration. But it is all dropped on a db nuke, as far as I can tell. |
OK, I'm happy to wait until my disk fills up. Maybe devs will figure out a solution by then.
I am not. I haven't tried to upload for months. |
@jpritikin I am adding a new |
also, @jpritikin please don't use that built version to run bee normally. use the current stable version to run bee with |
Here is the output:
|
w00t. and this takes how many gigs? can we have a |
Here you go,
|
can you also provide the output of your |
Here is topology output, topo.txt |
Thanks. Would you be able to post the |
thanks @jpritikin this was very helpful. i have some possible direction on the problem. since you can reproduce the problem, could you try the following please?
many thanks in advance! |
Okay, I'm running this code. |
Disk usage is up to 21.5GiB. |
@jpritikin so if I understand correctly everything is OK running with this fix? |
No, I commented 4 hours ago because that's how long it took for the disk to fill up to that point. The test begins now, not ends. Case in point, the disk usage is now up to 26GiB. So I would say that the fix has failed to cure the problem. 😿 |
The limit on amount of chunks stored is not a hard limit for reasons which would be difficult to explain here. Let's start with evaluating the size of the sharky sub directory over time. Please update us here on your findings. The leveldb part (the localstore root) also does not get GCd so often and when it does, it is done on its own conditions (not a full gc). Let's start from sharky dir stats then we see how to progress. |
Ah, well, maybe there is nothing wrong? Is this the output you need?
I using btrfs for |
Looks like I was confused by |
Let's leave it open for now and continue tracking the issue |
Should I upgrade to v1.6 or is there still value in running the custom build? |
yes you can, as we ported the fix into the release 👍 |
Hm, something is still broken? I upgraded to 1.6.2 today and noticed that the disk was filling up again,
|
Summary
We have a bunch of troubles running our bee node serving the swarm downloader (out hackathon project).
Steps to reproduce
wget https://github.com/ethersphere/bee/releases/download/v0.5.0/bee_0.5.0_amd64.deb sudo dpkg -i bee_0.5.0_amd64.deb
Expected behavior
I expect to see 5gb freespace :)
Actual behavior
Config /etc/bee/bee.yaml
Uncommented lines from config file:
The text was updated successfully, but these errors were encountered: