-
-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Filer hangs or restarts on deleting large buckets #1715
Comments
Could you please try to run with 2.18 with this option?
|
I have updated my setup to 2.18. I'm running s3 with filer
Should I add this option there? So ok, I've add it as
I'll check if it would help |
How is it going? |
Could you test what’s the speed to remove the bucket? E.g., |
Hi Chris. Unfortunately, seems like it didn't helped. It had failed several times already. I'll try to collect some logs on next failure. |
So what just happened. I've tried to delete bucket at 00:25. After a while filer became unresponsive from both it's API and My setup is a "master" (virtual) machine with 8 GB of RAM running Logs doesn't seem to be much informative though. Master:
Filer (not posting the whole stack trace because it was just complaints about me killing it with signal 9):
Volume:
|
A typical bucket is something like the following, ~3.5M files
|
please show the stack trace that was truncated. |
Here it is.
|
The latest git tip has added a new filer store, see 629c996 |
Oh, cool. Would it be included in the next release? |
try the latest version here https://github.com/chrislusf/seaweedfs/releases/tag/dev |
Thanks, I'll give it a shoot and write back. |
So, I have migrated to |
can you please share a screenshot of the |
It looks almost the same as the previous version. BTW, I have updated only master and filer, should I upgrade volume also?
|
should be some bug. Need to debug. Deleting a bucket should be just removing volume collection and the level db folder, e.g. the |
Yeah, I guess so, but on commiting those commands whole shell hangs and usually spit some error after a while, or I just have to kill it manually. |
Added a fix f17aa1d Just need to update the filer. |
Things are definitely better with that fix. Shell is still hanging for a while, but at least it's not that crucial, filer doesn't eat memory and stays at the bare minimum and definitely not freezing. Seems like the delay is caused by removing collections now. I can reproduce it by manually deleting them through |
I did not expect deleting volumes can be that slow. How many volumes and how large are they? And was the master started with |
I have 2 |
Some logs from
You might notice the 4 minute gap between operations. Also, i've logged to
|
why this is repeated twice in the logs? Can I see the full logs? |
Sure, here it is: https://gist.github.com/divanikus/cfd311591400c0b700de1315f0922bce My cleanup script kicked in somewhere at 15:58 I guess. It simply deletes buckets one by one. |
how did you delete the bucket? Please share your script. |
the 4 minutes gap is just the heartbeat timing difference between the two volume servers. |
Not the full code, but I guess it would be enough resp = send_http("http://localhost:8888/buckets", "Get", {'Accept' => 'application/json'}, {limit: 1000})
JSON.parse(resp.body)["Entries"].each do |bucket|
if time_difference(bucket["Collection"])
begin
resp = send_http("http://localhost:8888/buckets/#{bucket["Collection"]}", "Delete", {}, {recursive: 'true'})
if resp.code == "204"
puts "bucket #{bucket["Collection"]} deleted!"
else
puts "bucket #{bucket["Collection"]} delete failed!"
end
rescue
puts "Http request error!"
end
end
end |
seems working fine now. |
Describe the bug
I'm running single master, single filer setup with s3 gateway. Filer's store is leveldb2.
I store lots of small files (up to 1MB) in separate per day buckets. Files are stored in nested directories (not a one dir for all). It works pretty well, unless I'm trying to drop bucket, both through API or
bucket.delete
. Filer might just hang with other components spittingrpc error: code = Unavailable desc = transport is closing
or simply do a restart. Deleting collections is breezily fast though. Am I missing something so I can painlessly delete a whole bucket in one operation? Or I should move to another filer store?System Setup
The text was updated successfully, but these errors were encountered: