New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Trash Does Not Cleanup #702
Comments
My current configuration: In
In
|
It's still growing, currently:
Now it seems to cleanup the trash: Two hours later:
One more hour later:
Three more hours later:
But the questions above still remain: How can I immediately clear the trash? |
I think your trash is too large for ls perhaps. Try using fine instead. Find is faster as it does not attempt to stat each file as it goes. Of you just want to purge the trash find can -remove or -delete as it goes. 28 million files in one folder is quite a lot, so it can easily overwhelm things like ls. |
Do you mean:
Or what command do you mean shall I execute? Ok, it is running (since ½h), but does not seem to be a success:
|
find /tmp/trash -print -delete |
Good call with the print.. it is probably working but some verbose is helpful to see what it is doing. It could take a very long time... you should quote your '{}' if you do it the original way, guestisp way is 'better'. Do you see change in the cgi? Less trash files maybe? |
With the above command,every deleted file is also printed. Anyway, with rsync is even faster, just sync an empty directory:
|
The ssh connection aborts before Anyway, at least, trash space is still decreasing:
|
@guestisp, @4Dolio, unfortunately it does not do anything at all. The metafilesystem is somehow defect!
But still more or less the same as before (it was 328TiB / 28715421 just before I started
|
Try this: And your mfsexport has a valid entry for that client for metadata? |
@4Dolio, |
You are working on a massive set, can not hurt. |
Can you tab complete /mnt/trash/und^t or /mnt/res^t ? |
Absolutely nothing, not with tab, not with So there is some massive bug in the meta file system! |
shrugs.. there should be a /mnt/reserved and feel like it should still exist and be visible. Seems like you could be not connected properly, like an export problem. Sometimes a normal dataset mount will act oddly if say no CS happen to be present for example. Just grasping for clues. |
If you set trash time to zero on a Non-empty file, and lock it, and delete it, i think it will show up in reserved until you let go of the lock(maybe from a different client?). .oO( managed to delete all data except locked reserved files once because lio iSCSI was still active and locked... manually rolled back the text changelog to before the deletion, rebuild, restarted, saved the reserved chunks and back online. Was lucky, don't try in production ;) Maybe reserved no longer exists? I just rolled back my 3.10/3.12 to 2.6 so can not check. Maybe mfsmeta is bugged out? It is normally magical and awesome. You should just be able to find $magic delete to purge all/some trash. Sigh. I would say no more snapshots till it trashtime 0 + purges in real time-ish... |
Maybe try to mount the mfsmeta from a different client? |
Yes, @4Dolio, And yes, I already mounted directly on the master host and on other hosts too. It's everywhere the same. The normal filesystem works. All data seem to be there. Just meta is strange. |
Could it be, that deleting after the timeout is an expensive and slow operation? Could it be, that I snapshod and delete faster (all data once per hour) than the data can be cleaned up? Could this be the reason of my crash 2 days before, see #700? Now I removed the hourly snapshot and reduce to daily snapshots. |
the existence of metadata.mfs.tmp indicates the master was dumping that file. But it did not finish that dump and rename it to metadata.mfs . And the lock indicates it did not exit cleanly and clear the lock. We don't know why it died though nor what unusuail state that resulted in. I have never used the rremove? You mentioned using so idk what that does, too new for my experience. If you didn't modify trash times first then you maybe still have 24 hours of retention after deletion, so accumulate 24*(objects count per snapshot). Perhaps your trash objects count is correct? Maybe you need to settrashtime 0 before the rremove so they get purged more quickly? So they do not stack up and overload mfsmeta into your broken state. Maybe the broken state is only related to the crash? Only thing i can suggest is wait for it to clear out the trash.. maybe a dev can jump in with other things to try... |
Well, I had to hard-reset the computer, since it was no more responsive.
Yesterday I decreased the trash time from 24h to 1h. But there are still files in the trash that I removed on Thursday. I can't see those files, but I know they are there, because they have missing chunks and i still can get the list of missing chunks, and they are still here. The missing chunks are from February, when my harddisk ran full. Yeah, still waiting… |
Do you have free space available ? |
Now, plenty:
I added two more 10TB disks… :) |
Your trash objects are lower at least... 18 million... would be interesting to know if mfsmeta begins to work, and when.. |
I'll keep you up to date. All my services are down since Thursday. I hope, I'll get them back running today. It's a docker swarm running on top of lizardfs, where the nodes are both, lizard chunk-/ master-/ logger-server and swarm nodes at the same time. That used to work well for some month, but currently docker on the swarm master is no more responsive, so I migrated to another node. Migration is still in progress. You find my configuration here. |
Are you using Lizard replicating over a Powerline? Seriously? Powerline has huge and unstable latency... |
→ I updated my blog post. |
Now, trash space begins to stabilize:
|
Now that I don't create hourly backups any more and only delete a backup a day instead of one every hour, it's back to normal work:
So: Deletion of snapshots is much too slow! |
Did your mfsmeta start working? True. And unfortunate. Long long ago, continuous snapshotting was floated. Can't do it yet, nor really used as rapidly as hourly. |
Yes! |
Ok, good. I wonder at what point it becomes broken? |
Another way of deleting trash files is to mount MFSMETA filesytem for manually trashing. First of all, allow that server to mount LizardFS by editing the exports file with the server's IPv4 Address On LizardFS master # mfsexports.cfg
172.26.3.1/32 . rw On server X sudo mkdir -p /mnt/mfsmeta
sudo mfsmount -m /mnt/mfsmeta/ -o mfsmaster=lizardfs-master.org You can trash/purge the deleted files from here: /mnt/mfsmeta/trash |
Final conclusion: LizardFS is too slow in cleaning up for hourly snapshots (if the oldest snapshot is removed). Daily snapshots (incl. cleanup) are not a problem. If you need a backup script for your cron job, try mine: |
Last update of my backup scripts, I accidentally re-enabled hourly snapshots. So it creates and deletes one snapshot per hour. Now my master server is down since two days. What I see, as additional information to be mentioned: Even though the server has 40GB and normally needs ~23GB, the memory exhausts and it is swapping. So memory seems to be the limit. I suppose this is not because of the snapshots, but due to the delete operations? I just ordered another 32GB RAM and I'll upgrade the server this evening to 72GB. Let's see how much this will help. |
I have a strange situation.
I create hourly, daily, weekly and monthly snaphots. After each snapshot, old snapshots are removed using
lizardfs rremove
. Since I have a lot of files and chunks, there are a lot of removals each hour. By default. trash-bin time is 24h, I reduced it yesterday to 1h.But the trash space seems to be constantly growing. Current status:
So many trash files, but where are they? I cannot see a single file in trash:
Questions:
restore
folder?Edit:
ls -lA
lasts ~9min minutes before it shows0
files.The text was updated successfully, but these errors were encountered: