New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
file replicas of 2GeV protodune-sp files aren't actually gone #513
Comments
have rucio erased one of the files, the expired_at field shows now. Will see if said file goes away tomorrow as sscheduled [dunepro@dunesl7gpvm01 protodune-sp]$ rucio erase protodune-sp:PDSPProd4a_protoDUNE_sp_reco_stage1_p2GeV_35ms_sce_datadriven_66503427_0_20231231T235952Z.root |
have done another rucio erase of one of the files.. file shows expired_at, we'll see if it kicks in tomorrow as scheduled. |
Martin Barisits confirms this Yes, I think erase doesn't work right now on files in all cases. Historically this was meant for datasets, never for files. Although the usecase to have it work on files came up as well, but it does not seem to work properly. There is an issue about it Issue was filed by Brandon White 2 years ago on behalf of Icarus. |
At the expired_at: time what we observe is that the expired_at: flag goes away but the file doesn't |
Dimitrios says that the undertaker won't erase replicas, the reaper does that. looking at the reaper we see that it is able to delete large numbers of files off of DUNE_US_FNAL_DISK_STAGE, mostly the justiin-logs. so nothing wrong with the reaper. Possibly there are locks on these files, have asked to see if there is a way we can tell short of accessing the DB. |
If the undertaker can't delete the DID, there is no reason any of the rules or replica locks would be modified. We need the undertaker to handle the removal of these DIDs properly as per rucio/rucio#5154. |
So we worked around this by using the list of the retired files that we had kept, organizing them into datasets again, making a short-lived rule and then the reaper got the files when the rule expired. |
It is now apparent that the others haven't yet been deleted because they were never part of any rule and there was no tombstone set. So we are first setting tombstones for all the files that we know are part of datasets and then we will scan the tree to get the rest.. I believe these are replicas that wouldn't be findable even via doing rucio list-datasets-rse (which shows any dataset that's ever been on the RSE) and getting the replicas from all those datasets. .but we'll try that too. |
These are cleaned now. |
Even though a rucio erase was done individually on every single file in this list and the datasets above them, there's no expired_at field in the individual file metadata and they are still out there and still have replicas.
Total size of the protodune-sp directory under /dune/persistent/staging/ hasn't really changed, it is still 33TB so we did not free up any space at all.
The text was updated successfully, but these errors were encountered: