Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
chimera: Delete unreferences tag inodes
Chimera fails to delete tag inodes when removing the last tag linking to the inode. In the past this didn't cause much trouble, as it was rare to delete an entire directory tree that would result in orphaned tag inodes. With the introduction of upload directories, every upload creates a temporary directory with its own tags. When the upload completes, those tags are deleted. Currently they leave behind the inodes. At NDGF we have approx. 300 million such unreferences inodes. Since everybody using SRM in dCache 2.10 or newer will suffer from this, we need a backportable solution. The 'correct' solution would be to maintain the existing inlink field of t_tags_inodes. This field is currently not maintained (it is always 1). A fix revolving around this field would have to update the existing inodes with the correct value. We know from past experience that updating every row of a 300 million row table is very slow and not appreciated in a patch level release. Also, a 'proper' fix would get rid of the current unreferenced inodes, yet this too takes a long time (we still haven't managed to do this as NDGF as it is slow and has a very negative impact on production throughput). Thus this patch settles for avoiding that more unreferenced inodes are left behind, while leaving cleaning up the existing inodes is left for a feature release. The present patch modifies the tag deletion code in Chimera to delete the tag inodes of any removed tags *iff* those inodes are not referenced by any other tag. The patch adds an index on t_tags(itagid) to make this lookup faster. This index is also needed to make deletion in t_tags_inodes fast as such deletes will do a referential integrity validation on t_tags(itagid). I tried to batch the requests to the DB as much as possible, but it should be obvious that this change will make deleting directories more expensive. Transaction isolation level should be increased to REPEATABLE_READ to ensure correctness, but after discussion between Gerd and Tigran it was concluded that the negative consequences of repeatable read are bigger than the benign risk of loosing the race inherent in the code. Even if one looses the race, the effect is merely to orphan a tag inode. This has to be contrasted to the current situation in which any tag deletion results in an orphaned tag inode. Creating the index on update is obviously something that may take a little while for a large database (we are talking maybe an hour - not days). It is however significantly less than cleaning t_tags_inodes would be, and it is essential for t_tag_inodes delete performance no matter how we implement the deletion. Target: trunk Require-notes: yes Require-book: no Request: 2.12 Request: 2.11 Request: 2.10 Acked-by: Tigran Mkrtchyan <tigran.mkrtchyan@desy.de> Patch: https://rb.dcache.org/r/8183/ (cherry picked from commit c95fe70) (cherry picked from commit 26e62a8)
- Loading branch information