Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chimera: Delete unreferences tag inodes #1508

Merged
merged 1 commit into from May 5, 2015

Conversation

gbehrmann
Copy link
Collaborator

Chimera fails to delete tag inodes when removing the last tag linking to the
inode. In the past this didn't cause much trouble, as it was rare to delete an
entire directory tree that would result in orphaned tag inodes. With the
introduction of upload directories, every upload creates a temporary directory
with its own tags. When the upload completes, those tags are deleted.
Currently they leave behind the inodes. At NDGF we have approx. 300 million
such unreferences inodes.

Since everybody using SRM in dCache 2.10 or newer will suffer from this, we
need a backportable solution. The 'correct' solution would be to maintain the
existing inlink field of t_tags_inodes. This field is currently not maintained
(it is always 1). A fix revolving around this field would have to update the
existing inodes with the correct value. We know from past experience that
updating every row of a 300 million row table is very slow and not appreciated
in a patch level release.

Also, a 'proper' fix would get rid of the current unreferenced inodes, yet this
too takes a long time (we still haven't managed to do this as NDGF as it is
slow and has a very negative impact on production throughput).

Thus this patch settles for avoiding that more unreferenced inodes are left
behind, while leaving cleaning up the existing inodes is left for a feature
release.

The present patch modifies the tag deletion code in Chimera to delete the tag
inodes of any removed tags iff those inodes are not referenced by any other
tag. The patch adds an index on t_tags(itagid) to make this lookup faster. This
index is also needed to make deletion in t_tags_inodes fast as such deletes
will do a referential integrity validation on t_tags(itagid). I tried to batch
the requests to the DB as much as possible, but it should be obvious that this
change will make deleting directories more expensive.

Transaction isolation level should be increased to REPEATABLE_READ to ensure
correctness, but after discussion between Gerd and Tigran it was concluded
that the negative consequences of repeatable read are bigger than the benign
risk of loosing the race inherent in the code. Even if one looses the race,
the effect is merely to orphan a tag inode. This has to be contrasted to
the current situation in which any tag deletion results in an orphaned tag
inode.

Creating the index on update is obviously something that may take a little
while for a large database (we are talking maybe an hour - not days). It is
however significantly less than cleaning t_tags_inodes would be, and it is
essential for t_tag_inodes delete performance no matter how we implement the
deletion.

Target: trunk
Require-notes: yes
Require-book: no
Request: 2.12
Request: 2.11
Request: 2.10
Acked-by: Tigran Mkrtchyan tigran.mkrtchyan@desy.de
Patch: https://rb.dcache.org/r/8183/
(cherry picked from commit c95fe70)
(cherry picked from commit 26e62a8)

Chimera fails to delete tag inodes when removing the last tag linking to the
inode. In the past this didn't cause much trouble, as it was rare to delete an
entire directory tree that would result in orphaned tag inodes. With the
introduction of upload directories, every upload creates a temporary directory
with its own tags.  When the upload completes, those tags are deleted.
Currently they leave behind the inodes. At NDGF we have approx. 300 million
such unreferences inodes.

Since everybody using SRM in dCache 2.10 or newer will suffer from this, we
need a backportable solution. The 'correct' solution would be to maintain the
existing inlink field of t_tags_inodes. This field is currently not maintained
(it is always 1). A fix revolving around this field would have to update the
existing inodes with the correct value. We know from past experience that
updating every row of a 300 million row table is very slow and not appreciated
in a patch level release.

Also, a 'proper' fix would get rid of the current unreferenced inodes, yet this
too takes a long time (we still haven't managed to do this as NDGF as it is
slow and has a very negative impact on production throughput).

Thus this patch settles for avoiding that more unreferenced inodes are left
behind, while leaving cleaning up the existing inodes is left for a feature
release.

The present patch modifies the tag deletion code in Chimera to delete the tag
inodes of any removed tags *iff* those inodes are not referenced by any other
tag. The patch adds an index on t_tags(itagid) to make this lookup faster. This
index is also needed to make deletion in t_tags_inodes fast as such deletes
will do a referential integrity validation on t_tags(itagid). I tried to batch
the requests to the DB as much as possible, but it should be obvious that this
change will make deleting directories more expensive.

Transaction isolation level should be increased to REPEATABLE_READ to ensure
correctness, but after discussion between Gerd and Tigran it was concluded
that the negative consequences of repeatable read are bigger than the benign
risk of loosing the race inherent in the code. Even if one looses the race,
the effect is merely to orphan a tag inode. This has to be contrasted to
the current situation in which any tag deletion results in an orphaned tag
inode.

Creating the index on update is obviously something that may take a little
while for a large database (we are talking maybe an hour - not days). It is
however significantly less than cleaning t_tags_inodes would be, and it is
essential for t_tag_inodes delete performance no matter how we implement the
deletion.

Target: trunk
Require-notes: yes
Require-book: no
Request: 2.12
Request: 2.11
Request: 2.10
Acked-by: Tigran Mkrtchyan <tigran.mkrtchyan@desy.de>
Patch: https://rb.dcache.org/r/8183/
(cherry picked from commit c95fe70)
(cherry picked from commit 26e62a8)
paulmillar added a commit that referenced this pull request May 5, 2015
chimera: Delete unreferences tag inodes
@paulmillar paulmillar merged commit 1b40bbf into dCache:2.11 May 5, 2015
@gbehrmann gbehrmann deleted the fix/2.11/rb8183 branch May 27, 2015 07:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants