Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
dcache-resilience: ignore broken cached files
Motivation: master@c367e9fad850e4ff83560cf11edf38fc7ded313f https://rb.dcache.org/r/11643/ changed the way resilience handles file "removal" (no longer setting the repository entry to 'removed' but simply by caching the replica). This change, however, did not take into account the handling of broken files. Since its inception, resilience has always tried to handle all but the last sticky replica that was broken by removal and recopy. However, when removal was changed to simple removal of the sticky bit, the logic governing broken files was no longer valid. Encountering a broken file, it would indiscriminately attempt to remove it, whether it was cached or not; this now leads to an infinite loop, with the file operation continuously iterating without doing any further work. This situation can potentially hang the pool scans (if there are as many broken files as there are scan threads), and even the file operation queue. Modification: The current patch repairs this situation, maintaining the intended semantics: upon discovery of a broken sticky replica, it is cached, and if another replica is required, it is made. NOTE: it seems to be the consensus of opinion that resilience should no longer try to handle broken replicas at all. This will be addressed by a subsequent patch. Result: No longer any potential for stalled operations when encountering broken replicas. Target: master Request: 6.2 Request: 6.1 Request: 6.0 Request: 5.2 Requires-notes: yes Requires-book: no Patch: https://rb.dcache.org/r/12513/ Acked-by: Tigran
- Loading branch information