-
Notifications
You must be signed in to change notification settings - Fork 131
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Download error can cause stalling and error looping #3563
Comments
Thanks for this report! With regards to the connection to the linked issues: Did the cvmfs2 process still respond to SIGKILL when you encountered this bug? |
Hmm, good question, I didn't test that. However, I can confirm my user process moved into a |
I was able to kill the process. I was then able to re-mount and recover. So no issues there. Note that to trigger this I just have to generate a decent amount of CVMFS load and then kill a proxy server. Thats enough to get 1 thread into a Backoff() and the stall/retry/fail loop starts. |
for replication:
Fix is in #3556 but only for the new introduced parallel decompression as this puts the backoff outside the sequential handling of curl callbacks Edit: |
Ran into a situation where a single download error caused an error loop and hung cvmfs2. The cause are these
Backoff()
calls inVerifyAndFinalize()
which sleep between retries:cvmfs/cvmfs/network/download.cc
Line 1602 in 6b4ccc1
This is all done in the libcurl event loop in
MainDownload()
. When this happens and a single download sleeps during its retry, all downloads stall out due to the event loop not being run. This then causes all kinds of new errors like timeouts and TooSlow errors. These then trigger more Backoffs() and you are potentially stuck in a forever fail/retry loop depending on the conditions.Possibly related to #3432 and #3378.
The text was updated successfully, but these errors were encountered: