-
Notifications
You must be signed in to change notification settings - Fork 149
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
No retry after a broken server is hit #138
Comments
some more precise info. in the redirector used there are more than 1 server able to provide the file. I link here 2 logs from xrdcp -d 3:
https://www.dropbox.com/s/m0ranydm6ook2sl/log.fail?dl=0 thanks tom |
Hi Tom, Indeed that is not the expected behaviour with the patch I gave to Brian. Andy On Mon, 8 Sep 2014, Tommaso Boccali wrote:
|
Rats - it looks like the patch got clobbered in our move to 4.0.3 (and subsequent revert back to 3.2.4). Andy, what's the ticket again? I can't find the old patch. |
What was the reason for the revert? |
Build issue with ROOT; supposedly fixed in 5.34.20, although 5.34.20 has unrelated bugs that prevent CMS from using it. So, we wait again. |
Hi Brian, I believe the patch is: I will have to verify that this is all of it (I think it is). Did I send Andy On Tue, 9 Sep 2014, Brian Bockelman wrote:
|
from the comment, the linked patch seems to be solving the multiple DNS problems, not the "not enough retries". Is it the same issue? tom |
Hi Tom, Yes, I believe it’s the same issue. The loop would simply exit and not continue on to the next redirector. Andy From: Tommaso Boccali from the comment, the linked patch seems to be solving the multiple DNS problems, not the "not enough retries". Is it the same issue? tom — Use REPLY-ALL to reply to list To unsubscribe from the XROOTD-DEV list, click the following link: |
uhm, but in this case the wanted behavior is to use another server under the same redirector... I would like to try, but currently the broken server is no more broken, and breaking it on purpose is a bit ... nasty ;) tom |
Hi Tom, I don't think you need to try. The patch should address the problems you Andy On Tue, 9 Sep 2014, Tommaso Boccali wrote:
|
OK, still waiting for what you really need to get the patch back into your On Tue, 9 Sep 2014, Brian Bockelman wrote:
|
Hi Brian, Please find the back-ported patch for this problem at: http://www.slac.stanford.edu/~abh/xrootd-3.2.4/ I will keep it there to the end of the year incase you loose it. Andy From: Brian Bockelman Rats - it looks like the patch got clobbered in our move to 4.0.3 (and subsequent revert back to 3.2.4). Andy, what's the ticket again? I can't find the old patch. — Use REPLY-ALL to reply to list To unsubscribe from the XROOTD-DEV list, click the following link: |
Pull request is in to the CMSSW team; just waiting for them to pick it up. In terms of Xrootd 4.0.3 -- they are currently trying to integrate ROOT 5.34.21. |
For 4.0.3 - Great! Please let me know if you spot any problems. |
Seems like no problems have been hit after this has been fixed. So, I am closing this. |
Ciao, in CMS we just moved out SAM tests to a newer version, in order to solve issues we had previously with a 3_1 xrootd version.
Now we use:
Tool info as configured in location /tmp/tboccali/CMSSW_7_1_7
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Name : xrootd
Version : 3.2.4-cms2
++++++++++++++++++++
what we see is that after a connection to a broken server (certifica problems, most probably), a retry does not see to be issued:
http://dashb-cms-sum.cern.ch/dashboard/request.py/getMetricResultDetails?hostName=cmsrm-cream03.roma1.infn.it&flavour=CREAM-CE&metric=org.cms.WN-xrootd-fallback&timeStamp=2014-09-08T08:25:41Z
is this expected? Brian told me we had a patch from you for a similar problem, so I wanted to make sure this is NOT the expected behavior.
thanks a lot
tom
The text was updated successfully, but these errors were encountered: