-
Notifications
You must be signed in to change notification settings - Fork 149
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
More resilient IPv6/IPv4 dual stack handling #644
Comments
Hi Mats, What's the version of xrdcp you were using? Long story short, before 4.8.0, there was a single Connection Window for all the interfaces. The default size of the connection window was equal to the default connection timeout, meaning that by default there was time only to try one interface. This has been fixed in 4.8.0, and now the connection window is applied per interface. BTW, from what I see you are running with number of retries set to 1, right? Cheers, |
The problem sounds like the same one, but we have 4.8.0:
This comes from the OSG package xrootd-client-4.8.0-1.osg34.el6.x86_64 |
OK, after a careful examination I can see that your problem is quite different than the one in #625
Currently, this is considered by XRootD client as a fatal and unrecoverable error, it has nothing to do with the Connection Window. I'll provide a complementary patch. BTW I tried running: and now it works just fine, so I suppose you guys have fixed your setup, Michal |
Hi Michal, The problem is client-side above. The job landed on a host with a valid IPv6 public address -- but the network the host is attached to is not actually routing IPv6 packets. I think the request is to have the Xrootd client behave a bit more like
Since Xrootd actually wants to randomize address order (as this is how load-balancing is implemented), the idea is to randomize only within address families. So, if there are two IPv4 addresses followed by two IPv6 addresses, the randomization should be applied first to the two IPv4 addresses then the two IPv6 addresses - leaving the ordering of the address families the same. Brian |
Hi Brian, By policy we try first IPv6, and I don't think that only because of wget (which is used by lazy sysadmin), we should change our ways ;-) On the other hand, I don't see why the client couldn't give it a try and use the next IP address in line. Michal |
Well, I certainly agree with the principle -- it's silly to deploy an IPv6 address but not the corresponding network. Further, IIRC, the various RFCs clearly state that IPv6 addresses should be tried first. It's also been how I've advocated to approach the problem previously. However, I think Mats has convinced me that the correct way may not be the best way because:
|
My proposition is to modify the client so it is not aborting after this kind of error, instead the client should try all the IP addresses on the list (this way IPv6 still gets tried first, and if it's bogus we move to the next entry, possibly IPv4). I suppose this should make you guys happy, wouldn't it? ;-) @rynge : could you provide a host were I can test a patch? |
@simonmichal - Yes, we left the host as it was to be able to test. Please email me at rynge@isi.edu your preferred username and and ssh key, and will get you an account. |
Perfect :-) |
I have tested the patch, and it looks all good:
I also added a new envar XRD_PREFERIPV4 (by default set to 0), if set the client will try first with IPv4. I suppose I can close this one :-) |
Michal, just curiosity, what differs between |
Hi Marian,
XRD_NETWORKSTACK=IPv4 means only use IPv4 for everything. While XRD_PREFERIPV4 means that if you have a choice of using IPv4 or IPv6, try IPv4 first and if that doesn’t work, try IPv6. The latter specifically controls how the client connects to a server when it’s allowed to use IPv4 or IPv6.
Andu
From: Marian Zvada
Sent: Thursday, January 18, 2018 2:49 PM
To: xrootd/xrootd
Cc: Subscribed
Subject: Re: [xrootd/xrootd] More resilient IPv6/IPv4 dual stack handling (#644)
Michal, just curiosity, what differs between XRD_PREFERIPV4 versus XRD_NETWORKSTACK=IPv4? That's what actually worked when we used XRD_NETWORKSTACK=IPv4 xrdcp ... from Mat's client.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or mute the thread.
…--------------------------------------------------------------------------------
Use REPLY-ALL to reply to list
To unsubscribe from the XROOTD-DEV list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-DEV&A=1
|
Thanks Andy, well explained! |
One of our OSG submit hosts, xd-login.opensciencegrid.org, has a somewhat odd IPv6 configuration. IPv6 stack is enabled, the main external interface only has a IPv4 address and a secondary internal interface has both IPv4 and IPv6 enabled. There is no default IPv6 route. Detailed interface and route information below.
The result is that xrootd client think this host is IPv6 enabled, and only tries IPv6 connections. For example:
We understand that this host configuration is nor perfect, and we will work on improving the setup. However, we have no indicators that other tools using the network having issues with the configuration, and we believe the reason is that most tools handle dual-stack more gracefully. For example, from the wget's man page: "By default, an IPv6-aware Wget will use the address family specified by the host’s DNS record. If the DNS responds with both IPv4 and IPv6 addresses, Wget will try them in sequence until it finds one it can connect to.". Maybe xrootd should use a similar approach?
Please note that it is not really this host we are concerned about, but for example compute nodes outside our administrative domain, which could cause a lot of job failures.
To be more explicit, we would like xrootd to be made more resilient by trying address families in order (IPv6, and if that fails IPv4). If there rrdns is used, the order of the addresses can be randomized, but only within the ordered address families.
More detailed debugging information:
The text was updated successfully, but these errors were encountered: