-
Notifications
You must be signed in to change notification settings - Fork 149
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
http TPC fails in mixed environments (IPv6 / dual stack vs. IPv4 only) #968
Comments
@abh3 - I assume this is coming from here: https://github.com/xrootd/xrootd/blob/master/src/XrdTpc/XrdTpcTPC.cc#L184 Do we need to add special opaque information to the URL to get the cmsd to do the correct thing with respect to IPv4 / IPv6? |
No, the informatin has to be supplied when logging into the bridge.
Unfortunately, I noticed those flags were not documented in he reference
(I will do so). However, look at XrdXrootdXeq.cc:910 and the subsequent
few lines to see how the decision is made. I will look at the bridge code
for more clarification.
Andy
…On Thu, 18 Apr 2019, Brian P Bockelman wrote:
@abh3 - I assume this is coming from here:
https://github.com/xrootd/xrootd/blob/master/src/XrdTpc/XrdTpcTPC.cc#L184
Do we need to add special opaque information to the URL to get the cmsd to do the correct thing with respect to IPv4 / IPv6?
--
You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub:
#968 (comment)
|
Why do you think this is coming from the bridge and not the SFS object? My understanding of @olifre's description is that everything else works except TPC (which is the only place where we invoke the filesystem directly). |
Just to confirm: That's true. To give examples, a streaming copy done by FTS works, GET / PUT works etc. Only TPC with our IPv4-only site against a dual-stacked site fails. |
OK, I didn't look at the HTTP TPC code. Then that means the correct flags
are not being set in the ErrInfo object (assuming the code directly uses
the OFS plugin). Before I get myself into a contorted state I really need
to understand what that code really does. Any pointers ahead of time would
be appreciated.
Andy
…On Thu, 18 Apr 2019, Brian P Bockelman wrote:
Why do you think this is coming from the bridge and not the SFS object? My understanding of @olifre's description is that everything else works except TPC (which is the only place where we invoke the filesystem directly).
--
You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub:
#968 (comment)
|
@olifre - do you have the ability to test out a patch? I think this will work:
Might take me a bit more time for me to completely reproduce your setup on my development VM. |
@bbockelm Sadly, still a no 😢 . We wanted to set up a test setup with DFS in addition to our current one long ago, but to much other windmills to fight against came up (including IPv6 preparations, but these will still take a few months at least for a production solution). |
Gotcha. Oddly enough, I don't have access to any hosts without IPv6, so I'm currently trying to remove "just enough" IPv6 support from Xrootd to reproduce the issue exactly. |
If we open a file using the SFS interface, the OFS plugin will, by default, only query for servers that support IPv6. If a cluster only has IPv4 addresses, then this will always fail. This changes the default to dual-stack: the transfer can be serviced by a server that speaks either IPv4 or IPv6 (or both!). This is likely the best we can do as we have no indication of whether the source side is IPv4, IPv6, or dual stack itself. Fixes xrootd#968
Ok, I figured out how to disable enough IPv6 on the development machine in order to trick xrootd to being an "ipv4-only" host. I was able to reproduce and confirm the fix works. Basically, Xrootd tries to find a server that can perform the transfer and have some amount of protocol-awareness. By default, it searches for a host that can handle an IPv6 transfer. I changed the default to query for hosts that can do transfers over either IPv4 or IPv6. Now, this assumes that the remote side is compatible with your cluster (i.e., an IPv6-only source will be matched to an IPv4-only cluster ... but a failure obviously still will occur further downstream). It seems the assumption the two side are compatible is better than assuming the remote side is always IPv6-only. @simonmichal - this fix would be very good to backport. |
That's for sure the case, IPv6-only is still rare, but dual-stack vs. IPv4-only is pretty common. And I hope that IPv4 only will die out earlier than IPv6 only arises, at least for us this will be true 😉. Many thanks for the fix and explanation! |
If we open a file using the SFS interface, the OFS plugin will, by default, only query for servers that support IPv6. If a cluster only has IPv4 addresses, then this will always fail. This changes the default to dual-stack: the transfer can be serviced by a server that speaks either IPv4 or IPv6 (or both!). This is likely the best we can do as we have no indication of whether the source side is IPv4, IPv6, or dual stack itself. Fixes xrootd#968
For the record, I can confirm this fixes our issue in practice 👍 Thanks again! |
Example from the log on a redirector which is IPv4 only:
The partner site is dual-stacked, and the
PUSH
request failed with:No servers are reachable via public IPv6 network to read the file.
Quoting @abh :
The text was updated successfully, but these errors were encountered: