Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

davix-client fails to follow redirection => missing urlencoding in xrdhttp-redirection? #577

Closed
olifre opened this issue Sep 4, 2017 · 15 comments
Assignees

Comments

@olifre
Copy link
Contributor

olifre commented Sep 4, 2017

I am only 99% sure this is an xrootd issue, so I am first reporting this here...

Following setup (all xrootd 4.7):

  • Manager xrootd.somedomain with ip address: IPv4ADDRESSOFMGR
  • Data server xrootd001.somedomain with ip address: IPv4ADDRESSOFDS
  • Some client to test, with ip address IPv4ADDRESSOFCLIENT

Both servers contain in their configuration (amongst some authentication stuff):

all.manager xrootd.somedomain:1213
all.role server
all.role manager if xrootd.somedomain
http.selfhttps2http yes
desthttps no
if exec xrootd
xrd.protocol XrdHttp /usr/lib64/libXrdHttp.so
fi
http.secretkey REDACTED

Now I run:

davix-ls --debug -P Grid https://xrootd.somedomain:1094/beegfs/grid/atlas/atlaslocalgroupdisk/

I get (abbreviated a bit):

> PROPFIND /beegfs/grid/atlas/atlaslocalgroupdisk/ HTTP/1.1
> Host: xrootd.somedomain:1094
[... Lots of SSL handling which goes well...]
< HTTP/1.1 302 Redirect
< Content-Length: 0
< Location: http://[::ffff:IPv4ADDRESSOFMGR]:1094/beegfs/grid/atlas/atlaslocalgroupdisk/?xrdhttptk=s6Q7i5fjNHzy17b/z/2hXA==&xrdhttptime=1504543303&xrdhttpname=Oliver%20Freyermuth&xrdhttpvorg=atlas&xrdhttphost=%5B%3A%3Affff%3AIPv4ADDRESSOFCLIENT%5D&xrdhttpdn=REDACTED

So far, so good, self-redirection to http with a temporary key. Please note that the redacted xrdhttpdn is nicely URL escaped.

Now, I see:

> PROPFIND /beegfs/grid/atlas/atlaslocalgroupdisk/?xrdhttptk=s6Q7i5fjNHzy17b/z/2hXA==&xrdhttptime=1504543303&xrdhttpname=Oliver%20Freyermuth&xrdhttpvorg=atlas&xrdhttphost=%5B%3A%3Affff%3AIPv4ADDRESSOFCLIENT%5D&xrdhttpdn=REDACTED HTTP/1.1
> User-Agent: libdavix/0.6.4 neon/0.0.29

Still fine, also here, the redacted xrdhttpdn is nicely URL escaped.

Now, I receive:

< HTTP/1.1 302 Redirect
< Content-Length: 0
< Location: http://xrootd001.physik.uni-bonn.de:1094/beegfs/grid/atlas/atlaslocalgroupdisk/?xrdhttptk=s6Q7i5fjNHzy17b/z/2hXA==&xrdhttptime=1504543303&xrdhttpname=Oliver Freyermuth&xrdhttpvorg=atlas&xrdhttphost=[::ffff:IPv4ADDRESSOFCLIENT]&xrdhttpdn=/C=DE/O=GermanGrid/REMAININGPARTREDACTED

Here:

  • xrdhttphost is not urlencoded.
  • xrdhttpdn is not urlencoded.

Now, davix shows:

(Davix::HttpRequest) Error: Impossible to get the new redirected destination

and gives up - no request ever hits the data server.

Am I missing something, or is this a bug in xrdhttp?

@xrootd-dev
Copy link

xrootd-dev commented Sep 4, 2017 via email

@olifre
Copy link
Contributor Author

olifre commented Sep 4, 2017

For the records, which version of Davix are you using?

$ davix-ls --version                                                                                    
Version: 0.6.6-

Have you also tried with curl ?

I did that just now, here we go... Looks the same...

curl -E /tmp/x509up_u1000 -X PROPFIND -vkL https://xrootd.somedomain:1094/beegfs/grid/atlas/atlaslocalgroupdisk
[ ... lots of SSL stuff ... ]
> PROPFIND /beegfs/grid/atlas/atlaslocalgroupdisk HTTP/1.1
> Host: xrootd.somedomain:1094
[...]
< HTTP/1.1 302 Redirect
< Content-Length: 0
< Location: http://[::ffff:IPv4ADDRESSOFMGR]:1094/beegfs/grid/atlas/atlaslocalgroupdisk?xrdhttptk=wgvPwlDDRrv11zRRREiz0g==&xrdhttptime=1504550035&xrdhttpname=Oliver%20Freyermuth&xrdhttpvorg=atlas&xrdhttphost=%5B%3A%3Affff%3AIPv4ADDRESSOFCLIENT%5D&xrdhttpdn=%2FC=DE%2FO=GermanGrid%2FREMAININGPARTREDACTED
[...]
> PROPFIND /beegfs/grid/atlas/atlaslocalgroupdisk?xrdhttptk=wgvPwlDDRrv11zRRREiz0g==&xrdhttptime=1504550035&xrdhttpname=Oliver%20Freyermuth&xrdhttpvorg=atlas&xrdhttphost=%5B%3A%3Affff%3AIPv4ADDRESSOFCLIENT%5D&xrdhttpdn=%2FC=DE%2FO=GermanGrid%2FREMAININGPARTREDACTED HTTP/1.1
[...]
< HTTP/1.1 302 Redirect
< Content-Length: 0
< Location: http://xrootd001.somedomain:1094/beegfs/grid/atlas/atlaslocalgroupdisk?xrdhttptk=wgvPwlDDRrv11zRRREiz0g==&xrdhttptime=1504550035&xrdhttpname=Oliver Freyermuth&xrdhttpvorg=atlas&xrdhttphost=[::ffff:IPv4ADDRESSOFCLIENT]&xrdhttpdn=/C=DE/O=GermanGrid/REMAININGPARTREDACTED

After that, curl (as it was told) still performs the last, broken request, but receives an empty reply from the server.

@xrootd-dev
Copy link

xrootd-dev commented Sep 5, 2017 via email

@xrootd-dev
Copy link

xrootd-dev commented Sep 5, 2017 via email

@xrootd-dev
Copy link

xrootd-dev commented Sep 5, 2017 via email

@olifre
Copy link
Contributor Author

olifre commented Sep 5, 2017

Hi,

many thanks for taking the effort to try to reproduce!

I have uploaded the logfiles as "gists" here on github, produced using -d and also explicitly setting http.trace all. The client was a usual davix-ls again.

Here the log from the redirector:
https://gist.github.com/olifre/94ca70bf2ff158b78acb3f4ef33de7b8

And here the one from the data server:
https://gist.github.com/olifre/2a2a36d19af801f09406a8ca22a7883e

And here the full config file I use:
https://gist.github.com/olifre/b4cccbafa0006fd58bc822f9256ef3a3

I will of course change the secret key again before entering production with that ;).

@olifre
Copy link
Contributor Author

olifre commented Sep 5, 2017

I suspect some issues in passing through the pull request or merges or whatever. In they they should be the same code.

Ah, that's good news - at least you don't need to hack through my logs then.
Good luck finding the difference between the two versions!

By the way, I am using epel (not epel-testing) and xrootd-stable and wlcg repositories here. Let me know if you need the exact source used for one of the packages, then I can check.

@xrootd-dev
Copy link

xrootd-dev commented Sep 5, 2017 via email

@olifre
Copy link
Contributor Author

olifre commented Sep 5, 2017

Hi,

If you disable https2http in the manager (you can keep it in the data servers) then it should just work. Please let me know.

Many thanks! You are correct, removing that on the manager fixes the issue.
You are also right that this option makes little sense, on second thought. Up to the point of actually testing it, my belief was that self.https2http would actually do nothing on the manager (since it is not really useful), but you are fully right, this creates one additional connection which is totally unnecessary.

Last thought... please let me know if it works now, anyway IMO one should preferably disable https in the redirector and let it just do the redirector http->https. Spreading this kind of load to the disk servers is certainly a more scalable approach.

This would be a good point. However, if I do that, how does the redirector know which paths the connecting user is allowed to access (since authentication with client certificate will not take place)?
Testing that in my setup, I get a 404 from the redirector, since it does not authorize the connecting user "none" to list any paths.
Or does it imply I have to effectively allow l to everybody, also unathenticated users, on my redirector for all to-be-exported paths?

Cheers and many thanks,
Oliver

@xrootd-dev
Copy link

xrootd-dev commented Sep 5, 2017 via email

@bbockelm
Copy link
Contributor

bbockelm commented Sep 5, 2017

Hi,

FWIW - we locally require auth on the redirector. The added CPU usage from the redirector, if summed across the years of running the redirector and millions of connections, probably amounts to $10 of electricity.

HTTPS is dirt-cheap. No need to overthink it.

Brian

@olifre
Copy link
Contributor Author

olifre commented Sep 5, 2017

Hi,

So, is your cluster a vanilla xrootd+cmsd cluster ?

Yes, it is. However, in our case, all servers (redirector and data servers) see exactly the same filesystem (a BeeFGS). The multiple dataservers are only used to increase the bandwidth, distribute the load and have a failover mechanism if a dataserver fails.
Since our setup is meant mainly for R2D2 traffic, I guess the number of actual connections to the redirector will not be too huge (we use a VM as redirector right now). Since we are also an ATLAS T3 only, I think that even if we export some locations for users, they will be used by less than 100 people in total.

FWIW - we locally require auth on the redirector. The added CPU usage from the redirector, if summed across the years of running the redirector and millions of connections, probably amounts to $10 of electricity.

HTTPS is dirt-cheap. No need to overthink it.

Many thanks for this additional input! This is good to know. Right now, our redirector VM only has a single core dedicated to it, maybe this will really suffice then.

Cheers and all the best,
Oliver

@abh3 abh3 assigned abh3 and unassigned abh3 Sep 28, 2017
@olifre
Copy link
Contributor Author

olifre commented Apr 23, 2018

To add on this issue... or maybe it is a new one?
Even with https2http disabled on the manager, and http.desthttps no (and shared secret), I get a problem...
While davix-ls and davix-get and davix-put work fine, davix-move from https://manager-address/somefile.root to https://manager-address/somenewfile.root fails with:

< HTTP/1.1 409 Unknown
< Content-Length: 232
<
DAVIX(body): Read block (232 bytes):
[Renaming to relative path 'Freyermuth&xrdhttpvorg=atlasREDACTEDPROXYINFO /full/path/to/file/somefile.root' is disallowed.]
(Davix::mv) Error: HTTP 409 : Conflict, File Exist

This also seems like parsing URI encoding was broken - my full name is "Oliver Freyermuth", so somehow the space messed things up.

This was after the following happened:

> MOVE /full/path/to/file/somenewfile.root HTTP/1.1
> User-Agent: libdavix/0.6.7 neon/0.0.29
> Keep-Alive: 
> Connection: Keep-Alive
> TE: trailers
> Host: manager-name:1094
> Destination: https://manager-name:1094/full/path/to/file/somefile.root'
> 

< HTTP/1.1 302 Redirect
< Content-Length: 0
< Location: http://data-server-6:1094/full/path/to/file/somenewfile.root?xrdhttptk=XXXXXXXXXXXXXX&xrdhttptime=1524518226&xrdhttpname=%2FC=DE%2FO=GermanGrid%2FOU=UniBonn%2FCN=Oliver%20Freyermuth&xrdhttpvorg=atlas&REDACTED
< 
HTTP session to http://data-server-6:1094 begins.
> MOVE /full/path/to/file/somenewfile.root?xrdhttptk=XXXXXXXXXXXXXX&xrdhttptime=1524518226&xrdhttpname=%2FC=DE%2FO=GermanGrid%2FOU=UniBonn%2FCN=Oliver%20Freyermuth&xrdhttpvorg=atlas&REDACTED HTTP/1.1
> User-Agent: libdavix/0.6.7 neon/0.0.29
> Keep-Alive: 
> Connection: Keep-Alive
> TE: trailers
> Host: data-server-6:1094
> Destination: https://manager-name:1094/full/path/to/file/somefile.root
>

Things work again if I specify:

http.desthttps yes

on the manager, effectively forcing double authentication (and preventing those URI parameters in the redirect to the data-server).

@ffurano
Copy link
Contributor

ffurano commented Aug 22, 2018

Hi, It's fixed now. If the filenames have spaces the kXR_mv request needs an additional field

---------------------------------------------------------------
-bash-4.1$ davix-ls -l https://littlexrdhttp.cern.ch:1094/ | grep small1
-rwxrwxrwx 0     11         2017-10-04 11:32:14 small1 with spaces

-bash-4.1$ davix-mv -P grid  "https://littlexrdhttp.cern.ch:1094/small1%20with%20spaces" https://littlexrdhttp.cern.ch:1094/small1  --trace header
HTTP session to https://littlexrdhttp.cern.ch:1094 begins.
> MOVE /small1%20with%20spaces HTTP/1.1
> User-Agent: libdavix/0.6.8 neon/0.0.29
> Keep-Alive: 
> Connection: Keep-Alive
> TE: trailers
> Host: littlexrdhttp.cern.ch:1094
> Destination: https://littlexrdhttp.cern.ch:1094/small1
> 

< HTTP/1.1 201 Unknown
< Content-Length: 3
< 
-bash-4.1$ 
-bash-4.1$ 
-bash-4.1$ davix-ls -l https://littlexrdhttp.cern.ch:1094/ | grep small1
-rwxrwxrwx 0     11         2017-10-04 11:32:14 small1
-bash-4.1$ 
--------------------------------------------------------

@ffurano
Copy link
Contributor

ffurano commented Aug 22, 2018

Hi,
I think that these issues have been fixed by the recent commits. Please let me know if it's OK for you too.
Cheers
Fabrizio

@ffurano ffurano closed this as completed Aug 22, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants