-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
globus-url-copy always delegates the X.509 credential #153
Comments
Hi all, I do not think we have such a use case today and I agree I suggest we just comment out the code for now and |
You're saying no-one is doing 2rd party transfers any more? It would be necessary for those I'd say? |
Hi all, |
In case this isn't clear, FTP (and, by extension, GridFTP) supports third-party transfers without requiring delegation. This is because FTP has a separate TCP connections for the "control" and "data" channels. Somewhat simplified, the procedure is:
The port number is used as a secret, with which to authorise the transfer, the transfer is not otherwise protected. If the data channel is encrypted (via TLS), then it would be possible for the data channel to have mutual authentication, with the data-channel client (the source service, in the above example) using an X.509 credential during the TLS handshake. This would require the data-channel client to have some kind of X.509 credential with which to authenticate, which could be the user's credential (if the client delegated) but it could also be something else, such as the host credential. On a more practical point, I think the easiest solution would be to make delegation optional, and disabled by default. I suspect the vast majority of The default could be made more sophisticated; for example, enabled if the command is a third-party transfer with a secure data channel and disabled otherwise. The underlying problem is that GSI delegation is just broken. It happens too early and is controlled by the wrong agent. In general, only the server (not the client) knows whether or not delegation is needed for a particular operation. Moreover, the server can only say whether delegation is needed once it knows what the client wants to do. The happens only after the GSI handshake has completed. :-( IMHO, the correct approach would be to define a specific error code to mean "requires delegation" and have separate FTP commands to allow the client to delegation a credential. However, I don't think anyone has the energy to implement this. |
Hi Paul, |
Hi Maarten, Yes, the client knows whether it is asking for an encrypted third-party transfer, but it still doesn't know whether the server needs the client to delegate the credential. Here are some counter-examples where the client is requesting an encrypted third-party transfer, but it do not require delegation:
The last example is a specific example of a more general scenario, where delegation is avoided by having a high level of trust between the two services. Globus (Online)'s sharing use-case is one example. xcache can also support something similar for accessing embargoed data. As you mentioned earlier, there's also the counter example where "normal" (non-third-party transfers) could require delegation: the FTP server is acting as a proxy in front of some other service(s), which requires authentication. Therefore (almost?) all operations would require delegation. So, in general, only the server will know whether delegation is needed. The client can make intelligent guesses, but they are still only guesses. (As an aside, this server-tells-client-when-to-delegate model is how delegation works in HTTP-TPC: the client attempts a transfer and, if a credential is needed, the server tells the client that it must delegate and then retry the HTTP-TPC request). |
Hi all, We have survived with this misfeature for 15+ years at scale Hence I propose we just do nothing about this. Comments? |
Hi Maarten, all, I tend to agree with you. And we should certainly focus on security fixes and the really urgent fixes. But if it is easy to change the default and add a flag, I think we could do that but I'm worried it could be quite involved to find exactly how and where to do that. It could be quite deeply inside the GSI code itself. |
@paulmillar @msalle @maarten-litmaath |
I also found more details in GCT v6.2 GridFTP : Developer’s Guide. @fscheiner Do you know of a community that's using GridFTP Multicasting? It's certainly an interesting idea, albeit one with some drawbacks. IIRC, the client (guc) chooses whether to delegate, by sending a |
Thanks for that. I seemed to remember that multicasting illustration, but couldn't remember where it was used in the documentation, so referenced the paper about this functionality instead.
Actually not. But I could imagine it could be a good way to distribute images to a webserver cluster.
What drawbacks do you see? I guess it would be more efficient to replicate the data packets on the routers, though I assume this won't work when the data channel is encrypted and it would also nullify the possible network overlay which would be useful to "connect" GridFTP servers located in private networks to GridFTP servers located in public networks. And w/o local writes on the intermediate servers, this is more efficient than what I enabled with multi-step transfers with gtransfer.
I.e. delegation when using (1) 3rd party transfers and/or (2) multicasting. |
Well, the main drawback is that the client doing the upload needs to know where all the places where the file should be written. In some cases, that might make sense (e.g., a CDN-like data placement model), but others (e.g., a grid job that's just completed) it not necessarily the best approach. Of course, the client could call out to some external service to learn where the files should be placed. However, in that case, the external agent could manage the transfers itself (e.g., Rucio) without involving the client. There's not much benefit to this multicast solution. Some features become a little harder and perhaps a little more fragile. For example, if there was a data placement policy of "one copy in Europe (don't care which site) and one copy in the USA (don't care which site)" then it would require an early binding (this job knows to send its output to Fermilab and DESY; the next job knows to send its output to MIT and KIT, ...) It would be hard to change this binding once the job starts without the client calling out to an external service. Another problem I see is how are errors handled. Suppose a transfer fails, how is this reported back? Which agent (if any) should retry the transfer? This might be described in the paper, but I didn't see it while skimming through. Another aspect is that file-level checksums are usually calculated over the entire file's contents. That means that, if you care about data integrity, you will probably have to implement a store-and-forward model, to be sure there's no data corruption. Relying on TLS or using mode-X do offer alternatives, albeit with some limitations. So, if the implementation uses the store-and-forward model then I think almost all the benefits are gone: the same thing could be achieved by managing the transfers with an external data-placement service. IIRC, there are some cool projects that implement multicast file delivery, which use FEC to achieve some reliability. Although the kind of multi-hop file delivery that Globus developed is certainly useful in some use-cases, I think you get more-or-less the same benefit using a higher-level data placement service (like Rucio). That approach also gives the users an additional level of flexibility (error-handling, dynamic changes to placement rules, etc). As usual, this is just my 2c-worth :-) |
GSI is distinct from TLS in that it supports optional X.509 delegation as part of the handshake. Whether or not delegation takes place is controlled by the client. The
globus-url-copy
command is the client. By default, it delegates its credential to the server and there does not appear to be any (documented) way to disable this delegation.At least for dCache (and likely other GridFTP servers, too), the delegated credential is just thrown away. Delegation is useless for GridFTP.
Beyond being pointless, delegation is actually problematic for a number of reasons:
My suggestion would be to modify
globus-url-copy
so that either:globus-url-copy
so it does not delegate by default.The text was updated successfully, but these errors were encountered: