Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added initial support for tusd #341

Merged
merged 2 commits into from
Dec 3, 2021
Merged

Added initial support for tusd #341

merged 2 commits into from
Dec 3, 2021

Conversation

nuwang
Copy link
Member

@nuwang nuwang commented Oct 12, 2021

This builds on @mvdbeek's fantastic work on integrating tus with Galaxy: galaxyproject/galaxy#12656, by addingthe tusd daemon as an out-of-the-box component in the helm chart.

Things are at a point where tusd is receiving the requests, but something is slightly amiss with auth handling:

[tusd] 2021/10/12 14:06:23 Using '/galaxy/server/database/tmp' as directory storage.
[tusd] 2021/10/12 14:06:23 Using 0.00MB as maximum size.
[tusd] 2021/10/12 14:06:23 Using 'http://gxyarchivetest6-galaxy-nginx:8000/galaxy/api/upload/hooks' as the endpoint for hooks
[tusd] 2021/10/12 14:06:23 Enabled hook events: pre-create, post-create, post-receive, post-terminate, post-finish
[tusd] 2021/10/12 14:06:23 Using 0.0.0.0:1080 as address to listen.
[tusd] 2021/10/12 14:06:23 Using /files/ as the base path.
[tusd] 2021/10/12 14:06:23 Using /metrics as the metrics path.
[tusd] 2021/10/12 14:06:23 Supported tus extensions: creation,creation-with-upload,termination,concatenation,creation-defer-length
[tusd] 2021/10/12 14:06:23 You can now upload files to: http://0.0.0.0:1080/files/
[tusd] 2021/10/12 14:11:34 event="RequestIncoming" method="POST" path="" requestId="76a08832e52cef184ed906db1d5c4510"
[tusd] 2021/10/12 14:11:34 event="HookInvocationStart" type="pre-create" id=""
[tusd] 2021/10/12 14:11:34 event="ResponseOutgoing" status="403" method="POST" path="" error="pre-create hook failed: endpoint returned: 403 Forbidden" requestId="76a08832e52cef184ed906db1d5c4510"

session cookie not getting forwarded perhaps? Ping @mvdbeek

@mvdbeek
Copy link
Member

mvdbeek commented Oct 12, 2021

Ah! I have been using -hooks-http-forward-headers=X-Api-Key,Cookie ... I will update the docs!

Copy link
Member

@mvdbeek mvdbeek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, very happy you've got it (almost) working, save for me being unable to write correct docs :)

- "1080"
- "-upload-dir={{ .Values.persistence.mountPath }}/tmp"
- '-hooks-http=http://{{ include "galaxy.fullname" . }}-nginx:{{ .Values.service.port }}{{ template "galaxy.add_trailing_slash" .Values.ingress.path }}api/upload/hooks'
- "-hooks-http-forward-headers=X-Api-Key,sessioncookie"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- "-hooks-http-forward-headers=X-Api-Key,sessioncookie"
- "-hooks-http-forward-headers=X-Api-Key,Cookie"

proxy_set_header X-Forwarded-Proto $scheme;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
client_max_body_size {{ .Values.nginx.conf.client_max_body_size }};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be 0, a well-behaved client shouldn't send a body to api/upload/resumable_upload

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although there's probably no harm in it, I took this directly from https://github.com/tus/tusd/blob/master/examples/nginx.conf#L65

@nuwang
Copy link
Member Author

nuwang commented Oct 12, 2021

@mvdbeek The Cookie header solved the auth issue, thanks. But there's a small issue remaining, with Galaxy complaining that:

image

Uploaded temporary file (/galaxy/server/database/tmp/resumable_upload7b8816eaa9268e559a64702b63173feb) does not exist.

The content of the upload dir does not have resumable_upload prefixed in front.

/srv/tusd-data $ ls -alh /galaxy/server/database/tmp
total 140
drwxr-xr--    2 101      101         4.0K Oct 12 14:41 .
drwxrwxrwx   13 root     root        4.0K Oct 12 14:41 ..
-rw-r--r--    1 101      101       124.2K Oct 12 14:40 7b8816eaa9268e559a64702b63173feb
-rw-r--r--    1 101      101          570 Oct 12 14:40 7b8816eaa9268e559a64702b63173feb.info
``

@mvdbeek
Copy link
Member

mvdbeek commented Oct 12, 2021

Another doc problem ... sorry!

proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
client_max_body_size {{ .Values.nginx.conf.client_max_body_size }};
proxy_pass http://{{ template "galaxy.fullname" . }}-tusd:1080/files/;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
proxy_pass http://{{ template "galaxy.fullname" . }}-tusd:1080/files/;
proxy_pass http://{{ template "galaxy.fullname" . }}-tusd:1080/files;

I also ran into the same thing and then didn't update the docs ...

@nuwang
Copy link
Member Author

nuwang commented Oct 12, 2021

@mvdbeek Works great now, thanks! One thing I noticed was that uploads that had failed prior to these changes can still not be uploaded. But files that are successful can be.
image

On the k8s side, we are still double proxied, so I may take a crack at exposing tusd directly over k8s ingress, instead of making it double proxied through the internal galaxy nginx.

Also, this only works for authenticated users right? Anonymous users still go through the previous mechanism?

@mvdbeek
Copy link
Member

mvdbeek commented Oct 12, 2021

Works great now, thanks! One thing I noticed was that uploads that had failed prior to these changes can still not be uploaded.

Did you click on reset in the upload box ? It did work for me locally, but you do have to reset the upload box

Also, this only works for authenticated users right? Anonymous users still go through the previous mechanism?

No, it also works for anonymous (i.e not logged in) users via the session cookie. You just can't send upload requests when you haven't gotten a session cookie (or an API key) first. If you want to open uploads up completely without any checks you could skip the -hooks-http flag, whose only purpose is making sure we know that we're actually dealing with a legitimate Galaxy user (logged in or not)

@nuwang
Copy link
Member Author

nuwang commented Oct 12, 2021

For whatever reason, it doesn't work if I log out, it's using the legacy upload:
image

If I log back in, I still can't upload that failed file. This is after a hard refresh.

@nuwang
Copy link
Member Author

nuwang commented Oct 12, 2021

@mvdbeek
This is the log after logging in:
image

@mvdbeek
Copy link
Member

mvdbeek commented Oct 12, 2021

Hmm, in both cases you have a sessioncookie, so that's all the formal requirements. But I see we do some additional logic that might disable chunked uploads. galaxyproject/galaxy@5f926d9 might fix this (I haven't tested it yet)

@nuwang nuwang marked this pull request as ready for review October 12, 2021 17:36
@nuwang
Copy link
Member Author

nuwang commented Oct 13, 2021

@mvdbeek Some additional data about the failure of previously failed files.

  • It only seems to happen for files that had failed previously due to one of the afore-mentioned failures like an invalid header name.
  • The tusd backend never receives a request, so it's probably happening entirely on the client.
  • Logging in as an anonymous user allows the file to be uploaded.
  • Renaming the file also allows it to be uploaded.
  • As shown in the last screen show above, all of the HEAD requests return 200, before finally logging this:
    Failed because: Error: tus: invalid or missing offset value, originated from request (method: HEAD, url: http://localhost/galaxy/api/upload/resumable_uploadf64e9c47874b96195d41b1f4d983b663, response code: 200, response text: , request id: n/a)
    It also seems to be issuing the same HEAD request over and over again.

@nuwang
Copy link
Member Author

nuwang commented Oct 13, 2021

Another question, if anonymous access is disabled, but a malicious user nevertheless directly accesses the tusd resumable_upload endpoint, does the http hook get triggered at that point and auth validated? Is the hook endpoint https://github.com/galaxyproject/galaxy/pull/12656/files#diff-90f10fef62ca598fe3d9ff6f19cac742cb3a9f2e4cd418584e2c3d28ed5993aeR29 still to be fleshed out?

@mvdbeek
Copy link
Member

mvdbeek commented Oct 13, 2021

Another question, if anonymous access is disabled, but a malicious user nevertheless directly accesses the tusd resumable_upload endpoint, does the http hook get triggered at that point and auth validated?

Yes, it's tusd that controls the hook. So even if you say break out from an IT and manage to get access to the internal tusd url the hook will run (and fail if you don't provide a valid API key / session key).

Is the hook endpoint https://github.com/galaxyproject/galaxy/pull/12656/files#diff-90f10fef62ca598fe3d9ff6f19cac742cb3a9f2e4cd418584e2c3d28ed5993aeR29 still to be fleshed out?

Not for the current functionality. The only purpose of the endpoint is checking that either a valid api key or session id is in use.
This is more of a protection against DOS attacks / someone spamming a Galaxy server with many small uploads than anything else.

If the hook endpoint returns 403, as it would for an unauthorized user, tusd will not accept the requests. We could even use something like the existing /api/users/current_user, but I think it's better to reserve this endpoint for future enhancements.
You can check the behavior by doing curl <galaxy_url>/api/upload/hooks and see it returns 403. Or you can use tusc (https://github.com/jackhftang/tusc) and compare upload with and without a header (so tusc client <galaxy_url>/api/upload/resumable_upload some_file-H "x-api-key: $API_KEY" or tusc client <galaxy_url>/api/upload/resumable_upload some_file)

@mvdbeek
Copy link
Member

mvdbeek commented Oct 13, 2021

  • As shown in the last screen show above, all of the HEAD requests return 200, before finally logging this:
    Failed because: Error: tus: invalid or missing offset value, originated from request (method: HEAD, url: http://localhost/galaxy/api/upload/resumable_uploadf64e9c47874b96195d41b1f4d983b663, response code: 200, response text: , request id: n/a)
    It also seems to be issuing the same HEAD request over and over again.

Yeah, I also noticed this just now. It seems to be quite selective about the file in question, which is weird. That is probably a client or nginx problem.

@mvdbeek
Copy link
Member

mvdbeek commented Oct 13, 2021

Yeah, I also noticed this just now. It seems to be quite selective about the file in question, which is weird. That is probably a client or nginx problem.

Yep, a missing slash in the client 😆 ... and this was only a problem for the HEAD requests that check if an upload had already been created.

@nuwang nuwang added the feature label Oct 14, 2021
@afgane afgane merged commit eb54cbc into master Dec 3, 2021
@afgane afgane deleted the add_tusd_support branch December 3, 2021 17:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants