Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace StreamBall, replace tarballs with zip #10919

Merged
merged 6 commits into from
Dec 21, 2020

Conversation

mvdbeek
Copy link
Member

@mvdbeek mvdbeek commented Dec 14, 2020

The way StreamBall writes to the wsgi response just doesn't work for ASGI frameworks.
The zipstream method seems superior in that it creates a uncompressed zip archive and doesn't need a tempfile.
Additionally I think it makes sense to return zip files, as that is probably the most well known compression format.
This also adds the option to have nginx serve zip archives directly using mod_zip

@mvdbeek mvdbeek added this to the 21.01 milestone Dec 14, 2020
@mvdbeek mvdbeek added this to In progress in Galaxy API modernization via automation Dec 15, 2020
@mvdbeek mvdbeek added this to In progress in Backend Working Group via automation Dec 15, 2020
@nsoranzo
Copy link
Member

I think we could still take into account the value of app.config.upstream_gzip and use ZIP_DEFLATED instead of ZIP_STORED if it is False ?
Otherwise we should remove the option.

The way StreamBall writes to the wsgi response just doesn't work for
ASGI frameworks. The zipstream method seems superior in that it creates
a uncompressed zip archive and doesn't need a tempfile.
Additionally I think it makes sense to return zip files, as that is
probably the most well known compression format, and we could replace
this conditionally with
https://www.nginx.com/resources/wiki/modules/zip/ (while we also work
on creating archives in a celery worker).
zipstream was one of the suggested libraries in galaxyproject#9794 (comment)
@mvdbeek mvdbeek force-pushed the zipstream_replace_tarball branch 2 times, most recently from 95f1352 to d122df8 Compare December 18, 2020 11:45
@mvdbeek
Copy link
Member Author

mvdbeek commented Dec 18, 2020

Alright, I've implemented returning a header that is compatible with mod-zip. admins (@natefoo @gmauro @hexylena), do you think https://github.com/galaxyproject/galaxy/blob/d122df8e9b9fb9b6c5a32c54d132487fa44cda30/doc/source/admin/nginx.md#creating-archives-with-mod-zip is reasonable ? An alternative we could implement (but that is more work, and I'm not sure time wouldn't be better spent completely offloading archive creation to celery or jobs) would be to return locations to /api/datasets/<dataset_id>/download/<filename> which would then be served by x-send-file. The downside of that would be many individual internal requests that go through galaxy's API, but the upside is that the exposed locations don't need to be set in the nginx config.

@mvdbeek mvdbeek force-pushed the zipstream_replace_tarball branch 2 times, most recently from e644a72 to 3af98cd Compare December 18, 2020 12:00
mvdbeek and others added 2 commits December 18, 2020 16:38
Co-authored-by: Nicola Soranzo <nicola.soranzo@gmail.com>
doc/source/admin/nginx.md Outdated Show resolved Hide resolved
doc/source/admin/nginx.md Outdated Show resolved Hide resolved
doc/source/admin/nginx.md Outdated Show resolved Hide resolved
Co-authored-by: Nicola Soranzo <nicola.soranzo@gmail.com>
@natefoo
Copy link
Member

natefoo commented Dec 18, 2020

FWIW, although the option specifically mentions gzip, the idea was that if your browser and proxy are (as all modern browsers would do) using compression with Content-Encoding then you don't want to double compress. For example, usegalaxy.org right now replies with Content-Encoding: gzip. But mod-zip is certainly a nicer option here.

@mvdbeek
Copy link
Member Author

mvdbeek commented Dec 18, 2020

The upstream_gzip option was only used together with StreamBall, afaict. I know this worked for tarballs, but if it also works for uncompressed zip archives I can restore the option and set application/zip in that case.

}
```

The `internal;` statement means that the location can only be used for internal nginx requests.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️

@mvdbeek
Copy link
Member Author

mvdbeek commented Dec 19, 2020

Thanks @natefoo, adding application/zip to gzip_types actually nicely compresses the archive, so that's a good option to have when you can't use mod-zip. Added upstream_gzip back in 64b75e0

@jmchilton jmchilton merged commit c7d6f28 into galaxyproject:dev Dec 21, 2020
Galaxy API modernization automation moved this from In progress to Done Dec 21, 2020
Backend Working Group automation moved this from In progress to Done Dec 21, 2020
@nsoranzo nsoranzo deleted the zipstream_replace_tarball branch April 6, 2021 18:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

Successfully merging this pull request may close these issues.

None yet

4 participants