Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disable werkzeug header generation for files served via mod_xsendfile #7188

Merged
merged 1 commit into from
Jun 24, 2024

Conversation

zenmonkeykstop
Copy link
Contributor

Status

Draft mode

Description of Changes

Fixes #7186.

By default, when flask serves a file in response to a partial content request, it (or rather werkzeug) generates the necessary Content-length and Content-range headers itself. Production SD instances use mod_xsendfile to speed up downloads, but mod_xsendfile skips handling responses where said headers are already set.

This PR checks whether USE_X_SENDFILE is set, and if so, it disables extra header generation for partial content responses, allowing mod_xsendfile to handle them instead.

Testing

dev environment

prod environment (VMs ok)

Deployment

Any special considerations for deployment? Consider both:

n/a

Checklist

If you made non-trivial code changes:

  • I have written a test plan and validated it for this PR

@zenmonkeykstop zenmonkeykstop requested a review from a team as a code owner June 23, 2024 00:51
@zenmonkeykstop zenmonkeykstop changed the title Disable werkzeug header generation for files served via mod_xsendfile [2.9.0] Disable werkzeug header generation for files served via mod_xsendfile Jun 23, 2024
@eloquence
Copy link
Member

Looks good in dev env:

  • verify that you can download the full submission: curl -v -H "Authorization: Token $TOKEN" -o logo.png.gpg http://<JI_address>$DOWNLOAD_URL
  • verify that you can download the first 100 bytes of the submission: curl -v -H "Authorization: Token $TOKEN" -H "Range: bytes=0-99" -o logo.png.gpg.part-0-99 http://l<JI_address>$DOWNLOAD_URL
  • Verify that the first 100 bytes of the original submission has the same sha256 hash as the partial download.

Now for the prod test.

@@ -525,8 +525,13 @@ def col_download_all(cols_selected: List[str]) -> werkzeug.Response:

def serve_file_with_etag(db_obj: Union[Reply, Submission]) -> flask.Response:
file_path = Storage.get_default().path(db_obj.source.filesystem_id, db_obj.filename)
add_range_headers = not current_app.config["USE_X_SENDFILE"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A comment explaining why we're doing this might be nice so we don't forget.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To avoid pushing further code changes I just beefed up the commit message. Please restamp if all looks well.

@eloquence
Copy link
Member

Shuttled the .deb from this branch to my prod server and installed it there.

(TIL that you have to run curl with curl --socks5-hostname 127.0.0.1:9150 via SOCKS proxy to avoid getting "Not resolving .onion address" errors.)

Looks good there as well.

  • verify that you can download the full submission: curl -v -H "Authorization: Token $TOKEN" -o logo.png.gpg http://<JI_address>$DOWNLOAD_URL
  • verify that you can download the first 100 bytes of the submission: curl -v -H "Authorization: Token $TOKEN" -H "Range: bytes=0-99" -o logo.png.gpg.part-0-99 http://l<JI_address>$DOWNLOAD_URL
  • Verify that the first 100 bytes of the original submission has the same sha256 hash as the partial download.

@eloquence
Copy link
Member

Response headers in prod look good as well:

 Date: Mon, 24 Jun 2024 21:04:14 GMT
< Server: Apache
< X-Frame-Options: DENY
< Referrer-Policy: no-referrer
< Cross-Origin-Opener-Policy: same-origin
< Cross-Origin-Embedder-Policy: same-origin
< Cross-Origin-Resource-Policy: same-origin
< X-Content-Type-Options: nosniff
< Content-Security-Policy: default-src 'none'; script-src 'self'; style-src 'self'; img-src 'self'; font-src 'self'; frame-ancestors 'none';
< Last-Modified: Mon, 24 Jun 2024 20:24:23 GMT
< Content-Disposition: attachment; filename=1-dynastic_bedspring-doc.gz.gpg
< Cache-Control: no-store
< Etag: sha256:1a74419e78276a2840fe6466c80979ee44e96e4f85aca500516579245524cb14
< Accept-Ranges: bytes
< Content-Length: 101
< Content-Range: bytes 0-100/994713
< Content-Type: application/pgp-encrypted
< 

Copy link
Member

@eloquence eloquence left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM and appears to work as intended! One suggestion inline, up to you. Other than that, should the title of this PR be "[2.9.0]" given that it is targeting develop?

@zenmonkeykstop
Copy link
Contributor Author

It's a 2.9.0 fix but I have no problem editing it

@zenmonkeykstop zenmonkeykstop changed the title [2.9.0] Disable werkzeug header generation for files served via mod_xsendfile Disable werkzeug header generation for files served via mod_xsendfile Jun 24, 2024
@eloquence
Copy link
Member

(I may be out of date, I'm used to the convention of having the "[x.x.x]" in the PR title if it's a backport targeting a release branch.)

For partial content requests, werkzeug's default behaviour is to calculate and
add the necessary Content* headers in the response. However, mod-xsendfile, used
by SD to deliver files more efficiently, will just pass through requests with
pre-existing Content* headers, not even removing the X-Sendfile header used to
invoke it.

If USE_X_SENDIFLE is set to True in the Flask config, we should stop werkzeug from
generating headers, and just let mod_xsendfile do it.
@zenmonkeykstop zenmonkeykstop merged commit 7a67ed8 into develop Jun 24, 2024
16 of 18 checks passed
rocodes added a commit that referenced this pull request Jun 25, 2024
[2.9.0] Backport #7188: Disable werkzeug header generation for files served via mod_xsendfile
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

[2.9.0] Range requests fail against prod VM install
2 participants