Skip to content

Handle binary/octet-stream content type #190

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jan 24, 2025

Conversation

Mr0grog
Copy link
Member

@Mr0grog Mr0grog commented Jan 24, 2025

Some recent additions to our upload script (edgi-govdata-archiving/web-monitoring-processing#855) are now storing response bodies in S3 with the content type binary/octet-stream. This isn’t a valid media type (it should be application/octet-stream), and this appears to be a change in boto3 (the AWS SDK) or maybe a difference between it and AWS SDKs for other languages. Regardless, we now have data stored this way and we should handle it the same as application/octet-stream (essentially: this content-type tells us nothing one way or the other, so ignore it).

Obviously we should fix the upload script, too, but that is a secondary concern vs. actual data we have stored.

Some recent additions to our upload script are now storing response bodies in S3 with the content type `binary/octet-stream`. This appears to be a change in boto3 (the AWS SDK), where it is now using that as a generic content type instead of `application/octet-stream`. This new type is not actually valid; I'm not sure why they're doing it, but this is functionally the same, and should not cause us to consider something as "definitely not HTML".

(Obviously we should fix the upload script, too, but that is a secondary concern.)
@Mr0grog Mr0grog merged commit e8c603c into main Jan 24, 2025
7 checks passed
@Mr0grog Mr0grog deleted the hotfix-boto-got-a-little-weird-with-content-types branch January 24, 2025 04:23
Mr0grog added a commit to edgi-govdata-archiving/web-monitoring-ops that referenced this pull request Jan 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant