Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect Content-type header for StreamingStorage factory #80

Open
ghost opened this issue Mar 9, 2021 · 1 comment
Open

Incorrect Content-type header for StreamingStorage factory #80

ghost opened this issue Mar 9, 2021 · 1 comment

Comments

@ghost
Copy link

ghost commented Mar 9, 2021

This is a bug/new feature request in which Giftless is used with CKAN, ckanext-blob-storage and datapusher, see original discussion.

When using StreamingStorage (factory: giftless.transfer.basic_external:factory) the Content-type of a file to be stored is not preserved. For instance, after uploading a csv-file and then downloaded its Content-type returned by Giftless is text/html which causes an error at datapusher thus resulting that tabular data cannot be previewed in CKAN UI. ExternalStorage seems to work correctly in this sense.

@shevron
Copy link
Contributor

shevron commented Mar 9, 2021

Thanks for reporting, I think this would be useful to implement as some clients depend on the correct content type. I don't consider this a bug per se because handling content type properly is not a part of the Git LFS spec, but more of a Giftless-specific extension / capability that is currently only supported by some backends / transfer modes.

Some thoughts on this (kind of a brain dump rather than something organized):

  1. Supporting this in ExternalStorage depends on the storage backend, because it is the backend that sends the HTTP reply. I know we implemented support for content-type preservation / pre-setting in Azure / GCP but not all backends may support this.
  2. Supporting this in StreamingStorage is a bit different, because there could be different strategies to pass the Content-type header from the storage backend (if preserved / saved) or pre-set it when crafting upload / download URLs for example, like we do with JWT tokens and handling x-filename.
  3. Saving the content-type when LocalStorage is used will be harder - we will need to save it on disk somewhere. Or we can just rely on the file name or mime-magic type DB to guess the content-type and fall back to some default (application/octet-stream)
  4. There was some discussion about this here: Handle filenames for basic streaming transfer adapter. #68
  5. It may be easiest (and will solve the CKAN use case) if we just pass-through a value that is passed as a query param, then fall back to guess based on file extension (Python built-in capability ... I don't remember the standard library module that does this) then fall back to a default such as application/octet-stream.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant