Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gateway content type detection : ODS documents served as ZIP (MIME) #7252

Closed
hsanjuan opened this issue Apr 29, 2020 · 8 comments · Fixed by #7262
Closed

Gateway content type detection : ODS documents served as ZIP (MIME) #7252

hsanjuan opened this issue Apr 29, 2020 · 8 comments · Fixed by #7262
Labels
effort/hours Estimated to take one or several hours exp/novice Someone with a little familiarity can pick up good first issue Good issue for new contributors help wanted Seeking public contribution on this issue kind/bug A bug in existing code (including security flaws) kind/enhancement A net-new feature or improvement to an existing feature P2 Medium: Good to have, but can wait until someone steps up status/ready Ready to be worked topic/gateway Topic gateway

Comments

@hsanjuan
Copy link
Contributor

Version information: 0.5.0

Description:

It seems http.DetectContentType does not work well for these types.

We could switch to https://github.com/gabriel-vasile/mimetype ? (also see https://stackoverflow.com/a/52266327 for more options). Unsure if there is a reason have have sticked with the current detector.

Related #2164

Reported at https://discuss.ipfs.io/t/bugs-in-0-5-0-regarding-file-names-types/7845

@hsanjuan hsanjuan added kind/bug A bug in existing code (including security flaws) need/triage Needs initial labeling and prioritization labels Apr 29, 2020
@Stebalien
Copy link
Member

Sounds like a great idea!

@hsanjuan hsanjuan added exp/novice Someone with a little familiarity can pick up effort/hours Estimated to take one or several hours good first issue Good issue for new contributors help wanted Seeking public contribution on this issue kind/enhancement A net-new feature or improvement to an existing feature P2 Medium: Good to have, but can wait until someone steps up status/ready Ready to be worked topic/gateway Topic gateway and removed need/triage Needs initial labeling and prioritization labels Apr 29, 2020
@xmaysonnave
Copy link

Definitely room for improvement.
I made some experiment with gz contents. Nginx is able to gz content and modern browsers are able to properly deflate appropriately served gz html.
I really would like to proxy the proper content type. The filename trick while serving data blocks seems a good way to investigate. /ipfs/cid?filename=whatever.gz
Right while doing this, browsers open a save as dialog box.
Thanks.

@hsanjuan
Copy link
Contributor Author

At first sight it seems you are mixing Content-Encoding and Content-Type @xmaysonnave ? For browsers to deflate on the fly, I think Content-Type should still be html, and Content-Encoding gzip.

If you add a gzip file directly to ipfs... well, the content type will be gzip.

@xmaysonnave
Copy link

@hsanjuan Thanks for your fast reply. You're right it probably needs to be more detailed.
Here is my understanding (I could be wrong, do not hesitate to correct me).
It really depends the front-end. I make some quick test with infura (no idea the way their front-end/proxy is setup though).

1 - added from js-ipfs-http-client block mode no filename specified.
From my nginx: (all the tests are made in incognito mode, no cache)
https://ipfs.bluelightav.org/ipfs/bafybeif2cdpb7jbcaeuzohsmsegvilsjmki4lmly4wqhmviudul23uwlra
Response-Header:
Content-Encoding: gzip
Content-Type: text/html

https://ipfs.bluelightav.org/ipfs/bafybeif2cdpb7jbcaeuzohsmsegvilsjmki4lmly4wqhmviudul23uwlra?filename=index.html
Response-Header
Content-Encoding: gzip
Content-Type: text/html

From Infura (no Content-Encoding in each test)
https://ipfs.infura.io/ipfs/bafybeicvunyrpfm5kcgsoldsfd7olgiogq5zlkviiokx7e3hekfe32nmue
Response-Header
content-type: text/html; charset=utf-8

https://ipfs.infura.io/ipfs/bafybeicvunyrpfm5kcgsoldsfd7olgiogq5zlkviiokx7e3hekfe32nmue?filename=index.html
Response-Header
content-type: text/html; charset=utf-8

Now the test with an index.html.gz
https://ipfs.infura.io/ipfs/QmQ2x72Nw9oDhrPckfdbbjBEc6WiB3gqrnRZqmqxHMdmVS?filename=index.html
Response-Header
content-type: text/html; charset=utf-8
The browser display non deflated content.

https://ipfs.infura.io/ipfs/QmQ2x72Nw9oDhrPckfdbbjBEc6WiB3gqrnRZqmqxHMdmVS?filename=index.html.gz
Response-Header
content-type: application/x-gzip
The browser open a save as dialog box

2 - added an index.html.gz on my local server with curl
curl -X POST -F file=@/work/tiddly/tiddlywiki-ipfs/wiki/index.html.gz "http://localhost:5001/api/v0/add?pin=true"

From my local web server then:
https://ipfs.bluelightav.org/ipfs/QmeeqFYbLabqZA2KjmFTCRfAVpv4kjgRHNistw63V6Jp4X
Response-Header
Content-Type: application/x-gzip
I immediately get a save as dialog box however chrome suggest the following filename:
QmeeqFYbLabqZA2KjmFTCRfAVpv4kjgRHNistw63V6Jp4X.gz

https://ipfs.bluelightav.org/ipfs/QmeeqFYbLabqZA2KjmFTCRfAVpv4kjgRHNistw63V6Jp4X?filename=index.html.gz
Response Header
Content-Type: application/gzip

Notice that the content type is different
I get a dialog box with a save as index.html.gz

Hope it clarifies the situation. Once I start playing with gz content I do not have any Content-Encoding.

Maybe the current issue is not related.

Thanks

@hsanjuan
Copy link
Contributor Author

hsanjuan commented Apr 30, 2020

If you add a gzipped file, then this is a gzip file. The go-ipfs gateway doesn't make the guess that you actually want it to figure out the actual file content type and use that as content-type while streaming the original gzip. This is a feature that can probably be discussed in a different issue (unrelated to this).

It opens an number of questions: accessing /ipfs/Qmxxx on the gateway with the browser, saving it and adding it to IPFS would result in a different hash because deflating magic has happened, which is weird. Probably compression should be handled at the nginx level, like you do (seems infura doesn't).

But in any case -> open a new issue for that please.

@gowthamgts
Copy link
Contributor

can i work on this issue if it's not already assigned?

@gowthamgts
Copy link
Contributor

gowthamgts commented May 1, 2020

Issued a PR: #7262

@xmaysonnave
Copy link

If you add a gzipped file, then this is a gzip file. The go-ipfs gateway doesn't make the guess that you actually want it to figure out the actual file content type and use that as content-type while streaming the original gzip. This is a feature that can probably be discussed in a different issue (unrelated to this).

It opens an number of questions: accessing /ipfs/Qmxxx on the gateway with the browser, saving it and adding it to IPFS would result in a different hash because deflating magic has happened, which is weird. Probably compression should be handled at the nginx level, like you do (seems infura doesn't).

But in any case -> open a new issue for that please.

@hsanjuan Thanks for your feedback. I'll open a feature request.

ralendor pushed a commit to ralendor/go-ipfs that referenced this issue Jun 6, 2020
ralendor pushed a commit to ralendor/go-ipfs that referenced this issue Jun 6, 2020
ralendor pushed a commit to ralendor/go-ipfs that referenced this issue Jun 6, 2020
ralendor pushed a commit to ralendor/go-ipfs that referenced this issue Jun 8, 2020
ralendor pushed a commit to ralendor/go-ipfs that referenced this issue Jun 8, 2020
ralendor pushed a commit to ralendor/go-ipfs that referenced this issue Jun 8, 2020
ralendor pushed a commit to ralendor/go-ipfs that referenced this issue Jun 8, 2020
ralendor pushed a commit to ralendor/go-ipfs that referenced this issue Jun 8, 2020
ralendor pushed a commit to ralendor/go-ipfs that referenced this issue Jun 8, 2020
ralendor pushed a commit to ralendor/go-ipfs that referenced this issue Jun 8, 2020
ralendor pushed a commit to ralendor/go-ipfs that referenced this issue Jun 8, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
effort/hours Estimated to take one or several hours exp/novice Someone with a little familiarity can pick up good first issue Good issue for new contributors help wanted Seeking public contribution on this issue kind/bug A bug in existing code (including security flaws) kind/enhancement A net-new feature or improvement to an existing feature P2 Medium: Good to have, but can wait until someone steps up status/ready Ready to be worked topic/gateway Topic gateway
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants