Skip to content
This repository

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

simpleHttp gunzips retrieved tar-gzip file #30

Closed
erikd opened this Issue · 5 comments

3 participants

Erik de Castro Lopo Michael Snoyman Herbert Valerio Riedel
Erik de Castro Lopo

If I do:

simpleHttp tgzUrl >>= L.writeFile storeLocation

where tgzUrl is the URL for a gzipped tar file, the written file ends up as just a plain tar file instead of the gzipped tar file I was hoping for.

Willing to fix this myself and send a pull request if you can point me in the right direction. Maybe a simpleHttpRaw function?

Michael Snoyman
Owner

It's funny you should mention this now, I just ran up against this in the past few days. But AFAICT, http-enumerator is following spec correctly on this. If I download a .tar.gz from Hackage, for instance, I get the following headers:

HTTP/1.1 200 OK
Date: Mon, 29 Aug 2011 06:09:11 GMT
Server: Apache/2.2.9 (Debian) mod_python/3.3.1 Python/2.5.2
Last-Modified: Fri, 15 Jul 2011 08:08:34 GMT
ETag: "1f5a809-7ba2-4a81727e6e480"
Accept-Ranges: bytes
Content-Length: 31650
Keep-Alive: timeout=15, max=100
Connection: Keep-Alive
Content-Type: application/x-tar
Content-Encoding: x-gzip

The last two definitely imply that we should ungzip the output and return a tar file. I suppose we could add an extra option to never ungzip files ending in .gz, but that seems a bit hacky to me.

Erik de Castro Lopo

Yes, I agree that the Content-Encoding suggests that H.E should gunzip it, but if I download a tar.gz with a web browser I get a tar.gz file, not a tar file so the current simpleHttp behaviour is surprising.

In this case (ie tar.gz file), I think H.E should be looking at Content-Type as well as Content-Encoding and only gunzip the stream if the Content-Type is something like text or html.

Michael Snoyman
Owner

I sent an email to the Haskell cafe (http://www.haskell.org/pipermail/haskell-cafe/2011-August/094996.html) about this issue, let's see if anyone has a good suggestion.

Herbert Valerio Riedel
hvr commented

Please take into account, that there's also the Transfer-Encoding header which may define gzip compression (which does not pertain to the entity, as opposed to the Content-Encoding header which is meta-information about the entity)

Btw, why the x- in Content-Encoding: x-gzip? gzip is the proper registered name for gzip compression afaik...

Michael Snoyman
Owner

http://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html#sec3.5 specifies that x-gzip is the same as gzip.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.