wrong HTTP Content-Type for tar.gz files, breaking cabal-install behind some proxies. #615

bos opened this Issue May 24, 2012 · 6 comments

3 participants

Haskell member

(Imported from Trac #622, reported by guest on 2010-01-03)

As can be seen here,

> HEAD http://hackage.haskell.org/packages/00-index.tar.gz
Content-Encoding: x-gzip
Content-Length: 1398234
Content-Type: application/x-tar
the file is declared as a tar file with a gzip content encoding.

This breaks with at least on virus scanning proxy, namely AvkHttp (of InternetSecurity by G DATA):

> cabal update -v3
Downloading the latest package list from hackage.haskell.org
GET /packages/archive/00-index.tar.gz HTTP/1.1
Host: hackage.haskell.org
User-Agent: cabal-install/0.8.0
Creating new connection to hackage.haskell.org
HTTP/1.1 200 OK
AvkHttp: timeout-protection
AvkHttp: timeout-protection
Date: Sun, 03 Jan 2010 09:52:53 GMT
Server: Apache/2.2.3 (Debian)
Last-Modified: Sun, 03 Jan 2010 09:38:56 GMT
ETag: "388d94-1555da-5fa0cc00"
Accept-Ranges: bytes
Content-Length: 17571840
Content-Type: application/x-tar
Downloaded to /home/nils/.cabal/packages/hackage.haskell.org/00-index.tar.gz
cabal: Codec.Compression.Zlib: incorrect header check
As you can see, the proxy unpacked the data, breaking cabal-install.

https://bugs.launchpad.net/malone/+bug/173096 describes a similar problem, and has some discussion about why using Content-Encoding in this way is useful in some cases but hurtful in others.

Haskell member

(Imported comment by guest on 2010-01-03)

After reading RFC 2616, it becomes clear that the proxy is buggy.

A transparent proxy (and a virus checker ought to be one) is not allowed to change Content-Encoding. Even for non-transparent proxies, http://www.w3.org/Protocols/rfc2616/rfc2616-sec13.html#sec13.5.2 says this about Content-Encoding, Content-Range and Content-Type:

A non-transparent proxy MAY modify or add these fields to a message that does not include no-transform, but if it does so, it MUST add a Warning 214 (Transformation applied) if one does not already appear in the message (see section 14.46).

There's no such warning in the reply above.

I still think the headers should be different, but it doesn't seem to be a bug on the hackage side. Perhaps adding a Cache-Control: no-transform header to the HTTP requests from cabal-install would be a good idea as well.

Haskell member

(Imported comment by @dcoutts on 2010-01-03)

So it's clear that the Hackage server is sending the correct headers. So I think the solution is that cabal-install should be slightly smarter and look at the Content-Type and Content-Encoding. The behaviour should be:

Content-Type:      |  Content-Encoding  |  Action
application/x-gzip | (none) | decompress
application/x-tar | x-gzip | decompress
application/x-tar | (none) | none
_ | _ | error
That's the ideal. In practice we probably need to add various other non-standard aliases like application/octet-stream, application/x-tar-gz etc.

Haskell member

(Imported comment by finlay on 2010-01-06)

Replying to @dcoutts:

In the second case, can cabal-install be more tolerant to broken proxies (#686) and try decompressing, and on failure, check if already has been decompressed ?


Haskell member

(Imported comment by adept on 2010-10-20)

Just for reference: this behavior could be recreated locally on linux box by installing "ziproxy" and specifying "MaxSize? = 0" and "Gzip = false" in its config.

Haskell member

(Imported comment by @dcoutts on 2010-10-27)

Wed Oct 27 10:50:34 BST 2010  Duncan Coutts <duncan@haskell.org>

So we now are quite tolerant, perhaps too tolerant. I'm leaving this ticket open because I think the "right thing" is to be slightly more paranoid. I think we should use the headers as described above to see what we expect to happen. In the case that we expected it to be compressed but it is actually not compressed, we should handle it as we do now, but perhaps in an even more paranoid way (e.g. checking for gzip magic number, checking for a valid tar header), and perhaps also emitting a warning about the broken proxy.

Given that there is no activity since 2010, perhaps this is resolved? At any rate, I propose closing due to inactivity. Please re-open or create a new issue if this is still occurring.

/cc @tibbe

@tibbe tibbe closed this Feb 24, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment