Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

Not having an up-to-date package list causes indexing out of bounds error #262

Closed
bos opened this Issue May 24, 2012 · 3 comments

Comments

Projects
None yet
1 participant
Contributor

bos commented May 24, 2012

(Imported from Trac #269, reported by guest on 2008-04-11)

Not having an up-to-date package list causes

cabal: Data.ByteString.Lazy.index: index too large: 0
(.. which is a bad error message.)
Contributor

bos commented May 24, 2012

(Imported comment by @dcoutts on 2008-04-11)

We think this is due to truncated downloads. We've got a sample 00-index.tar.gz that elicits this problem.

So there are two things to do:

  • work out where it's failing when reading truncated files (eg is it the gunzip or the tar format parsing that fails)
  • detect truncated downloads and decide what to do wit them.
Contributor

bos commented May 24, 2012

(Imported comment by @dcoutts on 2008-08-20)

I think what happens is that when we get a short download we still save it to disk. Then when we gunzip it it fails when gzip notices that the file is short. However we were still writing the file, so the bit that has been written out already remains. If the user does not notice the error then they'll continue and we end up trying to parse the truncated tar file.

So there are three cases of bad error handling here:

  • The HTTP lib is not notifying us that the download was shorter than expected. Perhaps we're expected to do this manually.
  • When we decompress the .tar.gz, we should do it into a temp file and only overwrite the target if the decompress succeeds.
  • The tar code needs auditing to check that it handles corrupt tar files correctly. There's newer tar code in the hackage-server that we could test with and steal if necessary.
Contributor

bos commented May 24, 2012

(Imported comment by @dcoutts on 2008-08-22)

Fixed points 2 and 3 above. This should be enough to give much better behaviour in the case of a truncated download.

Sat Aug 23 00:00:33 BST 2008  Duncan Coutts <duncan@haskell.org>
- Decompress the repo index atomically.
  So if decompression fails (eg if the index is corrupt) then
  the decompressed file does not get (partially) written.
  Sun Aug 24 19:05:01 BST 2008  Duncan Coutts <duncan@haskell.org>
- Use updated tar code
  Much more robust. Correctly detects trucated archives.
  
For the remaining point see #338.

@bos bos closed this May 24, 2012

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment