Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hackage doesn't show latest uploads #573

Closed
sjakobi opened this issue Mar 1, 2017 · 32 comments
Closed

Hackage doesn't show latest uploads #573

sjakobi opened this issue Mar 1, 2017 · 32 comments

Comments

@sjakobi
Copy link
Member

sjakobi commented Mar 1, 2017

Reposting my comment from Reddit in case you haven't seen it yet:

https://www.reddit.com/r/haskell/comments/5wpkhh/heads_up_short_hackage_downtime_today_30_minutes/decpxvo/

@sjakobi
Copy link
Member Author

sjakobi commented Mar 1, 2017

tasty-stats-0.2.0.2 also isn't listed in the recent uploads although https://www.stackage.org/diff/nightly-2017-02-28/nightly-2017-03-01 indicates that it has been uploaded on 2017-02-28.

bildschirmfoto_2017-03-01_05-08-49

@bitemyapp
Copy link
Contributor

bitemyapp commented Mar 1, 2017

At a guess, CDN. Newer results than that is a little confusing though.

@gbaz
Copy link
Contributor

gbaz commented Mar 1, 2017

sadly, not cdn. munging the url doesn't change the results. There may have been a brief window while drives were being swapped where uploads were missed, although the procedure for the swap was developed to avoid this eventuality.

Checking the docbuilder, I see that between slack-web and mediabus, packages uploaded and which don't appear in the above listing were

intro-0.1.0.9
tasty-stats-0.2.0.2
tasty-jenkins-xml-0.1.0.0
general-games-1.0.0
git-annex-6.20170228

We should probably contact the authors of those packages and warn them that they will need to upload again.

@gbaz
Copy link
Contributor

gbaz commented Mar 1, 2017

@sjakobi
Copy link
Member Author

sjakobi commented Mar 1, 2017

I don't understand how tasty-stats-0.2.0.2 made it into Stackage. cc @snoyberg.

@gbaz
Copy link
Contributor

gbaz commented Mar 1, 2017

Same way it made it into the docbuilder so I could see it in the logs... it was uploaded properly to the "old" hackage and processed, and somehow the snapshot that the "new" hackage was set up with wasn't fully up-to-date. :-/

I hope there are no ramifications with hackage security for this, in terms of the induced rollback and rebranch of the signing. cc: @edsko and @dcoutts ...

@snoyberg
Copy link
Contributor

snoyberg commented Mar 1, 2017

The package made it into all-cabal-hashes, which is how it's included in Stackage right now: https://github.com/commercialhaskell/all-cabal-hashes/tree/hackage/tasty-stats/0.2.0.2.

The next version of Stack (and the Stackage build tool) will be using Hackage Security for getting the package list instead of all-cabal-hashes. That should mean better compatibility with any upstream issues going on with Hackage in the future. I'm not sure if "bug for bug compatibility" is a good thing or a bad thing here, simply stating the facts :)

@liskin
Copy link

liskin commented Mar 1, 2017

Haha. They say you can't delete packages uploaded to Hackage and here I am, not uploading tasty-jenkins-xml again, pretending it never happened. :-)

(the full story is that in between writing the code and publishing on Hackage, a better solution appeared and my package is not necessary/useful any more)

@minad
Copy link

minad commented Mar 1, 2017

Thx for looking into this! I re-uploaded the package versions again.

@snoyberg
Copy link
Contributor

snoyberg commented Mar 1, 2017

I just discovered that our all-cabal-hashes mirror is no longer updating, due to mismatched hashes:

ERROR: Received an unexpected exception while updating repositories: Mismatched hashes between S3 and Hackage: ("intro-0.1.0.9",(fromList [("MD5","aac49faf1c1657353926fbd9cd9167aa"),("SHA1","a36512c427da78f0262ce224ac143faa1bc92030"),("SHA256","963620591271c3a8e39cfc6b222b4ea9266da0121887298de7e6e59962d422ae"),("SHA512","7417b5bfc33286e6c3d5dcf75322e175d97953f24e3da59a15467c33fb4dcc59708be250046b6f15b1855b8e9a37b2089fd534432ce30c16a8b6226fd09368e9"),("Skein512_512","9997794dcf9564dfca22b203d82980842669ee409306b8f4015d3c05ad96ab929df5bc1cedac1763e762424290d8b24a73a68c4f63c03a3fd8958332dbc7ed3f")],11646),(fromList [("MD5","31b00b9e3327c58aa85c0d0421f0588e"),("SHA1","d5f6f18bde2954da93cf14f936ae68b627ed2818"),("SHA256","f714191d5e7f342c0504a77e7d9276a89ad585c0d9529a62b27c17a5901ecafb"),("SHA512","c5f1f29441fb69fb04c160f42522f438a7fce669961fc9e28b4f250d1708839c7fa8e001a7b84373a476a6ed5559914c1dbf468c8105dc52da597120519d45f1"),("Skein512_512","235482196c81457b1d22d8b8d881a3dd498383c6c2204ba104022fc53c8aa646ede98088394884cc0251c9c7710c79489abeba651f7963849b3ecf71745fb6c5")],11646))

This is becoming more worrying for me, as Hackage has reported two different versions of the same packages to the Hackage mirror tool. (Pinging @hvr, who may be affected as well.) I'm not really sure what the right thing to do here is. It seems like the simplest is to delete the tarball from the S3 mirror and hope everything rights itself.

@minad
Copy link

minad commented Mar 1, 2017

@snoyberg Ups, sorry! Maybe I should have waited a little longer or just have skipped this version.

@snoyberg
Copy link
Contributor

snoyberg commented Mar 1, 2017

@minad Were there actually differences in the two tarballs you uploaded?

@minad
Copy link

minad commented Mar 1, 2017

I don't think so. I used "stack upload ." both times. Will this create different hashes each time? Maybe because of timestamps?

@snoyberg
Copy link
Contributor

snoyberg commented Mar 1, 2017

Yeah, it's probably timestamps. I'll confirm shortly...

@minad
Copy link

minad commented Mar 1, 2017

Cool, thank you!

@snoyberg
Copy link
Contributor

snoyberg commented Mar 1, 2017

Yup, that's the case:

$ diff -r hackage s3/
$ ls -l hackage/intro-0.1.0.9/
total 16
-rw-r--r-- 1 michael staff 1081 Feb  4 16:07 LICENSE
-rw-r--r-- 1 michael staff 1491 Feb 14 17:42 README.md
-rw-r--r-- 1 michael staff   46 Dec 24 10:01 Setup.hs
-rw-r--r-- 1 michael staff 3749 Feb 28 19:26 intro.cabal
drwxr-xr-x 4 michael staff  136 Feb 28 15:27 src
drwxr-xr-x 5 michael staff  170 Jan  9 21:20 test
$ ls -l s3/intro-0.1.0.9/
total 16
-rw-r--r-- 1 michael staff 1081 Feb  4 16:07 LICENSE
-rw-r--r-- 1 michael staff 1491 Feb 14 17:42 README.md
-rw-r--r-- 1 michael staff   46 Dec 24 10:01 Setup.hs
-rw-r--r-- 1 michael staff 3749 Feb 28 19:26 intro.cabal
drwxr-xr-x 4 michael staff  136 Feb 28 23:38 src
drwxr-xr-x 5 michael staff  170 Jan  9 21:20 test

Note the different timestamp on src. I can't think of a solution to this except manually replacing the S3 tarball, and hope this doesn't negatively impact anything else.

Given this, I'd recommend that anyone else planning on reuploading use a new package version. It would be nice if Hackage could somehow block reuploads of these name/version combos too.

@snoyberg
Copy link
Contributor

snoyberg commented Mar 1, 2017

I think we have a serious problem on Hackage right now. I just ran this script:

if [ ! -f 01-index.tar.gz ]
then
  wget https://hackage.haskell.org/01-index.tar.gz
fi

if [ ! -f intro-0.1.0.9.tar.gz ]
then
  wget https://hackage.haskell.org/package/intro-0.1.0.9.tar.gz
fi

tar zxfv 01-index.tar.gz intro/0.1.0.9/package.json
cat intro/0.1.0.9/package.json
echo
md5sum intro-0.1.0.9.tar.gz
sha256sum intro-0.1.0.9.tar.gz

The output was:

intro/0.1.0.9/package.json
{"signatures":[],"signed":{"_type":"Targets","expires":null,"targets":{"<repo>/package/intro-0.1.0.9.tar.gz":{"hashes":{"md5":"aac49faf1c1657353926fbd9cd9167aa","sha256":"963620591271c3a8e39cfc6b222b4ea9266da0121887298de7e6e59962d422ae"},"length":11646}},"version":0}}
31b00b9e3327c58aa85c0d0421f0588e  intro-0.1.0.9.tar.gz
f714191d5e7f342c0504a77e7d9276a89ad585c0d9529a62b27c17a5901ecafb  intro-0.1.0.9.tar.gz

Notice how the checksums differ between the package.json file and what I've calculated from the actual download. In fact, it appears that the original checksums that all-cabal-tool was complaining about (which are also the ones original present on the S3 mirror) are still present in package.json, and in conflict with what Hackage is now serving. Looks like there was an incomplete sync.

@dcoutts
Copy link
Contributor

dcoutts commented Mar 1, 2017

So I don't yet know why we have different copies of certain packages, though almost certainly related to the server move, but the current inconsistency between the content of the intro-0.1.0.9.tar.gz and its expected content from the index is due to CDN caching.

Compare

At the time of writing, the 1st has the content that @snoyberg says above, while the 2nd has the expected content.

As an experiment we're purging the CDN cache for that file to check, so the above will likely no longer be true for other observers.

We should identify all the inconsistencies and check what is going on with them, to check if there's anything suspicious or othrerwise try to identify the cause of the problem during the server move.

One thing to note is that the new server does have the content stored (in it's blob store) for both the versions of the intro-0.1.0.9.tar.gz file we're talking about here.

@snoyberg
Copy link
Contributor

snoyberg commented Mar 1, 2017

Thanks @dcoutts, I can confirm that using hackage-origin solves the mismatched hash info (I must have gotten "original" and "new" confused in my previous comments). I've manually uploaded the correct version to our S3 mirror, hopefully that will suffice for the all-cabal-tool job to be satisfied.

@alanz
Copy link

alanz commented Mar 1, 2017

The hackage move happened in the middle of the S3 outage, maybe that is where the CDN corruption came in

@snoyberg
Copy link
Contributor

snoyberg commented Mar 1, 2017

@alanz My understanding of things is that the CDN is caching the original version of the intro package, which was uploaded to the old server and not synced to the new server. When the new server came online, it had no record of that intro release, so a new release was accepted. That new release had a different checksum. As @dcoutts now demonstrated, Hackage is serving a matching 01-index.tar file and intro-0.1.0.9.tar.gz file (contrary to my claim above). My demonstration of mismatched checksums are due to the CDN holding onto the original intro-0.1.0.9.tar.gz file.

tl;dr: I don't think the S3 outage has anything to do with this. The CDN isn't corrupt, it's just serving older content.

@joeyh
Copy link

joeyh commented Mar 1, 2017 via email

@cgorski
Copy link

cgorski commented Mar 1, 2017

Re-uploaded general-games under new minor version number 1.0.1

@dcoutts
Copy link
Contributor

dcoutts commented Mar 1, 2017

Ok, so:

  • intro-0.1.0.9 currently a mismatch between CDN and origin server, but origin server reports correct content
  • tasty-stats-0.2.0.2 still present with the expected content hashes
  • tasty-jenkins-xml-0.1.0.0 still missing
  • general-games-1.0.0 was missing, now uploaded under a new version number
  • git-annex-6.20170228 was missing, now uploaded under a new version number

On the intro-0.1.0.9 mismatch issue, it is interesting to note that service for end users using hackage-security has not been affected because it automatically selects another mirror (you can see this with cabal fetch -v3). Obviously this does not help for the mirrors themselves, but these should be using the hackage-origin DNS name to bypass the CDN.

@sjakobi
Copy link
Member Author

sjakobi commented Mar 2, 2017

Would you mind writing a post-mortem once the investigation is complete? I'm mostly interested in precisely how this went wrong and how you'll try to prevent it from happening again.

@cgorski
Copy link

cgorski commented Mar 2, 2017 via email

@gbaz
Copy link
Contributor

gbaz commented Mar 2, 2017

From hvr's more complete list, also missing from that window was machines-amazonka. I'll make sure to inform the author (who doesn't seem to be on github). And that should cover it. Eeesh :-)

@theunixman
Copy link

I'm reuploading machines-amazonka now, and I've got a GH account but rarely use it. Thanks!

@sjakobi
Copy link
Member Author

sjakobi commented Mar 2, 2017

@gbaz

From hvr's more complete list

What are referring to here?

@gbaz
Copy link
Contributor

gbaz commented Mar 2, 2017

He had logs from matrix.hackage that included a single further package, and forwarded them to me offthread.

@sjakobi
Copy link
Member Author

sjakobi commented Mar 2, 2017

He had logs from matrix.hackage that included a single further package, and forwarded them to me offthread.

Ah, nice! Good to hear that hvr is active, I'd been wondering where he'd been.

@gbaz
Copy link
Contributor

gbaz commented Jul 14, 2017

The incident with this particular transition I think was resolved by the end of this thread. Going to close this ticket as part of cleaning things up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests