Skip to content

All DocPad releases since November 28, 2013 have disappeared!!! #4596

Closed
balupton opened this Issue Feb 3, 2014 · 14 comments

6 participants

@balupton
balupton commented Feb 3, 2014

Big problem... a whole bunch of DocPad releases have disappeared within the last 24 hours; was on a plane, landed, and everyone has been reporting they can't get the latest DocPad release, running npm info docpad shows none of the releases since November 28, 2013:

http://pastebin.com/raw.php?i=ggVDFSeV

@balupton
balupton commented Feb 3, 2014

Just published up v6.63.3 again (it is the most recent missing version) in order to allow people to actually install the latest version of DocPad while this is sorted out.

$ npm publish
npm http PUT https://registry.npmjs.org/docpad
npm http 409 https://registry.npmjs.org/docpad
npm http GET https://registry.npmjs.org/docpad
npm http 200 https://registry.npmjs.org/docpad
npm http PUT https://registry.npmjs.org/docpad
npm http 201 https://registry.npmjs.org/docpad
+ docpad@6.63.3
@mikeumus
mikeumus commented Feb 3, 2014

:+1:

@balupton
balupton commented Feb 5, 2014

@isaacs any chance this and #4605 could be investigated...? Or at least a heads up on how I should proceed in the meantime, shall I write a script to republish all those versions? Shall I wait? What to do?

@balupton

... is disappearing releases / data corruption not an issue for npm or npm inc?

@stongo
stongo commented Feb 10, 2014

@isaacs +1 on further investigation on this issue if you could find the time, definitely seems like a npmjs.org problem ... this breaks a lot of projects

@balupton

@domenic perhaps you could look into this?

@stongo
stongo commented Feb 25, 2014

NPM team: bump

Starting to lose prospective Docpad users due to this issue.

For example from a user seeking help on IRC:

"im going to give docpad a miss until these issues are ironed out" - "issues" refers to the npm issue listed here.

@mikeumus

Hey guys, as an alternative, can we re-publish these manually?
Might be a faster temp-fix as we are but a fly among flies right now for NPM.

@balupton

Yeah, I can do up a script that republishes them from the git tags. Just that I'm not sure if that will make debugging harder for the npm team. Regardless though, perhaps we just need to give up and recover this ourselves.

@isaacs did chat to me briefly that he did start looking into it.

@balupton

I've re-published the missing versions. /cc @stongo @mikeumus

@mikeumus

Cheers @balupton ! :+1:

@isaacs
npm member
isaacs commented Feb 25, 2014

Unfortunately, I've been unable to gain much insight on this matter.

What we know is that the packages disappeared sometime around the end of January, while @balupton was on a plane, a week before I made my snapshot backup and cut over from Nodejitsu to npm, Inc's new infrastructure.

Since we weren't running the registry at that time, I don't have backups or logs from that period. I can also confirm from @balupton that the disappearance started prior to our 2014-02-04 cut over. Before flipping the switch, I spent 2 days verifying that every document in our database matched the data that in Nodejitsu's system. If anything was missing, it was missing at that time.

There was a significant registry outage on January 24th, but I don't see how that could be related.

What appears to have happened is that either a previous revision of the document was inserted into the database over the existing revision, or several revisions of the document were dropped. I don't know how this can happen, and certainly, it would be impossible in standard couchdb replication. Even if there was a collision or something, you'd just get conflicts, and a human would have to choose a leaf node by manually deleting the other leaf nodes. By the time that anyone noticed this problem, the lost revisions were compacted out.

I hate pointing fingers, especially where it regards npm, because this is my baby and I take responsibility for it. You've trusted npm with your package versions, and it let you down. I'm sorry. I don't want to sound like I'm trying to pass the buck. This sucks, and I feel terrible about it.

Also, I don't want to speculate about the effects of code I didn't write and don't maintain. Spreading FUD is undignified at best. So, I'll talk about the code we do maintain.

We have a single write-master, and the document update code has been refactored such that conflicts are extremely rare. (I hesitate to say outright impossible, though I can't see how they can occur in the current setup any more. But production is always full of surprises.) Also, we have no automated conflict-resolution, so any conflicts that DO occur will only ever be resolved by a human being, and conflicted leaf nodes will never be compacted out. Document deletions are no longer possible except by administrators, so even unpublishing a document will not cause a conflicted document rev to be removed entirely.

Also, for what it's worth (ie, not much to you, here and now), we have been backing up our databases continuously. In the future, if there are events like this, we'll always be able to rewind to any point in the history and figure out what happened. Unfortunately, I can't do that in this case, because I was not hosting the registry at that time.

If the folks at Nodejitsu have any backups or logs from around the end of January, it might be very informative. Have you reached out to them?

@smikes
smikes commented Feb 2, 2015

Can this issue be closed?

Unfortunately I don't have any more information. @isaacs did what appears to be a definitive post-mortem above, using the information available to npm.

We are trying to clean up older npm issues, so if we don't hear back from you within a week, we will close this issue. (Don't worry -- you can always come back again and open a new issue!)

Thanks!

@balupton
balupton commented Feb 2, 2015

Closed. Thanks for looking into it.

@balupton balupton closed this Feb 2, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.