Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Confusing LevelDB corruption. #203

Closed
cmumford opened this issue Sep 9, 2014 · 7 comments
Closed

Confusing LevelDB corruption. #203

cmumford opened this issue Sep 9, 2014 · 7 comments
Assignees

Comments

@cmumford
Copy link
Contributor

cmumford commented Sep 9, 2014

Original issue 197 created by jtolds on 2013-07-29T15:48:54.000Z:

Like issue # 196, we recently decided to enable paranoid mode to see how good LevelDB was actually doing wrt corruption and data integrity.

We found this wacky case of corruption and can't explain it. It appears as if two threads raced on adding a record to the log file: one with a short record, and one with a long record. The short record wrote, the long record wrote, the sort record updated pointer, then the long record updated pointer. It ended up looking like some random bytes were inserted, but the rest of the records lined up on block boundaries perfectly. When loading it sees the set of zeros (how coincidental) and jumps to the next block, which, fortunately, was an end record type, and so complained under paranoid mode. The scary part is that if the record at the beginning of the next block was a full type, then it would be silent data loss, even under paranoid mode.

All of the hex dumps are sequential bytes in the file, partitioned into headers, data, and the strange data in the middle of the log.

// record header. 0x3a bytes type 01
00000000  a4 36 e6 8e 3a 00 01                              |.6..:..|

// 0x003a bytes of data
00000000  e6 48 00 00 00 00 00 00  02 00 00 00 01 14 0a 02  |.H..............|
00000010  01 16 08 b9 51 38 ba 2c  50 51 d0 39 f5 34 61 6e  |....Q8.,PQ.9.4an|
00000020  5c 43 00 01 12 0a 03 08  b9 51 38 ba 2c 50 51 d0  |\C.......Q8.,PQ.|
00000030  39 f5 34 61 6e 5c 43 02  01 16                    |9.4an\C...|

// tail of some record? those numbers look like a unix epoch time and there are
// other records in the log with a similar format.
00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000010  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000020  00 00 00 31 33 38 34 31  38 36 37 30 37           |...1384186707|

// record header. 0x35 bytes type 01
00000000  d4 13 bd a4 35 00 01                              |....5..|

// 0x35 bytes of data
00000000  e9 48 00 00 00 00 00 00  01 00 00 00 01 13 00 02  |.H..............|
00000010  00 42 e4 8a ea c1 53 f3  e1 e4 6e 74 a4 14 40 da  |.B....S...nt..@.|
00000020  90 13 08 01 18 00 50 fa  aa c4 89 ef ae b8 02 58  |......P........X|
00000030  d1 e8 36 60 01                                    |..6`.|

// record header ...
00000000  af 4d 0f 4b 20 00 01                              |.M.K ..|

I have no idea how this happened or how to fix it.

@cmumford
Copy link
Contributor Author

cmumford commented Sep 9, 2014

Comment #1 originally posted by gavinandresen on 2013-08-12T06:16:12.000Z:

We're seeing very similar corruption reported, running Bitcoin on OSX: bitcoin/bitcoin#2770

@cmumford cmumford self-assigned this Sep 9, 2014
@cmumford
Copy link
Contributor Author

cmumford commented Sep 9, 2014

Comment #2 originally posted by sudosurootdev on 2013-08-12T07:58:35.000Z:

This issue is affecting a ton of the crypto currency clients... bitcoin, litecoin, novacoin, worldcoin, on and on and on... all Level DB errors and it is for me when I shut the client down and start back up. It is so annoying because I have to delete the DB files and re download the whole block chain. Also, it is not just OSX I saw at least one person say Windows XP and I am on Ubuntu 13.04 DESKTOP ( WITH WINDOWWS TOO) and on Ubuntu SERVER...

@cmumford
Copy link
Contributor Author

cmumford commented Sep 9, 2014

Comment #3 originally posted by jonas.schnelli on 2013-08-13T06:22:17.000Z:

I would recommend to set the priority of this defect to -->"Priority-High"<--. People getting unusable/destroyed level-db's because of this issue.

@cmumford
Copy link
Contributor Author

cmumford commented Sep 9, 2014

Comment #4 originally posted by mh.in.england on 2013-08-15T08:06:46.000Z:

The issue on OS X may be that fsync apparently doesn't tell the hard disk to flush to the platters. There's a separate magic incantation for that.

@cmumford
Copy link
Contributor Author

cmumford commented Sep 9, 2014

Comment #5 originally posted by dana.powers on 2013-08-15T17:45:26.000Z:

I think it would be very helpful for bug reports on corruption to include version specifics for both OS and filesystem. This issue is probably related to getting writes flushed to disk properly and the steps necessary to do that can be dependent on both the OS and the filesystem. Leveldb is likely tuned very well for the linux stack used at Google, but for other stacks we may need to tweak the use of fsync/fdatasync etc -- I think this is what port/port_posix.h is intended for.

On Mac OS X for most filesystems, for example, it will probably require using a fcntl F_FULLFSYNC, instead of a simply fsync(), in order to guarantee writes get to non-volatile storage before returning. Other OS/fs pairs may require other tweaks. Unfortunately that may significantly degrade performance as F_FULLFSYNC will force all buffers to write, including those unrelated to leveldb (i.e., I don't believe it is file-specific). Patch from my local git repo is attached.

@cmumford
Copy link
Contributor Author

cmumford commented Sep 9, 2014

Comment #6 originally posted by jtolds on 2013-08-15T18:16:15.000Z:

Come to think of it, the originally reported corruption in this ticket was on an OS X system as well.

@cmumford
Copy link
Contributor Author

cmumford commented Sep 9, 2014

Comment #7 originally posted by jtolds on 2013-08-15T18:21:11.000Z:

oops, i was just informed that it was actually a linux VM on top of OS X. i'm sure the VM stack called the appropriate f_fullfsync, but i don't know for certain, and i don't know the specific vm used at this point. :( sorry guys.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant