New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Data corruption when not closing on Windows #476
Comments
That error triggers if there is a need to truncate the file. Can't reproduce this in Linux. Given no write, this should produce a value log of zero length, can you confirm? |
The value log is actually 2,147,483,648 bytes long. Even if I am writing before crashing, the value log is still the same size. |
I think in Windows, we have to create a 2GB file to make things work. That's why it is 2GB. Unfortunately, I don't have a way to test Badger in Windows. Could use help here from the community. |
@manishrjain |
Hey @djdv , could you see if you can replicate this bug, and suggest/test solutions? |
The case above is the same for me, on the 3rd run I get
with a 2,147,483,648 byte vlog. Unfortunately, I may not be much help on actually debugging the issue at the moment, but can certainly run tests. |
It seems as though something has to be done to handle dynamically sized memory-mapped files on Windows. Handling this without impacting performance seems like it could be complex. |
We're already doing all that for Windows. Line 28 in fa35388
I remember that in Windows, we need to expand the size of a file to the max size upfront. Line 42 in fa35388
So, I think what's happening here is that this file which has been expanded beyond it's written data, gets left behind when Badger is crashed in windows. And when replaying the value log, Badger determines that it needs to truncate the file to bring it back to it's valid written data. Now, that truncation was changed recently to not auto-truncate, because of this issue: #434 (comment) This is what is confusing users. In linux, truncation error means there might be a data loss. But, in windows, that's just how it works. You need to trucate because we have to overallocate upfront due to the nature of how file mmap works. So, it's not really corruption. In windows, you must pass the Truncate option. In fact, it could be tested by writing a few key-value pairs, then crashing the instance, and seeing if any of those ever get lost. I bet they wouldn't. I'm inclined to close this issue -- unless someone can prove an actual data loss. Summary: Set |
Closing the issue. Feel free to reopen if there's an actual data loss. |
@manishrjain How then can the user differentiate if it's truncating because of how Windows works or because there was actually some data loss that resulted in an inconsistent DB? |
Hard to tell, honestly. The chance of actually losing confirmed data with sync writes on, is almost nil. Can only happen if the hard drive has issues and is flipping bits. |
I have a similar issue running dgraph on a Windows VM. Anytime the VM is restarted i.e. forces dgraph to stop unexpectedly, I am unable to restart dgraph server. Lock files exist (which can be deleted), but then I get:
I don't know about any data loss. What I don't know how to do is actually get dgraph to restart at all. |
Hmm... we'll have to set the truncate option in Dgraph. I'll raise a PR. |
I've submitted a change, would be part of v1.0.7 release: |
Where to set truncate option? I am getting an error >> Unable to find log file. Please retry. so i think might be the error is same, server and zero are getting started but for some queries in ratel browser, giving this error. (using windows) |
according to dgraph-io/badger#476 (comment) Truncate needs to be set to true under Windows to make BadgerDB work without spitting out corruption errors. Signed-off-by: Luca Moser <moser.luca@gmail.com>
#470 describes and fixes a lock file issue on Windows. The author also describes a data corruption problem on Windows. I can reproduce this problem as well. Quite easily.
On my 64-bit Windows 10 machine, to simulate a crash (opening Badger and not closing it), I ran the following program three times:
The first time I ran it, it terminated gracefully. The second run, Badger said there's a lock file, so I removed it and ran the program again. The third time, Badger told me it's corrupted:
This shouldn't happen. I'm using the default options, and badger's writes should be crash-safe.
The text was updated successfully, but these errors were encountered: