Dataloss possible when system crashes during loadDatabase #92

szwacz · 2013-10-18T09:28:01Z

loadDatabase while running is rewriting the whole database file. This is quite fragile moment. If the system crashes at this time, data will be lost.

I don't know how probable is this issue, but I have run some dummy test...

var db = new Datastore({ filename: 'test.db' }); // file was 13MB

db.loadDatabase();

setTimeout(function () {
    process.exit(1);
}, 50);

...and data indeed was lost.

Possible fix would be ofcourse something like:
For file test.db

Change file name to test.db.bac
Do loadDatabase and save test.db
Remove test.db.bac
And while loading data first time we have to look first for test.db.bac

What do you think? Is it a real problem or too edge case to bother? Heh, life taught me there is no such thing like too edge case ;)

The text was updated successfully, but these errors were encountered:

louischatriot · 2013-10-19T18:14:18Z

Hello,

This is actually a very good point, I hadn't considered this case. Your solution is also the first idea I had, I need a bit more thought on this but that's definitely something I want to do.

Thanks !

spolu · 2013-10-20T08:00:58Z

This would break hard links. But it prevents copying the whole file. This is how emacs behaves more or less so I guess it's acceptable.

szwacz · 2013-11-16T17:25:53Z

Hah, this just happened to me in real life.
I mistakenly clicked on launching icon of my node-webkit application, thought "I dont need you now", and immediatelly hit close button. That was enough :) Database file is 0KB now.
I can patch it in my app not letting to kill the node-webkit process until nedb is in safe state, but general, bullet proof solution would be much better ofcourse.

But this proves this is real issue, not just theoretical speculation.

spolu · 2013-11-18T07:08:57Z

+1 to implement this

louischatriot · 2013-11-18T07:45:35Z

I've been planning to do it for the last month but your last message gave me the nudge :) I'm on it.

louischatriot · 2013-11-24T16:00:20Z

Hello @szwacz @spolu ,

I finally had the time to code this :) Since this is a crucial part of NeDB and you are the two would raised this issue, it would be great if you two could take a look at the code and tell me if you think there is a problem! The branch is no-dataloss-during-loaddatabase

The two parts of the failsafe loadDatabase are thoroughly tested:

Persistence.persistCachedDatabase which actually persists the database, and uses a temporary datafile to ensure no data is lost
Persistence.ensureDatafileIntegrity which is the first step of a loadDatabase and makes sure the data is in the datafile, even if a crash happened in the loadDatabase before.

There is also a test that simulates a crash during a loadDatabase. It is timed for my machine though, if you want to go as deep as this you'll need to find the proper timing for your machines.

I'm planning on merging this in master and releasing the new version on Thursday, if you don't have time to check this no worries, it is well tested.

Cheers,
Louis

szwacz · 2013-11-24T16:14:04Z

Thanks a lot. I will look into it tomorrow.

spolu · 2013-11-25T07:15:56Z

Looks good!

spolu · 2013-11-25T07:19:20Z

Ideally, it would be interesting to dig into vim/emacs startegy to see if they lock the files while reading/moving/writing them. I guess that's not a "necessity" as you explicitely say that nedb does not support parallel accesses to a DB file.

In any case, I'm looking forward to seeing this merged in master!

szwacz · 2013-11-25T11:30:57Z

I have few thought about testing suite.

You are testing agains file exists and file empty. What I guess is ok because nedb writes whole file in one chunk, but...
a) I don't have knowledge how file systems work, but is it for sure not possible to have non-empty file with part of data inside?
b) If in the futute you may migrate into stream writing, then for sure this will be the case.
But I imagine it will require some marker-lines in file to be sure about its integity, so complications and backwards compatibility issues.
Test If system crashes during a loadDatabase, the former version is not lost is pain in the butt :) All edge cases are simulated in previous tests, if there is a need to test the main db class I would do that the same way - by manipulating (simulating) malformed data on disk, not by trying to actually crash the process.
But this is yours test suite :)

spolu · 2013-11-25T11:33:28Z

1- Perfectly possible indeed! You're right.
@louischatriot An easy fix to that: move f to f~ and then write in f.tmp and finally move f.tmp to f.

szwacz · 2013-11-25T11:43:40Z

@spolu it solves nothing I think. If you have malformed files with data inside there is no other way to tell which one to use as by checking which is the newest and integral at the same time. In nedb case this is actually hard, because data is always appended to the end of file, so you can't tell which should be the trully last line of data just by looking at it.

spolu · 2013-11-25T12:26:19Z

@szwacz in that very case it does solve the problem. If the file is incomplete it means that the final rename hasn't happened and f will not exist. In that case the f~ file should be recovered.

The fact that the rename is atomic makes it work as a lock and you can't end up with a malformed f file. You can have a malformed f.tmp file, but in that case f does not exist and f~ should be recovered.

szwacz · 2013-11-25T13:07:24Z

Ok, now I understand. You want to:

Saving:

rename main to old
write tmp
rename tmp to main
delete old

Initial reading:

look for main
if main doesn't exist look for old (never bother with tmp)
if both doesn't exist start new db

So the additional rename works as ensure step that file is complete. It indeed might work.

spolu · 2013-11-25T13:15:24Z

That's it!

FlorianBruniaux · 2013-11-25T13:50:47Z

Hi,

Same issue here :

var Datastore = require('nedb'), db = new Datastore('database.db'); //database = 5ko
db.loadDatabase(); //database = 0ko

szwacz · 2013-11-25T14:13:43Z

@FlorianBruniaux I'm pretty certain snippet you provided is not enough to cause any problem. If you have similar issues please tell more about how it happens.

FlorianBruniaux · 2013-11-25T14:20:15Z

Hi, here is my whole snippet :

var requirejs = require('requirejs'),
     Datastore = require('nedb'), db = new Datastore('database.db');

requirejs.config({
    nodeRequire: require
});

requirejs([

], function () {

    exports.colisages = function(req, res) {

        db.loadDatabase();

        db.find({}, function (err, docs) {
            res.send(docs);
            console.log(docs);
        });

    };

});

szwacz · 2013-11-25T14:25:26Z

I think it have nothing to do with this topic :). db.loadDatabase() is not synchronous function, please see the documentation.

FlorianBruniaux · 2013-11-25T14:35:40Z

Yeah, exact. Sorry'

But is it normal that this single line :

Datastore = require('nedb'), db = new Datastore({ filename: 'database.db', autoload: true });

delete the whole data of database.db ? (Maybe I missed something .. ^^' )

szwacz · 2013-11-25T14:42:22Z

@FlorianBruniaux For sure this is not normal, and looks like so huge bug that I can't believe it is the case. Please investigate your problem a little more and if you still think you spoted a bug, report it in separate issue.

FlorianBruniaux · 2013-11-25T15:06:31Z

The problem came from the fact I used json data from an other test (with an other db solution). I had modified the id of my object

"id":"0001"

instead of

"_id":"mjoNZvETbK3mBVeS"

Okey. I'm an idiot 👍 !

Pb solved with a bit of attention ^^'

louischatriot · 2013-11-28T11:50:44Z

@spolu @szwacz The scheme you suggested is much better than what I had done so I changed it. I will merge this evening (its 1pm in France so in about 7 hours), feel free to comment before then if you want.

Thanks for the help!

szwacz · 2013-11-28T12:09:37Z

No further comments. Thanks for awesome work :)

louischatriot · 2013-11-29T08:14:53Z

Thanks :) It's now published, as v0.8.13

spolu · 2013-11-29T08:36:52Z

Awesome

louischatriot · 2015-11-23T14:46:49Z

This implementation was in fact not correct and prone to dataloss in case of OS (not process) crash, if kernel buffer cache was not flushed to disk.

The new implementation is very much inspired from Redis AOF persistence, see this very interesting article by antirez: http://oldblog.antirez.com/post/redis-persistence-demystified.html

Implementation here if you want to look at it: https://github.com/louischatriot/nedb/blob/master/lib/storage.js

You'll notice it doesn't need two temp files to ensure data integrity, but hwat it does need and were not present are fsync calls after data change.

louischatriot closed this as completed Nov 29, 2013

sdumetz mentioned this issue Dec 16, 2014

Data loss when system is unplugged #235

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dataloss possible when system crashes during loadDatabase #92

Dataloss possible when system crashes during loadDatabase #92

szwacz commented Oct 18, 2013

louischatriot commented Oct 19, 2013

spolu commented Oct 20, 2013

szwacz commented Nov 16, 2013

spolu commented Nov 18, 2013

louischatriot commented Nov 18, 2013

louischatriot commented Nov 24, 2013

szwacz commented Nov 24, 2013

spolu commented Nov 25, 2013

spolu commented Nov 25, 2013

szwacz commented Nov 25, 2013

spolu commented Nov 25, 2013

szwacz commented Nov 25, 2013

spolu commented Nov 25, 2013

szwacz commented Nov 25, 2013

spolu commented Nov 25, 2013

FlorianBruniaux commented Nov 25, 2013

szwacz commented Nov 25, 2013

FlorianBruniaux commented Nov 25, 2013

szwacz commented Nov 25, 2013

FlorianBruniaux commented Nov 25, 2013

szwacz commented Nov 25, 2013

FlorianBruniaux commented Nov 25, 2013

louischatriot commented Nov 28, 2013

szwacz commented Nov 28, 2013

louischatriot commented Nov 29, 2013

spolu commented Nov 29, 2013

louischatriot commented Nov 23, 2015

Dataloss possible when system crashes during loadDatabase #92

Dataloss possible when system crashes during loadDatabase #92

Comments

szwacz commented Oct 18, 2013

louischatriot commented Oct 19, 2013

spolu commented Oct 20, 2013

szwacz commented Nov 16, 2013

spolu commented Nov 18, 2013

louischatriot commented Nov 18, 2013

louischatriot commented Nov 24, 2013

szwacz commented Nov 24, 2013

spolu commented Nov 25, 2013

spolu commented Nov 25, 2013

szwacz commented Nov 25, 2013

spolu commented Nov 25, 2013

szwacz commented Nov 25, 2013

spolu commented Nov 25, 2013

szwacz commented Nov 25, 2013

spolu commented Nov 25, 2013

FlorianBruniaux commented Nov 25, 2013

szwacz commented Nov 25, 2013

FlorianBruniaux commented Nov 25, 2013

szwacz commented Nov 25, 2013

FlorianBruniaux commented Nov 25, 2013

szwacz commented Nov 25, 2013

FlorianBruniaux commented Nov 25, 2013

louischatriot commented Nov 28, 2013

szwacz commented Nov 28, 2013

louischatriot commented Nov 29, 2013

spolu commented Nov 29, 2013

louischatriot commented Nov 23, 2015