Skip to content
This repository has been archived by the owner on Apr 22, 2023. It is now read-only.

Large number of async file writes cause data corruption #808

Closed
TrevorBurnham opened this issue Mar 19, 2011 · 2 comments
Closed

Large number of async file writes cause data corruption #808

TrevorBurnham opened this issue Mar 19, 2011 · 2 comments

Comments

@TrevorBurnham
Copy link

I've been trying to generate a large file, writing one line at a time, and I ran into a serious problem. Here's a test case (original CoffeeScript gisted here:

var count, fd, fs, max, writeRandomNum;
fs = require('fs');
count = 0;
max = Math.pow(10, 7);
fd = fs.openSync('test.txt', 'w+');
(writeRandomNum = function() {
  var buffer, str;
  count++;
  str = Math.random() + '\n';
  buffer = new Buffer(str);
  fs.write(fd, buffer, 0, str.length, null);
  if (!str.match(/^(\d\.\d+(e-\d+)?)\n$/)) {
    console.log("At count " + count + ", invalid string: " + str);
  }
  if (count === max) {
    return fs.close(fd);
  } else {
    return process.nextTick(writeRandomNum);
  }
})();

This will simply write ten million random numbers to a file, separated by newlines. Try running it; it'll take 15+ minutes to complete. Then go to the command line and do

grep -nE ^[0-9][0-9]+ test.txt

This will match several lines starting with more than one decimal digit, which should be impossible (in fact, the code that wrote those lines would have written to the console to alert us, because such a line doesn't match /^(\d\.\d+(e-\d+)?)\n$/). For instance, in one run I got

8712:59
9979:3712P0.7081007240340114
10826:059193873312324
11673:714079976082
12952:68

...and so on for about 50 lines.

I've done repeated tests, and this never happens when I do a million writes. It's that jump to 10,000,000 that does it (even though the bad writes can, as you see above, occur well before the millionth one)!

For the record, I'm running Node 0.4.2 under Mac OS 10.6.6, with an SSD. I haven't run this test on any other systems.

@isaacs
Copy link

isaacs commented Mar 19, 2011

The order of repeated fs.write() calls is not guaranteed. If you want to make sure that the order is constant, then you need to wait for the cb to fire before doing the next write, or use fs.createWriteStream, which buffers writes for you.

@isaacs isaacs closed this as completed Mar 19, 2011
@TrevorBurnham
Copy link
Author

The docs do mention inconsistent order as a possibility, but it hadn't occurred to me that this could cause writes to overlap or be truncated (and goodness know how that P got in there).

I've submitted a pull request to note this in the docs: #811

coolaj86 pushed a commit that referenced this issue Apr 15, 2011
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants