Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fs.writeFileSync triggers two file change events #6112

Closed
jandrieu opened this issue Apr 8, 2016 · 7 comments
Closed

fs.writeFileSync triggers two file change events #6112

jandrieu opened this issue Apr 8, 2016 · 7 comments
Labels
fs Issues and PRs related to the fs subsystem / file system.

Comments

@jandrieu
Copy link

jandrieu commented Apr 8, 2016

This is an upstream report from fs.writeFileSync url can result in empty reads for external watchers

I use fs.watch to trigger browserify and I use vscode as my editor (which uses node's fs.write under the hood). I noticed that sometimes editing template files that were processed early in my build pipeline, the resulting renders would build out as if the templates were empty. I thought it was an encoding issue and my dust templates failing to trigger.

However, editing the same template file in emacs worked fine. The rendering and browserifying produced the expected output. So I figured it was a vscode problem. Certainly fs.watch (despite the caveats in the documentation) was performing fine with emacs, every time.

Working with vscode guys, I was able to boil it down to fs.writeFileSync generating two change events.

I've attached three files that will demonstrate the problem.
writeBug.zip

  • watched.txt -- the file that gets watched & changed
  • watcher.js -- watches with fs.watch; reads & dumps file to console on change
  • changer.js -- reads and appends a single character, '.', to watched.txt

To reproduce the error, start the watcher.js, then trigger a few changes by running changer.js a couple times.

linux:~/Software/watchTest$  ./watcher.js &
watching watched.txt for changes

linux:~/Software/watchTest$  ./changer.js
watched.txt changed
 Arbitrary text...
watched.txt changed
 Arbitrary text...

linux:~/Software/watchTest$  ./changer.js
watched.txt changed

watched.txt changed
 Arbitrary text....

linux:~/Software/watchTest$  ./changer.js
watched.txt changed

watched.txt changed
 Arbitrary text.....

The first time, both events enabled reading the actual file. But the other changes triggered two events, and when reading the file in the first event, it appears empty.

Trying the same code using emacs instead of changer.js, and there is just one changed event and the file content is always there.

FWIW, I have not tested fs.writeFile.

I have however, tested with fs.watchFile and another watch library chokidar, none of which helped. In fact, sadly, while chokidar did manage to produce only one event instead of two, it was inconsistent as to whether or not the file would have any content.

I'm wondering if the caveats around fs.watch are misattributed and really are problems with fs.write. In any case, emacs handles the file write correctly and I have a feeling I could write a c version of changer.js that also works correctly. So, hopefully we can find out where in node.js to change to get it to issue just one change event.

@mscdex mscdex added the fs Issues and PRs related to the fs subsystem / file system. label Apr 8, 2016
@bnoordhuis
Copy link
Member

Emacs probably writes to a temp file first, then renames it (which is atomic), whereas fs.writeFileSync() modifies the file in place, resulting in contents and metadata updates.

@jandrieu
Copy link
Author

jandrieu commented Apr 8, 2016

Someone else suggested that also. I haven't verified it, but I did test it with echo >>:

linux:~/Software/watchTest$  ./watcher.js &
watching watched.txt for changes

linux:~/Software/watchTest$ echo '.' >> watched.txt
watched.txt changed
 Arbitrary text...
.

If bash's append operator can do it, I would think fs.writeFileSync could...

@bnoordhuis
Copy link
Member

I'm going to guess that >> opens the file in append-only mode. Have you checked how fs.appendFileSync() behaves?

@jandrieu
Copy link
Author

I had not tried fs.appendFileSync.

So that's intriguing. As you expected, fs.appendFileSync only triggers one event, and the content of the file is readable.

So I thought I would check the redirect operator, >. which seems closer to fs.writeFileSync:

echo "replaced" > watched.txt

In this case, there are two events triggered, just like fs.writeFileSync. However, after both events, the contents of the file are accessible:

linux:~/Software/watchTest$ echo "replaced" > watched.txt
watched.txt changed
replaced

watched.txt changed
replaced

linux:~/Software/watchTest$ echo "replaced" > watched.txt
watched.txt changed
replaced

watched.txt changed
replaced

linux:~/Software/watchTest$ echo "replaced" > watched.txt
watched.txt changed
replaced

watched.txt changed
replaced

So I think this is still a bug. The frustrating part is that it is inconsistent across the two events. The file is never actually empty, but somehow, if you attempt to read it at just the right instant, (including in response to the first change event) fs.readFileSync returns empty. Maybe the error is in fs.readFileSync...

If the behavior matched >, I think that would be ok.

@bnoordhuis
Copy link
Member

If your expectation is that the file has been fully updated by the time you receive the event, that's a faulty assumption. File watching is inherently race-y, node.js just bubbles up the events as it receives them from the operating system.

@jandrieu
Copy link
Author

I guess my expectation is that node.js would handle file writes just as consistently as other programs. I also expected VSCode to trigger changes lik./e any other editor (but I can accept that the onus is on VSCode to do the temporary file trick).

To your point, yes, I assume that when I get a change event after fs.writeFileSync, that I can access the file and see what changed. I am surprised that's unreasonable, but after my own test results agree with you. It's still odd that other file writes do better than node.js...

That led me to wondering how a simple c++ app would work.

#include <iostream>
#include <fstream>
using namespace std;

int main () {
  ofstream myfile;
  myfile.open ("watched.txt");
  myfile << "written.\n";
//  myfile.flush();
  myfile.close();
  return 0;
}

I tried this both with and without the myfile.flush() command and compared it to the changer.js already described (using fs.writeFileSync) and the echo. I compiled with

g++ changer.cpp -o changer

Running it by hand, it looked like the cpp versions were working all the time. But then I noticed a few times the contents weren't there. So I captured the output to a file and ran the changer in a loop to gather 1-5 seconds worth of samples. I used two terminals, but here's what I did:

linux:~/Software/watchTest$  ./watcher.js > fsWriteFileSync.log
linux:~/Software/watchTest$ while true; do ./changer.js; done
^C
linux:~/Software/watchTest$  ./watcher.js > echo.log
linux:~/Software/watchTest$ while true; do echo "written." > watched.txt ; done
^C
linux:~/Software/watchTest$  ./watcher.js > cpp_flush.log
linux:~/Software/watchTest$ while true; do ./changer; done
^C
linux:~/Software/watchTest$  ./watcher.js > cpp_no_flush.log
linux:~/Software/watchTest$ while true; do ./changer; done
^C

Obviously, for those last two, I recompiled changer.cpp with the flush excluded.

Perhaps unsuprising, the echo approach was the fastest and node.js the slowest. Booting up a node instance just for a file write isn't the best idea. But I would have expected a comparable set of change events from all four approaches.

Here's what I got

writer samples % of time file content is readable after both events
node.js 249 5.22% (94.78% failure)
echo 3,543 77.03% (22.97% failure)
flush_cpp 3,354 96.18% (3.82% failure)
noflush_cpp 3,352 96.54% (3.46% failure)

So, you are right. I shouldn't expect it to work. But by a huge margin, most of the time node gets it wrong (95%), while most of the time, the other approaches get it right.

I get that if it can't be fixed "correctly" it isn't worth much to partially fix it. Oh well. At least now I understand nobody gets it right at the level of a bare file write.

FWIW, maybe someone else who runs into this limitation can use this issue for a better idea of how to approach a work-around for their needs.

@bnoordhuis
Copy link
Member

So, you are right. I shouldn't expect it to work. But by a huge margin, most of the time node gets it wrong (95%), while most of the time, the other approaches get it right.

That's probably because of the overhead of moving data from JS to the kernel and back, the other three are much thinner in that respect. I expect that if you tried it with python or ruby, you'd see similar results.

Thanks for following up. I'll close the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fs Issues and PRs related to the fs subsystem / file system.
Projects
None yet
Development

No branches or pull requests

3 participants