Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Serialize Writes to the Same File (Fixes EPERM) #22

Closed
wants to merge 4 commits into from

Conversation

ferm10n
Copy link
Contributor

@ferm10n ferm10n commented May 11, 2017

The TL;DR:

This PR fixes many issues on dependent packages related to the EPERM issue on Windows platforms. In addition, it eliminates the dependency on slide-flow-control and replaces it with Promises. Awesome!

How to reproduce the problem

This issue only occurs on Windows platforms! 🤤
In whatever directory you fancy, do $ npm i write-file-atomic tap then create test.js with the following contents:

let test = require('tap').test
let writeFileAtomic = require('write-file-atomic')

test('concurrency', function (t) {
  var ops = 15 // number of concurrent write operations to perform
  t.plan(ops)
  for (var i = 0; i < ops; i++) {
    writeFileAtomic('test.json', 'test', function (err) {
      if (err) t.fail(err)
      else t.pass()
    })
  }
})

Lastly, run $ node test.js and you should see multiple EPERM: operation not permitted, rename '...' -> '...' errors
If not, try increasing ops to increase the chances of getting a race condition

(Possibly) related issues

Background

About a month ago I originally encountered this problem while performing a deployment to a windows dev server. I was baffled that the errors were occurring because it worked fine on my laptop. Whenever anyone accessed a page on my site after logging in, many EPERM errors were spat out. After a lot of poking around, I discovered that the problem was caused when npm installed all the latest packages specified in my package.json. As usual it tried to get the latest stuff that should still be compatible, and I guess that somewhere a breaking change was accidentally introduced. It was very difficult hunting down when, where, and why the error was occurring because of the way write-file-atomic and especially its dependency, slide-flow-control, are written (I still think they're both awesome btw. They aren't impossible to follow, it's just awkward).
I'm always going to advocate readability over compactness, and that's why I've also created PRs for slide-flow-control and write-file-atomic which helps make it at least easier to follow. Check them out!

Maybe if the PR doesn't work out we can make a kickstarter to get Isaac a bigger projector screen? #sizematters

I eventually traced the problem's roots to: session-file-store > write-file-atomic > graceful-fs on this commit
Of course, you could get rid of these errors by simply adding a npm-shrinkwrap.json. But....

Here's why npm-shrinkwrap.json isn't enough

As Isaac wrote in the commit I mentioned above, the change is important because it causes a fast fail. For the project I was working on, requests to my site would trigger an update to the express session, which in turn triggered write-file-atomic to update the file store. Multiple requests would cause a sort of race condition, and only one attempt to modify the session would succeed. This explained why the temp files seemed to be successfully removed.
As a short term solution, I just added a npm-shrinkwrap.json to force an earlier version of graceful-fs.
But this does not fix the problem, it merely stifles it!!!

#justwindowsthings 🙄

While doing my research on this problem, I kept seeing similar issues popping up here and there. It seems like the EPERM problem just doesn't want to die! And it ALWAYS seemed to occur on Windows platforms.
So why the difference? Why isn't this a problem with *nix systems?
It boils down to the way that fs.rename is implemented. The POSIX standard says

If the link named by the new argument exists and the file's link count becomes 0 when it is removed and no process has the file open, the space occupied by the file shall be freed and the file shall no longer be accessible. If one or more processes have the file open when the last link is removed, the link shall be removed before rename() returns, but the removal of the file contents shall be postponed until all references to the file are closed.

Basically, *nix systems replace the destination file if it exists during rename. However on Windows systems, it's a bit more complicated! Also, unlike *nix systems, the Windows rename() is not atomic!

The fix

In order to truly make the behavior of writing files be atomic on ALL platforms, my modifications to write-file-atomic adds a queue to keep track of the requested write operations. Writes to different files are allowed to execute in parallel, but writes to the same file are serialized! I've tested that this does in fact solve my problems mentioned earlier, and I've even written tests to cover the added functionality. If you don't like my replacement of slide-flow-control with promises, I'd be happy to put together another PR that has just the serializing feature.

Hope this helps someone somewhere!

@ferm10n ferm10n mentioned this pull request May 11, 2017
@zkat zkat requested review from zkat and iarna May 13, 2017 07:05
@zkat zkat added the bug label May 13, 2017
@ferm10n
Copy link
Contributor Author

ferm10n commented May 15, 2017

Package version is 2.2.1 because it includes the bug fix and the new promise form

@zkat
Copy link
Contributor

zkat commented May 16, 2017

Hey, I wanted to check in right quick. This looks really interesting and I've been poking at looking at it the past few days.

The CLI team (which is usually the team that takes care of this project) is kinda in the middle of a bit of a crunch with npm5 coming out and I'm not sure we'll have time to give this the thorough review it deserves for another week or two (when npm5 leaves beta). Thanks a lot for the PR. I expect one of us will be able jump in when we can to get the new npm out the door. If it ends up fixing as much stuff as you suspect, this would be amazing.

From a surface read, my main hesitation right now is the big long chains of Promises, and the lack of a Promise-injection mechanism. It might be nice if you wrote a tiny promisify() function right in there that you could use to wrap local versions of the fs functions you're using, and added opts.Promise as an option so we can inject one if we want. I think this'll also make the code generally easier to review?

I see that you have a bunch of tests added, and I for one am super stoked about anything that gets rid of slide ;).

My last request for this first-pass review is that you also update package-lock.json, since this is one of the projects we've been using to dogfood the new lockfile. You should be able to npm i -g npm5 and then just npm5 rm slide to get everything in the right shape (modulo possible beta bugs we've been stomping).

Cheers!

Copy link
Contributor

@iarna iarna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also not comfortable with this as a giant glob of changes. 19 commits is way too many, at the same time, the combined diffs are too big.

Maybe you could separate your refactor-with-no-behavior-changes patch from your new behavior patch?

Or well, alternatively we could pick up the idea that your PR implements it, but do it using the existing style and code. It does seem like the obviously right track.

@iarna
Copy link
Contributor

iarna commented Jun 27, 2017

Also, please don't merge write-file-atomic#master into your PR branch-- rebase on top of it. Anything else makes the commit history too hard to follow.

@ferm10n
Copy link
Contributor Author

ferm10n commented Jun 27, 2017

I can see how the glob of changes would be a pain. I'm trying to figure out the best way to approach this.
I suppose I can retry the commits and group them by the "refactor" and the "new behavior" parts.

Would it be easier if I just made a new branch that was a clone of your master and applied the commits needed? (but fewer this time haha) The alternative would be to force push to this branch and I'm worried that would cause more harm than good for you guys.

Advice? I'm just starting to learn how git rebase works.

Added a queue for when writes to the same file are requested, thereby preventing simultaneous rename operations.
Added tests for the queue (test/concurrency.js)
Updated package.json to reflect new changes (and removed slide)
options.Promise overrides the default native promise
promisify() cleans up control flow a bit, as suggested by zkat!
Updated documentation for new option
@ferm10n
Copy link
Contributor Author

ferm10n commented Aug 18, 2017

Hey there, it's been kinda quiet here for a while so I'd thought I'd check back in on this.
I believe I made all the changes that were requested. Next steps?

index.js Outdated
options.chown = { uid: stats.uid, gid: stats.gid }
}
return thenWriteFile()
new Promise(function (resolve) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This introduces a requirement on npm@4 (which we can't do without doing a new major of the library). So, our options are:

Use Bluebird here, or do a new major.

I'm actually pretty ok taking this as a major, I think.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was npm@4 supposed to support versions of node that do not support native promises? I thought maybe just specifying 'engines' in package.json would be sufficient

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but that isn't automatically transitive to libraries it uses. Each needs its own major bump. (Believe me, I've tried to slip those things in before, it breaks people and makes them sad. =D)

But it's not a big deal, npm can take write-file-atomic@3 just fine.

@iarna
Copy link
Contributor

iarna commented Aug 18, 2017

So thank you for reshuffling these commits, it's MUCH easier to see what you're doing now. I've merged them, with one notable change, I found the promisify function confusing as that has other meanings other libraries, so I made the injection as minimal as possible (see: f5e4eff).

Oh hey, and it turns out I was wrong, 2.0.0 was our "Node 4+" version of this library so we're all good there.

@ferm10n
Copy link
Contributor Author

ferm10n commented Aug 18, 2017

Awesome! Thanks a bunch. Closing since it was merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants