Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fs.move fails with chmod error when disk under heavy use #357

Closed
jdalton opened this issue Feb 16, 2017 · 25 comments
Closed

fs.move fails with chmod error when disk under heavy use #357

jdalton opened this issue Feb 16, 2017 · 25 comments

Comments

@jdalton
Copy link
Contributor

jdalton commented Feb 16, 2017

Using fs-extra v2.0.0 I noticed that fs.move, which uses /lib/copy/ncp under the hood, fails with a chmod error when the disk is under heavy use.

{ Error: ENOENT: no such file or directory, chmod '/the/path-a/file'
  errno: -2,
  code: 'ENOENT',
  syscall: 'chmod',
  path: '/the/path-b/file' }

In this case I'm moving a very large directory of a few files.
I'm using pify on fs.move:

const binary = (func) => (a, b) => func(a, b)
const move = binary(pify(fs.move))
// ...
move('/the/path-a', '/the/path-b')

With this error on the move operation what ends up happening is that path-a will still exist with one or so of its files left but path-b is created and contains all files (at least it contains a copy of the file remaining in path-a).

Related to #326. \cc @nicks.

@jprichardson
Copy link
Owner

Does this same problem happen with fs-extra@1.x?

@jdalton
Copy link
Contributor Author

jdalton commented Feb 16, 2017

Does this same problem happen with fs-extra@1.x?

I'll try this evening and report back. Is there a specific v1 release you'd like me to try?

@jprichardson
Copy link
Owner

I'll try this evening and report back. Is there a specific v1 release you'd like me to try?

Thank you! Anything before 2.0.0 is fine, I'm just trying to figure out if there was a regression introduced by 2.0.0

Also, which version of Node are you using?

@jdalton
Copy link
Contributor Author

jdalton commented Feb 16, 2017

I'm using Node 7.5.0.

It's probably not a regression but more of a continuation of an existing issue.
Based on #326 the issue exists in v1.0.0 too.

@jprichardson
Copy link
Owner

It's probably not a regression but more of a continuation of an existing issue.

Agreed. Although I just wanted to cross it off my list (if everyone said they were using 7.5.x then that may be suspect).

@jprichardson
Copy link
Owner

In this case I'm moving a very large directory of a few files.

@jdalton what does this mean? Since you say a "few" files and "large" directory, are you saying the files are of a large size (hence, large directory)?

@jdalton
Copy link
Contributor Author

jdalton commented Feb 16, 2017

are you saying the files are of a large size (hence, large directory)?

Yes. You can probably mimic the scenario with a folder containing a VM image and a readme.md.

@manidlou
Copy link
Collaborator

Using the current version v2.0.0, I tested this case

src
|__ Logs
|    |__VBox.log
|
|__ file1.vdi (47.61 GB)
|__ file2.vdi (8.36 GB)
|__ file3.vdi (8.71 GB)
const fs = require('fs-extra')
const pify = require('pify')

const binary = (func) => (a, b) => func(a, b)
const move = binary(pify(fs.move))

// on Ubuntu 16.04.1
const src = 'src'
const dest = 'dest/newsrc'

// On Windows 10
const src = 'E:\\src'
const dest = 'G:\\dest\\newsrc'

move(src, dest).then(() => {
  console.log('finished successfully.')
}).catch(err => {
  console.log(err)
})

on both Ubuntu 16.04.1 intel core i7 and Windows 10 powershell on virtualbox and the move operation finished without any problems!

So @jdalton would you please elaborate a little more on your case like the system you are using and exactly how large your src is? I wanna see if I am missing something! 😏

@jdalton
Copy link
Contributor Author

jdalton commented Feb 19, 2017

@manidlou It's a flavor of Linux on a nas. The issue seems to happen more if I periodically scan the same folder for changes (like say every couple of minutes) in other active scripts. You might move the log file into the same folder as the vdi's though your setup looks representative.

@manidlou
Copy link
Collaborator

Alright, here is my guess. I think the issue comes from the fact that in lib/copy/ncp.js, in copyFile() function the writestream uses the close event while the finish event should be used instead simply because it is directly mentioned in node docs that the close event will not necessarily be emitted by all Writable streams.

From the https://nodejs.org/api/stream.html#stream_class_stream_writable docs:

Event: 'close'

Added in: v0.9.4

The 'close' event is emitted when the stream and any of its underlying resources (a file descriptor, for example) have been closed. The event indicates that no more events will be emitted, and no further computation will occur.

Not all Writable streams will emit the 'close' event.

and

Event: 'finish'

Added in: v0.9.4

The 'finish' event is emitted after the stream.end() method has been called, and all data has been flushed to the underlying system.

Furthermore, If we look at https://github.com/AvianFlu/ncp/blob/master/lib/ncp.js#L127, they use finish for the writestream in copyFile() as well.

So, I will fix that and hope that will solve your issue @jdalton. In the meantime, everyone is welcome to bring their ideas about this.

@jprichardson
Copy link
Owner

jprichardson commented Feb 21, 2017

Thanks for looking into this @manidlou!

We've gone full-circle as #327 (changed finish for close) to fix #326.

@jdalton would you be willing to modify the installed fs-extra on your end from close to finish to see if that makes a difference?

@nicks would love your input here as well on the problems that you experienced.

@nicks
Copy link
Contributor

nicks commented Feb 21, 2017

I agree that the fs documentation is ambiguous.

If you look at the NodeJS source code, it's pretty easy to see that close comes after finish,
https://github.com/nodejs/node/blob/master/lib/fs.js#L1924
and close is after what actually closes the file handle
https://github.com/nodejs/node/blob/master/lib/fs.js#L1871

i wonder if there's an underlying bug in libuv that affects the network storage that @jdalton is using. Are you only able to repro on the nas?

@jdalton
Copy link
Contributor Author

jdalton commented Feb 21, 2017

@jprichardson

would you be willing to modify the installed fs-extra on your end from close to finish to see if that makes a difference?

Sure thing!
I've stopped seeing the issue as much now that I've stopped periodically scanning the same folder.

@nicks

Are you only able to repro on the nas?

The nas is the scenario where I spotted the issue. However, Node is running on the nas too, so it's not crossing any network boundaries or anything like that.

Is there any reason deferring to the OS's mv command, when available, isn't an option?

@nicks
Copy link
Contributor

nicks commented Feb 21, 2017

@jdalton This package does use mv when available. But native OS rename only works when both the source and destination are on the same device (for some definition of 'device'). The code falls back to copy only when it's moving cross-device
https://github.com/jprichardson/node-fs-extra/blob/master/lib/move/index.js#L64

@jdalton
Copy link
Contributor Author

jdalton commented Feb 21, 2017

@nicks can you point out the mv or cp usage. In my case I'm not going cross-device so I guess it should be deferring to the system utils. I tried searching for a child_process reference in this repo but didn't come up with any matches (maybe it's in a sub dependency).

@manidlou
Copy link
Collaborator

We've gone full-circle as #327 (changed finish for close) to fix #326. 😄

@nicks you are right. fs writestream emits close after finish.

So i guess we should keep the close event and not changing it to finish. Is that correct?

@manidlou
Copy link
Collaborator

@jdalton as @nicks pointed out, fs-extra uses https://github.com/andrewrk/node-mv module for move() mehod. In that, as long as src and dest are on the same device, the ncp.js will not be called and no copy will occur instead fs native methods like fs.rename is used to achieve the move operation.

@RyanZim
Copy link
Collaborator

RyanZim commented Feb 21, 2017

@jdalton fs-extra uses fs.rename when possible, if that's not possible it uses ncp and deletes the source.

@manidlou We don't use node-mv as a dependency; lib/move/index.js is a fork of node-mv.

@jdalton We don't want to shell out to mv for a few reasons:

  • child_process is kinda slow
  • It introduces additional cross-platform issues
  • Shelling out is a security issue; without some kind of escaping, malicious filenames could possibly call arbitrary commands on the shell.

@manidlou
Copy link
Collaborator

@manidlou We don't use node-mv as a dependency; lib/move/index.js is a fork of node-mv.

Very true. My bad! I should've mentioned that is a fork and not a dependency 😁!

@jdalton
Copy link
Contributor Author

jdalton commented Mar 1, 2017

@jprichardson

would you be willing to modify the installed fs-extra on your end from close to finish to see if that makes a difference?

I'm seeing the error more when I change close to finish.

@manidlou
Copy link
Collaborator

manidlou commented Mar 1, 2017

@jdalton, thanks for your feedback. as @nicks mentioned:

If you look at the NodeJS source code, it's pretty easy to see that close comes after finish,
https://github.com/nodejs/node/blob/master/lib/fs.js#L1924

close seems to be the appropriate event to use here. So, IDK maybe in your case the issue is related to periodically check the directory because at least on my machine (which is not a powerful machine), moving those large files takes good amount of time (around 10mins or so) to be completed.

@jdalton
Copy link
Contributor Author

jdalton commented Mar 1, 2017

@manidlou

maybe in your case the issue is related to periodically check the directory

I've stopped periodically scanning the folder a bit up the thread.

Update:

As a workaround, since the error is ENOEN, I'm wrapping await move(...) in a try-catch to swallow the error if its code is ENOEN and then calling remove(source).

Looking at lib/copy/ncp.js it looks like an off-by-one error could easily slip in. I think using async await and trasnspiling back with Babel and something like fast-async would help here.

@RyanZim
Copy link
Collaborator

RyanZim commented Mar 4, 2017

@jdalton ncp.js is currently undergoing a rewrite. Mind trying this branch: #374?

@jdalton
Copy link
Contributor Author

jdalton commented Mar 13, 2017

I'm closing this because after monitoring it some more it looks like what I was seeing may have been caused by checking a file's stat after it had been moved. So checking the old path which no longer exists. I was thrown off because the error messages are the same.

A rewrite or async await use couldn't hurt though so 👍 on #374.

@jdalton jdalton closed this as completed Mar 13, 2017
@jprichardson
Copy link
Owner

@jdalton thanks for checking back to let us know!! Much appreciated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants