-
Notifications
You must be signed in to change notification settings - Fork 29.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fs: readdir optionally returning type information #22020
Conversation
src/node_file.cc
Outdated
name_idx = 0; | ||
} | ||
|
||
name_argv[name_idx++] = Integer::New(env->isolate(), ent.type); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know at one time V8 would optimize the case for arrays where each element was of the same type. Perhaps this might perform better if we had a separate array for the type values? That would also allow us to avoid having to i \ 2
inside the JS loop, which is slower than just referencing i
.
doc/api/fs.md
Outdated
@@ -2304,9 +2385,10 @@ changes: | |||
* `path` {string|Buffer|URL} | |||
* `options` {string|Object} | |||
* `encoding` {string} **Default:** `'utf8'` | |||
* `withTypes` {boolean} **Default:** `false` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion: Expand this to withFileTypes
, since types
can refer to a bunch of different things in a programming context
doc/api/fs.md
Outdated
@@ -283,6 +283,87 @@ synchronous use libuv's threadpool, which can have surprising and negative | |||
performance implications for some applications. See the | |||
[`UV_THREADPOOL_SIZE`][] documentation for more information. | |||
|
|||
## Class: fs.DirectoryEntry |
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
src/node_file.cc
Outdated
FSReqBase* req_wrap = FSReqBase::from_req(req); | ||
FSReqAfterScope after(req_wrap, req); | ||
|
||
if (after.Proceed()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Maybe do the reverse condition and return, to save one level of indentation for the rest of the function?
src/node_file.cc
Outdated
|
||
if (name_idx >= arraysize(name_argv)) { | ||
fn->Call(env->context(), names, name_idx, name_argv) | ||
.ToLocalChecked(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this is what we do for existing code, but do you mind moving to proper error-checking here? (i.e. return if the Call
result is empty like you do below)
doc/api/fs.md
Outdated
option set to `true`, the resulting array is filled with `fs.DirectoryEntry` | ||
objects, rather than strings or `Buffers`. | ||
|
||
### dirent.name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Missing YAML?
- It seems this property should be the last one, ABC-wise.
doc/api/fs.md
Outdated
|
||
* Returns: {boolean} | ||
|
||
Returns `true` if the `fs.DirectoryEntry` object describes a file system directory. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exceeds 80 characters length.
doc/api/fs.md
Outdated
|
||
* Returns: {boolean} | ||
|
||
Returns `true` if the `fs.DirectoryEntry` object describes a first-in-first-out (FIFO) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exceeds 80 characters length.
@addaleax @vsemozhetbyt I added a squash commit which I think addresses your comments. PTAL. @mscdex What would be better as a return value: an array of arrays like |
@nodejs/fs |
Since documentation additions are significant: @nodejs/documentation |
why don't we just add scandir and scandirSync so we don't have to deal with polymorphic return types and weird options also +1 to |
I would guess it's still faster to create objects in JS land, but what I was suggesting is to pass back two separate arrays to req.oncomplete = (err, names, fileTypes) => {
if (err) {
callback(err);
return;
}
const len = names.length;
for (var i = 0; i < len; ++i)
names[i] = new DirectoryEntry(names[i], fileTypes[i]);
callback(null, names);
}; I'm not sure what V8 does optimization-wise when reusing the array like that, if it re-optimizes after the loop finishes or what, but that's the general idea I had in mind. |
I'm definitely with @devsnek on making these separate API. Having one API return completely different things based on options is .. a bit too much. Different input format causing a different output format... At that point, I think it has just become a different function. Also, the name |
@mscdex Right, that would make sense for the async case, but for the sync case, since a single value needs to be returned from the binding function, something holding both arrays is needed.
I had originally planned on doing it that way (as suggested in the linked issue), but most functions on
I did the All that being said: If there's consensus around |
A 2-D array would probably work for that case. Perhaps an even better solution that would work for both scenarios would be to create the zero-length arrays in JS land and pass them (via the context object) to C++ land and fill them in there. Then do the same process as I showed in the code snippet earlier. |
@mscdex Since the async version seems to be all set up to handle both callbacks and promises in C++, it seemed easier to just have a 2-D array in all cases, which I've done in the latest squash commit. And another CI run... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Just some nits that do not block the PR. I only roughly looked at the C++ part.
I have no strong opinion about it being a new API or not.
if (testMethod === method) { | ||
assert.strictEqual(dirent[testMethod](), true); | ||
} else { | ||
assert.strictEqual(dirent[testMethod](), false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: assert.strictEqual(dirent[testMethod](), testMethod === method);
lib/internal/fs/promises.js
Outdated
const len = names.length; | ||
for (var i = 0; i < len; i++) { | ||
names[i] = new DirectoryEntry(names[i], types[i]); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe move this part into fs/utils.js
? It is done three times and could use the abstraction.
lib/fs.js
Outdated
return result; | ||
if (!options.withFileTypes) { | ||
return result; | ||
} else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Super tiny nit: the else could be removed to reduce indentation. The same in other places.
I'm not sure what the AIX and SmartOS errors are all about. It seems like |
Here's another CI run, hopefully fixing AIX and SmartOS... |
One more CI, hopefully passing this time... |
Resume Build: https://ci.nodejs.org/job/node-test-pull-request/16109/ |
Since this introduces a binding-layer breaking change.... CITGM: https://ci.nodejs.org/view/Node.js-citgm/job/citgm-smoker/1480/ |
Rebased to deal with New CI: https://ci.nodejs.org/job/node-test-pull-request/16131/ |
Post-rebase-fix CI: https://ci.nodejs.org/job/node-test-pull-request/16134/ |
readdir and readdirSync now have a "withFileTypes" option, which, when enabled, provides an array of DirectoryEntry objects, similar to Stats objects, which have the filename and the type information. Refs: #15699 PR-URL: #22020 Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de> Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Anna Henningsen <anna@addaleax.net> Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com> Reviewed-By: Roman Reiss <me@silverwind.io> Reviewed-By: John-David Dalton <john.david.dalton@gmail.com>
readdir and readdirSync now have a "withFileTypes" option, which, when enabled, provides an array of DirectoryEntry objects, similar to Stats objects, which have the filename and the type information. Refs: #15699 PR-URL: #22020 Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de> Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Anna Henningsen <anna@addaleax.net> Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com> Reviewed-By: Roman Reiss <me@silverwind.io> Reviewed-By: John-David Dalton <john.david.dalton@gmail.com>
Notable changes: * child_process: * `TypedArray` and `DataView` values are now accepted as input by `execFileSync` and `spawnSync`. #22409 * coverage: * Native V8 code coverage information can now be output to disk by setting the environment variable `NODE_V8_COVERAGE` to a directory. #22527 * deps: * The bundled npm was upgraded to version 6.4.1. #22591 * Changelogs: [6.3.0-next.0](https://github.com/npm/cli/releases/tag/v6.3.0-next.0) [6.3.0](https://github.com/npm/cli/releases/tag/v6.3.0) [6.4.0](https://github.com/npm/cli/releases/tag/v6.4.0) [6.4.1](https://github.com/npm/cli/releases/tag/v6.4.1) * fs: * The methods `fs.read`, `fs.readSync`, `fs.write`, `fs.writeSync`, `fs.writeFile` and `fs.writeFileSync` now all accept `TypedArray` and `DataView` objects. #22150 * A new boolean option, `withFileTypes`, can be passed to to `fs.readdir` and `fs.readdirSync`. If set to true, the methods return an array of directory entries. These are objects that can be used to determine the type of each entry and filter them based on that without calling `fs.stat`. #22020 * http2: * The `http2` module is no longer experimental. #22466 * os: * Added two new methods: `os.getPriority` and `os.setPriority`, allowing to manipulate the scheduling priority of processes. #22394 * process: * Added `process.allowedNodeEnvironmentFlags`. This object can be used to programmatically validate and list flags that are allowed in the `NODE_OPTIONS` environment variable. #19335 * src: * Deprecated option variables in public C++ API. #22392 * Refactored options parsing. #22392 * vm: * Added `vm.compileFunction`, a method to create new JavaScript functions from a source body, with options similar to those of the other `vm` methods. #21571 * Added new collaborators: * [lundibundi](https://github.com/lundibundi) - Denys Otrishko PR-URL: #22716
Notable changes: * child_process: * `TypedArray` and `DataView` values are now accepted as input by `execFileSync` and `spawnSync`. #22409 * coverage: * Native V8 code coverage information can now be output to disk by setting the environment variable `NODE_V8_COVERAGE` to a directory. #22527 * deps: * The bundled npm was upgraded to version 6.4.1. #22591 * Changelogs: [6.3.0-next.0](https://github.com/npm/cli/releases/tag/v6.3.0-next.0) [6.3.0](https://github.com/npm/cli/releases/tag/v6.3.0) [6.4.0](https://github.com/npm/cli/releases/tag/v6.4.0) [6.4.1](https://github.com/npm/cli/releases/tag/v6.4.1) * fs: * The methods `fs.read`, `fs.readSync`, `fs.write`, `fs.writeSync`, `fs.writeFile` and `fs.writeFileSync` now all accept `TypedArray` and `DataView` objects. #22150 * A new boolean option, `withFileTypes`, can be passed to to `fs.readdir` and `fs.readdirSync`. If set to true, the methods return an array of directory entries. These are objects that can be used to determine the type of each entry and filter them based on that without calling `fs.stat`. #22020 * http2: * The `http2` module is no longer experimental. #22466 * os: * Added two new methods: `os.getPriority` and `os.setPriority`, allowing to manipulate the scheduling priority of processes. #22407 * process: * Added `process.allowedNodeEnvironmentFlags`. This object can be used to programmatically validate and list flags that are allowed in the `NODE_OPTIONS` environment variable. #19335 * src: * Deprecated option variables in public C++ API. #22515 * Refactored options parsing. #22392 * vm: * Added `vm.compileFunction`, a method to create new JavaScript functions from a source body, with options similar to those of the other `vm` methods. #21571 * Added new collaborators: * [lundibundi](https://github.com/lundibundi) - Denys Otrishko PR-URL: #22716
Notable changes: * child_process: * `TypedArray` and `DataView` values are now accepted as input by `execFileSync` and `spawnSync`. #22409 * coverage: * Native V8 code coverage information can now be output to disk by setting the environment variable `NODE_V8_COVERAGE` to a directory. #22527 * deps: * The bundled npm was upgraded to version 6.4.1. #22591 * Changelogs: [6.3.0-next.0](https://github.com/npm/cli/releases/tag/v6.3.0-next.0) [6.3.0](https://github.com/npm/cli/releases/tag/v6.3.0) [6.4.0](https://github.com/npm/cli/releases/tag/v6.4.0) [6.4.1](https://github.com/npm/cli/releases/tag/v6.4.1) * fs: * The methods `fs.read`, `fs.readSync`, `fs.write`, `fs.writeSync`, `fs.writeFile` and `fs.writeFileSync` now all accept `TypedArray` and `DataView` objects. #22150 * A new boolean option, `withFileTypes`, can be passed to to `fs.readdir` and `fs.readdirSync`. If set to true, the methods return an array of directory entries. These are objects that can be used to determine the type of each entry and filter them based on that without calling `fs.stat`. #22020 * http2: * The `http2` module is no longer experimental. #22466 * os: * Added two new methods: `os.getPriority` and `os.setPriority`, allowing to manipulate the scheduling priority of processes. #22407 * process: * Added `process.allowedNodeEnvironmentFlags`. This object can be used to programmatically validate and list flags that are allowed in the `NODE_OPTIONS` environment variable. #19335 * src: * Deprecated option variables in public C++ API. #22515 * Refactored options parsing. #22392 * vm: * Added `vm.compileFunction`, a method to create new JavaScript functions from a source body, with options similar to those of the other `vm` methods. #21571 * Added new collaborators: * [lundibundi](https://github.com/lundibundi) - Denys Otrishko PR-URL: #22716
Notable changes: * child_process: * `TypedArray` and `DataView` values are now accepted as input by `execFileSync` and `spawnSync`. #22409 * coverage: * Native V8 code coverage information can now be output to disk by setting the environment variable `NODE_V8_COVERAGE` to a directory. #22527 * deps: * The bundled npm was upgraded to version 6.4.1. #22591 * Changelogs: [6.3.0-next.0](https://github.com/npm/cli/releases/tag/v6.3.0-next.0) [6.3.0](https://github.com/npm/cli/releases/tag/v6.3.0) [6.4.0](https://github.com/npm/cli/releases/tag/v6.4.0) [6.4.1](https://github.com/npm/cli/releases/tag/v6.4.1) * fs: * The methods `fs.read`, `fs.readSync`, `fs.write`, `fs.writeSync`, `fs.writeFile` and `fs.writeFileSync` now all accept `TypedArray` and `DataView` objects. #22150 * A new boolean option, `withFileTypes`, can be passed to to `fs.readdir` and `fs.readdirSync`. If set to true, the methods return an array of directory entries. These are objects that can be used to determine the type of each entry and filter them based on that without calling `fs.stat`. #22020 * http2: * The `http2` module is no longer experimental. #22466 * os: * Added two new methods: `os.getPriority` and `os.setPriority`, allowing to manipulate the scheduling priority of processes. #22407 * process: * Added `process.allowedNodeEnvironmentFlags`. This object can be used to programmatically validate and list flags that are allowed in the `NODE_OPTIONS` environment variable. #19335 * src: * Deprecated option variables in public C++ API. #22515 * Refactored options parsing. #22392 * vm: * Added `vm.compileFunction`, a method to create new JavaScript functions from a source body, with options similar to those of the other `vm` methods. #21571 * Added new collaborators: * [lundibundi](https://github.com/lundibundi) - Denys Otrishko PR-URL: #22716
If possible, I'd like to see this backported to 8.x. |
const type = types[i]; | ||
if (type === UV_DIRENT_UNKNOWN) { | ||
const name = names[i]; | ||
const stats = lazyLoadFs().statSync(pathModule.resolve(path, name)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe that was asked before (I didn't read through all comments): why does this use resolve
instead of join
? At this point name
is not expected to be an absolute path.
const type = types[i]; | ||
if (type === UV_DIRENT_UNKNOWN) { | ||
const name = names[i]; | ||
const stats = lazyLoadFs().statSync(pathModule.resolve(path, name)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does this use statSync
instead of lstatSync
? This makes it rather inconsistent because in one case a Dirent may be reported as symbolic link and in case readdir
gives type === UV_DIRENT_UNKNOWN
symlinks are resolved and the type of the symlinked file is used instead.
This makes walking large directory trees much more efficient on Node 10.10 or later. See: https://lwn.net/Articles/606995/ https://www.python.org/dev/peps/pep-0471/ nodejs/node#22020 https://nodejs.org/en/blog/release/v10.10.0/ Signed-off-by: Anders Kaseorg <andersk@mit.edu>
This makes walking large directory trees much more efficient on Node 10.10 or later. See: https://lwn.net/Articles/606995/ https://www.python.org/dev/peps/pep-0471/ nodejs/node#22020 https://nodejs.org/en/blog/release/v10.10.0/ Signed-off-by: Anders Kaseorg <andersk@mit.edu>
…35286) This makes walking large directory trees much more efficient on Node 10.10 or later. See: https://lwn.net/Articles/606995/ https://www.python.org/dev/peps/pep-0471/ nodejs/node#22020 https://nodejs.org/en/blog/release/v10.10.0/ Signed-off-by: Anders Kaseorg <andersk@mit.edu>
…icrosoft#35286) This makes walking large directory trees much more efficient on Node 10.10 or later. See: https://lwn.net/Articles/606995/ https://www.python.org/dev/peps/pep-0471/ nodejs/node#22020 https://nodejs.org/en/blog/release/v10.10.0/ Signed-off-by: Anders Kaseorg <andersk@mit.edu>
readdir and readdirSync now have a "withFileTypes" option, which, when enabled,
provides an array of DirectoryEntry objects, similar to Stats bjects,
which have the filename and the type information.
Ref: #15699
Checklist
make -j4 test
(UNIX), orvcbuild test
(Windows) passes