Skip to content

fs: improve cpSync no-filter copyDir performance #58461

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

dario-piotrowicz
Copy link
Member

move the logic in cpSync that copies a directory from src to dest from JavaScript to C++ increasing its performance

Note: this improvement is not applied if the filter option is
provided, such optimization will be looked into separately


Example of perf improvement
// index.js
const fs = require('node:fs');
const path = require('node:path');

const fileName = path.basename(__filename);

fs.rmSync('./tmp', { recursive: true, force: true });
fs.rmSync('./tmp-out', { recursive: true, force: true });

fs.mkdirSync('./tmp', { recursive: true });

for (let i = 0; i < 100; i++) {
    fs.writeFileSync(`./tmp/file-${i}.txt`, `This is file number ${i}`);
}

for (let i = 0 ; i < 10 ; i++) {
    fs.mkdirSync(`./tmp/dir-${i}`);
    for (let j = 0; j < 100; j++) {
        fs.writeFileSync(`./tmp/dir-${i}/file-${j}.txt`, `This is file number ${j} inside dir number ${i}`);
    }
}

fs.cpSync(`./tmp`, `./tmp-out`, { recursive: true });

const runs = 500_000;
for (let i = 0 ; i < runs; i++) {
    fs.cpSync(`./${fileName}`, `./tmp/${fileName}`, { errorOnExist: true });
}

Screenshot at 2025-05-26 00-29-31

@nodejs-github-bot nodejs-github-bot added c++ Issues and PRs that require attention from people who are familiar with C++. lib / src Issues and PRs related to general changes in the lib or src directory. needs-ci PRs that need a full CI run. labels May 26, 2025
Comment on lines 496 to 526
// small wait to make sure that destStat.mtime.getTime() would produce a time
// different from srcStat.mtime.getTime() if preserveTimestamps wasn't set to true
await setTimeout(5);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this change is not necessary, however without this additional wait I noticed that a non functioning preserveTimestamps logic would not consistently make this test fail

if (!opts.filter) {
// the caller didn't provide a js filter function, in this case
// we can run the whole function faster in C++
// TODO(dario-piotrowicz): look into making cpSyncCopyDir also accept the potential filter function
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can also look into the filter function in this PR if preferred but I figured it could make sense to do attempt that separately

@dario-piotrowicz dario-piotrowicz force-pushed the dario/move-cpsync-copyDir-to-cpp branch 2 times, most recently from f776f77 to 54f1daf Compare May 26, 2025 11:35
@dario-piotrowicz dario-piotrowicz marked this pull request as draft May 26, 2025 12:17
Copy link

codecov bot commented May 26, 2025

Codecov Report

Attention: Patch coverage is 77.34807% with 41 lines in your changes missing coverage. Please review.

Project coverage is 90.13%. Comparing base (c2d4c78) to head (92ed1cf).
Report is 6 commits behind head on main.

Files with missing lines Patch % Lines
src/node_file.cc 74.37% 20 Missing and 21 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main   #58461      +/-   ##
==========================================
- Coverage   90.19%   90.13%   -0.07%     
==========================================
  Files         636      636              
  Lines      187705   187878     +173     
  Branches    36852    36869      +17     
==========================================
+ Hits       169306   169335      +29     
- Misses      11161    11293     +132     
- Partials     7238     7250      +12     
Files with missing lines Coverage Δ
lib/internal/fs/cp/cp-sync.js 56.96% <100.00%> (-35.25%) ⬇️
src/node_errors.h 87.50% <ø> (ø)
src/node_file.cc 75.56% <74.37%> (-0.54%) ⬇️

... and 29 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@dario-piotrowicz dario-piotrowicz force-pushed the dario/move-cpsync-copyDir-to-cpp branch 4 times, most recently from e08f0ad to fc55e9f Compare May 26, 2025 15:13
@dario-piotrowicz dario-piotrowicz marked this pull request as ready for review May 26, 2025 15:15
@dario-piotrowicz dario-piotrowicz force-pushed the dario/move-cpsync-copyDir-to-cpp branch 3 times, most recently from 9b4154d to 5cb540a Compare May 26, 2025 23:32
src/node_file.cc Outdated

static void CpSyncCopyDir(const FunctionCallbackInfo<Value>& args) {
Environment* env = Environment::GetCurrent(args);
Isolate* isolate = env->isolate();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use args.GetIsolate() directly. No need for an environment

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but I am using the env in the function below, no? 🤔

@dario-piotrowicz dario-piotrowicz force-pushed the dario/move-cpsync-copyDir-to-cpp branch from b88a8dc to fed9c50 Compare June 1, 2025 13:46
@dario-piotrowicz dario-piotrowicz marked this pull request as draft June 1, 2025 14:30
@nodejs-github-bot
Copy link
Collaborator

src/node_file.cc Outdated
dir_entry.path(), dest_entry_path, file_copy_opts, error);
if (error) {
if (error.value() == EEXIST) {
THROW_ERR_FS_CP_EEXIST(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking about this throw... and I think that it could be nicer to just throw the EEXIST error from the C++ layer, catch it in the js layer and then throw an ERR_FS_CP_EEXIST error from there (using the appropriate class).

I did not however go with that solution as I think that catching and re-throwing in the js layer interferes with userland debuggers that can catch those as part of options such as Pause on caught exceptions, please let me know if you think I'm wrong and that throwing the error from the js layer could be a better choice here 🙂

@dario-piotrowicz dario-piotrowicz added the commit-queue-squash Add this label to instruct the Commit Queue to squash all the PR commits into the first one. label Jun 1, 2025
@nodejs-github-bot
Copy link
Collaborator

@nodejs-github-bot
Copy link
Collaborator

@dario-piotrowicz dario-piotrowicz force-pushed the dario/move-cpsync-copyDir-to-cpp branch from 34856b5 to 74ce1b2 Compare June 1, 2025 18:22
@dario-piotrowicz dario-piotrowicz removed the commit-queue-squash Add this label to instruct the Commit Queue to squash all the PR commits into the first one. label Jun 1, 2025
@nodejs-github-bot
Copy link
Collaborator

@dario-piotrowicz dario-piotrowicz force-pushed the dario/move-cpsync-copyDir-to-cpp branch from 74ce1b2 to 8e068db Compare June 1, 2025 18:46
@nodejs-github-bot
Copy link
Collaborator

@nodejs-github-bot
Copy link
Collaborator

@nodejs-github-bot
Copy link
Collaborator

@dario-piotrowicz dario-piotrowicz force-pushed the dario/move-cpsync-copyDir-to-cpp branch from a3d1f74 to b71f878 Compare June 1, 2025 23:12
@nodejs-github-bot
Copy link
Collaborator

@dario-piotrowicz dario-piotrowicz requested a review from anonrig June 2, 2025 00:05
@dario-piotrowicz dario-piotrowicz marked this pull request as ready for review June 2, 2025 00:05
@dario-piotrowicz
Copy link
Member Author

ok finally it's all green! (even on windows! 😅)

Sorry @anonrig the PR changed significantly after your review, could you have another look? 🙏

move the logic in `cpSync` that copies a directory from
src to dest from JavaScript to C++ increasing its performance

Note: this improvement is not applied if the filter option is
      provided, such optimization will be looked into separately
@dario-piotrowicz dario-piotrowicz force-pushed the dario/move-cpsync-copyDir-to-cpp branch from b71f878 to 92ed1cf Compare June 7, 2025 12:03
@dario-piotrowicz dario-piotrowicz added the request-ci Add this label to start a Jenkins CI on a PR. label Jun 7, 2025
@github-actions github-actions bot removed the request-ci Add this label to start a Jenkins CI on a PR. label Jun 7, 2025
@nodejs-github-bot
Copy link
Collaborator

@dario-piotrowicz dario-piotrowicz added the commit-queue Add this label to land a pull request using GitHub Actions. label Jun 7, 2025
@nodejs-github-bot nodejs-github-bot removed the commit-queue Add this label to land a pull request using GitHub Actions. label Jun 7, 2025
@nodejs-github-bot nodejs-github-bot merged commit 3c351c2 into nodejs:main Jun 7, 2025
60 checks passed
@nodejs-github-bot
Copy link
Collaborator

Landed in 3c351c2

@dario-piotrowicz dario-piotrowicz deleted the dario/move-cpsync-copyDir-to-cpp branch June 7, 2025 14:53
aduh95 pushed a commit that referenced this pull request Jun 7, 2025
move the logic in `cpSync` that copies a directory from
src to dest from JavaScript to C++ increasing its performance

Note: this improvement is not applied if the filter option is
      provided, such optimization will be looked into separately
PR-URL: #58461
Reviewed-By: Yagiz Nizipli <yagiz@nizipli.com>
aduh95 pushed a commit that referenced this pull request Jun 10, 2025
move the logic in `cpSync` that copies a directory from
src to dest from JavaScript to C++ increasing its performance

Note: this improvement is not applied if the filter option is
      provided, such optimization will be looked into separately
PR-URL: #58461
Reviewed-By: Yagiz Nizipli <yagiz@nizipli.com>
seriousme pushed a commit to seriousme/node that referenced this pull request Jun 10, 2025
move the logic in `cpSync` that copies a directory from
src to dest from JavaScript to C++ increasing its performance

Note: this improvement is not applied if the filter option is
      provided, such optimization will be looked into separately
PR-URL: nodejs#58461
Reviewed-By: Yagiz Nizipli <yagiz@nizipli.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c++ Issues and PRs that require attention from people who are familiar with C++. lib / src Issues and PRs related to general changes in the lib or src directory. needs-ci PRs that need a full CI run.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants