Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate copy_to_directory performance #258

Closed
kormide opened this issue Oct 7, 2022 · 5 comments
Closed

Investigate copy_to_directory performance #258

kormide opened this issue Oct 7, 2022 · 5 comments
Assignees
Labels
cleanup Cleanup task

Comments

@kormide
Copy link
Member

kormide commented Oct 7, 2022

See https://bazelbuild.slack.com/archives/CA31HN1T3/p1665178148461539.

A copy_to_directory on 2500 files is taking on the order of 10s. Investigate what's causing the slowness and see if there are any optimizations that can be made.

@cgrindel cgrindel added the cleanup Cleanup task label Oct 7, 2022
@jwnx
Copy link

jwnx commented Oct 11, 2022

I'm experiencing the same problem. Copying 513 files takes consistently ~2s.

screen 2022-10-11 at 13 10 01

@matthewjh
Copy link

matthewjh commented Oct 27, 2022

I am also having problems, but with analysis-phase performance of this rule. Our project is a graph of various ts_project, npm_package_link, npm_package, and js_library targets. When making a change to a BUILD file, such as to add a dep, the Bazel process uses 500% CPU for 15-20s, in the analysis phase (so before execution).

Profiling shows that most of this time is spent in copy_to_directory_action:

MicrosoftTeams-image (6)

@gregmagolan
Copy link
Member

gregmagolan commented Nov 3, 2022

Should be improved with 350408b that is included in the 1.16.0 release

@matthewjh
Copy link

Thanks, @gregmagolan , analysis performance is good now. 🙌

Execution performance is still problematic, though, which is what the OP was about. Are you guys still planning to investigate this? Should we open a new issue or re-open this one?

@matthewjh
Copy link

matthewjh commented Nov 24, 2022

It seems to me that, besides simplifying the logic, one thing that would certainly improve perf for larger directories is performing the logic in parallel on different subtrees. At the moment this rule males very poor use of CPU as everything is serial and blocking. This should be easy to implement (eg in js or go) but I'm not sure how to do it in bash alone.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cleanup Cleanup task
Projects
None yet
Development

No branches or pull requests

6 participants