Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: fast drop LinkStageOutput #842

Closed
wants to merge 6 commits into from
Closed

perf: fast drop LinkStageOutput #842

wants to merge 6 commits into from

Conversation

underfin
Copy link
Contributor

Description

Copy link

netlify bot commented Apr 12, 2024

Deploy Preview for rolldown-rs canceled.

Name Link
🔨 Latest commit 31c526e
🔍 Latest deploy log https://app.netlify.com/sites/rolldown-rs/deploys/66194f647d0d6600086988e2

Copy link

codecov bot commented Apr 12, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 80.58%. Comparing base (f2c0b8a) to head (31c526e).

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #842      +/-   ##
==========================================
+ Coverage   80.50%   80.58%   +0.07%     
==========================================
  Files         133      135       +2     
  Lines        6679     6705      +26     
==========================================
+ Hits         5377     5403      +26     
  Misses       1302     1302              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link

codspeed-hq bot commented Apr 12, 2024

CodSpeed Performance Report

Merging #842 will not alter performance

Comparing perf-dorp (95d1fc2) with main (ba6b79d)

Summary

✅ 6 untouched benchmarks

@underfin underfin closed this Apr 12, 2024
@underfin underfin deleted the perf-dorp branch April 12, 2024 09:09
@underfin underfin restored the perf-dorp branch April 12, 2024 09:20
@underfin underfin reopened this Apr 12, 2024
@underfin underfin marked this pull request as draft April 12, 2024 11:17
@hyf0
Copy link
Member

hyf0 commented Apr 12, 2024

  1. What's the optimization for? Rust or Node performance?
  • If this is only for rust performance, we don't need this due to rust users could have their own fast-drop and drop our bundler

    • The reason we don't need this is that I think it's not our responsibility to do this optimization, since it could be done in user-land.
    • Second, this create threads implicitly in drop method
      • Implicitly here is bad. You couldn't predict creating thread behaviors by looking into the control flow.
      • it also can't be turned off. If users don't want this behavior, they need to either give up rolldown or raise a PR. So it doesn't make scene in the first place, especially user could this optimization on their own.
  • If this only for node performance

    • You need a benchmark data to show that this is worth it for such a big change.
    • The changed code should be limited to rolldown_binding
  1. How this optimization works?

Dropping data to another thread may show advantage on multiple time build but bad for one-time usage such as cli.


Saw you making this a draft, not worth for me to continue writing this.

@underfin
Copy link
Contributor Author

underfin commented Apr 12, 2024

I'm finding why the ci benchmark is not changed. The js benchmark makes it difficult to see changes at local, but the local rust benchmark will see the result. The pr branch is 180 ms compared to main branch 200 ms for threejs10x.

If this is only for rust performance, we don't need this due to rust users could have their own fast-drop and drop our bundler

It is a Rust problem because we never drop memory manually, it depends on Rust compiler. The problem is Module/AsT will be used at rolldown everywhere, the drop action happened at the final process, but the data is larger, it take 20% times to drop it, and it will block the main thread.

How this optimization works

The performance is due to has something to do before quit the main thread(drop data), move the action of drop data to other thread will run it and other something in parallel. Here has some detail for it.

@Boshen
Copy link
Member

Boshen commented Apr 12, 2024

The technique is legit. But maybe don't call it fast_drop, call it drop_in_another_thread?

@Boshen
Copy link
Member

Boshen commented Apr 12, 2024

To show evidence, you should use Mac Xcode Instruments and see if your CPUs are occupied with drops in the end.

crates/rolldown/src/drop_in_another_thread.rs Show resolved Hide resolved
Comment on lines +22 to +26
drop_in_another_thread(std::mem::take(self));
}
}

impl Drop for Symbols {

This comment was marked as outdated.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, I still think should do this explicitly rather do it in drop method.

@hyf0
Copy link
Member

hyf0 commented Apr 12, 2024

Dropping data to another thread may show advantage on multiple time build but bad for one-time usage such as cli.

With that being said, we probably could add a flag to show we are running in cli mode and we could call std::mem:forget instead of drop_in_another_thread on the bundler. Probably would be more faster.

@Brooooooklyn
Copy link
Contributor

Using std::mem::forget is absolutely a footgun. If you don't perform any memory release operations during a large project build, it will lead to OOM issues more frequently, especially in memory-constrained environments like GitHub Actions and Vercel build. Introducing more complex memory management mechanisms, such as GC algorithms or LRU, would significantly increase the complexity of memory management and the likelihood of bugs.

I also oppose manually spawning memory to another thread for release internally; on some platforms, like wasm32, this approach would be counterproductive.


impl Drop for Symbols {
fn drop(&mut self) {
drop_in_another_thread(std::mem::take(self));
Copy link
Member

@hyf0 hyf0 Apr 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was so against implicity that I didn't notce this in the first place. Does this cause infinity loop? Every drop on Self will create a new Self and when the new Self is dropped another Self will be created again.

We end up create infinity threads on infinity Self.

@hyf0
Copy link
Member

hyf0 commented Apr 13, 2024

Overall, I think it's valueable. rspack use this optimization and has some improment. However as @Brooooooklyn said, it's not a best way in any situation, so what we need here is only apply this optimization in some conditions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants