perf: fast drop LinkStageOutput #842

underfin · 2024-04-12T08:40:21Z

Description

netlify · 2024-04-12T08:40:52Z

✅ Deploy Preview for rolldown-rs canceled.

Name	Link
🔨 Latest commit	`31c526e`
🔍 Latest deploy log	https://app.netlify.com/sites/rolldown-rs/deploys/66194f647d0d6600086988e2

codecov · 2024-04-12T08:42:20Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 80.58%. Comparing base (f2c0b8a) to head (31c526e).

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #842      +/-   ##
==========================================
+ Coverage   80.50%   80.58%   +0.07%     
==========================================
  Files         133      135       +2     
  Lines        6679     6705      +26     
==========================================
+ Hits         5377     5403      +26     
  Misses       1302     1302

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

codspeed-hq · 2024-04-12T08:49:11Z

CodSpeed Performance Report

Merging #842 will not alter performance

_{Comparing perf-dorp (95d1fc2) with main (ba6b79d)}

Summary

✅ 6 untouched benchmarks

hyf0 · 2024-04-12T11:27:40Z

What's the optimization for? Rust or Node performance?

If this is only for rust performance, we don't need this due to rust users could have their own fast-drop and drop our bundler
- The reason we don't need this is that I think it's not our responsibility to do this optimization, since it could be done in user-land.
- Second, this create threads implicitly in drop method
  - Implicitly here is bad. You couldn't predict creating thread behaviors by looking into the control flow.
  - it also can't be turned off. If users don't want this behavior, they need to either give up rolldown or raise a PR. So it doesn't make scene in the first place, especially user could this optimization on their own.
If this only for node performance
- You need a benchmark data to show that this is worth it for such a big change.
- The changed code should be limited to rolldown_binding

How this optimization works?

Dropping data to another thread may show advantage on multiple time build but bad for one-time usage such as cli.

Saw you making this a draft, not worth for me to continue writing this.

underfin · 2024-04-12T12:15:22Z

I'm finding why the ci benchmark is not changed. The js benchmark makes it difficult to see changes at local, but the local rust benchmark will see the result. The pr branch is 180 ms compared to main branch 200 ms for threejs10x.

If this is only for rust performance, we don't need this due to rust users could have their own fast-drop and drop our bundler

It is a Rust problem because we never drop memory manually, it depends on Rust compiler. The problem is Module/AsT will be used at rolldown everywhere, the drop action happened at the final process, but the data is larger, it take 20% times to drop it, and it will block the main thread.

How this optimization works

The performance is due to has something to do before quit the main thread(drop data), move the action of drop data to other thread will run it and other something in parallel. Here has some detail for it.

Boshen · 2024-04-12T12:26:37Z

The technique is legit. But maybe don't call it fast_drop, call it drop_in_another_thread?

Boshen · 2024-04-12T12:28:13Z

To show evidence, you should use Mac Xcode Instruments and see if your CPUs are occupied with drops in the end.

crates/rolldown/src/drop_in_another_thread.rs

hyf0 · 2024-04-13T07:19:03Z

crates/rolldown/src/drop_in_another_thread.rs

+    drop_in_another_thread(std::mem::take(self));
+  }
+}
+
+impl Drop for Symbols {


However, I still think should do this explicitly rather do it in drop method.

hyf0 · 2024-04-12T15:30:06Z

Dropping data to another thread may show advantage on multiple time build but bad for one-time usage such as cli.

With that being said, we probably could add a flag to show we are running in cli mode and we could call std::mem:forget instead of drop_in_another_thread on the bundler. Probably would be more faster.

Brooooooklyn · 2024-04-13T07:52:33Z

Using std::mem::forget is absolutely a footgun. If you don't perform any memory release operations during a large project build, it will lead to OOM issues more frequently, especially in memory-constrained environments like GitHub Actions and Vercel build. Introducing more complex memory management mechanisms, such as GC algorithms or LRU, would significantly increase the complexity of memory management and the likelihood of bugs.

I also oppose manually spawning memory to another thread for release internally; on some platforms, like wasm32, this approach would be counterproductive.

hyf0 · 2024-04-13T08:08:46Z

crates/rolldown/src/drop_in_another_thread.rs

+
+impl Drop for Symbols {
+  fn drop(&mut self) {
+    drop_in_another_thread(std::mem::take(self));


I was so against implicity that I didn't notce this in the first place. Does this cause infinity loop? Every drop on Self will create a new Self and when the new Self is dropped another Self will be created again.

We end up create infinity threads on infinity Self.

hyf0 · 2024-04-13T08:10:56Z

Overall, I think it's valueable. rspack use this optimization and has some improment. However as @Brooooooklyn said, it's not a best way in any situation, so what we need here is only apply this optimization in some conditions.

github-actions bot added the needs-triage label Apr 12, 2024

underfin closed this Apr 12, 2024

underfin deleted the perf-dorp branch April 12, 2024 09:09

underfin restored the perf-dorp branch April 12, 2024 09:20

underfin reopened this Apr 12, 2024

underfin marked this pull request as draft April 12, 2024 11:17

underfin added 5 commits April 12, 2024 23:10

perf: fast drop LinkStageOutput

bab4641

chore: update

39091aa

chore: update

2565c69

refactor: update

c919d54

refactor: update

31ac0ab

underfin force-pushed the perf-dorp branch from 87af4f1 to 31ac0ab Compare April 12, 2024 15:11

chore: comment

31c526e

hyf0 requested changes Apr 12, 2024

View reviewed changes

hyf0 reviewed Apr 13, 2024

View reviewed changes

underfin closed this Apr 15, 2024

underfin mentioned this pull request Apr 15, 2024

Performance optimizations #867

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: fast drop LinkStageOutput #842

perf: fast drop LinkStageOutput #842

underfin commented Apr 12, 2024

netlify bot commented Apr 12, 2024 •

edited

codecov bot commented Apr 12, 2024 •

edited

codspeed-hq bot commented Apr 12, 2024 •

edited

hyf0 commented Apr 12, 2024

underfin commented Apr 12, 2024 •

edited

Boshen commented Apr 12, 2024

Boshen commented Apr 12, 2024

This comment was marked as outdated.

hyf0 Apr 13, 2024

hyf0 commented Apr 12, 2024

Brooooooklyn commented Apr 13, 2024

hyf0 Apr 13, 2024 •

edited

hyf0 commented Apr 13, 2024 •

edited

perf: fast drop LinkStageOutput #842

perf: fast drop LinkStageOutput #842

Conversation

underfin commented Apr 12, 2024

Description

netlify bot commented Apr 12, 2024 • edited

✅ Deploy Preview for rolldown-rs canceled.

codecov bot commented Apr 12, 2024 • edited

Codecov Report

codspeed-hq bot commented Apr 12, 2024 • edited

CodSpeed Performance Report

Merging #842 will not alter performance

Summary

hyf0 commented Apr 12, 2024

underfin commented Apr 12, 2024 • edited

Boshen commented Apr 12, 2024

Boshen commented Apr 12, 2024

This comment was marked as outdated.

hyf0 Apr 13, 2024

Choose a reason for hiding this comment

hyf0 commented Apr 12, 2024

Brooooooklyn commented Apr 13, 2024

hyf0 Apr 13, 2024 • edited

Choose a reason for hiding this comment

hyf0 commented Apr 13, 2024 • edited

netlify bot commented Apr 12, 2024 •

edited

codecov bot commented Apr 12, 2024 •

edited

codspeed-hq bot commented Apr 12, 2024 •

edited

underfin commented Apr 12, 2024 •

edited

hyf0 Apr 13, 2024 •

edited

hyf0 commented Apr 13, 2024 •

edited