Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoids emitting the same output from multiple processes of the same process pool #3531

Merged
merged 4 commits into from
Feb 1, 2023

Conversation

sokra
Copy link
Member

@sokra sokra commented Jan 30, 2023

It uses a shared set of (line: [u8], occurences: u32) to avoid emitting lines that has been emitted by other processes already.

It still supports emitting the same line multiple times from one process.

The shared data might be later used for showing logging from a worker pool. It contains the merged output of all processes.

Fixes WEB-489

@sokra sokra requested a review from a team as a code owner January 30, 2023 18:13
@vercel
Copy link

vercel bot commented Jan 30, 2023

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated
examples-basic-web ✅ Ready (Inspect) Visit Preview 💬 Add your feedback Feb 1, 2023 at 3:17PM (UTC)
examples-cra-web ✅ Ready (Inspect) Visit Preview 💬 Add your feedback Feb 1, 2023 at 3:17PM (UTC)
examples-designsystem-docs ✅ Ready (Inspect) Visit Preview 💬 Add your feedback Feb 1, 2023 at 3:17PM (UTC)
examples-kitchensink-blog ✅ Ready (Inspect) Visit Preview 💬 Add your feedback Feb 1, 2023 at 3:17PM (UTC)
examples-native-web ✅ Ready (Inspect) Visit Preview 💬 Add your feedback Feb 1, 2023 at 3:17PM (UTC)
examples-nonmonorepo ✅ Ready (Inspect) Visit Preview 💬 Add your feedback Feb 1, 2023 at 3:17PM (UTC)
examples-svelte-web ✅ Ready (Inspect) Visit Preview 💬 Add your feedback Feb 1, 2023 at 3:17PM (UTC)
examples-tailwind-web ✅ Ready (Inspect) Visit Preview 💬 Add your feedback Feb 1, 2023 at 3:17PM (UTC)
examples-vite-web ✅ Ready (Inspect) Visit Preview 💬 Add your feedback Feb 1, 2023 at 3:17PM (UTC)
turbo-site ✅ Ready (Inspect) Visit Preview 💬 Add your feedback Feb 1, 2023 at 3:17PM (UTC)

crates/turbopack-node/src/pool.rs Outdated Show resolved Hide resolved
crates/turbopack-node/src/pool.rs Outdated Show resolved Hide resolved
let mut shared = shared.lock().unwrap();
shared.insert((line.clone(), occurances))
};
if new_line {
Copy link
Member

@ForsakenHarmony ForsakenHarmony Jan 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if I'm understanding this correctly, a new line gets logged whenever there are more than the previous amount of occurrences?

This also means if one execution logs something 2 times and the next logs it 3 times, it will only output 1 time for the next execution? That seems like confusing behavior that might potentially hide information necessary for debugging

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it could be a bit confusing in some edge cases... Do you have a better idea?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe do it for the whole output and not line by line

@github-actions
Copy link
Contributor

Benchmark for 81e2498

Test Base PR % Significant %
bench_startup/Turbopack RCC/1000 modules 2522.97ms ± 16.98ms 2461.67ms ± 13.14ms -2.43% -0.04%
bench_startup/Turbopack RSC/1000 modules 2373.29ms ± 8.63ms 2337.04ms ± 6.72ms -1.53% -0.24%
Click to view full benchmark
Test Base PR % Significant %
bench_hmr_to_commit/Turbopack CSR/1000 modules 8773.03µs ± 66.68µs 8815.88µs ± 54.50µs +0.49%
bench_hmr_to_commit/Turbopack RCC/1000 modules 9104.05µs ± 59.48µs 9218.37µs ± 88.40µs +1.26%
bench_hmr_to_commit/Turbopack RSC/1000 modules 471.41ms ± 1.63ms 470.14ms ± 1.74ms -0.27%
bench_hmr_to_commit/Turbopack SSR/1000 modules 9073.32µs ± 55.73µs 9303.00µs ± 81.81µs +2.53%
bench_hmr_to_eval/Turbopack CSR/1000 modules 7934.67µs ± 118.65µs 8326.23µs ± 110.83µs +4.93%
bench_hmr_to_eval/Turbopack RCC/1000 modules 8132.99µs ± 87.64µs 8125.92µs ± 106.12µs -0.09%
bench_hmr_to_eval/Turbopack SSR/1000 modules 8123.13µs ± 88.68µs 8068.22µs ± 70.03µs -0.68%
bench_hydration/Turbopack RCC/1000 modules 3729.78ms ± 36.13ms 3673.57ms ± 11.71ms -1.51%
bench_hydration/Turbopack RSC/1000 modules 3181.43ms ± 9.22ms 3182.04ms ± 9.79ms +0.02%
bench_hydration/Turbopack SSR/1000 modules 2967.89ms ± 9.61ms 2984.31ms ± 11.81ms +0.55%
bench_startup/Turbopack CSR/1000 modules 2063.71ms ± 12.87ms 2035.64ms ± 17.45ms -1.36%
bench_startup/Turbopack RCC/1000 modules 2522.97ms ± 16.98ms 2461.67ms ± 13.14ms -2.43% -0.04%
bench_startup/Turbopack RSC/1000 modules 2373.29ms ± 8.63ms 2337.04ms ± 6.72ms -1.53% -0.24%
bench_startup/Turbopack SSR/1000 modules 1981.81ms ± 6.15ms 1991.41ms ± 6.04ms +0.48%

@github-actions
Copy link
Contributor

Benchmark for 2f063d8

Test Base PR % Significant %
bench_hmr_to_eval/Turbopack CSR/1000 modules 7427.42µs ± 71.40µs 7747.10µs ± 83.93µs +4.30% +0.12%
Click to view full benchmark
Test Base PR % Significant %
bench_hmr_to_commit/Turbopack CSR/1000 modules 8362.55µs ± 77.31µs 8199.91µs ± 105.81µs -1.94%
bench_hmr_to_commit/Turbopack RCC/1000 modules 8731.98µs ± 110.32µs 8559.38µs ± 67.00µs -1.98%
bench_hmr_to_commit/Turbopack RSC/1000 modules 463.45ms ± 1.86ms 462.23ms ± 2.02ms -0.26%
bench_hmr_to_commit/Turbopack SSR/1000 modules 8566.04µs ± 100.81µs 8766.64µs ± 83.27µs +2.34%
bench_hmr_to_eval/Turbopack CSR/1000 modules 7427.42µs ± 71.40µs 7747.10µs ± 83.93µs +4.30% +0.12%
bench_hmr_to_eval/Turbopack RCC/1000 modules 7615.38µs ± 72.28µs 7655.67µs ± 52.33µs +0.53%
bench_hmr_to_eval/Turbopack SSR/1000 modules 7690.92µs ± 75.89µs 7745.69µs ± 84.49µs +0.71%
bench_hydration/Turbopack RCC/1000 modules 3631.01ms ± 9.81ms 3620.49ms ± 13.14ms -0.29%
bench_hydration/Turbopack RSC/1000 modules 3197.02ms ± 7.48ms 3185.64ms ± 9.44ms -0.36%
bench_hydration/Turbopack SSR/1000 modules 2974.04ms ± 12.59ms 2962.40ms ± 12.82ms -0.39%
bench_startup/Turbopack CSR/1000 modules 1991.43ms ± 9.22ms 1985.25ms ± 14.66ms -0.31%
bench_startup/Turbopack RCC/1000 modules 2419.89ms ± 8.34ms 2420.20ms ± 8.59ms +0.01%
bench_startup/Turbopack RSC/1000 modules 2316.36ms ± 5.38ms 2308.72ms ± 6.05ms -0.33%
bench_startup/Turbopack SSR/1000 modules 1979.19ms ± 2.78ms 1970.76ms ± 5.24ms -0.43%

@github-actions
Copy link
Contributor

github-actions bot commented Jan 30, 2023

⚠️ CI failed ⚠️

The following steps have failed in CI:

  • Rust tests
  • Rust benchmark tests

See workflow summary for details

}
let line = Arc::from(take(&mut buffer).into_boxed_slice());
let occurance_number = *own_output
.entry(Arc::clone(&line))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we store a hash instead of the raw bytes? Wouldn't that save memory overall?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That work work (with a small risk of collisions), but that would loose the ability to read the logged output from the browser ui in future.

.or_insert(0);
let new_line = {
let mut shared = shared.lock().unwrap();
shared.insert((line.clone(), occurance_number))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need occurance_number? Wouldn't just a HashSet work?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to allow logging lines multiple times from a process.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I initially planned to use HashMap<line: String, count: u32> but that won't preserve the order of logging. So it's a IndexSet<(line: String, occurance_number: u32)> which preserves all logging in the correct order.

We can display that in the browser UI for inspecting by iterating over the IndexSet. That will give all logging in the right order. Merged from all processes of the pool.

@github-actions
Copy link
Contributor

Benchmark for 9c2ea3a

Click to view benchmark
Test Base PR % Significant %
bench_hmr_to_commit/Turbopack CSR/1000 modules 9157.39µs ± 64.66µs 9149.88µs ± 85.25µs -0.08%
bench_hmr_to_commit/Turbopack RCC/1000 modules 9433.39µs ± 52.47µs 9513.88µs ± 92.22µs +0.85%
bench_hmr_to_commit/Turbopack RSC/1000 modules 494.07ms ± 3.33ms 491.14ms ± 2.95ms -0.59%
bench_hmr_to_commit/Turbopack SSR/1000 modules 9373.69µs ± 56.64µs 9319.47µs ± 84.85µs -0.58%
bench_hmr_to_eval/Turbopack CSR/1000 modules 8185.54µs ± 64.16µs 8199.27µs ± 63.15µs +0.17%
bench_hmr_to_eval/Turbopack RCC/1000 modules 8443.55µs ± 87.39µs 8432.42µs ± 44.39µs -0.13%
bench_hmr_to_eval/Turbopack SSR/1000 modules 8447.94µs ± 72.49µs 8502.64µs ± 68.31µs +0.65%
bench_hydration/Turbopack RCC/1000 modules 3787.77ms ± 13.30ms 3811.32ms ± 17.13ms +0.62%
bench_hydration/Turbopack RSC/1000 modules 3325.19ms ± 7.70ms 3313.89ms ± 13.66ms -0.34%
bench_hydration/Turbopack SSR/1000 modules 3074.01ms ± 11.06ms 3087.44ms ± 12.47ms +0.44%
bench_startup/Turbopack CSR/1000 modules 2074.08ms ± 12.69ms 2067.69ms ± 8.04ms -0.31%
bench_startup/Turbopack RCC/1000 modules 2530.89ms ± 9.13ms 2514.65ms ± 3.52ms -0.64%
bench_startup/Turbopack RSC/1000 modules 2437.12ms ± 5.93ms 2418.11ms ± 4.56ms -0.78%
bench_startup/Turbopack SSR/1000 modules 2049.07ms ± 2.84ms 2036.63ms ± 5.80ms -0.61%

Ok(_) => {}
}
let line = Arc::from(take(&mut buffer).into_boxed_slice());
let occurance_number = *own_output
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let occurance_number = *own_output
let occurrence_number = *own_output

.or_insert(0);
let new_line = {
let mut shared = shared.lock().unwrap();
shared.insert((line.clone(), occurance_number))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
shared.insert((line.clone(), occurance_number))
shared.insert((line.clone(), occurrence_number))

crates/turbopack-node/src/pool.rs Show resolved Hide resolved
stream: impl AsyncRead + Unpin,
shared: SharedOutputSet,
mut final_stream: impl AsyncWrite + Unpin,
) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If stream A reports:

error A:
   shared line
   non-shared line (A)

while stream B reports:

error B:
  shared line
  non-shared line (B)

We would end up with:

error A:
  shared line
  non-shared line (A)
error B:
  non-shared line (B)

I expect this to also happen with empty lines printed for formatting purposes.

Should we instead only dedupe the start of a stream, up until the first different character?

@github-actions
Copy link
Contributor

github-actions bot commented Feb 1, 2023

Benchmark for ba8c683

Click to view benchmark
Test Base PR % Significant %
bench_hmr_to_commit/Turbopack CSR/1000 modules 9685.64µs ± 72.89µs 9813.65µs ± 61.56µs +1.32%
bench_hmr_to_commit/Turbopack RCC/1000 modules 9975.32µs ± 99.36µs 10.10ms ± 0.09ms +1.22%
bench_hmr_to_commit/Turbopack RSC/1000 modules 508.85ms ± 2.39ms 515.80ms ± 2.09ms +1.37%
bench_hmr_to_commit/Turbopack SSR/1000 modules 9906.52µs ± 85.75µs 9928.39µs ± 67.37µs +0.22%
bench_hmr_to_eval/Turbopack CSR/1000 modules 8669.25µs ± 113.29µs 8711.47µs ± 62.51µs +0.49%
bench_hmr_to_eval/Turbopack RCC/1000 modules 9045.68µs ± 103.19µs 9084.23µs ± 52.95µs +0.43%
bench_hmr_to_eval/Turbopack SSR/1000 modules 9023.56µs ± 85.82µs 8911.38µs ± 65.99µs -1.24%
bench_hydration/Turbopack RCC/1000 modules 3888.72ms ± 18.51ms 3895.80ms ± 11.96ms +0.18%
bench_hydration/Turbopack RSC/1000 modules 3422.35ms ± 11.55ms 3379.98ms ± 10.57ms -1.24%
bench_hydration/Turbopack SSR/1000 modules 3163.31ms ± 13.00ms 3145.91ms ± 12.67ms -0.55%
bench_startup/Turbopack CSR/1000 modules 2164.49ms ± 12.62ms 2149.62ms ± 24.24ms -0.69%
bench_startup/Turbopack RCC/1000 modules 2597.56ms ± 9.82ms 2598.29ms ± 12.69ms +0.03%
bench_startup/Turbopack RSC/1000 modules 2498.36ms ± 8.89ms 2492.68ms ± 6.77ms -0.23%
bench_startup/Turbopack SSR/1000 modules 2119.44ms ± 5.75ms 2125.17ms ± 12.82ms +0.27%

@sokra sokra merged commit 7177ae4 into main Feb 1, 2023
@sokra sokra deleted the sokra/web-521-duplicate-logging-from-the-nodejs-pooled branch February 1, 2023 16:42
mehulkar pushed a commit that referenced this pull request Feb 1, 2023
…rocess pool (#3531)

It uses a `shared` set of `(line: [u8], occurences: u32)` to avoid
emitting lines that has been emitted by other processes already.

It still supports emitting the same line multiple times from one
process.

The `shared` data might be later used for showing logging from a worker
pool. It contains the merged output of all processes.

Fixes WEB-489
@ForsakenHarmony
Copy link
Member

Why did you merge this with 0 approvals and 1 request for change?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants