Skip to content

feat[cuda]: add a cuda stream pool#6205

Merged
0ax1 merged 8 commits intodevelopfrom
ji/fuzz-oom
Jan 29, 2026
Merged

feat[cuda]: add a cuda stream pool#6205
0ax1 merged 8 commits intodevelopfrom
ji/fuzz-oom

Conversation

@joseph-isaacs
Copy link
Contributor

Hopefully it will help not to OOM gpus

Signed-off-by: Joe Isaacs <joe.isaacs@live.co.uk>
Signed-off-by: Joe Isaacs <joe.isaacs@live.co.uk>
Signed-off-by: Joe Isaacs <joe.isaacs@live.co.uk>
@codspeed-hq
Copy link

codspeed-hq bot commented Jan 29, 2026

Merging this PR will degrade performance by 29.5%

⚠️ Unknown Walltime execution environment detected

Using the Walltime instrument on standard Hosted Runners will lead to inconsistent data.

For the most accurate results, we recommend using CodSpeed Macro Runners: bare-metal machines fine-tuned for performance measurement consistency.

⚡ 8 improved benchmarks
❌ 4 regressed benchmarks
✅ 1167 untouched benchmarks
⏩ 1323 skipped benchmarks1

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Performance Changes

Mode Benchmark BASE HEAD Efficiency
WallTime 1M_10pct[100000] 21.4 µs 27.1 µs -21.14%
Simulation canonical_into_non_nullable[(10000, 1, 0.0)] 27.9 µs 24.7 µs +12.88%
Simulation canonical_into_non_nullable[(10000, 1, 0.01)] 34.2 µs 31.1 µs +10.02%
Simulation canonical_into_non_nullable[(10000, 10, 0.0)] 285.3 µs 194.1 µs +46.95%
Simulation canonical_into_non_nullable[(10000, 10, 0.1)] 471.6 µs 380.6 µs +23.91%
Simulation canonical_into_non_nullable[(10000, 100, 0.01)] 2.2 ms 2.9 ms -26.86%
Simulation canonical_into_non_nullable[(10000, 100, 0.1)] 3.7 ms 4.6 ms -18.18%
Simulation into_canonical_non_nullable[(10000, 10, 0.1)] 461.8 µs 383.9 µs +20.3%
Simulation canonical_into_non_nullable[(10000, 10, 0.01)] 312.1 µs 221.1 µs +41.2%
Simulation into_canonical_non_nullable[(10000, 10, 0.01)] 302.8 µs 228.1 µs +32.78%
Simulation into_canonical_non_nullable[(10000, 10, 0.0)] 276.1 µs 200.6 µs +37.67%
Simulation canonical_into_non_nullable[(10000, 100, 0.0)] 1.9 ms 2.7 ms -29.5%

Comparing ji/fuzz-oom (2a4cc78) with develop (f568de5)

Open in CodSpeed

Footnotes

  1. 1323 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

Signed-off-by: Joe Isaacs <joe.isaacs@live.co.uk>
Signed-off-by: Joe Isaacs <joe.isaacs@live.co.uk>
@0ax1 0ax1 self-requested a review January 29, 2026 15:49
@joseph-isaacs joseph-isaacs added the feature A feature request label Jan 29, 2026
// The closure always returns Some, so fetch_update always succeeds.
let slot_idx = self
.next_index
.fetch_update(Ordering::Relaxed, Ordering::Relaxed, |x| {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fetch_add

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make sure capacity is 2 ^ n. and do % len on the returned value.

Signed-off-by: Joe Isaacs <joe.isaacs@live.co.uk>
@joseph-isaacs joseph-isaacs added changelog/feature A new feature and removed feature A feature request labels Jan 29, 2026
Signed-off-by: Joe Isaacs <joe.isaacs@live.co.uk>
Copy link
Contributor

@0ax1 0ax1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good

@0ax1 0ax1 enabled auto-merge (squash) January 29, 2026 16:21
Signed-off-by: Joe Isaacs <joe.isaacs@live.co.uk>
@0ax1 0ax1 merged commit 61f4c99 into develop Jan 29, 2026
43 of 44 checks passed
@0ax1 0ax1 deleted the ji/fuzz-oom branch January 29, 2026 17:08
AdamGS pushed a commit that referenced this pull request Feb 2, 2026
Hopefully it will help not to OOM gpus

---------

Signed-off-by: Joe Isaacs <joe.isaacs@live.co.uk>
danking pushed a commit that referenced this pull request Feb 6, 2026
Hopefully it will help not to OOM gpus

---------

Signed-off-by: Joe Isaacs <joe.isaacs@live.co.uk>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

changelog/feature A new feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants