Skip to content

feat(cuda): add SEQUENCE source op to dyn dispatch#7078

Merged
0ax1 merged 1 commit intodevelopfrom
ad/cuda-sequence
Mar 20, 2026
Merged

feat(cuda): add SEQUENCE source op to dyn dispatch#7078
0ax1 merged 1 commit intodevelopfrom
ad/cuda-sequence

Conversation

@0ax1
Copy link
Contributor

@0ax1 0ax1 commented Mar 20, 2026

No description provided.

@0ax1 0ax1 requested a review from joseph-isaacs March 20, 2026 12:11
@0ax1 0ax1 changed the title feat(cuda): add SEQUENCE source op to dyn dispatch kernel feat(cuda): add SEQUENCE source op to dyn dispatch Mar 20, 2026
@0ax1 0ax1 changed the title feat(cuda): add SEQUENCE source op to dyn dispatch feat(cuda): add SEQUENCE source op to dyn dispatch Mar 20, 2026
@0ax1 0ax1 force-pushed the ad/cuda-sequence branch from 91e891f to 3579936 Compare March 20, 2026 12:12
@0ax1 0ax1 enabled auto-merge (squash) March 20, 2026 12:12
@codspeed-hq
Copy link

codspeed-hq bot commented Mar 20, 2026

Merging this PR will degrade performance by 15.51%

❌ 2 regressed benchmarks
✅ 1014 untouched benchmarks
⏩ 1522 skipped benchmarks1

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Performance Changes

Mode Benchmark BASE HEAD Efficiency
Simulation bitwise_not_vortex_buffer_mut[1024] 477.2 ns 535.6 ns -10.89%
Simulation bitwise_not_vortex_buffer_mut[128] 317.8 ns 376.1 ns -15.51%

Comparing ad/cuda-sequence (8aa9592) with develop (b1ab304)

Open in CodSpeed

Footnotes

  1. 1522 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

@0ax1 0ax1 added the changelog/feature A new feature label Mar 20, 2026
Add a SEQUENCE source op that generates value[i] = base + i * multiplier
directly in shared memory, enabling SequenceArray to participate in fused
dynamic dispatch plans instead of requiring a separate kernel launch.

Signed-off-by: Alexander Droste <alexander.droste@protonmail.com>
@0ax1 0ax1 force-pushed the ad/cuda-sequence branch from 3579936 to 8aa9592 Compare March 20, 2026 12:21
@0ax1 0ax1 requested a review from robert3005 March 20, 2026 13:24
@0ax1 0ax1 merged commit 4a0ed9b into develop Mar 20, 2026
54 of 55 checks passed
@0ax1 0ax1 deleted the ad/cuda-sequence branch March 20, 2026 13:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

changelog/feature A new feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants