Kokkos: fix arithmetic bugs in existing ports + add 7 new ports by Copilot · Pull Request #63 · kento/HeCBench

Copilot · 2026-04-12T04:51:49Z

Reviews all 60 existing Kokkos benchmark ports for arithmetic correctness and begins porting the 432 benchmarks not yet on Kokkos.

Bug Fixes in Existing Ports

adam-kokkos: eps was 1e-10f instead of 1e-8f, causing the Adam denominator sqrt(v_corrected + eps) to be too small and producing oversized parameter updates
romberg-kokkos: getFirstSetBitPos used (int)(logf(n)/logf(2.f)) which truncates incorrectly at certain powers-of-two (e.g. logf(8192)/logf(2.f) = 12.999... → 12 instead of 13); replaced with log2f(n) to match the CUDA reference

New Kokkos Ports (7)

Benchmark	Description
`cbsfil-kokkos`	CBS filter
`cobahh-kokkos`	Hodgkin-Huxley neuron simulation
`depixel-kokkos`	Depixelize
`ecdh-kokkos`	Elliptic Curve Diffie-Hellman
`expdist-kokkos`	Exponential distance
`memcpy-kokkos`	Memory copy bandwidth
`pso-kokkos`	Particle swarm optimization

All new ports follow the established pattern: Kokkos::View + create_mirror_view/deep_copy for data movement, parallel_for/parallel_reduce for kernels, and Kokkos::atomic_* where needed. 425 benchmarks remain to be ported.

aobench-kokkos: fix transposed x/y pixel coordinates - The 1D->2D index decomposition used idx/h and idx%h, which assigned the row to x and the column to y (opposite of CUDA). Fix: y = idx/w (row), x = idx%w (column). aop-kokkos: fix missing sums.w reduction in prepare_svd_kernel - The CUDA version reduces all four moment sums (x, y, z, w) for the QR/SVD assembly. The Kokkos port omitted the atomic_add for sums.w (sum of S^4 for in-the-money paths), leaving final_sums.w always zero and corrupting the SVD and subsequent regression. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Co-authored-by: kento <1034379+kento@users.noreply.github.com>

adam-kokkos: eps constant was 1e-10f instead of 1e-8f from the CUDA reference. The smaller epsilon makes the Adam optimizer denominator smaller, producing numerically incorrect parameter updates. romberg-kokkos: getFirstSetBitPos used logf(x)/logf(2.f) to compute log2. Due to float32 rounding, logf(8192)/logf(2.f) = 12.999... which truncates to 12 instead of 13, and logf(32768)/logf(2.f) = 14.999... which truncates to 14 instead of 15. This misroutes 5 of the 65535 function evaluations into wrong Richardson extrapolation buckets. Fixed with the direct log2f intrinsic, matching the CUDA reference. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Co-authored-by: kento <1034379+kento@users.noreply.github.com>

Agent-Logs-Url: https://github.com/kento/HeCBench/sessions/078c5e19-017b-4615-9e7a-4e2cd0914222 Co-authored-by: kento <1034379+kento@users.noreply.github.com>

Copilot AI and others added 3 commits April 12, 2026 03:47

Add 7 new kokkos ports and fix 2 bugs in existing implementations

cca7785

Agent-Logs-Url: https://github.com/kento/HeCBench/sessions/078c5e19-017b-4615-9e7a-4e2cd0914222 Co-authored-by: kento <1034379+kento@users.noreply.github.com>

Copilot AI assigned Copilot and kento Apr 12, 2026

Copilot created this pull request from a session on behalf of kento April 12, 2026 04:51 View session

kento marked this pull request as ready for review April 12, 2026 04:51

kento merged commit 77713d9 into master Apr 12, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Kokkos: fix arithmetic bugs in existing ports + add 7 new ports#63

Kokkos: fix arithmetic bugs in existing ports + add 7 new ports#63
kento merged 3 commits into
masterfrom
copilot/port-benchmarks-for-kokkos-again

Copilot AI commented Apr 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Copilot AI commented Apr 12, 2026

Bug Fixes in Existing Ports

New Kokkos Ports (7)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants