cleanup(gpu): delete GPU stubs and migrate native benchmark to DSL by michalharakal · Pull Request #131 · SKaiNET-developers/SKaiNET-transformers

michalharakal · 2026-05-04T22:16:02Z

Summary

There is no real GPU support in this repo — the GPU code paths in
`:llm-runtime:kllama` and the native benchmark engine are placeholders
that have never run on GPU and always fall back to CPU. This PR deletes
them and migrates the native benchmark to the DSL path, mirroring the
JVM cleanup from #127.

Deleted: `GpuAttentionBackend.kt`, `GpuTensorBridge.kt` from kllama commonMain
Stripped from kllama backend expect/actual (linux/macos/ios): `createGpuTensorBridge` and `createGraphAccelerator` (the latter was unused dead code — `GraphAccelerator` interface and the JVM `FusedQKVAccelerator` impl are CPU-side and stay)
Stripped from llm-performance macosMain: `createMetalContext`, `createMlxContext`, `createGpuBridge` (only `availableNativeBackends` remains)
Rewrote `NativeBenchmarkEngine`: dropped `GpuNativeLlamaAdapter` and the Metal/MLX scenario adapters; renamed scenario `native-backend-throughput` → `native-cpu-throughput`; migrated CPU adapter from legacy `LlamaRuntime` + `LlamaIngestion` to the DSL path (`DecoderGgufWeightLoader` + `LlamaNetworkLoader.fromWeights` + `OptimizedLLMRuntime` DIRECT)
Updated `AttentionBackend` kdoc to drop the `GpuAttentionBackend` reference

Net: +73 / −479 across 9 files.

Test plan

`./gradlew :llm-runtime:kllama:compileKotlinJvm :llm-runtime:kllama:compileKotlinLinuxX64 :llm-runtime:kllama:compileKotlinMacosArm64 :llm-runtime:kllama:compileKotlinWasmJs` — all green
`./gradlew :llm-performance:compileKotlinJvm :llm-performance:compileKotlinMacosArm64 :llm-performance:compileKotlinWasmJs` — all green
`./gradlew :llm-runtime:kllama:jvmTest :llm-performance:jvmTest` — pass
(optional) Run `bash tests/smoke/smoke-test.sh` to confirm CLI smoke harness still passes

🤖 Generated with Claude Code

Removes the placeholder GPU code paths in :llm-runtime:kllama and the native benchmark engine. There is no real GPU support in this repo — GpuAttentionBackend, GpuTensorBridge, and the createGpuBridge / createMetalContext / createMlxContext expect/actual chains were stubs that always fell back to CPU. - Delete GpuAttentionBackend.kt and GpuTensorBridge.kt - Strip createGpuTensorBridge / createGraphAccelerator from kllama BackendExpect.kt and the linux/macos/ios actuals (createGraphAccelerator was unused dead code) - Drop createMetalContext / createMlxContext / createGpuBridge from llm-performance macosMain; only availableNativeBackends remains - Rewrite NativeBenchmarkEngine: drop GpuNativeLlamaAdapter and the Metal/MLX scenario adapters; rename scenario to native-cpu-throughput and migrate the CPU adapter to the DSL path (DecoderGgufWeightLoader + LlamaNetworkLoader.fromWeights + OptimizedLLMRuntime DIRECT), mirroring #127's JVM cleanup - Drop GpuAttentionBackend reference from AttentionBackend kdoc Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

michalharakal merged commit 3d5cd31 into develop May 4, 2026
2 checks passed

michalharakal deleted the cleanup/drop-gpu-stubs branch May 4, 2026 22:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cleanup(gpu): delete GPU stubs and migrate native benchmark to DSL#131

cleanup(gpu): delete GPU stubs and migrate native benchmark to DSL#131
michalharakal merged 1 commit intodevelopfrom
cleanup/drop-gpu-stubs

michalharakal commented May 4, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

michalharakal commented May 4, 2026

Summary

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant