fix(native): test b9016 runtime and expose mainGpu by leehack · Pull Request #113 · leehack/llamadart

leehack · 2026-05-04T19:26:09Z

Summary

Pins the native runtime hook from b8955 to b9016, consuming the published llamadart-native release assets.
Adds ModelParams.mainGpu and passes it through to llama_model_params.main_gpu for explicit primary GPU selection (Expose main_gpu on ModelParams for explicit Vulkan / CUDA device selection #112).
Adds ModelParams.splitMode and passes it through to llama_model_params.split_mode, matching upstream llama.cpp semantics for single-GPU selection via ModelSplitMode.none.
Resolves ggml backend registry/device APIs from the loaded ggml runtime DLL on Windows split bundles when the generated default FFI asset cannot see those symbols.
Documents the upstream ggml-vulkan cooperative-matrix workaround for driver crashes (GGML_VK_DISABLE_COOPMAT, GGML_VK_DISABLE_COOPMAT2).
Updates README, website runtime-parameter docs, support matrix docs, and CHANGELOG entries for the b9016/runtime-parameter/loader changes.

Validation

dart analyze
dart test test/unit/core/models/inference/model_params_test.dart
dart test -p vm test/unit/backends/llama_cpp/llama_cpp_service_test.dart
./tool/docs/build_site.sh
GitHub CI passed on docs-only head 577d10a2b.

Issue status

Blackwell (RTX 50-series, sm_120) unsupported on Windows: Vulkan crash + CUDA runtime too old #111 CUDA / Blackwell: reporter confirmed b9016 works on RTX 5050 Laptop with CUDA.
Blackwell (RTX 50-series, sm_120) unsupported on Windows: Vulkan crash + CUDA runtime too old #111 Vulkan / Windows split bundle: reporter confirmed the loader fix works on Windows. Diagnostics changed to registeredBackends=[CPU, Vulkan], concrete Vulkan devices, and registryApisUnavailable=false.
Blackwell (RTX 50-series, sm_120) unsupported on Windows: Vulkan crash + CUDA runtime too old #111 Vulkan / driver crashes: remaining AMD Radeon 860M and NVIDIA RTX 5050 Laptop Vulkan crashes are in vendor cooperative-matrix property queries. Reporter confirmed GGML_VK_DISABLE_COOPMAT plus GGML_VK_DISABLE_COOPMAT2 works around them; this workaround is now documented.
Expose main_gpu on ModelParams for explicit Vulkan / CUDA device selection #112 device-selection API: addressed at the Dart API/native wiring level with mainGpu plus splitMode.

codecov-commenter · 2026-05-05T01:02:08Z

Codecov Report

❌ Patch coverage is 29.20354% with 80 lines in your changes missing coverage. Please review.
✅ Project coverage is 76.57%. Comparing base (3b8e101) to head (577d10a).
⚠️ Report is 1 commits behind head on main.

Files with missing lines	Patch %	Lines
lib/src/backends/llama_cpp/llama_cpp_service.dart	27.92%	80 Missing ⚠️

❌ Your patch status has failed because the patch coverage (29.20%) is below the target coverage (70.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #113      +/-   ##
==========================================
- Coverage   77.12%   76.57%   -0.55%     
==========================================
  Files          68       68              
  Lines        8594     8670      +76     
==========================================
+ Hits         6628     6639      +11     
- Misses       1966     2031      +65

Flag	Coverage Δ
unittests	`76.57% <29.20%> (-0.55%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

fix(native): pin b9016 runtime and expose mainGpu

1542b0e

This was referenced May 5, 2026

Blackwell (RTX 50-series, sm_120) unsupported on Windows: Vulkan crash + CUDA runtime too old #111

Closed

Expose main_gpu on ModelParams for explicit Vulkan / CUDA device selection #112

Closed

leehack marked this pull request as ready for review May 5, 2026 01:36

leehack added 4 commits May 5, 2026 06:39

fix(native): expose model split mode

fb348b7

docs(native): update b9016 runtime docs

3a92e95

fix(native): resolve ggml registry fallbacks

061161d

docs: document Vulkan coopmat workaround

577d10a

leehack merged commit 5a138a2 into main May 6, 2026
6 checks passed

leehack deleted the codex-native-b9016-cuda-test branch May 6, 2026 12:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(native): test b9016 runtime and expose mainGpu#113

fix(native): test b9016 runtime and expose mainGpu#113
leehack merged 5 commits intomainfrom
codex-native-b9016-cuda-test

leehack commented May 4, 2026 •

edited

Loading

Uh oh!

codecov-commenter commented May 5, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

leehack commented May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Validation

Issue status

Uh oh!

codecov-commenter commented May 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

leehack commented May 4, 2026 •

edited

Loading

codecov-commenter commented May 5, 2026 •

edited

Loading