ci : migrate ggml ci to self-hosted runners #16116

ggerganov · 2025-09-19T16:46:47Z

With this change, there is no longer need to specify ggml-ci into commits to trigger the CI
The V100 runner now runs both CUDA and Vulkan CI
Add T4 runner for CUDA, Vulkan Coopmat1 and Vulkan Coopmat2 runs
The Mac Mini runner will run both Metal and Vulkan CI
Disable test-backend-ops from Debug builds due to slowness (alternatively, we can add a Release build config with asserts enabled and run just that)
Add instructions for adding self-hosted runners
Move the MUSA CI instructions to separate README
Add AMD V710 runner for Vulkan and ROCm workflows
Switch to more lightweight Qwen3 0.6B model

netrunnereve · 2025-09-19T21:57:07Z

It'll definitely be good to have a Vulkan CI. Is it not crashing now on Nvidia?

BTW the V100 is pretty old and I wonder if it's worth upgrading to something newer with coopmat and coopmat2 support. Budget wise it should be possible to move the existing arm and x86 runs to the free Github machines (I think 16GB and 4 cores are enough for the CI) or have the amx machine do double duty as a x86 CPU runner.

ggerganov · 2025-09-20T07:23:14Z

It'll definitely be good to have a Vulkan CI. Is it not crashing now on Nvidia?

On both V100 and T4, the F16 EXP tests are failing because instead of inf the Vulkan F16 kernels return 65504.000000:

[EXP] inf mismatch: Vulkan0=65504.000000 CPU=inf   EXP(type=f16,ne_a=[128,2,2,2],v=0): FAIL
[EXP] inf mismatch: Vulkan0=65504.000000 CPU=inf   EXP(type=f16,ne_a=[5,7,11,13],v=0): FAIL

Is this a known issue and can we workaround?

am17an · 2025-09-20T08:22:52Z

Maybe #15652?

ggerganov · 2025-09-20T08:35:53Z

Maybe #15652?

Yes, probably a patch like this is needed. I'll leave it to be fixed on master.

@netrunnereve

For the Vulkan runs, we now have:

V100 (no coopmat)
T4 with coopmat1
T4 with coopmat2
Will add a Mac workflow on Tuesday

Let me know if there are any additional Vulkan configs that we can add. Or any machines from the Azure cloud that can be useful to exercise in addition to the current ones.

netrunnereve · 2025-09-20T17:06:51Z

Let me know if there are any additional Vulkan configs that we can add. Or any machines from the Azure cloud that can be useful to exercise in addition to the current ones.

For Vulkan your runs already cover most cases including the FP16 path (Mac with MoltenVK), the int dot path (V100), and coopmat 1 and 2 (T4). We also already run the tests for the FP16 path with llvmpipe using the standard Github machines. The only remaining path is the FP32 one (run with GGML_VK_DISABLE_F16, GGML_VK_DISABLE_COOPMAT, GGML_VK_DISABLE_COOPMAT2, and GGML_VK_DISABLE_INTEGER_DOT_PRODUCT) which is only used on old gpus. If we have room for that then go for it, else I don't think it's that important.

If you're thinking of getting another machine then the NVads V710 is an AMD one which could be used for the ROCM ci as well as an additional Vulkan testcase for benchmarking and detecting driver bugs. Having an Intel GPU would be nice for SYCL and Vulkan as well but Azure doesn't have those.

FYI while I was looking into this I also discovered that the V100 Azure machines will be discontinued at the end of this month so that'll have to be dealt with eventually.

slaren

I think this is a good starting point. Some notes:

A lot of this could be consolidated into a single job using a matrix
It may be too heavy to run the full ggml-ci on every PR push, it may be necessary to leave the full version for master only, and use lighter tests for PRs

ggerganov · 2025-09-21T10:30:38Z

@netrunnereve Do you know which driver I need to install for the AMD V710 GPU?

https://github.com/ggml-org/llama.cpp/actions/runs/17892383263/job/50874519651?pr=16116

Currently, I have installed the Vulkan SDK and the project builds. But it does not detect the GPU.

Edit: I think I found it: https://learn.microsoft.com/en-us/azure/virtual-machines/linux/azure-n-series-amd-gpu-driver-linux-installation-guide

CISC · 2025-09-21T10:46:44Z

* It may be too heavy to run the full ggml-ci on every PR push, it may be necessary to leave the full version for master only, and use lighter tests for PRs

Perhaps limit them to pushes to the related backend?

* ci : migrate ggml ci to a self-hosted runners * ci : add T4 runner * ci : add instructions for adding self-hosted runners * ci : disable test-backend-ops from debug builds due to slowness * ci : add AMD V710 runner (vulkan) * cont : add ROCM workflow * ci : switch to qwen3 0.6b model * cont : fix the context size

github-actions bot added the devops improvements to build systems and github actions label Sep 19, 2025

ggerganov force-pushed the gg/ggml-ci-self-host branch 3 times, most recently from e05d6c2 to b50a96f Compare September 19, 2025 17:47

ci : migrate ggml ci to a self-hosted runners

3848517

ggerganov force-pushed the gg/ggml-ci-self-host branch from f0704eb to 3848517 Compare September 19, 2025 19:11

ggerganov marked this pull request as ready for review September 19, 2025 19:12

ggerganov changed the title ~~ci : try to migrate ggml ci to a self-hosted runner~~ ci : migrate ggml ci to a self-hosted runner Sep 19, 2025

ggerganov changed the title ~~ci : migrate ggml ci to a self-hosted runner~~ ci : migrate ggml ci to a self-hosted runners Sep 19, 2025

ggerganov changed the title ~~ci : migrate ggml ci to a self-hosted runners~~ ci : migrate ggml ci to self-hosted runners Sep 19, 2025

ci : add T4 runner

dd3d5c6

ci : add instructions for adding self-hosted runners

16801fb

ci : disable test-backend-ops from debug builds due to slowness

0235ddc

ggerganov requested a review from slaren September 20, 2025 08:40

slaren approved these changes Sep 20, 2025

View reviewed changes

ci : add AMD V710 runner (vulkan)

91f690a

ggerganov added 4 commits September 21, 2025 14:00

cont : add ROCM workflow

58b1b12

ci : switch to qwen3 0.6b model

9b28641

cont : fix the context size

5353345

ci : add TODOs [no ci]

85de2b4

ggerganov merged commit 28baac9 into master Sep 21, 2025
1 check passed

ggerganov deleted the gg/ggml-ci-self-host branch September 21, 2025 13:50

ggerganov mentioned this pull request Sep 22, 2025

ggml: riscv: add riscv spacemit backend #15288

Merged

ggerganov mentioned this pull request Sep 29, 2025

ci : add self-hosted workflows ggml-org/ggml#1357

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ci : migrate ggml ci to self-hosted runners #16116

ci : migrate ggml ci to self-hosted runners #16116

Uh oh!

ggerganov commented Sep 19, 2025 •

edited

Loading

Uh oh!

netrunnereve commented Sep 19, 2025

Uh oh!

ggerganov commented Sep 20, 2025

Uh oh!

am17an commented Sep 20, 2025

Uh oh!

ggerganov commented Sep 20, 2025

Uh oh!

netrunnereve commented Sep 20, 2025

Uh oh!

slaren left a comment

Uh oh!

ggerganov commented Sep 21, 2025 •

edited

Loading

Uh oh!

CISC commented Sep 21, 2025

Uh oh!

Uh oh!

Uh oh!

ci : migrate ggml ci to self-hosted runners #16116

ci : migrate ggml ci to self-hosted runners #16116

Uh oh!

Conversation

ggerganov commented Sep 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

netrunnereve commented Sep 19, 2025

Uh oh!

ggerganov commented Sep 20, 2025

Uh oh!

am17an commented Sep 20, 2025

Uh oh!

ggerganov commented Sep 20, 2025

Uh oh!

netrunnereve commented Sep 20, 2025

Uh oh!

slaren left a comment

Choose a reason for hiding this comment

Uh oh!

ggerganov commented Sep 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

CISC commented Sep 21, 2025

Uh oh!

Uh oh!

Uh oh!

ggerganov commented Sep 19, 2025 •

edited

Loading

ggerganov commented Sep 21, 2025 •

edited

Loading