server : fix swa-full logic by ggerganov · Pull Request #22288 · ggml-org/llama.cpp

ggerganov · 2026-04-23T12:50:58Z

Overview

Simplify the logic by augmenting llama_model_n_swa with a server_context.n_swa member. When --swa-full is passed, we set n_swa = 0 to simulate a non-SWA model

Requirements

I have read and agree with the contributing guidelines
AI usage disclosure: NO

shipped-it · 2026-04-23T13:16:43Z

I'm a bit unsure about the approach. Why would the model be reported as non-SWA? It is still SWA, just with a full-size cache.

Also, you've made the suggestion to use --swa-full --no-mmproj. Correct me if I'm wrong, but in practice I've seen repetitions issues when using Gemma 4.

I will confirm shortly if this fixes the issue, or if I get the repetition problem.

ggerganov · 2026-04-23T13:23:42Z

Also, you've made the suggestion to use --swa-full --no-mmproj.

Cache reuse does not work with mmproj.

Correct me if I'm wrong, but in practice I've seen repetitions issues when using Gemma 4.

I am not following. This fixes the cache reuse logic - I am not aware of any repetitions.

shipped-it · 2026-04-23T19:27:45Z

I confirm that it is fixed with this PR or #21749 (tested on ROCm)

Without PR:
warm req: prompt_n=821, prompt_ms=982

With PR:
warm req: prompt_n=5, prompt_ms=71 so about 13x faster

server : fix swa-full logic

d7e88f7

ggerganov requested a review from a team as a code owner April 23, 2026 12:50

This was referenced Apr 23, 2026

cache reuse is not supported for Gemma 4 models despite -fa enabled and --swa-full #21468

Closed

server: ensure prompt caching for SWA models #21749

Open

github-actions Bot added examples server labels Apr 23, 2026

ggerganov merged commit ffdd983 into master Apr 24, 2026
42 of 47 checks passed

ggerganov deleted the gg/server-fix-n-swa branch April 24, 2026 07:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

server : fix swa-full logic#22288

server : fix swa-full logic#22288
ggerganov merged 1 commit intomasterfrom
gg/server-fix-n-swa

ggerganov commented Apr 23, 2026

Uh oh!

shipped-it commented Apr 23, 2026

Uh oh!

ggerganov commented Apr 23, 2026

Uh oh!

shipped-it commented Apr 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ggerganov commented Apr 23, 2026

Overview

Requirements

Uh oh!

shipped-it commented Apr 23, 2026

Uh oh!

ggerganov commented Apr 23, 2026

Uh oh!

shipped-it commented Apr 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants