Skip to content

fix(turboquant): guard upstream-only grpc-server fields for fork#10043

Merged
mudler merged 1 commit into
masterfrom
worktree-fix-turboquant-grpc-server-bump
May 28, 2026
Merged

fix(turboquant): guard upstream-only grpc-server fields for fork#10043
mudler merged 1 commit into
masterfrom
worktree-fix-turboquant-grpc-server-bump

Conversation

@localai-bot
Copy link
Copy Markdown
Collaborator

Summary

The turboquant backend reuses backend/cpp/llama-cpp/grpc-server.cpp but compiles it against an older llama.cpp fork (TheTom/llama-cpp-turboquant). Two recent changes added references to upstream-only struct fields outside the existing LOCALAI_LEGACY_LLAMA_CPP_SPEC guards, breaking the turboquant build for every flavor:

The fork has neither field, so grpc-server.cpp.o failed with 'struct common_params' has no member named 'checkpoint_min_step' and 'struct common_params_speculative' has no member named 'draft'.

This wraps the three references in #ifndef LOCALAI_LEGACY_LLAMA_CPP_SPEC (the same fork-compat pattern already used elsewhere in the file). The stock llama-cpp build is unchanged (macro undefined → code compiled as before); the turboquant build (macro injected by patch-grpc-server.sh) skips them. The patch-grpc-server.sh doc comment is updated to record what the macro now gates out.

Test plan

  • Local fallback-flavor turboquant build: grpc-server.cpp compiles against the fork and local-ai-backend:turboquant image builds
  • Preprocessor verification: with the macro defined the missing-field references are gone; both views (stock/fork) stay brace-balanced
  • CI backend-jobs-multiarch -cpu-turboquant (amd64 + arm64) green

backend/cpp/llama-cpp/grpc-server.cpp is reused by the turboquant build,
which compiles against an older llama.cpp fork (TheTom/llama-cpp-turboquant).
Two recent changes added references to upstream-only struct fields outside the
existing LOCALAI_LEGACY_LLAMA_CPP_SPEC guards:

  - common_params::checkpoint_min_step (default + option handler), added with
    the ggml-org/llama.cpp 35c9b1f3 bump (#9998)
  - the common_params_speculative::draft tensor_buft_overrides sentinel
    termination (#9919), which sat after the guard's #endif

The fork has neither field, so grpc-server.cpp failed to compile for every
turboquant flavor. Wrap the three references in #ifndef
LOCALAI_LEGACY_LLAMA_CPP_SPEC, matching the existing fork-compat guards, so the
stock llama-cpp build is unchanged and the fork build skips them. Update
patch-grpc-server.sh's doc comment to record what the macro now gates out.

Verified by a local fallback-flavor turboquant build: grpc-server.cpp compiles
against the fork and the backend image builds.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Assisted-by: Claude:claude-opus-4-7 [Claude Code]
@mudler mudler merged commit 1c92b00 into master May 28, 2026
70 of 71 checks passed
@mudler mudler deleted the worktree-fix-turboquant-grpc-server-bump branch May 28, 2026 15:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants