Skip to content

fix: unbreak master CI (docs, kokoros, vibevoice-cpp ABI)#9682

Merged
mudler merged 4 commits into
masterfrom
fix/ci-quick-wins
May 6, 2026
Merged

fix: unbreak master CI (docs, kokoros, vibevoice-cpp ABI)#9682
mudler merged 4 commits into
masterfrom
fix/ci-quick-wins

Conversation

@localai-bot
Copy link
Copy Markdown
Collaborator

Summary

Three independent fixes for failures observed on master after 06a15241:

  • Deploy docs to GitHub Pages ❌ — Hugo build aborts on two broken relrefs (/docs/features/distributed-mode was using a URL-style path; tts.md doesn't exist, the page is text-to-audio.md).
  • tests-extras / tests-kokoros ❌ — recent backend.proto additions (Diarize, AudioTransform, AudioTransformStream) extended the gRPC Backend trait. kokoros-grpc didn't pick them up and failed compilation with E0046. Adds Unimplemented stubs matching the pattern used for the other non-applicable RPCs in this TTS-only backend.
  • tests-extras / tests-vibevoice-cpp + tests-vibevoice-cpp-grpc-tts ❌ — mudler/vibevoice.cpp reshaped vv_capi_tts twice in quick succession (3bd759c inserted ref_audio_path, ad856bd promoted it to (const char* const* ref_audio_paths, int n_ref_audio_paths) for multi-speaker). purego resolves by symbol name, so the build kept linking; at runtime the misaligned arguments turned the closed-loop TTS→ASR test into a SIGSEGV inside cgo.

The vibevoice fix wires the new ABI properly and uses the moment to expose voice-cloning support:

  • CppTTS purego binding switched to the 9-arg form. []*byte marshals as **char (nil/empty → NULL).
  • New ref_audio gallery option (comma-separated, repeatable) — one WAV per speaker for the 1.5B path.
  • TTSRequest.Voice routes by extension/shape: .wav or a comma-list goes to ref_audio_paths; anything else stays on voice_path for the 0.5B pre-baked voice gguf.
  • VIBEVOICE_CPP_VERSION pinned to ad856bda6b1311b7f3d7c4a667be43eeb8a8249a. Floating on master is what allowed the silent ABI break to reach CI in the first place.
  • Added mudler/vibevoice.cpp to .github/workflows/bump_deps.yaml so future upstream rolls land as reviewable PRs alongside the other backends.

Test plan

  • Deploy docs to GitHub Pages succeeds on this PR
  • tests-extras / tests-kokoros compiles and passes
  • tests-extras / tests-vibevoice-cpp closed-loop TTS→ASR test passes
  • tests-extras / tests-vibevoice-cpp-grpc-tts and …-grpc-transcription pass
  • No regressions on the other extras (rerankers is failing for an unrelated reason — torch infer_schema rejecting stringified 'torch.Tensor' annotations; not in scope here)

Out of scope

  • 1.5B model gallery entry — mudler/vibevoice.cpp-models doesn't ship a 1.5B GGUF yet. Backend already supports it via ref_audio:; gallery row is a follow-up once the model file is published.
  • tests-rerankers torch/transformers version drift — separate workstream.

mudler added 4 commits May 6, 2026 07:44
The Hugo build has been failing on master since the relevant pages
landed:

- text-generation.md:720 referenced `/docs/features/distributed-mode`,
  but Hugo `relref` paths are relative to the content root, not the
  rendered URL. Drop the `/docs/` prefix so the lookup matches the
  existing `features/...` form used elsewhere in the file.
- audio-transform.md:144 referenced `tts.md`; the actual page is
  `text-to-audio.md`.

Assisted-by: Claude:claude-opus-4-7[1m]
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
The recent backend.proto additions (Diarize, AudioTransform,
AudioTransformStream) extended the gRPC Backend trait, breaking
kokoros-grpc compilation with E0046 because the Rust implementation
hadn't picked up the new methods. Add Unimplemented stubs matching the
existing pattern for non-applicable RPCs in this TTS-only backend.

Assisted-by: Claude:claude-opus-4-7[1m]
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Two recent commits in mudler/vibevoice.cpp reshaped the vv_capi_tts
signature without a corresponding bump on the LocalAI side:

  3bd759c "1.5b: unify into a single tts entry point" inserted a
          ref_audio_path parameter between voice_path and dst_wav_path.
  ad856bd "1.5b: multi-speaker dialog support" promoted that to a
          (const char* const* ref_audio_paths, int n_ref_audio_paths)
          pair for per-speaker conditioning.

Because purego resolves symbols by name and not by signature, the
build kept linking; at runtime the misaligned arguments turned the
TTS->ASR closed-loop test into a SIGSEGV inside cgo. Track HEAD
explicitly and bring the bridge in line with it:

  * Update the CppTTS purego binding to the 9-arg form. purego
    marshals []*byte as a **char by handing the C side the underlying
    array address; nil/empty maps to NULL, which matches the C
    contract for "no reference audio" on the realtime-0.5B path.
  * Add a `ref_audio` gallery option (comma-separated, repeatable)
    that the 1.5B path consumes for runtime voice cloning. Multiple
    entries are interpreted as one WAV per speaker (Speaker 0..n-1).
  * TTSRequest.Voice now routes by extension/shape: `.wav` or a
    comma-separated list goes to ref_audio_paths; anything else stays
    on voice_path (realtime-0.5B's pre-baked voice gguf).
  * Pin VIBEVOICE_CPP_VERSION to ad856bd and wire the Makefile into
    the existing bump_deps matrix so future upstream rolls land as
    reviewable PRs instead of a silent CI break.

Assisted-by: Claude:claude-opus-4-7[1m]
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Use the existing audio_path field from ModelOptions (already plumbed
through config_file's `audio_path:` YAML and consumed by other audio
backends like kokoros) instead of inventing a custom `ref_audio:`
Options[] string. Multi-speaker setups stay on a single comma-
separated value.

No behavior change beyond the gallery key name; per-call routing via
TTSRequest.Voice is unchanged.

Assisted-by: Claude:claude-opus-4-7[1m]
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
@mudler mudler merged commit a8d7d37 into master May 6, 2026
48 checks passed
@mudler mudler deleted the fix/ci-quick-wins branch May 6, 2026 08:37
@localai-bot localai-bot added the bug Something isn't working label May 9, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants