Sub-issue of #35 (item 2). Tracks the blocked work: root-cause and fix the Test (OTP 27 / Elixir 1.17) failure so the gate can be re-armed. Do not remove continue-on-error until the suite is green on main.
Root-cause lead (evidence gathered 2026-05-17)
The failure is almost certainly a full-app-boot failure under mix test --no-start, not individual assertions:
Justfile:245 → cd server && mix test --no-start.
server/test/test_helper.exs → Application.ensure_all_started(:burble) (manually boots the whole app because --no-start suppressed the automatic start).
server/lib/burble/application.ex start/2 has no test-env guard — it unconditionally boots the full production supervision tree: Burble.Store (VeriSimDB @ http://localhost:8081), Burble.Transport.RTSP (TCP 8554), Burble.Bolt.Listener (UDP 7373), Burble.LLM.Supervisor (QUIC 8503 via :quicer/msquic NIF), Burble.Media.Engine, Zig coprocessor NIF (server/priv/libburble_coprocessor.so).
config/test.exs only sets Endpoint server: false; it does not disable Burble.Store or the network/NIF children.
- In CI none of these externals exist: VeriSimDB is not running on :8081, the Zig FFI
.so may be absent (the "Build Zig FFI" step is itself continue-on-error, so a failed build silently yields no .so), and the quicer NIF/msquic may be unavailable.
⇒ The first hard-crashing child collapses Burble.Supervisor; ensure_all_started(:burble) returns {:error, …}; every test fails at startup — exactly the predicted mode.
(Local repro on this machine is non-authoritative: toolchain is Elixir 1.18.4/OTP 25 vs CI 1.17/OTP 27, and a separate Guardian.Plug.Pipeline compile error blocks local boot. Confirming which child crashes first still needs the blocking prerequisite below.)
Blocking prerequisite (unchanged from #35)
ONE of: widen cloud-sandbox allowlist to repo.hex.pm + builds.hex.pm; or an actions:read token / Actions-logs MCP tool; or paste the failing ** ( / N) test … failed block.
Remediation directions (decide once root cause confirmed)
- A. Test-env start path: filter
Burble.Application children via Application.compile_env(:burble, :start_children) so test boots only the minimal set (PubSub/registries/store-stub), excluding network listeners + external-dep children.
- B. Drop
--no-start; disable heavy children via config/test.exs flags the supervisor honours.
- C. Build the Zig coprocessor
.so before tests and re-arm the FFI gate (the FFI continue-on-error is part of the failure surface).
Acceptance criteria (from #35)
Refs #35
Sub-issue of #35 (item 2). Tracks the blocked work: root-cause and fix the
Test (OTP 27 / Elixir 1.17)failure so the gate can be re-armed. Do not removecontinue-on-erroruntil the suite is green onmain.Root-cause lead (evidence gathered 2026-05-17)
The failure is almost certainly a full-app-boot failure under
mix test --no-start, not individual assertions:Justfile:245→cd server && mix test --no-start.server/test/test_helper.exs→Application.ensure_all_started(:burble)(manually boots the whole app because--no-startsuppressed the automatic start).server/lib/burble/application.exstart/2has no test-env guard — it unconditionally boots the full production supervision tree:Burble.Store(VeriSimDB @http://localhost:8081),Burble.Transport.RTSP(TCP 8554),Burble.Bolt.Listener(UDP 7373),Burble.LLM.Supervisor(QUIC 8503 via:quicer/msquic NIF),Burble.Media.Engine, Zig coprocessor NIF (server/priv/libburble_coprocessor.so).config/test.exsonly setsEndpoint server: false; it does not disableBurble.Storeor the network/NIF children..somay be absent (the "Build Zig FFI" step is itselfcontinue-on-error, so a failed build silently yields no.so), and the quicer NIF/msquic may be unavailable.⇒ The first hard-crashing child collapses
Burble.Supervisor;ensure_all_started(:burble)returns{:error, …}; every test fails at startup — exactly the predicted mode.(Local repro on this machine is non-authoritative: toolchain is Elixir 1.18.4/OTP 25 vs CI 1.17/OTP 27, and a separate
Guardian.Plug.Pipelinecompile error blocks local boot. Confirming which child crashes first still needs the blocking prerequisite below.)Blocking prerequisite (unchanged from #35)
ONE of: widen cloud-sandbox allowlist to
repo.hex.pm+builds.hex.pm; or anactions:readtoken / Actions-logs MCP tool; or paste the failing** (/N) test … failedblock.Remediation directions (decide once root cause confirmed)
Burble.ApplicationchildrenviaApplication.compile_env(:burble, :start_children)so test boots only the minimal set (PubSub/registries/store-stub), excluding network listeners + external-dep children.--no-start; disable heavy children viaconfig/test.exsflags the supervisor honours..sobefore tests and re-arm the FFI gate (the FFIcontinue-on-erroris part of the failure surface).Acceptance criteria (from #35)
just test-serverpasses onmainfor OTP 27 / Elixir 1.17continue-on-error: trueremoved from "Run server tests"; a planted failing test turns CI reddialyzer(itneeds: test) and re-arming the "Build Zig FFI" gateRefs #35