Skip to content

build(deps): bump actions/checkout from 4 to 6#3

Merged
nicolotognoni merged 1 commit into
mainfrom
dependabot/github_actions/actions/checkout-6
Apr 10, 2026
Merged

build(deps): bump actions/checkout from 4 to 6#3
nicolotognoni merged 1 commit into
mainfrom
dependabot/github_actions/actions/checkout-6

Conversation

@dependabot
Copy link
Copy Markdown

@dependabot dependabot Bot commented on behalf of github Apr 10, 2026

Bumps actions/checkout from 4 to 6.

Release notes

Sourced from actions/checkout's releases.

v6.0.0

What's Changed

Full Changelog: actions/checkout@v5.0.0...v6.0.0

v6-beta

What's Changed

Updated persist-credentials to store the credentials under $RUNNER_TEMP instead of directly in the local git config.

This requires a minimum Actions Runner version of v2.329.0 to access the persisted credentials for Docker container action scenarios.

v5.0.1

What's Changed

Full Changelog: actions/checkout@v5...v5.0.1

v5.0.0

What's Changed

⚠️ Minimum Compatible Runner Version

v2.327.1
Release Notes

Make sure your runner is updated to this version or newer to use this release.

Full Changelog: actions/checkout@v4...v5.0.0

v4.3.1

What's Changed

Full Changelog: actions/checkout@v4...v4.3.1

v4.3.0

What's Changed

... (truncated)

Changelog

Sourced from actions/checkout's changelog.

Changelog

v6.0.2

v6.0.1

v6.0.0

v5.0.1

v5.0.0

v4.3.1

v4.3.0

v4.2.2

v4.2.1

v4.2.0

v4.1.7

v4.1.6

... (truncated)

Commits

Dependabot compatibility score

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

  • @dependabot rebase will rebase this PR
  • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
  • @dependabot show <dependency name> ignore conditions will show all of the ignore conditions of the specified dependency
  • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

Bumps [actions/checkout](https://github.com/actions/checkout) from 4 to 6.
- [Release notes](https://github.com/actions/checkout/releases)
- [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md)
- [Commits](actions/checkout@v4...v6)

---
updated-dependencies:
- dependency-name: actions/checkout
  dependency-version: '6'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
@dependabot dependabot Bot added dependencies Pull requests that update a dependency file github_actions Pull requests that update GitHub Actions code labels Apr 10, 2026
@nicolotognoni nicolotognoni merged commit c5ddba8 into main Apr 10, 2026
6 checks passed
@dependabot dependabot Bot deleted the dependabot/github_actions/actions/checkout-6 branch April 10, 2026 16:22
nicolotognoni added a commit that referenced this pull request Apr 21, 2026
…#66)

* fix(deps): pin websockets>=14 and add python-multipart

Fixes BUG #7 and #9 from acceptance suite.

- websockets: pin >=14,<16. The 'additional_headers=' kwarg used by the
  OpenAI Realtime, Deepgram STT and ElevenLabs ConvAI adapters is only
  supported on the new asyncio client that became the default in 14.0.
  Under 13.x the call failed with 'got an unexpected keyword argument
  additional_headers', blocking every streaming provider.
- python-multipart: add to the base install. Starlette >= 0.45 raises on
  'await request.form()' without python-multipart installed, so every
  Twilio webhook returned 422 and the call was silently dropped.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(server): repair Twilio & Telnyx webhook stack

Fixes BUG #6, #8, #16 from acceptance suite.

- #8 Request/Response import lifted to the top of server.py. With
  ``from __future__ import annotations`` in place, FastAPI's
  ``get_type_hints(handler)`` resolved the 'Request' annotation against
  module globals where only WebSocket was imported. The ForwardRef stayed
  unresolved, FastAPI classified the parameter as a query-string field
  and every Twilio/Telnyx webhook POST returned HTTP 422 before the
  handler body could run. Local mode was fundamentally broken on 0.4.3.
- #6 dashboard tracking of failed outbound calls: new route
  ``POST /webhooks/twilio/status`` consumes Twilio statusCallback events
  (initiated/ringing/answered/completed/no-answer/busy/failed) and feeds
  them into MetricsStore.update_call_status. Operators now see every
  dialled attempt in the dashboard, including ones that never reach
  media.
- #16 Telnyx Call Control: ``/webhooks/telnyx/voice`` now POSTs
  ``actions/answer`` on call.initiated and ``actions/streaming_start``
  on call.answered against the REST API and returns empty HTTP 200.
  Previously the route returned a JSON ``{commands: [...]}`` body that
  Telnyx silently discards — the call rang forever.

Twilio voice route also falls back to the ``Caller`` / ``Called`` form
fields when ``From`` / ``To`` are empty (see BUG #6 notes).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(telnyx): WS event shape, frame format, track filter, audio sender

Fixes BUG #17, #18, #19 from acceptance suite.

- #17 Media-stream WebSocket events use ``event`` (start / media / stop /
  dtmf / error / connected), not the Call Control REST notification
  ``event_type``. Audio payload lives in ``data.media.payload`` (base64),
  caller/callee live in ``data.start.{from,to}``. Previously the bridge
  matched ``event_type == "stream_started"`` and looked for audio in
  ``payload.audio.chunk`` — no media chunk was ever decoded, so the
  agent never heard the caller.
- #18 Outbound wire format corrected to
  ``{"event":"media","media":{"payload":b64}}`` and
  ``{"event":"clear"}``. The legacy ``event_type``/``payload.audio.chunk``
  shape was silently dropped by Telnyx, so the caller heard silence.
- #19 When ``stream_track=both_tracks`` Telnyx emits media for both the
  caller leg and the agent's own outbound leg; forwarding the outbound
  echo broke OpenAI Realtime turn detection ("speech_started" never
  fired). The bridge now filters ``media.track != "inbound"`` before
  forwarding.

OpenAI Realtime handler on Telnyx is now configured with
``audio_format="g711_ulaw"`` to match the PCMU 8 kHz bidirectional
stream. The TelnyxAudioSender transcodes PCM16 16 kHz → mulaw 8 kHz for
pipeline / ConvAI providers (PCM16 TTS output) and passes mulaw bytes
through when OpenAI Realtime provides them directly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(twilio): OpenAI Realtime audio format + pass-through audio sender

Fixes BUG #10 from acceptance suite.

OpenAI Realtime emits PCM16 at 24 kHz natively. The Twilio handler
previously left ``audio_format`` at the pcm16 default and fed the bytes
into TwilioAudioSender, which unconditionally ran
``resample_16k_to_8k(pcm) → pcm16_to_mulaw`` assuming 16 kHz input.
24 kHz bytes run through a 16→8 kHz resampler come out at ~66% of the
correct rate — the caller heard a deep, slurred voice.

Fix: on the Twilio path construct
``OpenAIRealtimeStreamHandler(..., audio_format="g711_ulaw")`` so
OpenAI emits Twilio-native mulaw 8 kHz directly. Pair it with
``TwilioAudioSender(..., input_is_mulaw_8k=True)`` which skips the
resample+mulaw encode and forwards the bytes as-is. Pipeline and ConvAI
still produce PCM16 @ 16 kHz and go through the default transcoding
path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(pipeline): STT path + hooks + barge-in + dedup + hallucination filter

Fixes BUG #12, #15, #20, #22 from acceptance suite.

- #12 Pipeline on Twilio: the bridge converts mulaw 8 kHz → PCM16 16 kHz
  before STT. The STT adapter used to be built with ``for_twilio=True``
  (mulaw 8 kHz) — Deepgram decoded the already-PCM bytes as mulaw and
  produced garbage transcripts. The pipeline now always configures
  linear16 @ 16 kHz.
- #15 ``PipelineHooks.before_send_to_stt`` was declared but never
  invoked. ``PipelineStreamHandler.on_audio_received`` now runs the
  hook on every inbound chunk and drops the chunk when it returns
  ``None``.
- #20 Pipeline barge-in: ``on_audio_received`` used to skip STT when
  ``_is_speaking=True``, blocking any barge-in detection. It now keeps
  forwarding caller audio to STT during TTS (unless
  ``agent.barge_in_threshold_ms == 0``), and ``_stt_loop`` flips
  ``_is_speaking=False`` + ``send_clear`` on any Deepgram transcript
  with text observed while speaking. Effective latency floor is
  ~800 ms (Deepgram interim), so noisy / short TTS sentences may not
  actually be interrupted — full sub-second barge-in requires a
  server-side VAD (Silero, already supported via ``agent.vad=``).
- #22 Dedup + throttle + hallucination filter. Low-quality STT (Whisper
  on mulaw 8 kHz) emits several nearly-identical final transcripts in
  1–2 s ("you", "you", "you") and hallucinates short fillers from
  silence / TTS echo. Each used to kick off a new LLM+TTS turn, and
  consecutive turns overlapped on the caller's line. Fix in
  ``_stt_loop``: dedup identical finals within 2 s, drop any final
  within 500 ms of the last committed turn, drop a curated blacklist
  of fillers (``you``, ``thank you``, ``yeah``, ``uh``, ``.``…).

Also adds the 8 kHz output path used by the Telnyx handler via a
shared linear16 STT factory in ``handlers/common.py``.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(providers): voice name resolver, Deepgram knobs, TTS streaming resample

Fixes BUG #11, #13, #23 from acceptance suite.

- #11 ElevenLabs voice-name resolver. ``Patter.elevenlabs(voice="rachel")``
  (the quickstart default) used to pass "rachel" verbatim into the
  /text-to-speech/{voice_id}/stream URL, which 404s because the API
  only accepts the opaque 20-char voice IDs. The new ``resolve_voice_id``
  helper maps ~45 common display names (rachel, adam, matilda, alloy, …)
  to their UUIDs and returns unknown strings unchanged so custom voices
  keep working. Removes the ad-hoc "alloy" substitution in
  stream_handler.
- #13 DeepgramSTT exposes ``endpointing_ms`` / ``utterance_end_ms`` /
  ``smart_format`` / ``interim_results`` / ``vad_events`` kwargs and the
  ``Patter.deepgram(...)`` factory forwards them via ``STTConfig.options``.
  Defaults tuned for telephony (endpointing_ms=150, utterance_end_ms=1000).
  The transcript gate is loosened to ``is_final OR speech_final`` so we
  don't wait up to utterance_end_ms on every turn. Pipeline turn latency
  on Twilio drops from ~4 s to ~2.2 s.
- #23 OpenAI TTS streaming resample. ``response_format=pcm`` returns
  24 kHz PCM16 chunks that must be downsampled to 16 kHz. The old
  implementation did the 3:2 downsample chunk-by-chunk without
  preserving filter state, so cross-chunk alignment drifted and the
  caller heard pops / dropped audio. Now uses ``audioop.ratecv`` with
  a persistent ``state`` and stashes odd trailing bytes between calls.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(scheduler,fallback): per-loop schedulers + async close + cancel probes

Fixes BUG #2, #3, #5 from acceptance suite.

- #3 Scheduler singleton dies across event loops. The old
  ``_scheduler_singleton`` bound to the first loop it saw; pytest-asyncio
  closed that loop at the end of every test and the next scheduled
  callback crashed with ``Event loop is closed``. Replaced by
  ``_schedulers_by_loop`` — a dict keyed on ``id(asyncio.get_event_loop())``
  that drops stale entries when the owning loop has been closed. Adds
  ``reset_for_tests()`` to tear down every cached scheduler; the public
  ``shutdown()`` is now an alias for it.
- #2 ``FallbackLLMProvider.complete_stream`` — convenience wrapper
  that flattens ``{"type": "text"}`` chunks so callers don't have to
  switch on chunk type. Mirrors the TS SDK's ``completeStream``.
- #5 ``FallbackLLMProvider`` recovery task leak. ``_probe`` tasks
  created by ``_start_recovery`` were never awaited, and pytest-asyncio
  tears the loop down before they finish. Adds ``aclose()`` and async
  context manager support (``__aenter__``/``__aexit__``) so callers can
  ``async with FallbackLLMProvider(...)`` and have the probes cancelled
  + awaited on exit.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(tools): @tool adapter unpacks kwargs into user function

Fixes BUG #21 from acceptance suite.

``@tool`` exposed the raw user function as ``handler`` but
``services/tool_executor._execute_handler`` always calls
``handler(arguments_dict, call_context_dict)``. Every typed tool — e.g.
``async def check_order(order_id: str)`` — crashed at runtime with
"takes 1 positional argument but 2 were given" and OpenAI Realtime
received a fallback error JSON instead of the tool's result.

The decorator now wraps the user function in an async adapter whose
signature matches the executor's contract ``(arguments, call_context)``.
The adapter inspects the original signature: if it already takes
``(arguments, call_context)`` positionally it passes through unchanged,
otherwise it filters ``arguments`` to the user function's declared
parameter names and calls ``fn(**args)``. The original function is
still reachable via ``handler.__wrapped__`` for introspection.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(dashboard): track failed & no-answer outbound calls

Fixes BUG #6 from acceptance suite.

The embedded dashboard used to show only calls that made it to the
media channel. An outbound dial that rang out (``status=no-answer``,
``busy``, ``failed``) never produced a webhook hit, so the row never
appeared in the UI even though Twilio billed for the attempt.

Changes:

- ``MetricsStore.record_call_initiated({call_id, caller, callee, …})``
  pre-registers the call when ``Patter.call()`` returns, so the row
  shows up the moment the dial is dispatched.
- ``MetricsStore.update_call_status(call_id, status, **extra)`` promotes
  the record through the lifecycle (ringing → in-progress → completed /
  no-answer / busy / failed / canceled). Terminal states move the row
  from active to the completed list so the UI timer freezes. Fed by
  the new ``/webhooks/twilio/status`` route.
- ``MetricsStoreProtocol`` extended with the two new methods.
- ``call_end`` now synthesises a minimal metrics shim when the call
  ended without a full CallMetrics payload, so the UI can still render
  duration / status.
- Dashboard UI: new ``STATUS`` column, filter pills (all / completed /
  failed), colour-coded badges (green / yellow / red / orange), red
  row tint for failed statuses, and SSE listeners for the new
  ``call_initiated`` and ``call_status`` events. The duration timer
  respects ``data-ended`` so rows that already received call_end stop
  ticking.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(api): ring_timeout + agent.hooks/vad/audio_filter forwarding + call pre-register

Fixes BUG #14 + IMP2 + completes BUG #6 from acceptance suite.

- #14 ``Patter.agent(...)`` used to drop ``hooks``, ``text_transforms``,
  ``vad``, ``audio_filter``, ``background_audio`` and
  ``barge_in_threshold_ms`` even though the ``Agent`` dataclass accepted
  them. The factory now forwards all fields.
- IMP2 ``ring_timeout: int | None`` kwarg on ``Patter.call(...)``.
  Forwarded to Twilio as ``Timeout=`` and to Telnyx as ``timeout_secs``
  (added to ``TelnyxAdapter.initiate_call``). Italian mobile carriers
  silence-drop the default ~28 s ring on US→IT calls; the quickstart
  now works with ``ring_timeout=60``.
- #6 ``Patter.call()`` pre-registers the dialled call in the
  MetricsStore via ``record_call_initiated(...)`` before returning, so
  the dashboard shows the attempt even when the callee never picks up.
  The Twilio branch also passes ``StatusCallbackEvent="initiated
  ringing answered completed"`` so we receive every state transition.

Also exposes the new Deepgram knobs on the ``Patter.deepgram(...)``
factory (``model``, ``endpointing_ms``, ``utterance_end_ms``,
``smart_format``, ``interim_results``).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(api): models barge_in_threshold_ms + STT/TTS options, top-level mix_pcm, docstring

Rolls up the smaller API additions — BUG #1, #04g, extras from #13/#15.

- ``Agent.barge_in_threshold_ms`` (default 300) — hangover window before
  treating caller audio as barge-in. Used by PipelineStreamHandler and
  mirrored on TS ``AgentOptions.bargeInThresholdMs``.
- ``STTConfig.options`` / ``TTSConfig.options`` — provider-specific
  knobs bag (e.g. Deepgram endpointing) that ``common._create_stt_from_config``
  unpacks when building the adapter. Keeps older ``STTConfig`` callers
  forward-compatible.
- Top-level ``patter.mix_pcm(agent, bg, ratio)`` — parity alias for the
  TS ``mixPcm(...)`` standalone helper (BUG #04g). Thin wrapper over
  the existing ``PcmMixer`` class with an explicit ratio.
- ``patter/__init__.py`` docstring enumerates the installable extras
  (scheduling, anthropic, groq, cerebras, google, …) so ``pip install
  getpatter`` users discover them without hitting a
  ``RuntimeError: Scheduling requires the 'apscheduler' package`` at
  call time (BUG #1).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test: align Python tests with BUG #12/#16/#17/#18/#19/#21 fixes

- ``test_local_mode``: pipeline Twilio bridge test now patches
  ``DeepgramSTT`` directly instead of ``DeepgramSTT.for_twilio`` —
  after BUG #12 the pipeline path uses the default linear16 16 kHz
  adapter on both telephony providers.
- ``test_new_features``: ``machine_detection=False`` no longer asserts
  an empty extra_params dict; BUG #6 now always wires a
  ``StatusCallback`` so the dashboard sees failed attempts. The test
  keeps its original intent (AMD-specific params absent) and additionally
  checks the status callback is set.
- ``test_server_unit::TestTelnyxVoiceRoute``: rewritten to assert the
  REST ``actions/answer`` POST after BUG #16 — the route no longer
  returns a JSON commands body.
- ``test_telnyx_bridge_unit``: helper messages updated to the
  ``{event: start|media|stop}`` wire shape from BUG #17; the OpenAI
  Realtime audio_format assertion now expects ``g711_ulaw`` (from #18).
- ``test_telnyx_handler_unit``: TelnyxAudioSender test uses
  ``input_is_mulaw_8k=True`` so the round-trip byte assertion still
  holds with the new PCM16→mulaw transcode path (#18). Wire format
  asserts ``event == "media"`` / ``event == "clear"``.
- ``test_tool_decorator``: invokes handlers with the new adapter
  signature ``(arguments_dict, call_context_dict)`` (#21), including a
  sync-wrapped handler awaited through the adapter.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(ts/api): Python parity — auto-detect local, static factories, ring_timeout

Brings TS parity with Python on BUG #4 parity items + #14 agent fields
+ IMP2 ring_timeout.

- Auto-detect local mode: ``new Patter({twilioSid, twilioToken, …})``
  without explicit ``mode: 'local'`` is now treated as local when
  apiKey is missing (mirrors Python).
- Static provider factories: ``Patter.deepgram(...)``,
  ``Patter.elevenlabs(...)``, ``Patter.whisper(...)``,
  ``Patter.openaiTts(...)``, ``Patter.cartesia(...)``, ``Patter.rime(...)``,
  ``Patter.lmnt(...)``.
- ``STTConfig.toDict`` / ``TTSConfig.toDict`` are now optional — plain
  object literals ``{provider, apiKey, language}`` are accepted
  everywhere (fallback serialisation is handled via
  ``sttConfigToDict`` / ``ttsConfigToDict`` helpers).
- ``STTConfig`` gets an ``options`` bag (parity with Python BUG #13).
- ``LocalCallOptions.ringTimeout`` forwarded to Twilio as ``Timeout``
  and Telnyx as ``timeout_secs`` — plus ``StatusCallbackEvent`` wired
  so the dashboard sees ringing/no-answer/busy/failed transitions
  (BUG #6).
- ``AgentOptions.bargeInThresholdMs`` (parity with #20 on Python).
- ``LocalOptions.deepgramKey`` / ``elevenlabsKey`` added as
  provider-level defaults (parity with Python Patter() kwargs).
- ``Patter.call()`` Twilio branch pre-registers the dialled call with
  ``metricsStore.recordCallInitiated`` so no-answer / busy / failed
  attempts still show up in the dashboard.
- ``providers.deepgram(...)`` factory exposes the Deepgram knobs
  (model / endpointing_ms / utterance_end_ms / smart_format /
  interim_results) and carries them in ``STTConfig.options``.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(ts/providers): voice resolver, Deepgram knobs, TTS streaming resample

TS parity port of Python BUG #11, #13, #23.

- ElevenLabs: ``resolveVoiceId()`` maps display names (rachel, adam,
  matilda, alloy, …) to the opaque 20-char UUIDs accepted by the
  /text-to-speech/{voice_id}/stream endpoint. Map mirrors the Python
  SDK byte-for-byte.
- DeepgramSTT: constructor overloaded to accept ``DeepgramSTTOptions``
  (endpointingMs / utteranceEndMs / smartFormat / interimResults /
  vadEvents) alongside the legacy positional form. Transcript gate
  loosened to ``is_final OR speech_final`` so short utterances don't
  wait for Deepgram's utterance_end commit.
- OpenAITTS: streaming 24 kHz → 16 kHz resample now carries state
  (``carryByte`` + ``leftover`` samples) between chunks so cross-chunk
  alignment doesn't drift. The legacy ``resample24kTo16k`` static is
  kept as a thin wrapper around the streaming path for the existing
  unit tests.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(ts): Telnyx stack, pipeline hooks/barge-in/dedup, dashboard status, scheduler sync

TS parity port of the Python fixes for BUG #2/#3/#6/#12/#15/#16/#17/#18/#19/#20/#22.

- ``stream-handler.ts``: ``handleAudio`` now runs the
  ``before_send_to_stt`` hook (#15), transcodes Twilio mulaw 8 kHz →
  PCM16 16 kHz unconditionally on the pipeline path (#12), and keeps
  forwarding caller audio during TTS so barge-in can trigger (#20).
  ``processTranscript`` implements the dedup + 500 ms throttle +
  hallucination-word blacklist from #22 and flips ``isSpeaking`` +
  ``sendClear`` on any transcript with text while the agent is
  speaking (#20).
- ``server.ts``: ``TelnyxBridge.sendAudio`` / ``sendClear`` use the
  correct ``{event:"media",media:{payload:b64}}`` wire format (#18);
  the Telnyx WS handler matches ``data.event`` (start / media / stop /
  dtmf / error / connected) and filters ``media.track !== "inbound"``
  before forwarding (#17, #19); the ``/webhooks/telnyx/voice`` route
  POSTs ``actions/answer`` and ``actions/streaming_start`` via the
  Call Control REST API and returns empty HTTP 200 (#16).
  ``TwilioBridge.createStt`` picks linear16 16 kHz when
  ``provider === 'pipeline'`` so Deepgram doesn't decode already-PCM
  bytes as mulaw (#12). A new ``/webhooks/twilio/status`` handler
  consumes Twilio status callbacks and updates the dashboard (#6).
- ``scheduler.ts``: ``scheduleCron`` returns a ``ScheduleHandle``
  synchronously (lazy node-cron import happens in the background) —
  parity with Python #4. ``scheduleInterval`` accepts
  ``{intervalMs}`` or ``{seconds}`` in addition to the legacy
  positional ms, matching Python ``schedule_interval(seconds=...)``.
- ``fallback-provider.ts``: ``completeStream()`` text-only convenience
  generator (#2), ``aclose()`` + ``Symbol.asyncDispose`` so
  ``await using fallback = ...`` parity with Python's
  ``async with FallbackLLMProvider(...)`` (#5).
- ``dashboard/store.ts``: ``recordCallInitiated`` pre-registers
  outbound attempts, ``updateCallStatus`` promotes rows through
  ringing / no-answer / busy / failed and moves terminal states to
  the completed list (#6).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(ts): align with 0.4.4 wire-format & provider API changes

- ``providers.test.ts``: toDict now surfaces ``options`` when set,
  knobs forwarding verified.
- ``types.test.ts``: toDict optional chain covered.
- ``openai-tts.test.ts``: 1-byte input no longer returns the byte
  verbatim — the streaming resampler stashes it as ``carryByte`` and
  the stateless wrapper flushes only complete samples, so the test now
  asserts an empty buffer.
- ``integration/twilio-pipeline.test.ts`` + ``integration/telnyx-pipeline.test.ts``:
  ``handleAudio`` is now async; tests await it. Telnyx fixture feeds
  mulaw 8 kHz and asserts the transcoded PCM16 16 kHz lands on the STT
  mock (BUG #12 + #19).
- ``unit/server-routes.test.ts``: Telnyx webhook tests assert the
  REST ``actions/answer`` + ``actions/streaming_start`` POSTs and the
  empty HTTP 200 response (BUG #16).
- ``package-lock.json``: refreshed for the sdk-ts worktree so the
  ``0.4.3`` → ``0.4.3-worktree`` alignment is consistent.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(unit): regression coverage for BUG #6/#22/#23 + ring_timeout (IMP2)

Three new unit test files lock in fixes that previously lived in the
acceptance suite as live-call checks:

  test_pipeline_dedup.py (13 tests)
    - Hallucination blacklist: "you", "thank you", ".", case/punctuation
      variants, empty-after-strip all drop silently.
    - 2-second duplicate window with time.time monkeypatched so parity
      with the live Whisper feedback loop is deterministic.
    - 500 ms back-to-back throttle covering legitimate vs spurious
      second turns.
    - Interim / empty finals must not fire on_transcript.

  test_openai_tts_resample.py (7 tests)
    - Cross-chunk ratecv state: multi-chunk stream output matches a
      single-shot resample byte-for-byte.
    - Odd-byte boundary: a chunk ending on a dangling byte must not
      drop the sample.
    - Empty / single-byte / tiny chunks must not crash.
    - Response is always aclosed on both successful and early-exit paths.

  test_twilio_status_and_ring_timeout.py (13 tests)
    - /webhooks/twilio/status routes to update_call_status with parsed
      duration, and survives missing SID, bad duration, and the
      dashboard-disabled path.
    - Twilio signature enforcement on the status endpoint.
    - Twilio ring_timeout -> Timeout REST param, Telnyx -> timeout_secs.
    - Twilio StatusCallback / StatusCallbackEvent are always registered
      on outbound calls so BUG #6 cannot regress.

Full unit suite: 728 passed, 2 skipped.

* docs+ci: latency/provider caveats + audit workflow

README
  - Pipeline turn-latency floor documented (~2.0–2.8 s) with per-stage
    breakdown so users know to switch to `provider="openai_realtime"`
    for sub-second UX.
  - ElevenLabs free-tier library-voice restriction (402) with pointer
    to `ELEVENLABS_VOICE_ID`.
  - Telnyx outbound D38 Outbound Profile requirement.
  - Google Gemini free-tier quota=0 caveat.
  - Whisper hallucination filter documented.
  - `ring_timeout` + status callback description added to call().

.github/workflows/audit.yml (new)
  - pip-audit on sdk-py runtime deps.
  - npm audit on sdk-ts production deps.
  - bandit static analysis with SARIF upload to GitHub Security.
  - Runs on dep-manifest changes, weekly schedule, and manual dispatch.
  - Findings are advisory-only to keep the pipeline from flaking on
    upstream CVE churn (telephony stack pulls many C-wrapped libs).

Baseline audit run: npm=0, bandit medium+/high-confidence=0,
pip-audit=2 (pytest dev-only + transformers optional-extra only).

* docs(readme): remove local-measured latency numbers from Voice Modes

The millisecond ranges previously listed for each provider came from a
single local benchmark run and are neither representative nor a target.
Keep the modes table qualitative and replace the per-stage breakdown
with a short note that latency is inherited from the chosen providers —
no hard numbers we don't want callers anchoring on.

* test(unit): bug coverage gaps — BUG #15/#19/#20

Three new unit test modules fill the remaining coverage gap for the
bugs fixed on this branch:

  test_pipeline_bargein.py (7 tests) — BUG #20
    - Interim transcript during TTS triggers send_clear + is_speaking=False.
    - record_turn_interrupted is fired on the metrics accumulator.
    - send_clear throwing does not crash the STT loop (fail-open).
    - No barge-in when the agent is idle or the transcript has no text.
    - Final transcripts also trigger the barge-in branch before the
      downstream LLM turn runs.

  test_before_send_to_stt_hook.py (9 tests) — BUG #15
    - Sync / async hook returning None drops the chunk (zero STT sends).
    - Returning modified bytes forwards the new buffer verbatim.
    - Hook receives the decoded PCM, not the raw mulaw payload.
    - Raising hooks fail-open: original audio still reaches STT.
    - Missing hook / hooks instance with before_send_to_stt=None are
      both bypass paths that must still forward audio.

  test_telnyx_track_filter.py (5 tests) — BUG #19
    - track=inbound forwards, track=outbound drops.
    - Missing `track` field defaults to inbound (legacy Telnyx payloads).
    - Mixed stream: only inbound frames reach the handler, in order.
    - Unknown track values are skipped defensively.

Full unit suite: 749 passed, 2 skipped (+21 from this commit).

* feat(sdk-py): add cartesia/rime/lmnt static factories + vad_events to deepgram

Brings Python SDK to parity with sdk-ts:
- Adds Patter.cartesia / Patter.rime / Patter.lmnt static methods so local-mode
  users can configure these TTS providers the same way they do in TypeScript.
- Adds the missing vad_events keyword to Patter.deepgram and the
  patter.providers.deepgram factory — the DeepgramSTT ctor already accepted
  it, but the public config helper silently dropped the flag.

* chore: bump to 0.4.4

Regression suites re-run after the bump:
  - sdk-py: 749 passed, 2 skipped
  - sdk-ts: 932 passed (57 test files, including soak)

* fix(ci): integration tests on 0.4.4 wire format + misc hygiene

Addresses the five failing CI checks on PR #66.

Telnyx integration tests (test_telnyx_{convai,pipeline,realtime}.py)
  - ``_telnyx_stream_started`` / ``_telnyx_media_event`` /
    ``_telnyx_stream_stopped`` helpers migrated from the pre-0.4.4
    ``{event_type, payload.audio.chunk}`` shape to the real Telnyx
    media-stream wire format ``{event, start|media.payload}`` (BUG
    #17/#18). Without this the bridge silently drops every test frame
    and 11 integration tests fail with "handler called 0 times".
  - ``test_audio_format_pcm16`` renamed to ``test_audio_format_g711_ulaw``
    and the assertion flipped — Telnyx is PCMU 8 kHz bidirectional
    (BUG #19), Realtime runs on ``g711_ulaw`` so both legs stay
    pass-through.

sdk-ts/src/scheduler.ts
  - Removed the trailing blank line that broke the pre-commit
    ``end-of-file-fixer`` hook.

.github/workflows/audit.yml
  - Bandit stock CLI doesn't support ``-f sarif`` — install
    ``bandit-sarif-formatter`` alongside bandit, and guard the
    upload-sarif step with ``hashFiles`` so future formatter breakage
    doesn't fail the job.

Local verification: 802 passed, 4 skipped (sdk-py unit + integration).

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
nicolotognoni added a commit that referenced this pull request Apr 21, 2026
* fix(deps): pin websockets>=14 and add python-multipart

Fixes BUG #7 and #9 from acceptance suite.

- websockets: pin >=14,<16. The 'additional_headers=' kwarg used by the
  OpenAI Realtime, Deepgram STT and ElevenLabs ConvAI adapters is only
  supported on the new asyncio client that became the default in 14.0.
  Under 13.x the call failed with 'got an unexpected keyword argument
  additional_headers', blocking every streaming provider.
- python-multipart: add to the base install. Starlette >= 0.45 raises on
  'await request.form()' without python-multipart installed, so every
  Twilio webhook returned 422 and the call was silently dropped.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(server): repair Twilio & Telnyx webhook stack

Fixes BUG #6, #8, #16 from acceptance suite.

- #8 Request/Response import lifted to the top of server.py. With
  ``from __future__ import annotations`` in place, FastAPI's
  ``get_type_hints(handler)`` resolved the 'Request' annotation against
  module globals where only WebSocket was imported. The ForwardRef stayed
  unresolved, FastAPI classified the parameter as a query-string field
  and every Twilio/Telnyx webhook POST returned HTTP 422 before the
  handler body could run. Local mode was fundamentally broken on 0.4.3.
- #6 dashboard tracking of failed outbound calls: new route
  ``POST /webhooks/twilio/status`` consumes Twilio statusCallback events
  (initiated/ringing/answered/completed/no-answer/busy/failed) and feeds
  them into MetricsStore.update_call_status. Operators now see every
  dialled attempt in the dashboard, including ones that never reach
  media.
- #16 Telnyx Call Control: ``/webhooks/telnyx/voice`` now POSTs
  ``actions/answer`` on call.initiated and ``actions/streaming_start``
  on call.answered against the REST API and returns empty HTTP 200.
  Previously the route returned a JSON ``{commands: [...]}`` body that
  Telnyx silently discards — the call rang forever.

Twilio voice route also falls back to the ``Caller`` / ``Called`` form
fields when ``From`` / ``To`` are empty (see BUG #6 notes).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(telnyx): WS event shape, frame format, track filter, audio sender

Fixes BUG #17, #18, #19 from acceptance suite.

- #17 Media-stream WebSocket events use ``event`` (start / media / stop /
  dtmf / error / connected), not the Call Control REST notification
  ``event_type``. Audio payload lives in ``data.media.payload`` (base64),
  caller/callee live in ``data.start.{from,to}``. Previously the bridge
  matched ``event_type == "stream_started"`` and looked for audio in
  ``payload.audio.chunk`` — no media chunk was ever decoded, so the
  agent never heard the caller.
- #18 Outbound wire format corrected to
  ``{"event":"media","media":{"payload":b64}}`` and
  ``{"event":"clear"}``. The legacy ``event_type``/``payload.audio.chunk``
  shape was silently dropped by Telnyx, so the caller heard silence.
- #19 When ``stream_track=both_tracks`` Telnyx emits media for both the
  caller leg and the agent's own outbound leg; forwarding the outbound
  echo broke OpenAI Realtime turn detection ("speech_started" never
  fired). The bridge now filters ``media.track != "inbound"`` before
  forwarding.

OpenAI Realtime handler on Telnyx is now configured with
``audio_format="g711_ulaw"`` to match the PCMU 8 kHz bidirectional
stream. The TelnyxAudioSender transcodes PCM16 16 kHz → mulaw 8 kHz for
pipeline / ConvAI providers (PCM16 TTS output) and passes mulaw bytes
through when OpenAI Realtime provides them directly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(twilio): OpenAI Realtime audio format + pass-through audio sender

Fixes BUG #10 from acceptance suite.

OpenAI Realtime emits PCM16 at 24 kHz natively. The Twilio handler
previously left ``audio_format`` at the pcm16 default and fed the bytes
into TwilioAudioSender, which unconditionally ran
``resample_16k_to_8k(pcm) → pcm16_to_mulaw`` assuming 16 kHz input.
24 kHz bytes run through a 16→8 kHz resampler come out at ~66% of the
correct rate — the caller heard a deep, slurred voice.

Fix: on the Twilio path construct
``OpenAIRealtimeStreamHandler(..., audio_format="g711_ulaw")`` so
OpenAI emits Twilio-native mulaw 8 kHz directly. Pair it with
``TwilioAudioSender(..., input_is_mulaw_8k=True)`` which skips the
resample+mulaw encode and forwards the bytes as-is. Pipeline and ConvAI
still produce PCM16 @ 16 kHz and go through the default transcoding
path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(pipeline): STT path + hooks + barge-in + dedup + hallucination filter

Fixes BUG #12, #15, #20, #22 from acceptance suite.

- #12 Pipeline on Twilio: the bridge converts mulaw 8 kHz → PCM16 16 kHz
  before STT. The STT adapter used to be built with ``for_twilio=True``
  (mulaw 8 kHz) — Deepgram decoded the already-PCM bytes as mulaw and
  produced garbage transcripts. The pipeline now always configures
  linear16 @ 16 kHz.
- #15 ``PipelineHooks.before_send_to_stt`` was declared but never
  invoked. ``PipelineStreamHandler.on_audio_received`` now runs the
  hook on every inbound chunk and drops the chunk when it returns
  ``None``.
- #20 Pipeline barge-in: ``on_audio_received`` used to skip STT when
  ``_is_speaking=True``, blocking any barge-in detection. It now keeps
  forwarding caller audio to STT during TTS (unless
  ``agent.barge_in_threshold_ms == 0``), and ``_stt_loop`` flips
  ``_is_speaking=False`` + ``send_clear`` on any Deepgram transcript
  with text observed while speaking. Effective latency floor is
  ~800 ms (Deepgram interim), so noisy / short TTS sentences may not
  actually be interrupted — full sub-second barge-in requires a
  server-side VAD (Silero, already supported via ``agent.vad=``).
- #22 Dedup + throttle + hallucination filter. Low-quality STT (Whisper
  on mulaw 8 kHz) emits several nearly-identical final transcripts in
  1–2 s ("you", "you", "you") and hallucinates short fillers from
  silence / TTS echo. Each used to kick off a new LLM+TTS turn, and
  consecutive turns overlapped on the caller's line. Fix in
  ``_stt_loop``: dedup identical finals within 2 s, drop any final
  within 500 ms of the last committed turn, drop a curated blacklist
  of fillers (``you``, ``thank you``, ``yeah``, ``uh``, ``.``…).

Also adds the 8 kHz output path used by the Telnyx handler via a
shared linear16 STT factory in ``handlers/common.py``.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(providers): voice name resolver, Deepgram knobs, TTS streaming resample

Fixes BUG #11, #13, #23 from acceptance suite.

- #11 ElevenLabs voice-name resolver. ``Patter.elevenlabs(voice="rachel")``
  (the quickstart default) used to pass "rachel" verbatim into the
  /text-to-speech/{voice_id}/stream URL, which 404s because the API
  only accepts the opaque 20-char voice IDs. The new ``resolve_voice_id``
  helper maps ~45 common display names (rachel, adam, matilda, alloy, …)
  to their UUIDs and returns unknown strings unchanged so custom voices
  keep working. Removes the ad-hoc "alloy" substitution in
  stream_handler.
- #13 DeepgramSTT exposes ``endpointing_ms`` / ``utterance_end_ms`` /
  ``smart_format`` / ``interim_results`` / ``vad_events`` kwargs and the
  ``Patter.deepgram(...)`` factory forwards them via ``STTConfig.options``.
  Defaults tuned for telephony (endpointing_ms=150, utterance_end_ms=1000).
  The transcript gate is loosened to ``is_final OR speech_final`` so we
  don't wait up to utterance_end_ms on every turn. Pipeline turn latency
  on Twilio drops from ~4 s to ~2.2 s.
- #23 OpenAI TTS streaming resample. ``response_format=pcm`` returns
  24 kHz PCM16 chunks that must be downsampled to 16 kHz. The old
  implementation did the 3:2 downsample chunk-by-chunk without
  preserving filter state, so cross-chunk alignment drifted and the
  caller heard pops / dropped audio. Now uses ``audioop.ratecv`` with
  a persistent ``state`` and stashes odd trailing bytes between calls.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(scheduler,fallback): per-loop schedulers + async close + cancel probes

Fixes BUG #2, #3, #5 from acceptance suite.

- #3 Scheduler singleton dies across event loops. The old
  ``_scheduler_singleton`` bound to the first loop it saw; pytest-asyncio
  closed that loop at the end of every test and the next scheduled
  callback crashed with ``Event loop is closed``. Replaced by
  ``_schedulers_by_loop`` — a dict keyed on ``id(asyncio.get_event_loop())``
  that drops stale entries when the owning loop has been closed. Adds
  ``reset_for_tests()`` to tear down every cached scheduler; the public
  ``shutdown()`` is now an alias for it.
- #2 ``FallbackLLMProvider.complete_stream`` — convenience wrapper
  that flattens ``{"type": "text"}`` chunks so callers don't have to
  switch on chunk type. Mirrors the TS SDK's ``completeStream``.
- #5 ``FallbackLLMProvider`` recovery task leak. ``_probe`` tasks
  created by ``_start_recovery`` were never awaited, and pytest-asyncio
  tears the loop down before they finish. Adds ``aclose()`` and async
  context manager support (``__aenter__``/``__aexit__``) so callers can
  ``async with FallbackLLMProvider(...)`` and have the probes cancelled
  + awaited on exit.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(tools): @tool adapter unpacks kwargs into user function

Fixes BUG #21 from acceptance suite.

``@tool`` exposed the raw user function as ``handler`` but
``services/tool_executor._execute_handler`` always calls
``handler(arguments_dict, call_context_dict)``. Every typed tool — e.g.
``async def check_order(order_id: str)`` — crashed at runtime with
"takes 1 positional argument but 2 were given" and OpenAI Realtime
received a fallback error JSON instead of the tool's result.

The decorator now wraps the user function in an async adapter whose
signature matches the executor's contract ``(arguments, call_context)``.
The adapter inspects the original signature: if it already takes
``(arguments, call_context)`` positionally it passes through unchanged,
otherwise it filters ``arguments`` to the user function's declared
parameter names and calls ``fn(**args)``. The original function is
still reachable via ``handler.__wrapped__`` for introspection.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(dashboard): track failed & no-answer outbound calls

Fixes BUG #6 from acceptance suite.

The embedded dashboard used to show only calls that made it to the
media channel. An outbound dial that rang out (``status=no-answer``,
``busy``, ``failed``) never produced a webhook hit, so the row never
appeared in the UI even though Twilio billed for the attempt.

Changes:

- ``MetricsStore.record_call_initiated({call_id, caller, callee, …})``
  pre-registers the call when ``Patter.call()`` returns, so the row
  shows up the moment the dial is dispatched.
- ``MetricsStore.update_call_status(call_id, status, **extra)`` promotes
  the record through the lifecycle (ringing → in-progress → completed /
  no-answer / busy / failed / canceled). Terminal states move the row
  from active to the completed list so the UI timer freezes. Fed by
  the new ``/webhooks/twilio/status`` route.
- ``MetricsStoreProtocol`` extended with the two new methods.
- ``call_end`` now synthesises a minimal metrics shim when the call
  ended without a full CallMetrics payload, so the UI can still render
  duration / status.
- Dashboard UI: new ``STATUS`` column, filter pills (all / completed /
  failed), colour-coded badges (green / yellow / red / orange), red
  row tint for failed statuses, and SSE listeners for the new
  ``call_initiated`` and ``call_status`` events. The duration timer
  respects ``data-ended`` so rows that already received call_end stop
  ticking.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(api): ring_timeout + agent.hooks/vad/audio_filter forwarding + call pre-register

Fixes BUG #14 + IMP2 + completes BUG #6 from acceptance suite.

- #14 ``Patter.agent(...)`` used to drop ``hooks``, ``text_transforms``,
  ``vad``, ``audio_filter``, ``background_audio`` and
  ``barge_in_threshold_ms`` even though the ``Agent`` dataclass accepted
  them. The factory now forwards all fields.
- IMP2 ``ring_timeout: int | None`` kwarg on ``Patter.call(...)``.
  Forwarded to Twilio as ``Timeout=`` and to Telnyx as ``timeout_secs``
  (added to ``TelnyxAdapter.initiate_call``). Italian mobile carriers
  silence-drop the default ~28 s ring on US→IT calls; the quickstart
  now works with ``ring_timeout=60``.
- #6 ``Patter.call()`` pre-registers the dialled call in the
  MetricsStore via ``record_call_initiated(...)`` before returning, so
  the dashboard shows the attempt even when the callee never picks up.
  The Twilio branch also passes ``StatusCallbackEvent="initiated
  ringing answered completed"`` so we receive every state transition.

Also exposes the new Deepgram knobs on the ``Patter.deepgram(...)``
factory (``model``, ``endpointing_ms``, ``utterance_end_ms``,
``smart_format``, ``interim_results``).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(api): models barge_in_threshold_ms + STT/TTS options, top-level mix_pcm, docstring

Rolls up the smaller API additions — BUG #1, #04g, extras from #13/#15.

- ``Agent.barge_in_threshold_ms`` (default 300) — hangover window before
  treating caller audio as barge-in. Used by PipelineStreamHandler and
  mirrored on TS ``AgentOptions.bargeInThresholdMs``.
- ``STTConfig.options`` / ``TTSConfig.options`` — provider-specific
  knobs bag (e.g. Deepgram endpointing) that ``common._create_stt_from_config``
  unpacks when building the adapter. Keeps older ``STTConfig`` callers
  forward-compatible.
- Top-level ``patter.mix_pcm(agent, bg, ratio)`` — parity alias for the
  TS ``mixPcm(...)`` standalone helper (BUG #04g). Thin wrapper over
  the existing ``PcmMixer`` class with an explicit ratio.
- ``patter/__init__.py`` docstring enumerates the installable extras
  (scheduling, anthropic, groq, cerebras, google, …) so ``pip install
  getpatter`` users discover them without hitting a
  ``RuntimeError: Scheduling requires the 'apscheduler' package`` at
  call time (BUG #1).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test: align Python tests with BUG #12/#16/#17/#18/#19/#21 fixes

- ``test_local_mode``: pipeline Twilio bridge test now patches
  ``DeepgramSTT`` directly instead of ``DeepgramSTT.for_twilio`` —
  after BUG #12 the pipeline path uses the default linear16 16 kHz
  adapter on both telephony providers.
- ``test_new_features``: ``machine_detection=False`` no longer asserts
  an empty extra_params dict; BUG #6 now always wires a
  ``StatusCallback`` so the dashboard sees failed attempts. The test
  keeps its original intent (AMD-specific params absent) and additionally
  checks the status callback is set.
- ``test_server_unit::TestTelnyxVoiceRoute``: rewritten to assert the
  REST ``actions/answer`` POST after BUG #16 — the route no longer
  returns a JSON commands body.
- ``test_telnyx_bridge_unit``: helper messages updated to the
  ``{event: start|media|stop}`` wire shape from BUG #17; the OpenAI
  Realtime audio_format assertion now expects ``g711_ulaw`` (from #18).
- ``test_telnyx_handler_unit``: TelnyxAudioSender test uses
  ``input_is_mulaw_8k=True`` so the round-trip byte assertion still
  holds with the new PCM16→mulaw transcode path (#18). Wire format
  asserts ``event == "media"`` / ``event == "clear"``.
- ``test_tool_decorator``: invokes handlers with the new adapter
  signature ``(arguments_dict, call_context_dict)`` (#21), including a
  sync-wrapped handler awaited through the adapter.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(ts/api): Python parity — auto-detect local, static factories, ring_timeout

Brings TS parity with Python on BUG #4 parity items + #14 agent fields
+ IMP2 ring_timeout.

- Auto-detect local mode: ``new Patter({twilioSid, twilioToken, …})``
  without explicit ``mode: 'local'`` is now treated as local when
  apiKey is missing (mirrors Python).
- Static provider factories: ``Patter.deepgram(...)``,
  ``Patter.elevenlabs(...)``, ``Patter.whisper(...)``,
  ``Patter.openaiTts(...)``, ``Patter.cartesia(...)``, ``Patter.rime(...)``,
  ``Patter.lmnt(...)``.
- ``STTConfig.toDict`` / ``TTSConfig.toDict`` are now optional — plain
  object literals ``{provider, apiKey, language}`` are accepted
  everywhere (fallback serialisation is handled via
  ``sttConfigToDict`` / ``ttsConfigToDict`` helpers).
- ``STTConfig`` gets an ``options`` bag (parity with Python BUG #13).
- ``LocalCallOptions.ringTimeout`` forwarded to Twilio as ``Timeout``
  and Telnyx as ``timeout_secs`` — plus ``StatusCallbackEvent`` wired
  so the dashboard sees ringing/no-answer/busy/failed transitions
  (BUG #6).
- ``AgentOptions.bargeInThresholdMs`` (parity with #20 on Python).
- ``LocalOptions.deepgramKey`` / ``elevenlabsKey`` added as
  provider-level defaults (parity with Python Patter() kwargs).
- ``Patter.call()`` Twilio branch pre-registers the dialled call with
  ``metricsStore.recordCallInitiated`` so no-answer / busy / failed
  attempts still show up in the dashboard.
- ``providers.deepgram(...)`` factory exposes the Deepgram knobs
  (model / endpointing_ms / utterance_end_ms / smart_format /
  interim_results) and carries them in ``STTConfig.options``.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(ts/providers): voice resolver, Deepgram knobs, TTS streaming resample

TS parity port of Python BUG #11, #13, #23.

- ElevenLabs: ``resolveVoiceId()`` maps display names (rachel, adam,
  matilda, alloy, …) to the opaque 20-char UUIDs accepted by the
  /text-to-speech/{voice_id}/stream endpoint. Map mirrors the Python
  SDK byte-for-byte.
- DeepgramSTT: constructor overloaded to accept ``DeepgramSTTOptions``
  (endpointingMs / utteranceEndMs / smartFormat / interimResults /
  vadEvents) alongside the legacy positional form. Transcript gate
  loosened to ``is_final OR speech_final`` so short utterances don't
  wait for Deepgram's utterance_end commit.
- OpenAITTS: streaming 24 kHz → 16 kHz resample now carries state
  (``carryByte`` + ``leftover`` samples) between chunks so cross-chunk
  alignment doesn't drift. The legacy ``resample24kTo16k`` static is
  kept as a thin wrapper around the streaming path for the existing
  unit tests.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(ts): Telnyx stack, pipeline hooks/barge-in/dedup, dashboard status, scheduler sync

TS parity port of the Python fixes for BUG #2/#3/#6/#12/#15/#16/#17/#18/#19/#20/#22.

- ``stream-handler.ts``: ``handleAudio`` now runs the
  ``before_send_to_stt`` hook (#15), transcodes Twilio mulaw 8 kHz →
  PCM16 16 kHz unconditionally on the pipeline path (#12), and keeps
  forwarding caller audio during TTS so barge-in can trigger (#20).
  ``processTranscript`` implements the dedup + 500 ms throttle +
  hallucination-word blacklist from #22 and flips ``isSpeaking`` +
  ``sendClear`` on any transcript with text while the agent is
  speaking (#20).
- ``server.ts``: ``TelnyxBridge.sendAudio`` / ``sendClear`` use the
  correct ``{event:"media",media:{payload:b64}}`` wire format (#18);
  the Telnyx WS handler matches ``data.event`` (start / media / stop /
  dtmf / error / connected) and filters ``media.track !== "inbound"``
  before forwarding (#17, #19); the ``/webhooks/telnyx/voice`` route
  POSTs ``actions/answer`` and ``actions/streaming_start`` via the
  Call Control REST API and returns empty HTTP 200 (#16).
  ``TwilioBridge.createStt`` picks linear16 16 kHz when
  ``provider === 'pipeline'`` so Deepgram doesn't decode already-PCM
  bytes as mulaw (#12). A new ``/webhooks/twilio/status`` handler
  consumes Twilio status callbacks and updates the dashboard (#6).
- ``scheduler.ts``: ``scheduleCron`` returns a ``ScheduleHandle``
  synchronously (lazy node-cron import happens in the background) —
  parity with Python #4. ``scheduleInterval`` accepts
  ``{intervalMs}`` or ``{seconds}`` in addition to the legacy
  positional ms, matching Python ``schedule_interval(seconds=...)``.
- ``fallback-provider.ts``: ``completeStream()`` text-only convenience
  generator (#2), ``aclose()`` + ``Symbol.asyncDispose`` so
  ``await using fallback = ...`` parity with Python's
  ``async with FallbackLLMProvider(...)`` (#5).
- ``dashboard/store.ts``: ``recordCallInitiated`` pre-registers
  outbound attempts, ``updateCallStatus`` promotes rows through
  ringing / no-answer / busy / failed and moves terminal states to
  the completed list (#6).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(ts): align with 0.4.4 wire-format & provider API changes

- ``providers.test.ts``: toDict now surfaces ``options`` when set,
  knobs forwarding verified.
- ``types.test.ts``: toDict optional chain covered.
- ``openai-tts.test.ts``: 1-byte input no longer returns the byte
  verbatim — the streaming resampler stashes it as ``carryByte`` and
  the stateless wrapper flushes only complete samples, so the test now
  asserts an empty buffer.
- ``integration/twilio-pipeline.test.ts`` + ``integration/telnyx-pipeline.test.ts``:
  ``handleAudio`` is now async; tests await it. Telnyx fixture feeds
  mulaw 8 kHz and asserts the transcoded PCM16 16 kHz lands on the STT
  mock (BUG #12 + #19).
- ``unit/server-routes.test.ts``: Telnyx webhook tests assert the
  REST ``actions/answer`` + ``actions/streaming_start`` POSTs and the
  empty HTTP 200 response (BUG #16).
- ``package-lock.json``: refreshed for the sdk-ts worktree so the
  ``0.4.3`` → ``0.4.3-worktree`` alignment is consistent.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(unit): regression coverage for BUG #6/#22/#23 + ring_timeout (IMP2)

Three new unit test files lock in fixes that previously lived in the
acceptance suite as live-call checks:

  test_pipeline_dedup.py (13 tests)
    - Hallucination blacklist: "you", "thank you", ".", case/punctuation
      variants, empty-after-strip all drop silently.
    - 2-second duplicate window with time.time monkeypatched so parity
      with the live Whisper feedback loop is deterministic.
    - 500 ms back-to-back throttle covering legitimate vs spurious
      second turns.
    - Interim / empty finals must not fire on_transcript.

  test_openai_tts_resample.py (7 tests)
    - Cross-chunk ratecv state: multi-chunk stream output matches a
      single-shot resample byte-for-byte.
    - Odd-byte boundary: a chunk ending on a dangling byte must not
      drop the sample.
    - Empty / single-byte / tiny chunks must not crash.
    - Response is always aclosed on both successful and early-exit paths.

  test_twilio_status_and_ring_timeout.py (13 tests)
    - /webhooks/twilio/status routes to update_call_status with parsed
      duration, and survives missing SID, bad duration, and the
      dashboard-disabled path.
    - Twilio signature enforcement on the status endpoint.
    - Twilio ring_timeout -> Timeout REST param, Telnyx -> timeout_secs.
    - Twilio StatusCallback / StatusCallbackEvent are always registered
      on outbound calls so BUG #6 cannot regress.

Full unit suite: 728 passed, 2 skipped.

* docs+ci: latency/provider caveats + audit workflow

README
  - Pipeline turn-latency floor documented (~2.0–2.8 s) with per-stage
    breakdown so users know to switch to `provider="openai_realtime"`
    for sub-second UX.
  - ElevenLabs free-tier library-voice restriction (402) with pointer
    to `ELEVENLABS_VOICE_ID`.
  - Telnyx outbound D38 Outbound Profile requirement.
  - Google Gemini free-tier quota=0 caveat.
  - Whisper hallucination filter documented.
  - `ring_timeout` + status callback description added to call().

.github/workflows/audit.yml (new)
  - pip-audit on sdk-py runtime deps.
  - npm audit on sdk-ts production deps.
  - bandit static analysis with SARIF upload to GitHub Security.
  - Runs on dep-manifest changes, weekly schedule, and manual dispatch.
  - Findings are advisory-only to keep the pipeline from flaking on
    upstream CVE churn (telephony stack pulls many C-wrapped libs).

Baseline audit run: npm=0, bandit medium+/high-confidence=0,
pip-audit=2 (pytest dev-only + transformers optional-extra only).

* docs(readme): remove local-measured latency numbers from Voice Modes

The millisecond ranges previously listed for each provider came from a
single local benchmark run and are neither representative nor a target.
Keep the modes table qualitative and replace the per-stage breakdown
with a short note that latency is inherited from the chosen providers —
no hard numbers we don't want callers anchoring on.

* test(unit): bug coverage gaps — BUG #15/#19/#20

Three new unit test modules fill the remaining coverage gap for the
bugs fixed on this branch:

  test_pipeline_bargein.py (7 tests) — BUG #20
    - Interim transcript during TTS triggers send_clear + is_speaking=False.
    - record_turn_interrupted is fired on the metrics accumulator.
    - send_clear throwing does not crash the STT loop (fail-open).
    - No barge-in when the agent is idle or the transcript has no text.
    - Final transcripts also trigger the barge-in branch before the
      downstream LLM turn runs.

  test_before_send_to_stt_hook.py (9 tests) — BUG #15
    - Sync / async hook returning None drops the chunk (zero STT sends).
    - Returning modified bytes forwards the new buffer verbatim.
    - Hook receives the decoded PCM, not the raw mulaw payload.
    - Raising hooks fail-open: original audio still reaches STT.
    - Missing hook / hooks instance with before_send_to_stt=None are
      both bypass paths that must still forward audio.

  test_telnyx_track_filter.py (5 tests) — BUG #19
    - track=inbound forwards, track=outbound drops.
    - Missing `track` field defaults to inbound (legacy Telnyx payloads).
    - Mixed stream: only inbound frames reach the handler, in order.
    - Unknown track values are skipped defensively.

Full unit suite: 749 passed, 2 skipped (+21 from this commit).

* feat(sdk-py): add cartesia/rime/lmnt static factories + vad_events to deepgram

Brings Python SDK to parity with sdk-ts:
- Adds Patter.cartesia / Patter.rime / Patter.lmnt static methods so local-mode
  users can configure these TTS providers the same way they do in TypeScript.
- Adds the missing vad_events keyword to Patter.deepgram and the
  patter.providers.deepgram factory — the DeepgramSTT ctor already accepted
  it, but the public config helper silently dropped the flag.

* chore: bump to 0.4.4

Regression suites re-run after the bump:
  - sdk-py: 749 passed, 2 skipped
  - sdk-ts: 932 passed (57 test files, including soak)

* fix(ci): integration tests on 0.4.4 wire format + misc hygiene

Addresses the five failing CI checks on PR #66.

Telnyx integration tests (test_telnyx_{convai,pipeline,realtime}.py)
  - ``_telnyx_stream_started`` / ``_telnyx_media_event`` /
    ``_telnyx_stream_stopped`` helpers migrated from the pre-0.4.4
    ``{event_type, payload.audio.chunk}`` shape to the real Telnyx
    media-stream wire format ``{event, start|media.payload}`` (BUG
    #17/#18). Without this the bridge silently drops every test frame
    and 11 integration tests fail with "handler called 0 times".
  - ``test_audio_format_pcm16`` renamed to ``test_audio_format_g711_ulaw``
    and the assertion flipped — Telnyx is PCMU 8 kHz bidirectional
    (BUG #19), Realtime runs on ``g711_ulaw`` so both legs stay
    pass-through.

sdk-ts/src/scheduler.ts
  - Removed the trailing blank line that broke the pre-commit
    ``end-of-file-fixer`` hook.

.github/workflows/audit.yml
  - Bandit stock CLI doesn't support ``-f sarif`` — install
    ``bandit-sarif-formatter`` alongside bandit, and guard the
    upload-sarif step with ``hashFiles`` so future formatter breakage
    doesn't fail the job.

Local verification: 802 passed, 4 skipped (sdk-py unit + integration).

* docs: update SDK reference for 0.4.4 features

- Update version to 0.4.4 in API reference
- Add static factories: cartesia(), rime(), lmnt() for TTS
- Document new agent() parameters: hooks, text_transforms, vad, audio_filter, background_audio, barge_in_threshold_ms
- Add ring_timeout parameter to call() signature
- Document Deepgram tuning options: endpointing_ms, utterance_end_ms, vad_events
- Synchronize Python and TypeScript API documentation for parity

* docs: document barge_in_threshold_ms configuration

Update barge-in feature documentation to reflect new barge_in_threshold_ms parameter
(default 300ms). Document how to customize or disable via agent configuration.

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
nicolotognoni added a commit that referenced this pull request Apr 29, 2026
Five fixes uncovered by the 0.5.5 acceptance matrix run, ranging from a
HIGH-severity onnxruntime-node version mismatch that blocks Silero VAD
on macOS x86_64 to a misleading metric that makes healthy calls look
slow.

**Bug #1 (HIGH) — SileroVAD onnxruntime-node 1.24+ API drift**
* ``optionalDependencies.onnxruntime-node`` tightened from ``^1.18.0`` to
  ``~1.18.0`` — the caret was resolving to 1.24.x where
  ``listSupportedBackends`` was removed and the prebuilt ``bin/`` layout
  drifted, so ``import('onnxruntime-node')`` failed on macOS x86_64.
* ``loadOnnxRuntime`` now classifies the underlying error
  (``missing`` / ``binding`` / ``api-drift`` / ``unknown``) and surfaces a
  targeted remedy plus the original error chain via ``Error.cause`` —
  previously the failure mode was hidden behind a single "could not be
  resolved" string.

**Bug #2 (MEDIUM) — ElevenLabsConvAI agent_id error message**
* The env-var fallback already worked but the error message did not say
  *where* to get an agent ID from (the dashboard, not the API key).
  Updated both Python and TypeScript constructors to point users at
  https://elevenlabs.io/app/conversational-ai and reiterate that the
  agent ID is per-deployed-agent.
* Python ``ConvAI.__post_init__`` now raises when ``agent_id`` is empty
  (was silently passing through) — TypeScript already did this. Parity.

**Bug #3 (MEDIUM) — ElevenLabs WS payment_required**
* New typed exception ``ElevenLabsPlanError`` (subclass of
  ``ElevenLabsTTSError``) raised when the WS endpoint returns
  ``payment_required``. Free / Starter plans now get a clear "upgrade
  or use the HTTP class (drop-in API)" message instead of an opaque
  ``ElevenLabsTTSError: ElevenLabs WS error: payment_required``.
* Detection is case-insensitive and matches both the exact server
  string and any ``payment_required`` substring.

**Bug #5 (MEDIUM) — barge-in fragile in pipeline mode without VAD**
* On tunnel + speakerphone setups the agent's own TTS leaks into the
  inbound mic feed, STT transcribes it, and the legacy
  "always forward + bargeInThresholdMs" heuristic fails to fire the
  cancel — the agent talks over the user.
* ``serve()`` now logs a one-shot warning at startup when
  ``agent.engine`` is undefined, ``agent.vad`` is undefined, and
  ``bargeInThresholdMs > 0``, recommending ``SileroVAD`` or
  ``bargeInThresholdMs: 0``. Both Python and TypeScript.

**Bug #6 (LOW) — pipeline ``total_ms`` misleading on long utterances**
* ``total_ms`` spans the user's entire utterance (including pauses)
  because it includes ``stt_ms``, which itself measures STT-stream-open
  to transcript-finalisation. On a 4 s user turn ``total_ms`` reads
  ~5.5 s even though the agent's TTFA after end-of-speech is ~1.3-1.5
  s — misleading as a p95 / SLO metric.
* New ``LatencyBreakdown.agent_response_ms`` field (Python +
  TypeScript). Computed as ``endpoint_ms + llm_ttft_ms + tts_ms`` when
  all three signals are available, ``undefined`` / ``None`` otherwise.
  This is the user-perceived latency dashboards should track.
* ``total_ms`` kept unchanged for backward compatibility.

**Bug #7 (HIGH) — outbound TwiML races tunnel startup**
* The documented ``void phone.serve(...) → setTimeout → phone.call(...)``
  pattern reads ``localConfig.webhookUrl`` while the cloudflared
  hostname is still resolving, producing
  ``wss://undefined/...`` in the dial TwiML and a Twilio 11100 call
  drop on answer.
* New ``phone.tunnelReady`` Promise (TS) / ``phone.tunnel_ready``
  ``asyncio.Future`` (Python). Resolves to the public webhook hostname
  once ``serve()`` knows it (immediately for static webhookUrl,
  after ``startTunnel`` for ``tunnel: true``). Rejects if ``serve()``
  fails before the hostname is known.
* Documented pattern is now ``await phone.tunnelReady`` instead of
  ``setTimeout(10_000)`` — deterministic, no race.
* Same root-cause fix likely also addresses Bug #4 (intermittent WS
  upgrade race) which the acceptance run flagged as a related symptom.

Test totals after the fixes: Python 1064 PASS / 7 skip, TypeScript
1163 PASS / 67 files, cross-SDK chunker parity 53 PASS / 8 XFAIL / 0
FAIL on the 61-case fixture. No regressions.
nicolotognoni added a commit that referenced this pull request May 1, 2026
…#81)

* test(parity): cross-SDK sentence chunker fixture + standalone runner

Add a 61-case fixture documenting expected sentence-chunker output for
every supported edge case across English, Italian, CJK, Hindi, Arabic,
Khmer, Burmese, Armenian, and Ethiopic scripts. Each case carries the
ideal `expected_sentences` plus an optional `current_behavior` field that
documents known regressions / by-design quirks so the runner can xfail
them without blocking CI.

Standalone runner (`sentence_chunker_parity.py`) executes each case
through the Python `SentenceChunker`, spawns `node sentence_chunker_shim.js`
for the TypeScript equivalent, and compares emissions case-by-case.
Self-contained — does not depend on the main `tests/parity/run.py`
runner (which currently fails on the recent `patter` -> `getpatter`
package rename).

Result on the current main branch: 53 PASS / 8 XFAIL / 0 FAIL /
0 PARITY_FAIL — Python and TypeScript chunkers produce identical
sentence streams for every covered case.

* feat(chunker): IT/EN abbreviations, multilingual terminators, aggressive first-flush

Three layered improvements to ``SentenceChunker`` (parity Py↔TS), all
additive — no breaking change to the default behaviour:

**Italian + English abbreviations** (Phase 1, 7)
* Prefix list adds Sig, Sgr, Dott, Prof, Avv, Ing, Geom, Rag, Arch, On,
  Egr, Spett, Gent, Ill (Italian honorifics) plus Gen, Sen, Rep, Lt,
  Cpt, Capt, Col, Cmdr, Adm (Pipecat NLTK Punkt).
* Suffix list adds ecc, cit, cap, sez, art, pag, fig, tab, cfr, vol, ed
  (Italian) plus vs, etc, No, Vol, pp, cf, ca, op, Mt, Hwy, Rt, Pl, Ave,
  Blvd, Sq (Pipecat).
* Suffix-followed-by-starter pattern now preserves the trailing period
  (e.g. ``Patter Inc. He left.`` keeps ``Inc.`` instead of dropping it).
* All-caps name fix (Pipecat #1692): the maybe-short-flush gate-5
  acronym guard previously blocked any uppercase-preceded period, so
  ``"...with RAMESH."`` would never flush. Now only purely uppercase
  ASCII words ≤3 chars (U/US/USA/NATO patterns) are treated as acronyms.

**Multilingual terminator support** (Phase 7)
* Added ASCII semicolon ``;``, Unicode ellipsis ``…``, full-width
  semicolon/period/Japanese half-width to the terminator set.
* Ported Pipecat's ``UNAMBIGUOUS_NON_LATIN_TERMINATORS`` (BSD-2): Hindi
  Devanagari ``। ॥``, Arabic ``؟ ؛ ۔ ؏``, Khmer ``។ ៕``, Burmese ``။``,
  Armenian ``։``, Ethiopic ``። ፧``, Tibetan ``༎ ༏``.
* Final ``<stop>`` regex builds its character class from the merged set.

**Opt-in aggressive first-clause flush** (Phase 2)
* New constructor option ``aggressive_first_flush`` (Python) /
  ``aggressiveFirstFlush`` (TypeScript). **Default OFF.**
* When enabled, emits the first clause of the response on a soft
  punctuation boundary (``,``, em-dash, en-dash) once the buffer reaches
  ``aggressive_first_min_len`` (default 40 chars). Saves 200–500 ms TTFA
  on the first sentence of each turn.
* Eight guards prevent regressions on the safe-but-aggressive path:
  min-length, decimal-comma (``3,14``), thousands-separator
  (``1,000,000``), currency (``$1,000``, ``€1.000,50``), balanced
  parens/brackets/braces/double-quotes (protects JSON), ellipsis
  (``...``, ``…``), comma-before-quote, sub-token ambiguity (requires
  one char after the terminator).
* Italian (``language="it"``) hard-disables the feature regardless of
  caller preference — Italian inverts EN convention (``,`` decimal,
  ``.`` thousands), so a comma-flush would split mid-number.
* New ``Agent.aggressive_first_flush: bool = False`` field on Python
  ``Agent`` model. TypeScript ``AgentOptions.aggressiveFirstFlush`` is
  shipped in the after_llm 3-tier commit alongside the rest of the
  ``types.ts`` surface.

Test coverage: +11 Python unit tests + +11 TypeScript unit tests for
the aggressive first-flush feature + parity-fixture cases for RAMESH,
Hindi danda, Arabic question mark, ASCII semicolon, Unicode ellipsis,
vs./etc./Gen./Sen. abbreviations.

Sentence-chunker constants and abbreviation lists ported from Pipecat
(BSD-2-Clause, Daily) and from the LiveKit-derived regex base
(Apache-2.0).

* feat(hooks): after_llm 3-tier API with deprecated legacy callable adapter

The ``after_llm`` pipeline hook used to be a single callable
``(text, ctx) → str`` that received the full LLM response only after
the stream completed. Buffering the entire response added 500 ms – 2 s
of TTFA for any agent that configured the hook.

This commit introduces a 3-tier API that lets callers pick the right
latency budget for their transform:

* ``onChunk`` (sync, ~0 ms) — per-token transform applied inline before
  the stream-handler ever sees the token. Use for: regex replace,
  markdown strip, profanity char-swap. Does NOT block streaming.
* ``onSentence`` (async, 50–300 ms) — runs between the sentence chunker
  and TTS. Returns rewritten sentence, ``null`` to keep the original,
  ``""`` to drop the sentence. Use for: PII redaction, persona overlay,
  refusal swap. Adds latency only on the rewritten sentence, not the
  full turn.
* ``onResponse`` (async, 500 ms – 2 s) — full-response rewrite that
  buffers the LLM stream then runs once. **Blocks streaming TTS.** Use
  only when sentence-level rewrite is insufficient (e.g. structured
  output validation that needs the full text).

Backward compatibility
----------------------
The legacy single callable ``afterLlm: (text, ctx) => string`` still
works and is mapped to ``onResponse`` with a one-shot
``PatterDeprecationWarning`` (Python — subclass of both
``DeprecationWarning`` and ``UserWarning`` so it surfaces by default in
library code) or ``console.warn`` (TypeScript). Removal scheduled for
v0.7.0.

Detection in TypeScript uses ``typeof hook === 'function'`` (not
``hook.length`` arity sniffing — that pattern breaks under minifiers
and arrow defaults). Detection in Python uses ``callable(hook)`` plus
``_has_tier_attrs(hook)`` to disambiguate from object-form hooks.

Wire-up
-------
* ``llm_loop.py`` / ``llm-loop.ts`` — ``has_after_llm_response`` (and
  the legacy callable that maps to it) gates token buffering.
  ``has_after_llm_chunk`` triggers per-token transform inline before
  yield.
* ``stream_handler.py`` / ``stream-handler.ts`` — applies
  ``has_after_llm_sentence`` between the chunker emit and the TTS
  synthesise call. Both the streaming-LLM path and the non-streaming
  ``_speakFinalResponse`` path apply the hook for parity.
* The same ``stream_handler`` change wires
  ``Agent.aggressive_first_flush`` / ``AgentOptions.aggressiveFirstFlush``
  into the chunker constructor (Phase 2 wire-up that needed
  ``stream_handler`` and ``types.ts`` to land here alongside the hook
  changes — separating them would have required interactive patch
  staging on the same hunks).

Test coverage
-------------
* +11 Python pytest cases under ``TestAfterLlmThreeTier`` covering: no
  hook pass-through, legacy callable maps to ``on_response`` with
  deprecation warning, dict / Protocol / object forms, drop-by-empty,
  fail-open on hook exception, type confusion (non-string return),
  legacy alias methods (``has_after_llm`` / ``run_after_llm``) preserved.
* +9 TypeScript Vitest cases covering the equivalent surface.

* feat(tts): ElevenLabsWebSocketTTS — opt-in low-latency WS variant

New TTS provider that targets ElevenLabs' streaming-input WebSocket
endpoint (``/v1/text-to-speech/{voice}/stream-input``) instead of the
HTTP ``/stream`` endpoint used by ``ElevenLabsTTS``. Saves ~50 ms HTTP
request setup per utterance and avoids the TLS cold-start handshake on
bursty calls.

Drop-in API matching ``ElevenLabsTTS``:

* Same ``synthesize`` (Python) / ``synthesizeStream`` (TypeScript)
  signature returning ``AsyncGenerator<bytes>``.
* Same ``for_twilio()`` / ``for_telnyx()`` factories.
* Same default model ``eleven_flash_v2_5``.
* Top-level export ``getpatter.ElevenLabsWebSocketTTS`` (Py) /
  ``import { ElevenLabsWebSocketTTS } from "getpatter"`` (TS).

Defaults
--------
* ``auto_mode=true`` — server picks chunk timing.
* ``inactivity_timeout=60`` (range 5–180).
* Per-utterance lifecycle. Documented as a known trade-off vs Pipecat's
  per-session pool (pooling is on the roadmap for v0.6.x).
* ``eleven_v3*`` is rejected at construction with a clear error — the
  WS stream-input endpoint does not support v3; users must fall back
  to the HTTP class.

Resilience contract (post-review hardening)
-------------------------------------------
* **Connect timeout 5 s** (Pipecat-aligned, was 15 s in earlier
  drafts) bounds DNS + TLS handshake.
* **Per-frame receive timeout 30 s** prevents the generator hanging
  forever on a stalled server.
* **Permanent error handler attached BEFORE the open await** — closes
  a window where an error fired after the once-listener resolved would
  surface as ``uncaughtException`` in Node.
* **All ws listeners removed in ``finally``** — no closure leak past
  socket close.
* **Server ``error`` raises ``ElevenLabsTTSError``** instead of
  silently completing — caller can distinguish "synthesis succeeded
  with empty text" from "synth failed mid-stream".
* **Best-effort EOS ``{"text":""}`` in ``finally``** — tells
  ElevenLabs to stop billing for unconsumed audio. Sending it
  immediately after ``flush:true`` (the previous draft) risked
  truncating tail audio under ``auto_mode=true``.
* **Audio frame size cap 512 KB** prevents OOM via malicious /
  malformed base64 (real frames are ~75 KB decoded).
* **Server error string sanitised** before logging (strips CR/LF/NUL,
  truncates to 200 chars) — defends against log-line injection.
* **``api_key`` private** (``_api_key`` + read-only ``api_key``
  property) so ``vars(tts)`` / dataclass-style introspection cannot
  surface the secret.
* **``eleven_v3`` prefix-based reject** also blocks
  ``eleven_v3_preview``, ``eleven_v3_alpha``.
* **Public wrapper exposes the full options surface**
  (``voice_settings``, ``language_code``, ``inactivity_timeout``,
  ``chunk_length_schedule``) — earlier drafts dropped them.
* **Default voice consistency**: the public wrapper no longer
  overrides the provider class default — both layers use Rachel
  (``21m00Tcm4TlvDq8ikWAM``) so direct-construct and wrapped-construct
  paths agree.

Public surface
--------------
* ``getpatter/providers/elevenlabs_ws_tts.py`` — provider class
  ``ElevenLabsWebSocketTTS`` + ``ElevenLabsTTSError``.
* ``getpatter/tts/elevenlabs_ws.py`` — wrapper class ``TTS`` re-exported
  as ``ElevenLabsWebSocketTTS`` from the package root.
* ``sdk-ts/src/providers/elevenlabs-ws-tts.ts`` + corresponding
  TypeScript wrapper at ``sdk-ts/src/tts/elevenlabs-ws.ts``.
* ``sdk-ts/src/providers/elevenlabs-tts.ts`` — ``resolveVoiceId``
  promoted from module-private to public export so the WS variant can
  share the voice-name → voice-id resolution table without
  duplicating the lookup map.
* ``sdk-py/getpatter/__init__.py`` and ``sdk-ts/src/index.ts`` —
  top-level re-exports.

Test coverage
-------------
* +20 Python pytest cases (construction, factories, URL build, send
  sequence, ``isFinal`` termination, voice settings in init,
  ``chunk_length_schedule`` only with ``auto_mode=False``,
  ``eleven_v3`` rejection + variants, env-var resolution).
* +11 TypeScript Vitest cases covering the equivalent surface,
  including a faked ``ws`` module that records sent frames.

The HTTP ``ElevenLabsTTS`` class is **untouched** — both transports
coexist and the user picks per-call.

* release: 0.5.5 — latency pass 1 (chunker + after_llm 3-tier + WS TTS)

Bump ``getpatter`` to 0.5.5 across both SDKs (Python ``pyproject.toml``,
TypeScript ``package.json`` + ``package-lock.json``, and the SDK
``__version__`` / ``VERSION`` constants kept in sync).

CHANGELOG entry covers the four user-visible additions shipped in this
release:

* Sentence chunker — IT/EN abbreviations + multilingual terminators +
  RAMESH-style all-caps flush bug fix (Pipecat #1692). Default
  behaviour unchanged for existing users.
* Opt-in ``aggressive_first_flush`` / ``aggressiveFirstFlush`` on
  ``Agent`` / ``AgentOptions`` — emits the first clause of each turn
  on a soft-punctuation boundary (",", em-dash, en-dash) once the
  buffer reaches ~40 chars. Saves 200–500 ms TTFA. Italian
  hard-disabled (decimal-comma + dot-thousands inversion). 8 guards
  prevent regressions on decimals, currency, JSON, ellipsis,
  open-delimiters, comma-before-quote, sub-token ambiguity.
* New 3-tier ``after_llm`` API (``onChunk`` / ``onSentence`` /
  ``onResponse``). Legacy single-callable form still works (mapped to
  ``onResponse``) but emits a one-shot ``PatterDeprecationWarning`` /
  ``console.warn``. Removal: v0.7.0.
* New opt-in ``ElevenLabsWebSocketTTS`` class — drop-in replacement
  for ``ElevenLabsTTS`` (HTTP) using the ``stream-input`` WebSocket
  endpoint. Saves ~50 ms HTTP setup + TLS cold-start per utterance.
  Per-utterance lifecycle (per-session pooling on the roadmap).

Test totals after this release: Python 1064 PASS / 7 skip,
TypeScript 1163 PASS / 67 files, cross-SDK chunker parity 53 / 8
XFAIL / 0 FAIL on a 61-case fixture spanning EN, IT, CJK, Hindi,
Arabic, Khmer, Burmese, Armenian, and Ethiopic scripts.

Cumulative review hardening from 11 parallel review agents
(Python-reviewer, TypeScript-reviewer, provider-reviewer, sdk-parity,
security-reviewer, code-reviewer, code-simplifier, refactor-cleaner,
docs-sync, build-validator, examples-validator) is folded into the
phase-specific commits — see the per-feature commits in this branch
for the detailed CRITICAL / HIGH fix lists.

* docs: Mintlify pages for 0.5.5 — WS TTS, after_llm 3-tier, aggressive flush

Document the four user-visible additions shipped in 0.5.5:

* **ElevenLabsWebSocketTTS** — new provider sub-pages
  ``docs/{python,typescript}-sdk/providers/elevenlabs-websocket.mdx``.
  What it is, why use it, ``for_twilio`` / ``for_telnyx`` factories,
  full constructor params table, ``eleven_v3*`` limitation,
  per-utterance lifecycle trade-off, ``ElevenLabsTTSError``. Both
  sub-pages added to the TTS group navigation in ``docs/docs.json``.
  Existing ``tts.mdx`` providers table updated with the new row plus a
  callout pointing at the WS variant.

* **``after_llm`` 3-tier API** — new "Pipeline Hooks" section in
  ``docs/{python,typescript}-sdk/events.mdx``: per-tier table for
  ``onChunk`` (sync, ~0 ms), ``onSentence`` (async, 50–300 ms), and
  ``onResponse`` (async, 500 ms – 2 s, blocks streaming). Return
  semantics (``null`` keep / ``""`` drop), legacy callable migration
  path with ``PatterDeprecationWarning`` (Python) / one-shot
  ``console.warn`` (TypeScript), removal in v0.7.0.

* **``aggressive_first_flush`` opt-in** — new row in the
  ``AgentOptions`` / ``Agent`` parameters tables in
  ``docs/{python,typescript}-sdk/agents.mdx`` and ``reference.mdx``
  with the Italian hard-disable note. Python ``features.mdx`` adds a
  dedicated section with code example and the 8-guard summary.

* **Chunker improvements** — Python ``features.mdx`` documents the
  expanded EN abbreviations (``vs.``, ``etc.``, ``Gen.``, ``Sen.``),
  IT abbreviations (``Sig.``, ``Dott.``, ``S.p.A.``, ``ecc.``), and
  multilingual terminator support (Hindi / Arabic / Armenian /
  Ethiopic / Khmer / Burmese / Tibetan). TypeScript SDK has no
  chunker page so no equivalent change required.

``docs.json`` JSON validated end-to-end. No source / examples /
CHANGELOG / NOTICE files touched.

* fix: 5 bugs from 2026-04-29 acceptance run (sdk-ts 0.5.5)

Five fixes uncovered by the 0.5.5 acceptance matrix run, ranging from a
HIGH-severity onnxruntime-node version mismatch that blocks Silero VAD
on macOS x86_64 to a misleading metric that makes healthy calls look
slow.

**Bug #1 (HIGH) — SileroVAD onnxruntime-node 1.24+ API drift**
* ``optionalDependencies.onnxruntime-node`` tightened from ``^1.18.0`` to
  ``~1.18.0`` — the caret was resolving to 1.24.x where
  ``listSupportedBackends`` was removed and the prebuilt ``bin/`` layout
  drifted, so ``import('onnxruntime-node')`` failed on macOS x86_64.
* ``loadOnnxRuntime`` now classifies the underlying error
  (``missing`` / ``binding`` / ``api-drift`` / ``unknown``) and surfaces a
  targeted remedy plus the original error chain via ``Error.cause`` —
  previously the failure mode was hidden behind a single "could not be
  resolved" string.

**Bug #2 (MEDIUM) — ElevenLabsConvAI agent_id error message**
* The env-var fallback already worked but the error message did not say
  *where* to get an agent ID from (the dashboard, not the API key).
  Updated both Python and TypeScript constructors to point users at
  https://elevenlabs.io/app/conversational-ai and reiterate that the
  agent ID is per-deployed-agent.
* Python ``ConvAI.__post_init__`` now raises when ``agent_id`` is empty
  (was silently passing through) — TypeScript already did this. Parity.

**Bug #3 (MEDIUM) — ElevenLabs WS payment_required**
* New typed exception ``ElevenLabsPlanError`` (subclass of
  ``ElevenLabsTTSError``) raised when the WS endpoint returns
  ``payment_required``. Free / Starter plans now get a clear "upgrade
  or use the HTTP class (drop-in API)" message instead of an opaque
  ``ElevenLabsTTSError: ElevenLabs WS error: payment_required``.
* Detection is case-insensitive and matches both the exact server
  string and any ``payment_required`` substring.

**Bug #5 (MEDIUM) — barge-in fragile in pipeline mode without VAD**
* On tunnel + speakerphone setups the agent's own TTS leaks into the
  inbound mic feed, STT transcribes it, and the legacy
  "always forward + bargeInThresholdMs" heuristic fails to fire the
  cancel — the agent talks over the user.
* ``serve()`` now logs a one-shot warning at startup when
  ``agent.engine`` is undefined, ``agent.vad`` is undefined, and
  ``bargeInThresholdMs > 0``, recommending ``SileroVAD`` or
  ``bargeInThresholdMs: 0``. Both Python and TypeScript.

**Bug #6 (LOW) — pipeline ``total_ms`` misleading on long utterances**
* ``total_ms`` spans the user's entire utterance (including pauses)
  because it includes ``stt_ms``, which itself measures STT-stream-open
  to transcript-finalisation. On a 4 s user turn ``total_ms`` reads
  ~5.5 s even though the agent's TTFA after end-of-speech is ~1.3-1.5
  s — misleading as a p95 / SLO metric.
* New ``LatencyBreakdown.agent_response_ms`` field (Python +
  TypeScript). Computed as ``endpoint_ms + llm_ttft_ms + tts_ms`` when
  all three signals are available, ``undefined`` / ``None`` otherwise.
  This is the user-perceived latency dashboards should track.
* ``total_ms`` kept unchanged for backward compatibility.

**Bug #7 (HIGH) — outbound TwiML races tunnel startup**
* The documented ``void phone.serve(...) → setTimeout → phone.call(...)``
  pattern reads ``localConfig.webhookUrl`` while the cloudflared
  hostname is still resolving, producing
  ``wss://undefined/...`` in the dial TwiML and a Twilio 11100 call
  drop on answer.
* New ``phone.tunnelReady`` Promise (TS) / ``phone.tunnel_ready``
  ``asyncio.Future`` (Python). Resolves to the public webhook hostname
  once ``serve()`` knows it (immediately for static webhookUrl,
  after ``startTunnel`` for ``tunnel: true``). Rejects if ``serve()``
  fails before the hostname is known.
* Documented pattern is now ``await phone.tunnelReady`` instead of
  ``setTimeout(10_000)`` — deterministic, no race.
* Same root-cause fix likely also addresses Bug #4 (intermittent WS
  upgrade race) which the acceptance run flagged as a related symptom.

Test totals after the fixes: Python 1064 PASS / 7 skip, TypeScript
1163 PASS / 67 files, cross-SDK chunker parity 53 PASS / 8 XFAIL / 0
FAIL on the 61-case fixture. No regressions.

* fix(bug-4): outbound WS upgrade race — encoded events + ready signal + diagnostics

Three layered fixes targeting the intermittent "outbound call connects
but never receives the WS upgrade" failure (Twilio 11100 on answer)
documented in BUGS.md.

**Root cause A — StatusCallbackEvent encoding**
Twilio expects ``StatusCallbackEvent`` as a multi-value parameter
(repeated keys), NOT a space-separated single value. The previous
``'initiated ringing answered completed'`` form triggered Twilio
notification 21626 ("invalid statusCallbackEvents") on every outbound
call, and on some ingestion paths also broke the answer-handler webhook
which is exactly the symptom that produced 11100.

* TypeScript: use ``params.append('StatusCallbackEvent', evt)`` four
  times so URLSearchParams emits repeated query keys.
* Python: pass the canonical twilio-python snake_case key
  ``status_callback_event`` as a list — twilio-python serialises it as
  the multi-value form Twilio expects.

**Root cause B — server-not-yet-listening race**
The previous ``phone.tunnelReady`` (TS) / ``phone.tunnel_ready`` (Py)
signal resolves as soon as the cloudflared hostname is known, BEFORE
the embedded HTTP / WS server has finished initialising. ``phone.call``
placed immediately afterwards races the Twilio Media Streams upgrade
and produces a half-ready route → 11100.

New ``phone.ready`` (TS Promise / Py Future) resolves only after:
1. Tunnel hostname known
2. Carrier auto-config complete
3. EmbeddedServer in ``listen`` state (TS) / uvicorn ``started`` flag
   set (Py)

Outbound pattern is now:

```ts
void phone.serve({ agent, tunnel: true });
await phone.ready;        // <-- safe for outbound
await phone.call(...);
```

``tunnelReady`` is kept as a separate signal for integrations that
only need the hostname (e.g. webhook registration), with a docstring
note pointing at ``ready`` for outbound use.

**Root cause C — opaque diagnostics**
On call drop the user could not tell whether Twilio rejected the dial,
the tunnel resolved late, or the WS upgrade failed. The new
``phone.call`` flow logs the Twilio notifications URL on every
outbound call ("check here if the call drops with no audio") so
self-diagnosis does not require learning the Twilio API.

**Test parity**
Updated ``test_twilio_statuscallback_always_registered`` to read the
new ``status_callback_event`` key (with fallback to the legacy
``StatusCallbackEvent`` for forward compat). Python 1064 PASS / 7
skip, TypeScript 1163 PASS / 67 files. No regressions.

* chore(docs): mintignore DEVLOG and superpowers/ to unblock Mintlify deployment

DEVLOG.md and superpowers/specs/2026-04-24-patter-feature-test-notebook-design.md fail Mintlify's MDX parser (filenames begin with digits, which MDX treats as JSX expressions). Skip both paths so the docs site can deploy.

* chore: drop DEVLOG/superpowers, fix CI failures

- Remove docs/DEVLOG.md and docs/superpowers/ (internal planning notes, no value to public docs site). The .mintignore introduced in the previous commit is no longer needed and is removed too.
- sdk-ts/src/client.ts: attach a no-op `.catch` to `_ready` and `_tunnelReady` so callers that never await them don't trigger Node's unhandled-rejection warning when serve() validates inputs synchronously. Awaiters of `phone.ready` / `phone.tunnelReady` still see the rejection.
- sdk-ts/package-lock.json: add trailing newline (end-of-file-fixer).
- examples/notebooks/**.ipynb: nbstripout pass — clear cell outputs and execution counts to match the repo convention enforced by .pre-commit-config.yaml.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dependencies Pull requests that update a dependency file github_actions Pull requests that update GitHub Actions code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant