Skip to content

build(deps-dev): bump vitest from 2.1.9 to 4.1.4 in /sdk-ts#6

Closed
dependabot[bot] wants to merge 1 commit into
mainfrom
dependabot/npm_and_yarn/sdk-ts/vitest-4.1.4
Closed

build(deps-dev): bump vitest from 2.1.9 to 4.1.4 in /sdk-ts#6
dependabot[bot] wants to merge 1 commit into
mainfrom
dependabot/npm_and_yarn/sdk-ts/vitest-4.1.4

Conversation

@dependabot
Copy link
Copy Markdown

@dependabot dependabot Bot commented on behalf of github Apr 10, 2026

Bumps vitest from 2.1.9 to 4.1.4.

Release notes

Sourced from vitest's releases.

v4.1.4

   🚀 Experimental Features

   🐞 Bug Fixes

    View changes on GitHub

v4.1.3

   🚀 Experimental Features

   🐞 Bug Fixes

    View changes on GitHub

v4.1.2

This release bumps Vitest's flatted version and removes version pinning to resolve flatted's CVE related issues (vitest-dev/vitest#9975).

... (truncated)

Commits
  • ac04bac chore: release v4.1.4
  • 82c858d chore: Remove no-op function in plugin config logic (#8501)
  • d4fbb5c feat(experimental): support aria snapshot (#9668)
  • b77de96 feat(reporter): add filterMeta option to json reporter (#10078)
  • a120e3a feat(experimental): expose assertion as a public field (#10095)
  • 5375780 feat(coverage): default to text reporter skipFull if agent detected (#10018)
  • a1b5f0f fix: make expect(..., message) consistent as error message prefix (#10068)
  • 203f07a fix: use "black" foreground for labeled terminal message to ensure contrast (...
  • 2dc0d62 chore: release v4.1.3
  • 7827363 feat: add experimental.preParse flag (#10070)
  • Additional commits viewable in compare view
Maintainer changes

This version was pushed to npm by [GitHub Actions](https://www.npmjs.com/~GitHub Actions), a new releaser for vitest since your current version.


Dependabot compatibility score

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

  • @dependabot rebase will rebase this PR
  • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
  • @dependabot show <dependency name> ignore conditions will show all of the ignore conditions of the specified dependency
  • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

Bumps [vitest](https://github.com/vitest-dev/vitest/tree/HEAD/packages/vitest) from 2.1.9 to 4.1.4.
- [Release notes](https://github.com/vitest-dev/vitest/releases)
- [Commits](https://github.com/vitest-dev/vitest/commits/v4.1.4/packages/vitest)

---
updated-dependencies:
- dependency-name: vitest
  dependency-version: 4.1.4
  dependency-type: direct:development
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
@dependabot dependabot Bot added dependencies Pull requests that update a dependency file javascript Pull requests that update javascript code labels Apr 10, 2026
@nicolotognoni nicolotognoni deleted the dependabot/npm_and_yarn/sdk-ts/vitest-4.1.4 branch April 10, 2026 16:23
@dependabot @github
Copy link
Copy Markdown
Author

dependabot Bot commented on behalf of github Apr 10, 2026

OK, I won't notify you again about this release, but will get in touch when a new version is available. If you'd rather skip all updates until the next major or minor version, let me know by commenting @dependabot ignore this major version or @dependabot ignore this minor version. You can also ignore all major, minor, or patch releases for a dependency by adding an ignore condition with the desired update_types to your config file.

If you change your mind, just re-open this PR and I'll resolve any conflicts on it.

nicolotognoni added a commit that referenced this pull request Apr 21, 2026
…#66)

* fix(deps): pin websockets>=14 and add python-multipart

Fixes BUG #7 and #9 from acceptance suite.

- websockets: pin >=14,<16. The 'additional_headers=' kwarg used by the
  OpenAI Realtime, Deepgram STT and ElevenLabs ConvAI adapters is only
  supported on the new asyncio client that became the default in 14.0.
  Under 13.x the call failed with 'got an unexpected keyword argument
  additional_headers', blocking every streaming provider.
- python-multipart: add to the base install. Starlette >= 0.45 raises on
  'await request.form()' without python-multipart installed, so every
  Twilio webhook returned 422 and the call was silently dropped.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(server): repair Twilio & Telnyx webhook stack

Fixes BUG #6, #8, #16 from acceptance suite.

- #8 Request/Response import lifted to the top of server.py. With
  ``from __future__ import annotations`` in place, FastAPI's
  ``get_type_hints(handler)`` resolved the 'Request' annotation against
  module globals where only WebSocket was imported. The ForwardRef stayed
  unresolved, FastAPI classified the parameter as a query-string field
  and every Twilio/Telnyx webhook POST returned HTTP 422 before the
  handler body could run. Local mode was fundamentally broken on 0.4.3.
- #6 dashboard tracking of failed outbound calls: new route
  ``POST /webhooks/twilio/status`` consumes Twilio statusCallback events
  (initiated/ringing/answered/completed/no-answer/busy/failed) and feeds
  them into MetricsStore.update_call_status. Operators now see every
  dialled attempt in the dashboard, including ones that never reach
  media.
- #16 Telnyx Call Control: ``/webhooks/telnyx/voice`` now POSTs
  ``actions/answer`` on call.initiated and ``actions/streaming_start``
  on call.answered against the REST API and returns empty HTTP 200.
  Previously the route returned a JSON ``{commands: [...]}`` body that
  Telnyx silently discards — the call rang forever.

Twilio voice route also falls back to the ``Caller`` / ``Called`` form
fields when ``From`` / ``To`` are empty (see BUG #6 notes).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(telnyx): WS event shape, frame format, track filter, audio sender

Fixes BUG #17, #18, #19 from acceptance suite.

- #17 Media-stream WebSocket events use ``event`` (start / media / stop /
  dtmf / error / connected), not the Call Control REST notification
  ``event_type``. Audio payload lives in ``data.media.payload`` (base64),
  caller/callee live in ``data.start.{from,to}``. Previously the bridge
  matched ``event_type == "stream_started"`` and looked for audio in
  ``payload.audio.chunk`` — no media chunk was ever decoded, so the
  agent never heard the caller.
- #18 Outbound wire format corrected to
  ``{"event":"media","media":{"payload":b64}}`` and
  ``{"event":"clear"}``. The legacy ``event_type``/``payload.audio.chunk``
  shape was silently dropped by Telnyx, so the caller heard silence.
- #19 When ``stream_track=both_tracks`` Telnyx emits media for both the
  caller leg and the agent's own outbound leg; forwarding the outbound
  echo broke OpenAI Realtime turn detection ("speech_started" never
  fired). The bridge now filters ``media.track != "inbound"`` before
  forwarding.

OpenAI Realtime handler on Telnyx is now configured with
``audio_format="g711_ulaw"`` to match the PCMU 8 kHz bidirectional
stream. The TelnyxAudioSender transcodes PCM16 16 kHz → mulaw 8 kHz for
pipeline / ConvAI providers (PCM16 TTS output) and passes mulaw bytes
through when OpenAI Realtime provides them directly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(twilio): OpenAI Realtime audio format + pass-through audio sender

Fixes BUG #10 from acceptance suite.

OpenAI Realtime emits PCM16 at 24 kHz natively. The Twilio handler
previously left ``audio_format`` at the pcm16 default and fed the bytes
into TwilioAudioSender, which unconditionally ran
``resample_16k_to_8k(pcm) → pcm16_to_mulaw`` assuming 16 kHz input.
24 kHz bytes run through a 16→8 kHz resampler come out at ~66% of the
correct rate — the caller heard a deep, slurred voice.

Fix: on the Twilio path construct
``OpenAIRealtimeStreamHandler(..., audio_format="g711_ulaw")`` so
OpenAI emits Twilio-native mulaw 8 kHz directly. Pair it with
``TwilioAudioSender(..., input_is_mulaw_8k=True)`` which skips the
resample+mulaw encode and forwards the bytes as-is. Pipeline and ConvAI
still produce PCM16 @ 16 kHz and go through the default transcoding
path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(pipeline): STT path + hooks + barge-in + dedup + hallucination filter

Fixes BUG #12, #15, #20, #22 from acceptance suite.

- #12 Pipeline on Twilio: the bridge converts mulaw 8 kHz → PCM16 16 kHz
  before STT. The STT adapter used to be built with ``for_twilio=True``
  (mulaw 8 kHz) — Deepgram decoded the already-PCM bytes as mulaw and
  produced garbage transcripts. The pipeline now always configures
  linear16 @ 16 kHz.
- #15 ``PipelineHooks.before_send_to_stt`` was declared but never
  invoked. ``PipelineStreamHandler.on_audio_received`` now runs the
  hook on every inbound chunk and drops the chunk when it returns
  ``None``.
- #20 Pipeline barge-in: ``on_audio_received`` used to skip STT when
  ``_is_speaking=True``, blocking any barge-in detection. It now keeps
  forwarding caller audio to STT during TTS (unless
  ``agent.barge_in_threshold_ms == 0``), and ``_stt_loop`` flips
  ``_is_speaking=False`` + ``send_clear`` on any Deepgram transcript
  with text observed while speaking. Effective latency floor is
  ~800 ms (Deepgram interim), so noisy / short TTS sentences may not
  actually be interrupted — full sub-second barge-in requires a
  server-side VAD (Silero, already supported via ``agent.vad=``).
- #22 Dedup + throttle + hallucination filter. Low-quality STT (Whisper
  on mulaw 8 kHz) emits several nearly-identical final transcripts in
  1–2 s ("you", "you", "you") and hallucinates short fillers from
  silence / TTS echo. Each used to kick off a new LLM+TTS turn, and
  consecutive turns overlapped on the caller's line. Fix in
  ``_stt_loop``: dedup identical finals within 2 s, drop any final
  within 500 ms of the last committed turn, drop a curated blacklist
  of fillers (``you``, ``thank you``, ``yeah``, ``uh``, ``.``…).

Also adds the 8 kHz output path used by the Telnyx handler via a
shared linear16 STT factory in ``handlers/common.py``.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(providers): voice name resolver, Deepgram knobs, TTS streaming resample

Fixes BUG #11, #13, #23 from acceptance suite.

- #11 ElevenLabs voice-name resolver. ``Patter.elevenlabs(voice="rachel")``
  (the quickstart default) used to pass "rachel" verbatim into the
  /text-to-speech/{voice_id}/stream URL, which 404s because the API
  only accepts the opaque 20-char voice IDs. The new ``resolve_voice_id``
  helper maps ~45 common display names (rachel, adam, matilda, alloy, …)
  to their UUIDs and returns unknown strings unchanged so custom voices
  keep working. Removes the ad-hoc "alloy" substitution in
  stream_handler.
- #13 DeepgramSTT exposes ``endpointing_ms`` / ``utterance_end_ms`` /
  ``smart_format`` / ``interim_results`` / ``vad_events`` kwargs and the
  ``Patter.deepgram(...)`` factory forwards them via ``STTConfig.options``.
  Defaults tuned for telephony (endpointing_ms=150, utterance_end_ms=1000).
  The transcript gate is loosened to ``is_final OR speech_final`` so we
  don't wait up to utterance_end_ms on every turn. Pipeline turn latency
  on Twilio drops from ~4 s to ~2.2 s.
- #23 OpenAI TTS streaming resample. ``response_format=pcm`` returns
  24 kHz PCM16 chunks that must be downsampled to 16 kHz. The old
  implementation did the 3:2 downsample chunk-by-chunk without
  preserving filter state, so cross-chunk alignment drifted and the
  caller heard pops / dropped audio. Now uses ``audioop.ratecv`` with
  a persistent ``state`` and stashes odd trailing bytes between calls.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(scheduler,fallback): per-loop schedulers + async close + cancel probes

Fixes BUG #2, #3, #5 from acceptance suite.

- #3 Scheduler singleton dies across event loops. The old
  ``_scheduler_singleton`` bound to the first loop it saw; pytest-asyncio
  closed that loop at the end of every test and the next scheduled
  callback crashed with ``Event loop is closed``. Replaced by
  ``_schedulers_by_loop`` — a dict keyed on ``id(asyncio.get_event_loop())``
  that drops stale entries when the owning loop has been closed. Adds
  ``reset_for_tests()`` to tear down every cached scheduler; the public
  ``shutdown()`` is now an alias for it.
- #2 ``FallbackLLMProvider.complete_stream`` — convenience wrapper
  that flattens ``{"type": "text"}`` chunks so callers don't have to
  switch on chunk type. Mirrors the TS SDK's ``completeStream``.
- #5 ``FallbackLLMProvider`` recovery task leak. ``_probe`` tasks
  created by ``_start_recovery`` were never awaited, and pytest-asyncio
  tears the loop down before they finish. Adds ``aclose()`` and async
  context manager support (``__aenter__``/``__aexit__``) so callers can
  ``async with FallbackLLMProvider(...)`` and have the probes cancelled
  + awaited on exit.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(tools): @tool adapter unpacks kwargs into user function

Fixes BUG #21 from acceptance suite.

``@tool`` exposed the raw user function as ``handler`` but
``services/tool_executor._execute_handler`` always calls
``handler(arguments_dict, call_context_dict)``. Every typed tool — e.g.
``async def check_order(order_id: str)`` — crashed at runtime with
"takes 1 positional argument but 2 were given" and OpenAI Realtime
received a fallback error JSON instead of the tool's result.

The decorator now wraps the user function in an async adapter whose
signature matches the executor's contract ``(arguments, call_context)``.
The adapter inspects the original signature: if it already takes
``(arguments, call_context)`` positionally it passes through unchanged,
otherwise it filters ``arguments`` to the user function's declared
parameter names and calls ``fn(**args)``. The original function is
still reachable via ``handler.__wrapped__`` for introspection.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(dashboard): track failed & no-answer outbound calls

Fixes BUG #6 from acceptance suite.

The embedded dashboard used to show only calls that made it to the
media channel. An outbound dial that rang out (``status=no-answer``,
``busy``, ``failed``) never produced a webhook hit, so the row never
appeared in the UI even though Twilio billed for the attempt.

Changes:

- ``MetricsStore.record_call_initiated({call_id, caller, callee, …})``
  pre-registers the call when ``Patter.call()`` returns, so the row
  shows up the moment the dial is dispatched.
- ``MetricsStore.update_call_status(call_id, status, **extra)`` promotes
  the record through the lifecycle (ringing → in-progress → completed /
  no-answer / busy / failed / canceled). Terminal states move the row
  from active to the completed list so the UI timer freezes. Fed by
  the new ``/webhooks/twilio/status`` route.
- ``MetricsStoreProtocol`` extended with the two new methods.
- ``call_end`` now synthesises a minimal metrics shim when the call
  ended without a full CallMetrics payload, so the UI can still render
  duration / status.
- Dashboard UI: new ``STATUS`` column, filter pills (all / completed /
  failed), colour-coded badges (green / yellow / red / orange), red
  row tint for failed statuses, and SSE listeners for the new
  ``call_initiated`` and ``call_status`` events. The duration timer
  respects ``data-ended`` so rows that already received call_end stop
  ticking.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(api): ring_timeout + agent.hooks/vad/audio_filter forwarding + call pre-register

Fixes BUG #14 + IMP2 + completes BUG #6 from acceptance suite.

- #14 ``Patter.agent(...)`` used to drop ``hooks``, ``text_transforms``,
  ``vad``, ``audio_filter``, ``background_audio`` and
  ``barge_in_threshold_ms`` even though the ``Agent`` dataclass accepted
  them. The factory now forwards all fields.
- IMP2 ``ring_timeout: int | None`` kwarg on ``Patter.call(...)``.
  Forwarded to Twilio as ``Timeout=`` and to Telnyx as ``timeout_secs``
  (added to ``TelnyxAdapter.initiate_call``). Italian mobile carriers
  silence-drop the default ~28 s ring on US→IT calls; the quickstart
  now works with ``ring_timeout=60``.
- #6 ``Patter.call()`` pre-registers the dialled call in the
  MetricsStore via ``record_call_initiated(...)`` before returning, so
  the dashboard shows the attempt even when the callee never picks up.
  The Twilio branch also passes ``StatusCallbackEvent="initiated
  ringing answered completed"`` so we receive every state transition.

Also exposes the new Deepgram knobs on the ``Patter.deepgram(...)``
factory (``model``, ``endpointing_ms``, ``utterance_end_ms``,
``smart_format``, ``interim_results``).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(api): models barge_in_threshold_ms + STT/TTS options, top-level mix_pcm, docstring

Rolls up the smaller API additions — BUG #1, #04g, extras from #13/#15.

- ``Agent.barge_in_threshold_ms`` (default 300) — hangover window before
  treating caller audio as barge-in. Used by PipelineStreamHandler and
  mirrored on TS ``AgentOptions.bargeInThresholdMs``.
- ``STTConfig.options`` / ``TTSConfig.options`` — provider-specific
  knobs bag (e.g. Deepgram endpointing) that ``common._create_stt_from_config``
  unpacks when building the adapter. Keeps older ``STTConfig`` callers
  forward-compatible.
- Top-level ``patter.mix_pcm(agent, bg, ratio)`` — parity alias for the
  TS ``mixPcm(...)`` standalone helper (BUG #04g). Thin wrapper over
  the existing ``PcmMixer`` class with an explicit ratio.
- ``patter/__init__.py`` docstring enumerates the installable extras
  (scheduling, anthropic, groq, cerebras, google, …) so ``pip install
  getpatter`` users discover them without hitting a
  ``RuntimeError: Scheduling requires the 'apscheduler' package`` at
  call time (BUG #1).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test: align Python tests with BUG #12/#16/#17/#18/#19/#21 fixes

- ``test_local_mode``: pipeline Twilio bridge test now patches
  ``DeepgramSTT`` directly instead of ``DeepgramSTT.for_twilio`` —
  after BUG #12 the pipeline path uses the default linear16 16 kHz
  adapter on both telephony providers.
- ``test_new_features``: ``machine_detection=False`` no longer asserts
  an empty extra_params dict; BUG #6 now always wires a
  ``StatusCallback`` so the dashboard sees failed attempts. The test
  keeps its original intent (AMD-specific params absent) and additionally
  checks the status callback is set.
- ``test_server_unit::TestTelnyxVoiceRoute``: rewritten to assert the
  REST ``actions/answer`` POST after BUG #16 — the route no longer
  returns a JSON commands body.
- ``test_telnyx_bridge_unit``: helper messages updated to the
  ``{event: start|media|stop}`` wire shape from BUG #17; the OpenAI
  Realtime audio_format assertion now expects ``g711_ulaw`` (from #18).
- ``test_telnyx_handler_unit``: TelnyxAudioSender test uses
  ``input_is_mulaw_8k=True`` so the round-trip byte assertion still
  holds with the new PCM16→mulaw transcode path (#18). Wire format
  asserts ``event == "media"`` / ``event == "clear"``.
- ``test_tool_decorator``: invokes handlers with the new adapter
  signature ``(arguments_dict, call_context_dict)`` (#21), including a
  sync-wrapped handler awaited through the adapter.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(ts/api): Python parity — auto-detect local, static factories, ring_timeout

Brings TS parity with Python on BUG #4 parity items + #14 agent fields
+ IMP2 ring_timeout.

- Auto-detect local mode: ``new Patter({twilioSid, twilioToken, …})``
  without explicit ``mode: 'local'`` is now treated as local when
  apiKey is missing (mirrors Python).
- Static provider factories: ``Patter.deepgram(...)``,
  ``Patter.elevenlabs(...)``, ``Patter.whisper(...)``,
  ``Patter.openaiTts(...)``, ``Patter.cartesia(...)``, ``Patter.rime(...)``,
  ``Patter.lmnt(...)``.
- ``STTConfig.toDict`` / ``TTSConfig.toDict`` are now optional — plain
  object literals ``{provider, apiKey, language}`` are accepted
  everywhere (fallback serialisation is handled via
  ``sttConfigToDict`` / ``ttsConfigToDict`` helpers).
- ``STTConfig`` gets an ``options`` bag (parity with Python BUG #13).
- ``LocalCallOptions.ringTimeout`` forwarded to Twilio as ``Timeout``
  and Telnyx as ``timeout_secs`` — plus ``StatusCallbackEvent`` wired
  so the dashboard sees ringing/no-answer/busy/failed transitions
  (BUG #6).
- ``AgentOptions.bargeInThresholdMs`` (parity with #20 on Python).
- ``LocalOptions.deepgramKey`` / ``elevenlabsKey`` added as
  provider-level defaults (parity with Python Patter() kwargs).
- ``Patter.call()`` Twilio branch pre-registers the dialled call with
  ``metricsStore.recordCallInitiated`` so no-answer / busy / failed
  attempts still show up in the dashboard.
- ``providers.deepgram(...)`` factory exposes the Deepgram knobs
  (model / endpointing_ms / utterance_end_ms / smart_format /
  interim_results) and carries them in ``STTConfig.options``.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(ts/providers): voice resolver, Deepgram knobs, TTS streaming resample

TS parity port of Python BUG #11, #13, #23.

- ElevenLabs: ``resolveVoiceId()`` maps display names (rachel, adam,
  matilda, alloy, …) to the opaque 20-char UUIDs accepted by the
  /text-to-speech/{voice_id}/stream endpoint. Map mirrors the Python
  SDK byte-for-byte.
- DeepgramSTT: constructor overloaded to accept ``DeepgramSTTOptions``
  (endpointingMs / utteranceEndMs / smartFormat / interimResults /
  vadEvents) alongside the legacy positional form. Transcript gate
  loosened to ``is_final OR speech_final`` so short utterances don't
  wait for Deepgram's utterance_end commit.
- OpenAITTS: streaming 24 kHz → 16 kHz resample now carries state
  (``carryByte`` + ``leftover`` samples) between chunks so cross-chunk
  alignment doesn't drift. The legacy ``resample24kTo16k`` static is
  kept as a thin wrapper around the streaming path for the existing
  unit tests.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(ts): Telnyx stack, pipeline hooks/barge-in/dedup, dashboard status, scheduler sync

TS parity port of the Python fixes for BUG #2/#3/#6/#12/#15/#16/#17/#18/#19/#20/#22.

- ``stream-handler.ts``: ``handleAudio`` now runs the
  ``before_send_to_stt`` hook (#15), transcodes Twilio mulaw 8 kHz →
  PCM16 16 kHz unconditionally on the pipeline path (#12), and keeps
  forwarding caller audio during TTS so barge-in can trigger (#20).
  ``processTranscript`` implements the dedup + 500 ms throttle +
  hallucination-word blacklist from #22 and flips ``isSpeaking`` +
  ``sendClear`` on any transcript with text while the agent is
  speaking (#20).
- ``server.ts``: ``TelnyxBridge.sendAudio`` / ``sendClear`` use the
  correct ``{event:"media",media:{payload:b64}}`` wire format (#18);
  the Telnyx WS handler matches ``data.event`` (start / media / stop /
  dtmf / error / connected) and filters ``media.track !== "inbound"``
  before forwarding (#17, #19); the ``/webhooks/telnyx/voice`` route
  POSTs ``actions/answer`` and ``actions/streaming_start`` via the
  Call Control REST API and returns empty HTTP 200 (#16).
  ``TwilioBridge.createStt`` picks linear16 16 kHz when
  ``provider === 'pipeline'`` so Deepgram doesn't decode already-PCM
  bytes as mulaw (#12). A new ``/webhooks/twilio/status`` handler
  consumes Twilio status callbacks and updates the dashboard (#6).
- ``scheduler.ts``: ``scheduleCron`` returns a ``ScheduleHandle``
  synchronously (lazy node-cron import happens in the background) —
  parity with Python #4. ``scheduleInterval`` accepts
  ``{intervalMs}`` or ``{seconds}`` in addition to the legacy
  positional ms, matching Python ``schedule_interval(seconds=...)``.
- ``fallback-provider.ts``: ``completeStream()`` text-only convenience
  generator (#2), ``aclose()`` + ``Symbol.asyncDispose`` so
  ``await using fallback = ...`` parity with Python's
  ``async with FallbackLLMProvider(...)`` (#5).
- ``dashboard/store.ts``: ``recordCallInitiated`` pre-registers
  outbound attempts, ``updateCallStatus`` promotes rows through
  ringing / no-answer / busy / failed and moves terminal states to
  the completed list (#6).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(ts): align with 0.4.4 wire-format & provider API changes

- ``providers.test.ts``: toDict now surfaces ``options`` when set,
  knobs forwarding verified.
- ``types.test.ts``: toDict optional chain covered.
- ``openai-tts.test.ts``: 1-byte input no longer returns the byte
  verbatim — the streaming resampler stashes it as ``carryByte`` and
  the stateless wrapper flushes only complete samples, so the test now
  asserts an empty buffer.
- ``integration/twilio-pipeline.test.ts`` + ``integration/telnyx-pipeline.test.ts``:
  ``handleAudio`` is now async; tests await it. Telnyx fixture feeds
  mulaw 8 kHz and asserts the transcoded PCM16 16 kHz lands on the STT
  mock (BUG #12 + #19).
- ``unit/server-routes.test.ts``: Telnyx webhook tests assert the
  REST ``actions/answer`` + ``actions/streaming_start`` POSTs and the
  empty HTTP 200 response (BUG #16).
- ``package-lock.json``: refreshed for the sdk-ts worktree so the
  ``0.4.3`` → ``0.4.3-worktree`` alignment is consistent.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(unit): regression coverage for BUG #6/#22/#23 + ring_timeout (IMP2)

Three new unit test files lock in fixes that previously lived in the
acceptance suite as live-call checks:

  test_pipeline_dedup.py (13 tests)
    - Hallucination blacklist: "you", "thank you", ".", case/punctuation
      variants, empty-after-strip all drop silently.
    - 2-second duplicate window with time.time monkeypatched so parity
      with the live Whisper feedback loop is deterministic.
    - 500 ms back-to-back throttle covering legitimate vs spurious
      second turns.
    - Interim / empty finals must not fire on_transcript.

  test_openai_tts_resample.py (7 tests)
    - Cross-chunk ratecv state: multi-chunk stream output matches a
      single-shot resample byte-for-byte.
    - Odd-byte boundary: a chunk ending on a dangling byte must not
      drop the sample.
    - Empty / single-byte / tiny chunks must not crash.
    - Response is always aclosed on both successful and early-exit paths.

  test_twilio_status_and_ring_timeout.py (13 tests)
    - /webhooks/twilio/status routes to update_call_status with parsed
      duration, and survives missing SID, bad duration, and the
      dashboard-disabled path.
    - Twilio signature enforcement on the status endpoint.
    - Twilio ring_timeout -> Timeout REST param, Telnyx -> timeout_secs.
    - Twilio StatusCallback / StatusCallbackEvent are always registered
      on outbound calls so BUG #6 cannot regress.

Full unit suite: 728 passed, 2 skipped.

* docs+ci: latency/provider caveats + audit workflow

README
  - Pipeline turn-latency floor documented (~2.0–2.8 s) with per-stage
    breakdown so users know to switch to `provider="openai_realtime"`
    for sub-second UX.
  - ElevenLabs free-tier library-voice restriction (402) with pointer
    to `ELEVENLABS_VOICE_ID`.
  - Telnyx outbound D38 Outbound Profile requirement.
  - Google Gemini free-tier quota=0 caveat.
  - Whisper hallucination filter documented.
  - `ring_timeout` + status callback description added to call().

.github/workflows/audit.yml (new)
  - pip-audit on sdk-py runtime deps.
  - npm audit on sdk-ts production deps.
  - bandit static analysis with SARIF upload to GitHub Security.
  - Runs on dep-manifest changes, weekly schedule, and manual dispatch.
  - Findings are advisory-only to keep the pipeline from flaking on
    upstream CVE churn (telephony stack pulls many C-wrapped libs).

Baseline audit run: npm=0, bandit medium+/high-confidence=0,
pip-audit=2 (pytest dev-only + transformers optional-extra only).

* docs(readme): remove local-measured latency numbers from Voice Modes

The millisecond ranges previously listed for each provider came from a
single local benchmark run and are neither representative nor a target.
Keep the modes table qualitative and replace the per-stage breakdown
with a short note that latency is inherited from the chosen providers —
no hard numbers we don't want callers anchoring on.

* test(unit): bug coverage gaps — BUG #15/#19/#20

Three new unit test modules fill the remaining coverage gap for the
bugs fixed on this branch:

  test_pipeline_bargein.py (7 tests) — BUG #20
    - Interim transcript during TTS triggers send_clear + is_speaking=False.
    - record_turn_interrupted is fired on the metrics accumulator.
    - send_clear throwing does not crash the STT loop (fail-open).
    - No barge-in when the agent is idle or the transcript has no text.
    - Final transcripts also trigger the barge-in branch before the
      downstream LLM turn runs.

  test_before_send_to_stt_hook.py (9 tests) — BUG #15
    - Sync / async hook returning None drops the chunk (zero STT sends).
    - Returning modified bytes forwards the new buffer verbatim.
    - Hook receives the decoded PCM, not the raw mulaw payload.
    - Raising hooks fail-open: original audio still reaches STT.
    - Missing hook / hooks instance with before_send_to_stt=None are
      both bypass paths that must still forward audio.

  test_telnyx_track_filter.py (5 tests) — BUG #19
    - track=inbound forwards, track=outbound drops.
    - Missing `track` field defaults to inbound (legacy Telnyx payloads).
    - Mixed stream: only inbound frames reach the handler, in order.
    - Unknown track values are skipped defensively.

Full unit suite: 749 passed, 2 skipped (+21 from this commit).

* feat(sdk-py): add cartesia/rime/lmnt static factories + vad_events to deepgram

Brings Python SDK to parity with sdk-ts:
- Adds Patter.cartesia / Patter.rime / Patter.lmnt static methods so local-mode
  users can configure these TTS providers the same way they do in TypeScript.
- Adds the missing vad_events keyword to Patter.deepgram and the
  patter.providers.deepgram factory — the DeepgramSTT ctor already accepted
  it, but the public config helper silently dropped the flag.

* chore: bump to 0.4.4

Regression suites re-run after the bump:
  - sdk-py: 749 passed, 2 skipped
  - sdk-ts: 932 passed (57 test files, including soak)

* fix(ci): integration tests on 0.4.4 wire format + misc hygiene

Addresses the five failing CI checks on PR #66.

Telnyx integration tests (test_telnyx_{convai,pipeline,realtime}.py)
  - ``_telnyx_stream_started`` / ``_telnyx_media_event`` /
    ``_telnyx_stream_stopped`` helpers migrated from the pre-0.4.4
    ``{event_type, payload.audio.chunk}`` shape to the real Telnyx
    media-stream wire format ``{event, start|media.payload}`` (BUG
    #17/#18). Without this the bridge silently drops every test frame
    and 11 integration tests fail with "handler called 0 times".
  - ``test_audio_format_pcm16`` renamed to ``test_audio_format_g711_ulaw``
    and the assertion flipped — Telnyx is PCMU 8 kHz bidirectional
    (BUG #19), Realtime runs on ``g711_ulaw`` so both legs stay
    pass-through.

sdk-ts/src/scheduler.ts
  - Removed the trailing blank line that broke the pre-commit
    ``end-of-file-fixer`` hook.

.github/workflows/audit.yml
  - Bandit stock CLI doesn't support ``-f sarif`` — install
    ``bandit-sarif-formatter`` alongside bandit, and guard the
    upload-sarif step with ``hashFiles`` so future formatter breakage
    doesn't fail the job.

Local verification: 802 passed, 4 skipped (sdk-py unit + integration).

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
nicolotognoni added a commit that referenced this pull request Apr 21, 2026
* fix(deps): pin websockets>=14 and add python-multipart

Fixes BUG #7 and #9 from acceptance suite.

- websockets: pin >=14,<16. The 'additional_headers=' kwarg used by the
  OpenAI Realtime, Deepgram STT and ElevenLabs ConvAI adapters is only
  supported on the new asyncio client that became the default in 14.0.
  Under 13.x the call failed with 'got an unexpected keyword argument
  additional_headers', blocking every streaming provider.
- python-multipart: add to the base install. Starlette >= 0.45 raises on
  'await request.form()' without python-multipart installed, so every
  Twilio webhook returned 422 and the call was silently dropped.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(server): repair Twilio & Telnyx webhook stack

Fixes BUG #6, #8, #16 from acceptance suite.

- #8 Request/Response import lifted to the top of server.py. With
  ``from __future__ import annotations`` in place, FastAPI's
  ``get_type_hints(handler)`` resolved the 'Request' annotation against
  module globals where only WebSocket was imported. The ForwardRef stayed
  unresolved, FastAPI classified the parameter as a query-string field
  and every Twilio/Telnyx webhook POST returned HTTP 422 before the
  handler body could run. Local mode was fundamentally broken on 0.4.3.
- #6 dashboard tracking of failed outbound calls: new route
  ``POST /webhooks/twilio/status`` consumes Twilio statusCallback events
  (initiated/ringing/answered/completed/no-answer/busy/failed) and feeds
  them into MetricsStore.update_call_status. Operators now see every
  dialled attempt in the dashboard, including ones that never reach
  media.
- #16 Telnyx Call Control: ``/webhooks/telnyx/voice`` now POSTs
  ``actions/answer`` on call.initiated and ``actions/streaming_start``
  on call.answered against the REST API and returns empty HTTP 200.
  Previously the route returned a JSON ``{commands: [...]}`` body that
  Telnyx silently discards — the call rang forever.

Twilio voice route also falls back to the ``Caller`` / ``Called`` form
fields when ``From`` / ``To`` are empty (see BUG #6 notes).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(telnyx): WS event shape, frame format, track filter, audio sender

Fixes BUG #17, #18, #19 from acceptance suite.

- #17 Media-stream WebSocket events use ``event`` (start / media / stop /
  dtmf / error / connected), not the Call Control REST notification
  ``event_type``. Audio payload lives in ``data.media.payload`` (base64),
  caller/callee live in ``data.start.{from,to}``. Previously the bridge
  matched ``event_type == "stream_started"`` and looked for audio in
  ``payload.audio.chunk`` — no media chunk was ever decoded, so the
  agent never heard the caller.
- #18 Outbound wire format corrected to
  ``{"event":"media","media":{"payload":b64}}`` and
  ``{"event":"clear"}``. The legacy ``event_type``/``payload.audio.chunk``
  shape was silently dropped by Telnyx, so the caller heard silence.
- #19 When ``stream_track=both_tracks`` Telnyx emits media for both the
  caller leg and the agent's own outbound leg; forwarding the outbound
  echo broke OpenAI Realtime turn detection ("speech_started" never
  fired). The bridge now filters ``media.track != "inbound"`` before
  forwarding.

OpenAI Realtime handler on Telnyx is now configured with
``audio_format="g711_ulaw"`` to match the PCMU 8 kHz bidirectional
stream. The TelnyxAudioSender transcodes PCM16 16 kHz → mulaw 8 kHz for
pipeline / ConvAI providers (PCM16 TTS output) and passes mulaw bytes
through when OpenAI Realtime provides them directly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(twilio): OpenAI Realtime audio format + pass-through audio sender

Fixes BUG #10 from acceptance suite.

OpenAI Realtime emits PCM16 at 24 kHz natively. The Twilio handler
previously left ``audio_format`` at the pcm16 default and fed the bytes
into TwilioAudioSender, which unconditionally ran
``resample_16k_to_8k(pcm) → pcm16_to_mulaw`` assuming 16 kHz input.
24 kHz bytes run through a 16→8 kHz resampler come out at ~66% of the
correct rate — the caller heard a deep, slurred voice.

Fix: on the Twilio path construct
``OpenAIRealtimeStreamHandler(..., audio_format="g711_ulaw")`` so
OpenAI emits Twilio-native mulaw 8 kHz directly. Pair it with
``TwilioAudioSender(..., input_is_mulaw_8k=True)`` which skips the
resample+mulaw encode and forwards the bytes as-is. Pipeline and ConvAI
still produce PCM16 @ 16 kHz and go through the default transcoding
path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(pipeline): STT path + hooks + barge-in + dedup + hallucination filter

Fixes BUG #12, #15, #20, #22 from acceptance suite.

- #12 Pipeline on Twilio: the bridge converts mulaw 8 kHz → PCM16 16 kHz
  before STT. The STT adapter used to be built with ``for_twilio=True``
  (mulaw 8 kHz) — Deepgram decoded the already-PCM bytes as mulaw and
  produced garbage transcripts. The pipeline now always configures
  linear16 @ 16 kHz.
- #15 ``PipelineHooks.before_send_to_stt`` was declared but never
  invoked. ``PipelineStreamHandler.on_audio_received`` now runs the
  hook on every inbound chunk and drops the chunk when it returns
  ``None``.
- #20 Pipeline barge-in: ``on_audio_received`` used to skip STT when
  ``_is_speaking=True``, blocking any barge-in detection. It now keeps
  forwarding caller audio to STT during TTS (unless
  ``agent.barge_in_threshold_ms == 0``), and ``_stt_loop`` flips
  ``_is_speaking=False`` + ``send_clear`` on any Deepgram transcript
  with text observed while speaking. Effective latency floor is
  ~800 ms (Deepgram interim), so noisy / short TTS sentences may not
  actually be interrupted — full sub-second barge-in requires a
  server-side VAD (Silero, already supported via ``agent.vad=``).
- #22 Dedup + throttle + hallucination filter. Low-quality STT (Whisper
  on mulaw 8 kHz) emits several nearly-identical final transcripts in
  1–2 s ("you", "you", "you") and hallucinates short fillers from
  silence / TTS echo. Each used to kick off a new LLM+TTS turn, and
  consecutive turns overlapped on the caller's line. Fix in
  ``_stt_loop``: dedup identical finals within 2 s, drop any final
  within 500 ms of the last committed turn, drop a curated blacklist
  of fillers (``you``, ``thank you``, ``yeah``, ``uh``, ``.``…).

Also adds the 8 kHz output path used by the Telnyx handler via a
shared linear16 STT factory in ``handlers/common.py``.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(providers): voice name resolver, Deepgram knobs, TTS streaming resample

Fixes BUG #11, #13, #23 from acceptance suite.

- #11 ElevenLabs voice-name resolver. ``Patter.elevenlabs(voice="rachel")``
  (the quickstart default) used to pass "rachel" verbatim into the
  /text-to-speech/{voice_id}/stream URL, which 404s because the API
  only accepts the opaque 20-char voice IDs. The new ``resolve_voice_id``
  helper maps ~45 common display names (rachel, adam, matilda, alloy, …)
  to their UUIDs and returns unknown strings unchanged so custom voices
  keep working. Removes the ad-hoc "alloy" substitution in
  stream_handler.
- #13 DeepgramSTT exposes ``endpointing_ms`` / ``utterance_end_ms`` /
  ``smart_format`` / ``interim_results`` / ``vad_events`` kwargs and the
  ``Patter.deepgram(...)`` factory forwards them via ``STTConfig.options``.
  Defaults tuned for telephony (endpointing_ms=150, utterance_end_ms=1000).
  The transcript gate is loosened to ``is_final OR speech_final`` so we
  don't wait up to utterance_end_ms on every turn. Pipeline turn latency
  on Twilio drops from ~4 s to ~2.2 s.
- #23 OpenAI TTS streaming resample. ``response_format=pcm`` returns
  24 kHz PCM16 chunks that must be downsampled to 16 kHz. The old
  implementation did the 3:2 downsample chunk-by-chunk without
  preserving filter state, so cross-chunk alignment drifted and the
  caller heard pops / dropped audio. Now uses ``audioop.ratecv`` with
  a persistent ``state`` and stashes odd trailing bytes between calls.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(scheduler,fallback): per-loop schedulers + async close + cancel probes

Fixes BUG #2, #3, #5 from acceptance suite.

- #3 Scheduler singleton dies across event loops. The old
  ``_scheduler_singleton`` bound to the first loop it saw; pytest-asyncio
  closed that loop at the end of every test and the next scheduled
  callback crashed with ``Event loop is closed``. Replaced by
  ``_schedulers_by_loop`` — a dict keyed on ``id(asyncio.get_event_loop())``
  that drops stale entries when the owning loop has been closed. Adds
  ``reset_for_tests()`` to tear down every cached scheduler; the public
  ``shutdown()`` is now an alias for it.
- #2 ``FallbackLLMProvider.complete_stream`` — convenience wrapper
  that flattens ``{"type": "text"}`` chunks so callers don't have to
  switch on chunk type. Mirrors the TS SDK's ``completeStream``.
- #5 ``FallbackLLMProvider`` recovery task leak. ``_probe`` tasks
  created by ``_start_recovery`` were never awaited, and pytest-asyncio
  tears the loop down before they finish. Adds ``aclose()`` and async
  context manager support (``__aenter__``/``__aexit__``) so callers can
  ``async with FallbackLLMProvider(...)`` and have the probes cancelled
  + awaited on exit.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(tools): @tool adapter unpacks kwargs into user function

Fixes BUG #21 from acceptance suite.

``@tool`` exposed the raw user function as ``handler`` but
``services/tool_executor._execute_handler`` always calls
``handler(arguments_dict, call_context_dict)``. Every typed tool — e.g.
``async def check_order(order_id: str)`` — crashed at runtime with
"takes 1 positional argument but 2 were given" and OpenAI Realtime
received a fallback error JSON instead of the tool's result.

The decorator now wraps the user function in an async adapter whose
signature matches the executor's contract ``(arguments, call_context)``.
The adapter inspects the original signature: if it already takes
``(arguments, call_context)`` positionally it passes through unchanged,
otherwise it filters ``arguments`` to the user function's declared
parameter names and calls ``fn(**args)``. The original function is
still reachable via ``handler.__wrapped__`` for introspection.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(dashboard): track failed & no-answer outbound calls

Fixes BUG #6 from acceptance suite.

The embedded dashboard used to show only calls that made it to the
media channel. An outbound dial that rang out (``status=no-answer``,
``busy``, ``failed``) never produced a webhook hit, so the row never
appeared in the UI even though Twilio billed for the attempt.

Changes:

- ``MetricsStore.record_call_initiated({call_id, caller, callee, …})``
  pre-registers the call when ``Patter.call()`` returns, so the row
  shows up the moment the dial is dispatched.
- ``MetricsStore.update_call_status(call_id, status, **extra)`` promotes
  the record through the lifecycle (ringing → in-progress → completed /
  no-answer / busy / failed / canceled). Terminal states move the row
  from active to the completed list so the UI timer freezes. Fed by
  the new ``/webhooks/twilio/status`` route.
- ``MetricsStoreProtocol`` extended with the two new methods.
- ``call_end`` now synthesises a minimal metrics shim when the call
  ended without a full CallMetrics payload, so the UI can still render
  duration / status.
- Dashboard UI: new ``STATUS`` column, filter pills (all / completed /
  failed), colour-coded badges (green / yellow / red / orange), red
  row tint for failed statuses, and SSE listeners for the new
  ``call_initiated`` and ``call_status`` events. The duration timer
  respects ``data-ended`` so rows that already received call_end stop
  ticking.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(api): ring_timeout + agent.hooks/vad/audio_filter forwarding + call pre-register

Fixes BUG #14 + IMP2 + completes BUG #6 from acceptance suite.

- #14 ``Patter.agent(...)`` used to drop ``hooks``, ``text_transforms``,
  ``vad``, ``audio_filter``, ``background_audio`` and
  ``barge_in_threshold_ms`` even though the ``Agent`` dataclass accepted
  them. The factory now forwards all fields.
- IMP2 ``ring_timeout: int | None`` kwarg on ``Patter.call(...)``.
  Forwarded to Twilio as ``Timeout=`` and to Telnyx as ``timeout_secs``
  (added to ``TelnyxAdapter.initiate_call``). Italian mobile carriers
  silence-drop the default ~28 s ring on US→IT calls; the quickstart
  now works with ``ring_timeout=60``.
- #6 ``Patter.call()`` pre-registers the dialled call in the
  MetricsStore via ``record_call_initiated(...)`` before returning, so
  the dashboard shows the attempt even when the callee never picks up.
  The Twilio branch also passes ``StatusCallbackEvent="initiated
  ringing answered completed"`` so we receive every state transition.

Also exposes the new Deepgram knobs on the ``Patter.deepgram(...)``
factory (``model``, ``endpointing_ms``, ``utterance_end_ms``,
``smart_format``, ``interim_results``).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(api): models barge_in_threshold_ms + STT/TTS options, top-level mix_pcm, docstring

Rolls up the smaller API additions — BUG #1, #04g, extras from #13/#15.

- ``Agent.barge_in_threshold_ms`` (default 300) — hangover window before
  treating caller audio as barge-in. Used by PipelineStreamHandler and
  mirrored on TS ``AgentOptions.bargeInThresholdMs``.
- ``STTConfig.options`` / ``TTSConfig.options`` — provider-specific
  knobs bag (e.g. Deepgram endpointing) that ``common._create_stt_from_config``
  unpacks when building the adapter. Keeps older ``STTConfig`` callers
  forward-compatible.
- Top-level ``patter.mix_pcm(agent, bg, ratio)`` — parity alias for the
  TS ``mixPcm(...)`` standalone helper (BUG #04g). Thin wrapper over
  the existing ``PcmMixer`` class with an explicit ratio.
- ``patter/__init__.py`` docstring enumerates the installable extras
  (scheduling, anthropic, groq, cerebras, google, …) so ``pip install
  getpatter`` users discover them without hitting a
  ``RuntimeError: Scheduling requires the 'apscheduler' package`` at
  call time (BUG #1).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test: align Python tests with BUG #12/#16/#17/#18/#19/#21 fixes

- ``test_local_mode``: pipeline Twilio bridge test now patches
  ``DeepgramSTT`` directly instead of ``DeepgramSTT.for_twilio`` —
  after BUG #12 the pipeline path uses the default linear16 16 kHz
  adapter on both telephony providers.
- ``test_new_features``: ``machine_detection=False`` no longer asserts
  an empty extra_params dict; BUG #6 now always wires a
  ``StatusCallback`` so the dashboard sees failed attempts. The test
  keeps its original intent (AMD-specific params absent) and additionally
  checks the status callback is set.
- ``test_server_unit::TestTelnyxVoiceRoute``: rewritten to assert the
  REST ``actions/answer`` POST after BUG #16 — the route no longer
  returns a JSON commands body.
- ``test_telnyx_bridge_unit``: helper messages updated to the
  ``{event: start|media|stop}`` wire shape from BUG #17; the OpenAI
  Realtime audio_format assertion now expects ``g711_ulaw`` (from #18).
- ``test_telnyx_handler_unit``: TelnyxAudioSender test uses
  ``input_is_mulaw_8k=True`` so the round-trip byte assertion still
  holds with the new PCM16→mulaw transcode path (#18). Wire format
  asserts ``event == "media"`` / ``event == "clear"``.
- ``test_tool_decorator``: invokes handlers with the new adapter
  signature ``(arguments_dict, call_context_dict)`` (#21), including a
  sync-wrapped handler awaited through the adapter.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(ts/api): Python parity — auto-detect local, static factories, ring_timeout

Brings TS parity with Python on BUG #4 parity items + #14 agent fields
+ IMP2 ring_timeout.

- Auto-detect local mode: ``new Patter({twilioSid, twilioToken, …})``
  without explicit ``mode: 'local'`` is now treated as local when
  apiKey is missing (mirrors Python).
- Static provider factories: ``Patter.deepgram(...)``,
  ``Patter.elevenlabs(...)``, ``Patter.whisper(...)``,
  ``Patter.openaiTts(...)``, ``Patter.cartesia(...)``, ``Patter.rime(...)``,
  ``Patter.lmnt(...)``.
- ``STTConfig.toDict`` / ``TTSConfig.toDict`` are now optional — plain
  object literals ``{provider, apiKey, language}`` are accepted
  everywhere (fallback serialisation is handled via
  ``sttConfigToDict`` / ``ttsConfigToDict`` helpers).
- ``STTConfig`` gets an ``options`` bag (parity with Python BUG #13).
- ``LocalCallOptions.ringTimeout`` forwarded to Twilio as ``Timeout``
  and Telnyx as ``timeout_secs`` — plus ``StatusCallbackEvent`` wired
  so the dashboard sees ringing/no-answer/busy/failed transitions
  (BUG #6).
- ``AgentOptions.bargeInThresholdMs`` (parity with #20 on Python).
- ``LocalOptions.deepgramKey`` / ``elevenlabsKey`` added as
  provider-level defaults (parity with Python Patter() kwargs).
- ``Patter.call()`` Twilio branch pre-registers the dialled call with
  ``metricsStore.recordCallInitiated`` so no-answer / busy / failed
  attempts still show up in the dashboard.
- ``providers.deepgram(...)`` factory exposes the Deepgram knobs
  (model / endpointing_ms / utterance_end_ms / smart_format /
  interim_results) and carries them in ``STTConfig.options``.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(ts/providers): voice resolver, Deepgram knobs, TTS streaming resample

TS parity port of Python BUG #11, #13, #23.

- ElevenLabs: ``resolveVoiceId()`` maps display names (rachel, adam,
  matilda, alloy, …) to the opaque 20-char UUIDs accepted by the
  /text-to-speech/{voice_id}/stream endpoint. Map mirrors the Python
  SDK byte-for-byte.
- DeepgramSTT: constructor overloaded to accept ``DeepgramSTTOptions``
  (endpointingMs / utteranceEndMs / smartFormat / interimResults /
  vadEvents) alongside the legacy positional form. Transcript gate
  loosened to ``is_final OR speech_final`` so short utterances don't
  wait for Deepgram's utterance_end commit.
- OpenAITTS: streaming 24 kHz → 16 kHz resample now carries state
  (``carryByte`` + ``leftover`` samples) between chunks so cross-chunk
  alignment doesn't drift. The legacy ``resample24kTo16k`` static is
  kept as a thin wrapper around the streaming path for the existing
  unit tests.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(ts): Telnyx stack, pipeline hooks/barge-in/dedup, dashboard status, scheduler sync

TS parity port of the Python fixes for BUG #2/#3/#6/#12/#15/#16/#17/#18/#19/#20/#22.

- ``stream-handler.ts``: ``handleAudio`` now runs the
  ``before_send_to_stt`` hook (#15), transcodes Twilio mulaw 8 kHz →
  PCM16 16 kHz unconditionally on the pipeline path (#12), and keeps
  forwarding caller audio during TTS so barge-in can trigger (#20).
  ``processTranscript`` implements the dedup + 500 ms throttle +
  hallucination-word blacklist from #22 and flips ``isSpeaking`` +
  ``sendClear`` on any transcript with text while the agent is
  speaking (#20).
- ``server.ts``: ``TelnyxBridge.sendAudio`` / ``sendClear`` use the
  correct ``{event:"media",media:{payload:b64}}`` wire format (#18);
  the Telnyx WS handler matches ``data.event`` (start / media / stop /
  dtmf / error / connected) and filters ``media.track !== "inbound"``
  before forwarding (#17, #19); the ``/webhooks/telnyx/voice`` route
  POSTs ``actions/answer`` and ``actions/streaming_start`` via the
  Call Control REST API and returns empty HTTP 200 (#16).
  ``TwilioBridge.createStt`` picks linear16 16 kHz when
  ``provider === 'pipeline'`` so Deepgram doesn't decode already-PCM
  bytes as mulaw (#12). A new ``/webhooks/twilio/status`` handler
  consumes Twilio status callbacks and updates the dashboard (#6).
- ``scheduler.ts``: ``scheduleCron`` returns a ``ScheduleHandle``
  synchronously (lazy node-cron import happens in the background) —
  parity with Python #4. ``scheduleInterval`` accepts
  ``{intervalMs}`` or ``{seconds}`` in addition to the legacy
  positional ms, matching Python ``schedule_interval(seconds=...)``.
- ``fallback-provider.ts``: ``completeStream()`` text-only convenience
  generator (#2), ``aclose()`` + ``Symbol.asyncDispose`` so
  ``await using fallback = ...`` parity with Python's
  ``async with FallbackLLMProvider(...)`` (#5).
- ``dashboard/store.ts``: ``recordCallInitiated`` pre-registers
  outbound attempts, ``updateCallStatus`` promotes rows through
  ringing / no-answer / busy / failed and moves terminal states to
  the completed list (#6).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(ts): align with 0.4.4 wire-format & provider API changes

- ``providers.test.ts``: toDict now surfaces ``options`` when set,
  knobs forwarding verified.
- ``types.test.ts``: toDict optional chain covered.
- ``openai-tts.test.ts``: 1-byte input no longer returns the byte
  verbatim — the streaming resampler stashes it as ``carryByte`` and
  the stateless wrapper flushes only complete samples, so the test now
  asserts an empty buffer.
- ``integration/twilio-pipeline.test.ts`` + ``integration/telnyx-pipeline.test.ts``:
  ``handleAudio`` is now async; tests await it. Telnyx fixture feeds
  mulaw 8 kHz and asserts the transcoded PCM16 16 kHz lands on the STT
  mock (BUG #12 + #19).
- ``unit/server-routes.test.ts``: Telnyx webhook tests assert the
  REST ``actions/answer`` + ``actions/streaming_start`` POSTs and the
  empty HTTP 200 response (BUG #16).
- ``package-lock.json``: refreshed for the sdk-ts worktree so the
  ``0.4.3`` → ``0.4.3-worktree`` alignment is consistent.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(unit): regression coverage for BUG #6/#22/#23 + ring_timeout (IMP2)

Three new unit test files lock in fixes that previously lived in the
acceptance suite as live-call checks:

  test_pipeline_dedup.py (13 tests)
    - Hallucination blacklist: "you", "thank you", ".", case/punctuation
      variants, empty-after-strip all drop silently.
    - 2-second duplicate window with time.time monkeypatched so parity
      with the live Whisper feedback loop is deterministic.
    - 500 ms back-to-back throttle covering legitimate vs spurious
      second turns.
    - Interim / empty finals must not fire on_transcript.

  test_openai_tts_resample.py (7 tests)
    - Cross-chunk ratecv state: multi-chunk stream output matches a
      single-shot resample byte-for-byte.
    - Odd-byte boundary: a chunk ending on a dangling byte must not
      drop the sample.
    - Empty / single-byte / tiny chunks must not crash.
    - Response is always aclosed on both successful and early-exit paths.

  test_twilio_status_and_ring_timeout.py (13 tests)
    - /webhooks/twilio/status routes to update_call_status with parsed
      duration, and survives missing SID, bad duration, and the
      dashboard-disabled path.
    - Twilio signature enforcement on the status endpoint.
    - Twilio ring_timeout -> Timeout REST param, Telnyx -> timeout_secs.
    - Twilio StatusCallback / StatusCallbackEvent are always registered
      on outbound calls so BUG #6 cannot regress.

Full unit suite: 728 passed, 2 skipped.

* docs+ci: latency/provider caveats + audit workflow

README
  - Pipeline turn-latency floor documented (~2.0–2.8 s) with per-stage
    breakdown so users know to switch to `provider="openai_realtime"`
    for sub-second UX.
  - ElevenLabs free-tier library-voice restriction (402) with pointer
    to `ELEVENLABS_VOICE_ID`.
  - Telnyx outbound D38 Outbound Profile requirement.
  - Google Gemini free-tier quota=0 caveat.
  - Whisper hallucination filter documented.
  - `ring_timeout` + status callback description added to call().

.github/workflows/audit.yml (new)
  - pip-audit on sdk-py runtime deps.
  - npm audit on sdk-ts production deps.
  - bandit static analysis with SARIF upload to GitHub Security.
  - Runs on dep-manifest changes, weekly schedule, and manual dispatch.
  - Findings are advisory-only to keep the pipeline from flaking on
    upstream CVE churn (telephony stack pulls many C-wrapped libs).

Baseline audit run: npm=0, bandit medium+/high-confidence=0,
pip-audit=2 (pytest dev-only + transformers optional-extra only).

* docs(readme): remove local-measured latency numbers from Voice Modes

The millisecond ranges previously listed for each provider came from a
single local benchmark run and are neither representative nor a target.
Keep the modes table qualitative and replace the per-stage breakdown
with a short note that latency is inherited from the chosen providers —
no hard numbers we don't want callers anchoring on.

* test(unit): bug coverage gaps — BUG #15/#19/#20

Three new unit test modules fill the remaining coverage gap for the
bugs fixed on this branch:

  test_pipeline_bargein.py (7 tests) — BUG #20
    - Interim transcript during TTS triggers send_clear + is_speaking=False.
    - record_turn_interrupted is fired on the metrics accumulator.
    - send_clear throwing does not crash the STT loop (fail-open).
    - No barge-in when the agent is idle or the transcript has no text.
    - Final transcripts also trigger the barge-in branch before the
      downstream LLM turn runs.

  test_before_send_to_stt_hook.py (9 tests) — BUG #15
    - Sync / async hook returning None drops the chunk (zero STT sends).
    - Returning modified bytes forwards the new buffer verbatim.
    - Hook receives the decoded PCM, not the raw mulaw payload.
    - Raising hooks fail-open: original audio still reaches STT.
    - Missing hook / hooks instance with before_send_to_stt=None are
      both bypass paths that must still forward audio.

  test_telnyx_track_filter.py (5 tests) — BUG #19
    - track=inbound forwards, track=outbound drops.
    - Missing `track` field defaults to inbound (legacy Telnyx payloads).
    - Mixed stream: only inbound frames reach the handler, in order.
    - Unknown track values are skipped defensively.

Full unit suite: 749 passed, 2 skipped (+21 from this commit).

* feat(sdk-py): add cartesia/rime/lmnt static factories + vad_events to deepgram

Brings Python SDK to parity with sdk-ts:
- Adds Patter.cartesia / Patter.rime / Patter.lmnt static methods so local-mode
  users can configure these TTS providers the same way they do in TypeScript.
- Adds the missing vad_events keyword to Patter.deepgram and the
  patter.providers.deepgram factory — the DeepgramSTT ctor already accepted
  it, but the public config helper silently dropped the flag.

* chore: bump to 0.4.4

Regression suites re-run after the bump:
  - sdk-py: 749 passed, 2 skipped
  - sdk-ts: 932 passed (57 test files, including soak)

* fix(ci): integration tests on 0.4.4 wire format + misc hygiene

Addresses the five failing CI checks on PR #66.

Telnyx integration tests (test_telnyx_{convai,pipeline,realtime}.py)
  - ``_telnyx_stream_started`` / ``_telnyx_media_event`` /
    ``_telnyx_stream_stopped`` helpers migrated from the pre-0.4.4
    ``{event_type, payload.audio.chunk}`` shape to the real Telnyx
    media-stream wire format ``{event, start|media.payload}`` (BUG
    #17/#18). Without this the bridge silently drops every test frame
    and 11 integration tests fail with "handler called 0 times".
  - ``test_audio_format_pcm16`` renamed to ``test_audio_format_g711_ulaw``
    and the assertion flipped — Telnyx is PCMU 8 kHz bidirectional
    (BUG #19), Realtime runs on ``g711_ulaw`` so both legs stay
    pass-through.

sdk-ts/src/scheduler.ts
  - Removed the trailing blank line that broke the pre-commit
    ``end-of-file-fixer`` hook.

.github/workflows/audit.yml
  - Bandit stock CLI doesn't support ``-f sarif`` — install
    ``bandit-sarif-formatter`` alongside bandit, and guard the
    upload-sarif step with ``hashFiles`` so future formatter breakage
    doesn't fail the job.

Local verification: 802 passed, 4 skipped (sdk-py unit + integration).

* docs: update SDK reference for 0.4.4 features

- Update version to 0.4.4 in API reference
- Add static factories: cartesia(), rime(), lmnt() for TTS
- Document new agent() parameters: hooks, text_transforms, vad, audio_filter, background_audio, barge_in_threshold_ms
- Add ring_timeout parameter to call() signature
- Document Deepgram tuning options: endpointing_ms, utterance_end_ms, vad_events
- Synchronize Python and TypeScript API documentation for parity

* docs: document barge_in_threshold_ms configuration

Update barge-in feature documentation to reflect new barge_in_threshold_ms parameter
(default 300ms). Document how to customize or disable via agent configuration.

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
nicolotognoni added a commit that referenced this pull request Apr 29, 2026
Five fixes uncovered by the 0.5.5 acceptance matrix run, ranging from a
HIGH-severity onnxruntime-node version mismatch that blocks Silero VAD
on macOS x86_64 to a misleading metric that makes healthy calls look
slow.

**Bug #1 (HIGH) — SileroVAD onnxruntime-node 1.24+ API drift**
* ``optionalDependencies.onnxruntime-node`` tightened from ``^1.18.0`` to
  ``~1.18.0`` — the caret was resolving to 1.24.x where
  ``listSupportedBackends`` was removed and the prebuilt ``bin/`` layout
  drifted, so ``import('onnxruntime-node')`` failed on macOS x86_64.
* ``loadOnnxRuntime`` now classifies the underlying error
  (``missing`` / ``binding`` / ``api-drift`` / ``unknown``) and surfaces a
  targeted remedy plus the original error chain via ``Error.cause`` —
  previously the failure mode was hidden behind a single "could not be
  resolved" string.

**Bug #2 (MEDIUM) — ElevenLabsConvAI agent_id error message**
* The env-var fallback already worked but the error message did not say
  *where* to get an agent ID from (the dashboard, not the API key).
  Updated both Python and TypeScript constructors to point users at
  https://elevenlabs.io/app/conversational-ai and reiterate that the
  agent ID is per-deployed-agent.
* Python ``ConvAI.__post_init__`` now raises when ``agent_id`` is empty
  (was silently passing through) — TypeScript already did this. Parity.

**Bug #3 (MEDIUM) — ElevenLabs WS payment_required**
* New typed exception ``ElevenLabsPlanError`` (subclass of
  ``ElevenLabsTTSError``) raised when the WS endpoint returns
  ``payment_required``. Free / Starter plans now get a clear "upgrade
  or use the HTTP class (drop-in API)" message instead of an opaque
  ``ElevenLabsTTSError: ElevenLabs WS error: payment_required``.
* Detection is case-insensitive and matches both the exact server
  string and any ``payment_required`` substring.

**Bug #5 (MEDIUM) — barge-in fragile in pipeline mode without VAD**
* On tunnel + speakerphone setups the agent's own TTS leaks into the
  inbound mic feed, STT transcribes it, and the legacy
  "always forward + bargeInThresholdMs" heuristic fails to fire the
  cancel — the agent talks over the user.
* ``serve()`` now logs a one-shot warning at startup when
  ``agent.engine`` is undefined, ``agent.vad`` is undefined, and
  ``bargeInThresholdMs > 0``, recommending ``SileroVAD`` or
  ``bargeInThresholdMs: 0``. Both Python and TypeScript.

**Bug #6 (LOW) — pipeline ``total_ms`` misleading on long utterances**
* ``total_ms`` spans the user's entire utterance (including pauses)
  because it includes ``stt_ms``, which itself measures STT-stream-open
  to transcript-finalisation. On a 4 s user turn ``total_ms`` reads
  ~5.5 s even though the agent's TTFA after end-of-speech is ~1.3-1.5
  s — misleading as a p95 / SLO metric.
* New ``LatencyBreakdown.agent_response_ms`` field (Python +
  TypeScript). Computed as ``endpoint_ms + llm_ttft_ms + tts_ms`` when
  all three signals are available, ``undefined`` / ``None`` otherwise.
  This is the user-perceived latency dashboards should track.
* ``total_ms`` kept unchanged for backward compatibility.

**Bug #7 (HIGH) — outbound TwiML races tunnel startup**
* The documented ``void phone.serve(...) → setTimeout → phone.call(...)``
  pattern reads ``localConfig.webhookUrl`` while the cloudflared
  hostname is still resolving, producing
  ``wss://undefined/...`` in the dial TwiML and a Twilio 11100 call
  drop on answer.
* New ``phone.tunnelReady`` Promise (TS) / ``phone.tunnel_ready``
  ``asyncio.Future`` (Python). Resolves to the public webhook hostname
  once ``serve()`` knows it (immediately for static webhookUrl,
  after ``startTunnel`` for ``tunnel: true``). Rejects if ``serve()``
  fails before the hostname is known.
* Documented pattern is now ``await phone.tunnelReady`` instead of
  ``setTimeout(10_000)`` — deterministic, no race.
* Same root-cause fix likely also addresses Bug #4 (intermittent WS
  upgrade race) which the acceptance run flagged as a related symptom.

Test totals after the fixes: Python 1064 PASS / 7 skip, TypeScript
1163 PASS / 67 files, cross-SDK chunker parity 53 PASS / 8 XFAIL / 0
FAIL on the 61-case fixture. No regressions.
nicolotognoni added a commit that referenced this pull request May 1, 2026
…#81)

* test(parity): cross-SDK sentence chunker fixture + standalone runner

Add a 61-case fixture documenting expected sentence-chunker output for
every supported edge case across English, Italian, CJK, Hindi, Arabic,
Khmer, Burmese, Armenian, and Ethiopic scripts. Each case carries the
ideal `expected_sentences` plus an optional `current_behavior` field that
documents known regressions / by-design quirks so the runner can xfail
them without blocking CI.

Standalone runner (`sentence_chunker_parity.py`) executes each case
through the Python `SentenceChunker`, spawns `node sentence_chunker_shim.js`
for the TypeScript equivalent, and compares emissions case-by-case.
Self-contained — does not depend on the main `tests/parity/run.py`
runner (which currently fails on the recent `patter` -> `getpatter`
package rename).

Result on the current main branch: 53 PASS / 8 XFAIL / 0 FAIL /
0 PARITY_FAIL — Python and TypeScript chunkers produce identical
sentence streams for every covered case.

* feat(chunker): IT/EN abbreviations, multilingual terminators, aggressive first-flush

Three layered improvements to ``SentenceChunker`` (parity Py↔TS), all
additive — no breaking change to the default behaviour:

**Italian + English abbreviations** (Phase 1, 7)
* Prefix list adds Sig, Sgr, Dott, Prof, Avv, Ing, Geom, Rag, Arch, On,
  Egr, Spett, Gent, Ill (Italian honorifics) plus Gen, Sen, Rep, Lt,
  Cpt, Capt, Col, Cmdr, Adm (Pipecat NLTK Punkt).
* Suffix list adds ecc, cit, cap, sez, art, pag, fig, tab, cfr, vol, ed
  (Italian) plus vs, etc, No, Vol, pp, cf, ca, op, Mt, Hwy, Rt, Pl, Ave,
  Blvd, Sq (Pipecat).
* Suffix-followed-by-starter pattern now preserves the trailing period
  (e.g. ``Patter Inc. He left.`` keeps ``Inc.`` instead of dropping it).
* All-caps name fix (Pipecat #1692): the maybe-short-flush gate-5
  acronym guard previously blocked any uppercase-preceded period, so
  ``"...with RAMESH."`` would never flush. Now only purely uppercase
  ASCII words ≤3 chars (U/US/USA/NATO patterns) are treated as acronyms.

**Multilingual terminator support** (Phase 7)
* Added ASCII semicolon ``;``, Unicode ellipsis ``…``, full-width
  semicolon/period/Japanese half-width to the terminator set.
* Ported Pipecat's ``UNAMBIGUOUS_NON_LATIN_TERMINATORS`` (BSD-2): Hindi
  Devanagari ``। ॥``, Arabic ``؟ ؛ ۔ ؏``, Khmer ``។ ៕``, Burmese ``။``,
  Armenian ``։``, Ethiopic ``። ፧``, Tibetan ``༎ ༏``.
* Final ``<stop>`` regex builds its character class from the merged set.

**Opt-in aggressive first-clause flush** (Phase 2)
* New constructor option ``aggressive_first_flush`` (Python) /
  ``aggressiveFirstFlush`` (TypeScript). **Default OFF.**
* When enabled, emits the first clause of the response on a soft
  punctuation boundary (``,``, em-dash, en-dash) once the buffer reaches
  ``aggressive_first_min_len`` (default 40 chars). Saves 200–500 ms TTFA
  on the first sentence of each turn.
* Eight guards prevent regressions on the safe-but-aggressive path:
  min-length, decimal-comma (``3,14``), thousands-separator
  (``1,000,000``), currency (``$1,000``, ``€1.000,50``), balanced
  parens/brackets/braces/double-quotes (protects JSON), ellipsis
  (``...``, ``…``), comma-before-quote, sub-token ambiguity (requires
  one char after the terminator).
* Italian (``language="it"``) hard-disables the feature regardless of
  caller preference — Italian inverts EN convention (``,`` decimal,
  ``.`` thousands), so a comma-flush would split mid-number.
* New ``Agent.aggressive_first_flush: bool = False`` field on Python
  ``Agent`` model. TypeScript ``AgentOptions.aggressiveFirstFlush`` is
  shipped in the after_llm 3-tier commit alongside the rest of the
  ``types.ts`` surface.

Test coverage: +11 Python unit tests + +11 TypeScript unit tests for
the aggressive first-flush feature + parity-fixture cases for RAMESH,
Hindi danda, Arabic question mark, ASCII semicolon, Unicode ellipsis,
vs./etc./Gen./Sen. abbreviations.

Sentence-chunker constants and abbreviation lists ported from Pipecat
(BSD-2-Clause, Daily) and from the LiveKit-derived regex base
(Apache-2.0).

* feat(hooks): after_llm 3-tier API with deprecated legacy callable adapter

The ``after_llm`` pipeline hook used to be a single callable
``(text, ctx) → str`` that received the full LLM response only after
the stream completed. Buffering the entire response added 500 ms – 2 s
of TTFA for any agent that configured the hook.

This commit introduces a 3-tier API that lets callers pick the right
latency budget for their transform:

* ``onChunk`` (sync, ~0 ms) — per-token transform applied inline before
  the stream-handler ever sees the token. Use for: regex replace,
  markdown strip, profanity char-swap. Does NOT block streaming.
* ``onSentence`` (async, 50–300 ms) — runs between the sentence chunker
  and TTS. Returns rewritten sentence, ``null`` to keep the original,
  ``""`` to drop the sentence. Use for: PII redaction, persona overlay,
  refusal swap. Adds latency only on the rewritten sentence, not the
  full turn.
* ``onResponse`` (async, 500 ms – 2 s) — full-response rewrite that
  buffers the LLM stream then runs once. **Blocks streaming TTS.** Use
  only when sentence-level rewrite is insufficient (e.g. structured
  output validation that needs the full text).

Backward compatibility
----------------------
The legacy single callable ``afterLlm: (text, ctx) => string`` still
works and is mapped to ``onResponse`` with a one-shot
``PatterDeprecationWarning`` (Python — subclass of both
``DeprecationWarning`` and ``UserWarning`` so it surfaces by default in
library code) or ``console.warn`` (TypeScript). Removal scheduled for
v0.7.0.

Detection in TypeScript uses ``typeof hook === 'function'`` (not
``hook.length`` arity sniffing — that pattern breaks under minifiers
and arrow defaults). Detection in Python uses ``callable(hook)`` plus
``_has_tier_attrs(hook)`` to disambiguate from object-form hooks.

Wire-up
-------
* ``llm_loop.py`` / ``llm-loop.ts`` — ``has_after_llm_response`` (and
  the legacy callable that maps to it) gates token buffering.
  ``has_after_llm_chunk`` triggers per-token transform inline before
  yield.
* ``stream_handler.py`` / ``stream-handler.ts`` — applies
  ``has_after_llm_sentence`` between the chunker emit and the TTS
  synthesise call. Both the streaming-LLM path and the non-streaming
  ``_speakFinalResponse`` path apply the hook for parity.
* The same ``stream_handler`` change wires
  ``Agent.aggressive_first_flush`` / ``AgentOptions.aggressiveFirstFlush``
  into the chunker constructor (Phase 2 wire-up that needed
  ``stream_handler`` and ``types.ts`` to land here alongside the hook
  changes — separating them would have required interactive patch
  staging on the same hunks).

Test coverage
-------------
* +11 Python pytest cases under ``TestAfterLlmThreeTier`` covering: no
  hook pass-through, legacy callable maps to ``on_response`` with
  deprecation warning, dict / Protocol / object forms, drop-by-empty,
  fail-open on hook exception, type confusion (non-string return),
  legacy alias methods (``has_after_llm`` / ``run_after_llm``) preserved.
* +9 TypeScript Vitest cases covering the equivalent surface.

* feat(tts): ElevenLabsWebSocketTTS — opt-in low-latency WS variant

New TTS provider that targets ElevenLabs' streaming-input WebSocket
endpoint (``/v1/text-to-speech/{voice}/stream-input``) instead of the
HTTP ``/stream`` endpoint used by ``ElevenLabsTTS``. Saves ~50 ms HTTP
request setup per utterance and avoids the TLS cold-start handshake on
bursty calls.

Drop-in API matching ``ElevenLabsTTS``:

* Same ``synthesize`` (Python) / ``synthesizeStream`` (TypeScript)
  signature returning ``AsyncGenerator<bytes>``.
* Same ``for_twilio()`` / ``for_telnyx()`` factories.
* Same default model ``eleven_flash_v2_5``.
* Top-level export ``getpatter.ElevenLabsWebSocketTTS`` (Py) /
  ``import { ElevenLabsWebSocketTTS } from "getpatter"`` (TS).

Defaults
--------
* ``auto_mode=true`` — server picks chunk timing.
* ``inactivity_timeout=60`` (range 5–180).
* Per-utterance lifecycle. Documented as a known trade-off vs Pipecat's
  per-session pool (pooling is on the roadmap for v0.6.x).
* ``eleven_v3*`` is rejected at construction with a clear error — the
  WS stream-input endpoint does not support v3; users must fall back
  to the HTTP class.

Resilience contract (post-review hardening)
-------------------------------------------
* **Connect timeout 5 s** (Pipecat-aligned, was 15 s in earlier
  drafts) bounds DNS + TLS handshake.
* **Per-frame receive timeout 30 s** prevents the generator hanging
  forever on a stalled server.
* **Permanent error handler attached BEFORE the open await** — closes
  a window where an error fired after the once-listener resolved would
  surface as ``uncaughtException`` in Node.
* **All ws listeners removed in ``finally``** — no closure leak past
  socket close.
* **Server ``error`` raises ``ElevenLabsTTSError``** instead of
  silently completing — caller can distinguish "synthesis succeeded
  with empty text" from "synth failed mid-stream".
* **Best-effort EOS ``{"text":""}`` in ``finally``** — tells
  ElevenLabs to stop billing for unconsumed audio. Sending it
  immediately after ``flush:true`` (the previous draft) risked
  truncating tail audio under ``auto_mode=true``.
* **Audio frame size cap 512 KB** prevents OOM via malicious /
  malformed base64 (real frames are ~75 KB decoded).
* **Server error string sanitised** before logging (strips CR/LF/NUL,
  truncates to 200 chars) — defends against log-line injection.
* **``api_key`` private** (``_api_key`` + read-only ``api_key``
  property) so ``vars(tts)`` / dataclass-style introspection cannot
  surface the secret.
* **``eleven_v3`` prefix-based reject** also blocks
  ``eleven_v3_preview``, ``eleven_v3_alpha``.
* **Public wrapper exposes the full options surface**
  (``voice_settings``, ``language_code``, ``inactivity_timeout``,
  ``chunk_length_schedule``) — earlier drafts dropped them.
* **Default voice consistency**: the public wrapper no longer
  overrides the provider class default — both layers use Rachel
  (``21m00Tcm4TlvDq8ikWAM``) so direct-construct and wrapped-construct
  paths agree.

Public surface
--------------
* ``getpatter/providers/elevenlabs_ws_tts.py`` — provider class
  ``ElevenLabsWebSocketTTS`` + ``ElevenLabsTTSError``.
* ``getpatter/tts/elevenlabs_ws.py`` — wrapper class ``TTS`` re-exported
  as ``ElevenLabsWebSocketTTS`` from the package root.
* ``sdk-ts/src/providers/elevenlabs-ws-tts.ts`` + corresponding
  TypeScript wrapper at ``sdk-ts/src/tts/elevenlabs-ws.ts``.
* ``sdk-ts/src/providers/elevenlabs-tts.ts`` — ``resolveVoiceId``
  promoted from module-private to public export so the WS variant can
  share the voice-name → voice-id resolution table without
  duplicating the lookup map.
* ``sdk-py/getpatter/__init__.py`` and ``sdk-ts/src/index.ts`` —
  top-level re-exports.

Test coverage
-------------
* +20 Python pytest cases (construction, factories, URL build, send
  sequence, ``isFinal`` termination, voice settings in init,
  ``chunk_length_schedule`` only with ``auto_mode=False``,
  ``eleven_v3`` rejection + variants, env-var resolution).
* +11 TypeScript Vitest cases covering the equivalent surface,
  including a faked ``ws`` module that records sent frames.

The HTTP ``ElevenLabsTTS`` class is **untouched** — both transports
coexist and the user picks per-call.

* release: 0.5.5 — latency pass 1 (chunker + after_llm 3-tier + WS TTS)

Bump ``getpatter`` to 0.5.5 across both SDKs (Python ``pyproject.toml``,
TypeScript ``package.json`` + ``package-lock.json``, and the SDK
``__version__`` / ``VERSION`` constants kept in sync).

CHANGELOG entry covers the four user-visible additions shipped in this
release:

* Sentence chunker — IT/EN abbreviations + multilingual terminators +
  RAMESH-style all-caps flush bug fix (Pipecat #1692). Default
  behaviour unchanged for existing users.
* Opt-in ``aggressive_first_flush`` / ``aggressiveFirstFlush`` on
  ``Agent`` / ``AgentOptions`` — emits the first clause of each turn
  on a soft-punctuation boundary (",", em-dash, en-dash) once the
  buffer reaches ~40 chars. Saves 200–500 ms TTFA. Italian
  hard-disabled (decimal-comma + dot-thousands inversion). 8 guards
  prevent regressions on decimals, currency, JSON, ellipsis,
  open-delimiters, comma-before-quote, sub-token ambiguity.
* New 3-tier ``after_llm`` API (``onChunk`` / ``onSentence`` /
  ``onResponse``). Legacy single-callable form still works (mapped to
  ``onResponse``) but emits a one-shot ``PatterDeprecationWarning`` /
  ``console.warn``. Removal: v0.7.0.
* New opt-in ``ElevenLabsWebSocketTTS`` class — drop-in replacement
  for ``ElevenLabsTTS`` (HTTP) using the ``stream-input`` WebSocket
  endpoint. Saves ~50 ms HTTP setup + TLS cold-start per utterance.
  Per-utterance lifecycle (per-session pooling on the roadmap).

Test totals after this release: Python 1064 PASS / 7 skip,
TypeScript 1163 PASS / 67 files, cross-SDK chunker parity 53 / 8
XFAIL / 0 FAIL on a 61-case fixture spanning EN, IT, CJK, Hindi,
Arabic, Khmer, Burmese, Armenian, and Ethiopic scripts.

Cumulative review hardening from 11 parallel review agents
(Python-reviewer, TypeScript-reviewer, provider-reviewer, sdk-parity,
security-reviewer, code-reviewer, code-simplifier, refactor-cleaner,
docs-sync, build-validator, examples-validator) is folded into the
phase-specific commits — see the per-feature commits in this branch
for the detailed CRITICAL / HIGH fix lists.

* docs: Mintlify pages for 0.5.5 — WS TTS, after_llm 3-tier, aggressive flush

Document the four user-visible additions shipped in 0.5.5:

* **ElevenLabsWebSocketTTS** — new provider sub-pages
  ``docs/{python,typescript}-sdk/providers/elevenlabs-websocket.mdx``.
  What it is, why use it, ``for_twilio`` / ``for_telnyx`` factories,
  full constructor params table, ``eleven_v3*`` limitation,
  per-utterance lifecycle trade-off, ``ElevenLabsTTSError``. Both
  sub-pages added to the TTS group navigation in ``docs/docs.json``.
  Existing ``tts.mdx`` providers table updated with the new row plus a
  callout pointing at the WS variant.

* **``after_llm`` 3-tier API** — new "Pipeline Hooks" section in
  ``docs/{python,typescript}-sdk/events.mdx``: per-tier table for
  ``onChunk`` (sync, ~0 ms), ``onSentence`` (async, 50–300 ms), and
  ``onResponse`` (async, 500 ms – 2 s, blocks streaming). Return
  semantics (``null`` keep / ``""`` drop), legacy callable migration
  path with ``PatterDeprecationWarning`` (Python) / one-shot
  ``console.warn`` (TypeScript), removal in v0.7.0.

* **``aggressive_first_flush`` opt-in** — new row in the
  ``AgentOptions`` / ``Agent`` parameters tables in
  ``docs/{python,typescript}-sdk/agents.mdx`` and ``reference.mdx``
  with the Italian hard-disable note. Python ``features.mdx`` adds a
  dedicated section with code example and the 8-guard summary.

* **Chunker improvements** — Python ``features.mdx`` documents the
  expanded EN abbreviations (``vs.``, ``etc.``, ``Gen.``, ``Sen.``),
  IT abbreviations (``Sig.``, ``Dott.``, ``S.p.A.``, ``ecc.``), and
  multilingual terminator support (Hindi / Arabic / Armenian /
  Ethiopic / Khmer / Burmese / Tibetan). TypeScript SDK has no
  chunker page so no equivalent change required.

``docs.json`` JSON validated end-to-end. No source / examples /
CHANGELOG / NOTICE files touched.

* fix: 5 bugs from 2026-04-29 acceptance run (sdk-ts 0.5.5)

Five fixes uncovered by the 0.5.5 acceptance matrix run, ranging from a
HIGH-severity onnxruntime-node version mismatch that blocks Silero VAD
on macOS x86_64 to a misleading metric that makes healthy calls look
slow.

**Bug #1 (HIGH) — SileroVAD onnxruntime-node 1.24+ API drift**
* ``optionalDependencies.onnxruntime-node`` tightened from ``^1.18.0`` to
  ``~1.18.0`` — the caret was resolving to 1.24.x where
  ``listSupportedBackends`` was removed and the prebuilt ``bin/`` layout
  drifted, so ``import('onnxruntime-node')`` failed on macOS x86_64.
* ``loadOnnxRuntime`` now classifies the underlying error
  (``missing`` / ``binding`` / ``api-drift`` / ``unknown``) and surfaces a
  targeted remedy plus the original error chain via ``Error.cause`` —
  previously the failure mode was hidden behind a single "could not be
  resolved" string.

**Bug #2 (MEDIUM) — ElevenLabsConvAI agent_id error message**
* The env-var fallback already worked but the error message did not say
  *where* to get an agent ID from (the dashboard, not the API key).
  Updated both Python and TypeScript constructors to point users at
  https://elevenlabs.io/app/conversational-ai and reiterate that the
  agent ID is per-deployed-agent.
* Python ``ConvAI.__post_init__`` now raises when ``agent_id`` is empty
  (was silently passing through) — TypeScript already did this. Parity.

**Bug #3 (MEDIUM) — ElevenLabs WS payment_required**
* New typed exception ``ElevenLabsPlanError`` (subclass of
  ``ElevenLabsTTSError``) raised when the WS endpoint returns
  ``payment_required``. Free / Starter plans now get a clear "upgrade
  or use the HTTP class (drop-in API)" message instead of an opaque
  ``ElevenLabsTTSError: ElevenLabs WS error: payment_required``.
* Detection is case-insensitive and matches both the exact server
  string and any ``payment_required`` substring.

**Bug #5 (MEDIUM) — barge-in fragile in pipeline mode without VAD**
* On tunnel + speakerphone setups the agent's own TTS leaks into the
  inbound mic feed, STT transcribes it, and the legacy
  "always forward + bargeInThresholdMs" heuristic fails to fire the
  cancel — the agent talks over the user.
* ``serve()`` now logs a one-shot warning at startup when
  ``agent.engine`` is undefined, ``agent.vad`` is undefined, and
  ``bargeInThresholdMs > 0``, recommending ``SileroVAD`` or
  ``bargeInThresholdMs: 0``. Both Python and TypeScript.

**Bug #6 (LOW) — pipeline ``total_ms`` misleading on long utterances**
* ``total_ms`` spans the user's entire utterance (including pauses)
  because it includes ``stt_ms``, which itself measures STT-stream-open
  to transcript-finalisation. On a 4 s user turn ``total_ms`` reads
  ~5.5 s even though the agent's TTFA after end-of-speech is ~1.3-1.5
  s — misleading as a p95 / SLO metric.
* New ``LatencyBreakdown.agent_response_ms`` field (Python +
  TypeScript). Computed as ``endpoint_ms + llm_ttft_ms + tts_ms`` when
  all three signals are available, ``undefined`` / ``None`` otherwise.
  This is the user-perceived latency dashboards should track.
* ``total_ms`` kept unchanged for backward compatibility.

**Bug #7 (HIGH) — outbound TwiML races tunnel startup**
* The documented ``void phone.serve(...) → setTimeout → phone.call(...)``
  pattern reads ``localConfig.webhookUrl`` while the cloudflared
  hostname is still resolving, producing
  ``wss://undefined/...`` in the dial TwiML and a Twilio 11100 call
  drop on answer.
* New ``phone.tunnelReady`` Promise (TS) / ``phone.tunnel_ready``
  ``asyncio.Future`` (Python). Resolves to the public webhook hostname
  once ``serve()`` knows it (immediately for static webhookUrl,
  after ``startTunnel`` for ``tunnel: true``). Rejects if ``serve()``
  fails before the hostname is known.
* Documented pattern is now ``await phone.tunnelReady`` instead of
  ``setTimeout(10_000)`` — deterministic, no race.
* Same root-cause fix likely also addresses Bug #4 (intermittent WS
  upgrade race) which the acceptance run flagged as a related symptom.

Test totals after the fixes: Python 1064 PASS / 7 skip, TypeScript
1163 PASS / 67 files, cross-SDK chunker parity 53 PASS / 8 XFAIL / 0
FAIL on the 61-case fixture. No regressions.

* fix(bug-4): outbound WS upgrade race — encoded events + ready signal + diagnostics

Three layered fixes targeting the intermittent "outbound call connects
but never receives the WS upgrade" failure (Twilio 11100 on answer)
documented in BUGS.md.

**Root cause A — StatusCallbackEvent encoding**
Twilio expects ``StatusCallbackEvent`` as a multi-value parameter
(repeated keys), NOT a space-separated single value. The previous
``'initiated ringing answered completed'`` form triggered Twilio
notification 21626 ("invalid statusCallbackEvents") on every outbound
call, and on some ingestion paths also broke the answer-handler webhook
which is exactly the symptom that produced 11100.

* TypeScript: use ``params.append('StatusCallbackEvent', evt)`` four
  times so URLSearchParams emits repeated query keys.
* Python: pass the canonical twilio-python snake_case key
  ``status_callback_event`` as a list — twilio-python serialises it as
  the multi-value form Twilio expects.

**Root cause B — server-not-yet-listening race**
The previous ``phone.tunnelReady`` (TS) / ``phone.tunnel_ready`` (Py)
signal resolves as soon as the cloudflared hostname is known, BEFORE
the embedded HTTP / WS server has finished initialising. ``phone.call``
placed immediately afterwards races the Twilio Media Streams upgrade
and produces a half-ready route → 11100.

New ``phone.ready`` (TS Promise / Py Future) resolves only after:
1. Tunnel hostname known
2. Carrier auto-config complete
3. EmbeddedServer in ``listen`` state (TS) / uvicorn ``started`` flag
   set (Py)

Outbound pattern is now:

```ts
void phone.serve({ agent, tunnel: true });
await phone.ready;        // <-- safe for outbound
await phone.call(...);
```

``tunnelReady`` is kept as a separate signal for integrations that
only need the hostname (e.g. webhook registration), with a docstring
note pointing at ``ready`` for outbound use.

**Root cause C — opaque diagnostics**
On call drop the user could not tell whether Twilio rejected the dial,
the tunnel resolved late, or the WS upgrade failed. The new
``phone.call`` flow logs the Twilio notifications URL on every
outbound call ("check here if the call drops with no audio") so
self-diagnosis does not require learning the Twilio API.

**Test parity**
Updated ``test_twilio_statuscallback_always_registered`` to read the
new ``status_callback_event`` key (with fallback to the legacy
``StatusCallbackEvent`` for forward compat). Python 1064 PASS / 7
skip, TypeScript 1163 PASS / 67 files. No regressions.

* chore(docs): mintignore DEVLOG and superpowers/ to unblock Mintlify deployment

DEVLOG.md and superpowers/specs/2026-04-24-patter-feature-test-notebook-design.md fail Mintlify's MDX parser (filenames begin with digits, which MDX treats as JSX expressions). Skip both paths so the docs site can deploy.

* chore: drop DEVLOG/superpowers, fix CI failures

- Remove docs/DEVLOG.md and docs/superpowers/ (internal planning notes, no value to public docs site). The .mintignore introduced in the previous commit is no longer needed and is removed too.
- sdk-ts/src/client.ts: attach a no-op `.catch` to `_ready` and `_tunnelReady` so callers that never await them don't trigger Node's unhandled-rejection warning when serve() validates inputs synchronously. Awaiters of `phone.ready` / `phone.tunnelReady` still see the rejection.
- sdk-ts/package-lock.json: add trailing newline (end-of-file-fixer).
- examples/notebooks/**.ipynb: nbstripout pass — clear cell outputs and execution counts to match the repo convention enforced by .pre-commit-config.yaml.
nicolotognoni added a commit that referenced this pull request May 8, 2026
…wave (#83)

* chore: scrub competitor lineage + bug fixes + phone preamble

Repo-wide pass to remove external license headers, "ported from" notes
and competitor product names from source files, plus three runtime fixes
and one missing best-practice feature surfaced by the audit.

## Cleanup (zero residual livekit/pipecat/apache references)

- Removed Apache 2.0 header blocks from 12 Python + 12 TypeScript provider
  files (the headers travelled in from external ports; Patter ships under
  the root MIT LICENSE only — no per-file copyright notices).
- Stripped "Adapted from livekit-plugins-X" / "Ported from pipecat" /
  "Based on LiveKit Agents" provenance comments across ~40 source files
  in sdk-py/getpatter/{services,providers,observability,resources,evals}/
  and sdk-ts/src/{services,providers,observability}/, including the
  cartesia-stt USER_AGENT integration tag.
- Rewrote competitor framing in 12 docs MDX pages (provider docs,
  patter-tool, call-logging) — descriptions now stand on Patter's own
  shape, no migration-from-X language.
- Renamed test fixtures and variables that named LiveKit/Pipecat in
  sentence_chunker tests (Py + TS) and the parity scenario JSON;
  test logic preserved.
- Removed personal-name copyright in LICENSE / sdk-py/LICENSE /
  sdk-ts/LICENSE in favour of "Patter Contributors".
- .gitignore: ignore .ruff_cache/, sdk/ (legacy build dir from the
  pre-rename Python SDK), .agents/, skills-lock.json.

## Bug fixes

- llm_loop.py:420-421 (Python): cache_read_input_tokens /
  cache_creation_input_tokens were Anthropic-style keys, but every
  Python provider emits cache_read_tokens / cache_write_tokens. Fix
  reads the keys the providers actually emit, so OpenAI / Google
  cache attribution is no longer silently zeroed.
- llm-loop.ts:304-308 (TS): non-OK upstream HTTP responses were logged
  and silently swallowed; callers couldn't distinguish empty model
  output from API failure. Now throws PatterConnectionError with the
  status + truncated body.

## Performance

- text_transforms.py: precompiled the 14 markdown regex patterns and 2
  emoji-cleanup helpers as module-level constants — they previously
  recompiled on every sentence flush. Drop-in win, public API and
  37/37 existing tests unchanged.

## Feature: default phone-call preamble

- New Agent.disable_phone_preamble (Py) / disablePhonePreamble (TS)
  field, default False. When False, LLMLoop prepends a short
  spoken-language preamble to system_prompt instructing the model to
  avoid markdown / emojis / bullet lists and keep replies concise.
- Wired through stream_handler and test_mode in both SDKs.
- Adds two Py tests and one TS test covering the new behaviour.

## Test status

- Python: 1466 passed, 8 skipped
- TypeScript: 1164/1164 passed

* chore(env): add per-SDK .env.example, drop obsolete cloud variant

- sdk-py/.env.example, sdk-ts/.env.example: full inventory of every
  env var the SDK reads at runtime, grouped by role (telephony, LLM
  providers, STT, TTS, tracing, Patter tunables). Only OPENAI_API_KEY
  + a telephony carrier is required; the rest are uncommented as the
  user enables specific provider integrations.
- .env.example.cloud removed — variables (PATTER_DATABASE_URL,
  PATTER_ENCRYPTION_KEY, PATTER_REDIS_URL, etc.) belonged to the
  hosted cloud surface that was retired in 0.5.3.
- Root .env.example kept as a minimal quickstart sample.

* refactor(pricing): introduce PricingUnit enum

Replace the magic strings ``"minute"`` / ``"1k_chars"`` / ``"token"``
sprinkled across DEFAULT_PRICING with a named enum, so the pricing
table reads as a typed shape rather than free-form dicts.

- Python: ``PricingUnit(StrEnum)`` — ``MINUTE``, ``THOUSAND_CHARS``,
  ``TOKEN``. Subclassing ``str`` keeps the dict JSON-serialisable and
  unchanged for any consumer that compares against the literal string.
- TypeScript: ``PricingUnit`` const object + ``PricingUnitValue`` union
  type. ``ProviderPricing.unit`` accepts ``PricingUnitValue | string``
  so user overrides loaded from JSON / env config still flow through
  ``mergePricing`` without type gymnastics.
- Behaviour preserved end-to-end: 143 Python pricing/metrics tests pass,
  18 TypeScript pricing tests pass, full suites 1466 Py / 1164 TS green.

* refactor: reorganize as libraries/{python,typescript}; drop in-repo examples

mcp-use-style monorepo layout: each SDK gets its own library folder with
README, CLAUDE.md, .env.example, tests/, and the package source. Sample
code is maintained in separate example repos and is no longer tracked
here (notebooks tutorial preserved — it's the documentation, not an
example).

## Layout

```
libraries/
├── python/         (was sdk-py/)
│   ├── README.md, CLAUDE.md, LICENSE, .env.example
│   ├── pyproject.toml, pytest.ini
│   ├── getpatter/
│   └── tests/
└── typescript/     (was sdk-ts/)
    ├── README.md, CLAUDE.md, LICENSE, .env.example
    ├── package.json, tsconfig.json, vitest.config.ts, tsup.config.ts
    ├── src/
    └── tests/
```

## What changed

- 405 ``git mv`` renames so history follows every file. ``sdk-py/`` and
  ``sdk-ts/`` no longer exist on disk.
- Per-library CLAUDE.md guides (~40 lines each); .gitignore exception
  ``!libraries/*/CLAUDE.md`` so the library guides ARE tracked while the
  root guide stays ignored.
- CI: ``.github/workflows/{audit,release,test,docs-feature-drift}.yml``
  rewritten to the new paths. ``scripts/check_feature_docs_drift.py``
  also fixed (it had a stale ``patter/__init__.py`` from the pre-rename
  era).
- Pre-commit, pre-push, ``scripts/pr-validate.sh``, top-level README and
  CONTRIBUTING.md re-pointed at ``libraries/{python,typescript}``.
- Internal package re-organisation (``handlers → telephony``, splitting
  ``audio/``, ``tools/``) deliberately deferred to a follow-up PR — that
  layer of import-path churn doesn't belong in the same commit as the
  outer rename.

## Examples

``examples/{developer,enterprise,startup,integrations}/`` removed (24
files + the index README). Sample code is published in dedicated repos.
``examples/notebooks/`` kept — it's the 24-notebook tutorial series
documented in the Mintlify site and depended on by
``.github/workflows/notebooks.yml`` and ``.pre-commit-config.yaml``.

PatterTool docs now point at the external example repo (TODO comment
left for the canonical URL — to fill in once the public examples repo is
public).

## Test status

- Python: 1413 passed, 6 skipped (pytest libraries/python/tests)
- TypeScript: 1164 passed, 67 files (vitest run libraries/typescript)
- TypeScript: ``tsc --noEmit`` clean (one pre-existing
  ``@ts-expect-error`` in silero-vad — predates this branch)

* refactor(types,providers): enum-ify config + tighten Agent.provider type

Wave 2 of the cleanup pass — covers half of the provider integrations.
Replaces hardcoded model/voice/format/sample-rate strings with typed
enums (Python ``StrEnum`` / ``IntEnum``, TypeScript ``const`` objects +
union types) so user code gets autocomplete and the type system catches
typos at the call site instead of at the provider's HTTP 400.

## Agent / public types

- ``Agent.provider`` (Python) tightened from ``str`` to a
  ``ProviderMode = Literal["openai_realtime", "elevenlabs_convai",
  "pipeline"]`` alias. TS counterpart was already a string union.
- Expanded ``Agent`` (Py) and ``AgentOptions`` (TS) docstrings to
  document the precedence rule for fields that appear both on the agent
  and on the engine adapter (``voice``, ``model``, ``language``):
  explicit kwarg on ``agent()`` wins; otherwise the engine value
  populates the agent via ``_unpack_engine``; otherwise the default.
- No behaviour change. ``StrEnum`` subclasses ``str``; existing callers
  passing raw strings keep working.

## Providers covered

Python: ``anthropic_llm``, ``cartesia_tts``, ``cerebras_llm``,
``deepgram_stt``, ``elevenlabs_tts``, ``google_llm``, ``groq_llm``,
``lmnt_tts``, ``openai_realtime``, ``rime_tts``.

TypeScript: ``anthropic-llm``, ``cerebras-llm``, ``deepgram-stt``,
``elevenlabs-tts``, ``google-llm``, ``groq-llm``, ``lmnt-tts``,
``openai-realtime``, ``rime-tts``.

Each module now exposes its own ``<Provider>Model`` /
``<Provider>OutputFormat`` / ``<Provider>Voice`` / etc. New enums are
re-exported from ``__init__.py`` and ``index.ts`` in dedicated
"provider-specific enums" sections.

## Still pending

The following providers still hold magic strings — covered in a
follow-up commit:
``assemblyai_stt``, ``soniox_stt``, ``speechmatics_stt``,
``cartesia_stt``, ``telnyx_stt``, ``whisper_stt``,
``elevenlabs_ws_tts``, ``openai_tts``, ``telnyx_tts``, ``gemini_live``,
``ultravox_realtime``, ``silero_vad``, ``silero_onnx``, ``krisp_*``.
The TS ``cartesia-tts.ts`` enums also still need to land (Py is
already covered).

## Test status

- Python: 1466 passed, 8 skipped
- TypeScript: 1164/1164 passed; ``tsc --noEmit`` clean (one pre-existing
  silero-vad warning unchanged)

* refactor(providers,server): finalize provider enums + bug-fix wave

Provider enum residuals (Wave 2.5)
- Python: assemblyai_stt, cartesia_stt, soniox_stt, speechmatics_stt,
  telnyx_stt, whisper_stt, elevenlabs_ws_tts, openai_tts, telnyx_tts,
  gemini_live, ultravox_realtime, silero_vad, silero_onnx, krisp_*
- TypeScript: assemblyai-stt, cartesia-stt, cartesia-tts, soniox-stt
- All hardcoded model/voice/format strings now live behind StrEnum/IntEnum
  (Python) or const-object + value union (TypeScript)

Bug fixes (Wave 3a)
- stream_handler: barge-in now sets asyncio.Event / AbortController to
  cancel in-flight LLM stream instead of letting it run to completion
- server (Py): SSRF validator on outbound webhook URLs + per-IP WS cap
  (MAX_WS_PER_IP=10) for parity with TS
- server (Py): voicemail POST gets explicit 10s timeout
- metrics (Py): agent_response_ms accepts 0.0 instead of treating it as
  "missing" (use is None gate)
- metrics (TS): emit llm/stt/tts TTFB events on the event bus
- observability/event_bus (Py): listener errors now surface to logger
  instead of being swallowed
- server (TS): queryTelephonyCost catch logs instead of silent return

* feat(errors): add ErrorCode enum to exception taxonomy

Stable, machine-readable error codes attached to every Patter exception
class. Existing class-name-based catches keep working; the enum is
additive.

ErrorCode values (10): CONFIG, CONNECTION, AUTH, TIMEOUT, RATE_LIMIT,
WEBHOOK_VERIFICATION, INPUT_VALIDATION, PROVIDER_ERROR, PROVISION,
INTERNAL.

- Python: StrEnum on `exceptions.py`; class-default `code` attribute
  per subclass (PatterError → INTERNAL, PatterConnectionError →
  CONNECTION, AuthenticationError → AUTH, ProvisionError → PROVISION,
  RateLimitError → RATE_LIMIT). Optional `code=` kwarg on the base
  ctor lets callers override per-instance.
- TypeScript: const-object + value union in `errors.ts`; `readonly
  code: ErrorCode` on every class; optional `{ code }` constructor
  option. Same class→code mapping byte-for-byte with Python.
- Both SDKs re-export `ErrorCode` from the package root.
- Test parity asserts the enum value sets match between SDKs.

* feat(errors): wire ErrorCode enum into exceptions module + package roots

Companion to 8b8c503 (test files). Ships the actual enum + class wiring:

- libraries/python/getpatter/exceptions.py — ErrorCode StrEnum, default
  .code per subclass, optional code= kwarg on PatterError.__init__
- libraries/python/getpatter/__init__.py — re-export ErrorCode
- libraries/typescript/src/errors.ts — ErrorCode const-object + value
  union, readonly code on every class, optional { code } ctor option
- libraries/typescript/src/index.ts — re-export ErrorCode

* perf(elevenlabs-ws-tts): auto-flip output_format=ulaw_8000 when paired with Twilio

ElevenLabs WS TTS streams `ulaw_8000` natively. When the carrier is
Twilio (mulaw 8 kHz), we can let ElevenLabs do the encoding server-side
and skip the SDK-side mulaw transcode entirely.

- ElevenLabsWebSocketTTS.set_telephony_carrier(carrier) / TS
  setTelephonyCarrier(carrier) — duck-typed hook called by the stream
  handler after TTS construction. Maps "twilio" → "ulaw_8000",
  "telnyx" → "pcm_16000" (lowest conversion).
- output_format constructor arg becomes truly optional (sentinel) —
  user-passed format wins over the carrier hint.
- for_twilio / for_telnyx factories already pass explicit formats →
  the carrier hint is a no-op for those callers.
- 7 new unit cases per SDK in TestCarrierAutoFlip / equivalent: default
  flip, URL contains ulaw_8000, telnyx no-op, explicit format respected,
  factory wins, unknown carrier no-op.

No public-API break — existing constructor calls behave identically
when no carrier hook is wired up.

* perf(openai-tts): add direct 24k→8k resample path (opt-in via target_sample_rate)

OpenAI TTS streams 24 kHz audio. The default 24k→16k resample stays for
the Telnyx (PCM 16 kHz) carrier; for Twilio (mulaw 8 kHz) the chained
24→16 + 16→8 used to cost two ratecv passes. New `target_sample_rate=8000`
constructor opt-in collapses the two passes into a single 3:1 decimation
with a tighter LPF (Nyquist ≈ 4 kHz).

- Python: getpatter.services.transcoding.create_resampler_24k_to_8k()
  factory; OpenAITTS gains optional `target_sample_rate=16000` (default
  preserves existing behaviour).
- TypeScript: createResampler24kTo8k() + 24000→8000 case in
  StatefulResampler; OpenAITTS gains optional positional
  `targetSampleRate=16000` with `LPF_ALPHA_8K=0.45` for proper
  anti-aliasing at 4 kHz Nyquist.

Auto-engagement on Twilio carriers is deferred — the audio sender
currently assumes 16 kHz PCM input. Manual opt-in keeps the change
narrowly scoped.

* feat(sentence-chunker): per-language honorifics + single-word flush

Bug #48 — per-language honorifics
- New HONORIFICS_{EN,IT,ES,DE,FR,PT} constants merged into HONORIFICS_ALL
  (sorted longest-first). Module-level HONORIFICS_REGEX_ALT alternation
  built once. The aggregation is union-of-all regardless of `language`
  (mixed-language deployments are common; safer default).
- splitSentences prefix regex sources from the union — sentences like
  "Ho incontrato il Sig. Rossi alla riunione" no longer split mid-honorific
  in any of the supported languages.

Bug #49 — single-word "Yes." never flushed
- DEFAULT_MIN_WORDS_FOR_SHORT_FLUSH lowered from 2 → 1; single-word
  replies ending in "."/"!"/"?" now flush on the terminator.
- New gate #6 in maybeShortFlush blocks flushes whose trailing word is a
  known honorific — prevents "Mr." / "Sig." escaping as a standalone
  sentence.
- Legacy escape hatch: pass `minWordsForShortFlush=2` to restore the
  pre-fix behaviour.

Tests: 22 Python + 21 TS new honorific cases; 12 + 12 single-word flush
cases. Existing tests updated where they asserted the old buffered
behaviour for single-word replies. Both suites green (Py 1538, TS 1224).

* docs(changelog,chore): unreleased section + tool_executor docstring + silero-vad lint

- CHANGELOG.md: comprehensive Unreleased section covering reorg,
  provider enums, error taxonomy, bug-fix wave, perf wins, and
  cleanup work landing on this branch.
- tool_executor.py: add module-level docstring describing the SSRF
  guard, response-size cap, and OTel span emission.
- silero-vad.ts:127: replace stale @ts-expect-error directive (now
  a TS2578 warning since onnxruntime-node types resolve at build) with
  a plain comment explaining the optional-peer-dep dynamic import.

* refactor(layout,py): split internal layout into telephony/, audio/, tools/

Internal restructure of the Python SDK; PUBLIC API surface unchanged.

- handlers/{twilio,telnyx,common}_handler.py → telephony/{twilio,telnyx,common}.py
  ("_handler" suffix dropped — the parent module name already conveys
  intent). stream_handler.py promoted out of handlers/ to package root
  since it's the per-call orchestrator, not a telephony adapter.
  handlers/ folder removed.
- services/{transcoding,pcm_mixer,background_audio}.py → audio/* (audio
  pipeline collected in one place).
- services/{tool_decorator,tool_executor}.py → tools/* (tool-decoration
  & webhook-execution kept together).
- Other services/* stay where they are: llm_loop, metrics,
  sentence_chunker, text_transforms, ivr, fallback_provider,
  pipeline_hooks, chat_context, call_log, remote_message.
- tts/ and stt/ namespaces kept — they expose
  getpatter.{tts,stt}.<provider>.{TTS,STT} with env-var auto-resolve
  and are public surface.
- File moves use git mv so blame/history follow.
- Imports rewritten across providers, server, services, tests, and
  package-root re-exports. Python tests: 1538 passed.

TS side ships in a separate commit.

* refactor(layout,ts): mirror Python telephony/, audio/, tools/ layout

TS internal restructure for parity with the Python d5d9391 commit.
Public API surface unchanged.

- carriers/{twilio,telnyx}.ts → telephony/{twilio,telnyx}.ts (rename for
  naming parity with Py; "carrier" was the original term, "telephony"
  reads better next to twilio/telnyx).
- transcoding.ts → audio/transcoding.ts.
- services/background-audio.ts → audio/background-audio.ts.
- tool-decorator.ts → tools/tool-decorator.ts.
- Imports rewritten across client, index, types, stream-handler,
  deepfilternet-filter, plus 5 test files. TS tests: 1224 passed,
  tsc --noEmit clean.

The telephony/audio/tools triad now matches between Python and
TypeScript SDKs.

* docs(claude.md): reflect new telephony/audio/tools layout

Update per-library AI-agent quickstarts to match the post-restructure
package tree. Adds the new folder names (telephony/, audio/, tools/)
and a one-line description per folder.

* docs(changelog): document internal layout reorg in Unreleased section

* docs(getpatter): fill missing docstrings in services/llm/tts/stt/observability/dashboard/top-level

* docs(getpatter): fill missing docstrings in providers/telephony/audio/tools

Adds 1-3 line docstrings to public symbols (modules, classes, methods)
in libraries/python/getpatter/{providers,telephony,audio,tools} that
previously had none. No behaviour changes; pre-existing docstrings are
left untouched.

* docs(getpatter-ts): fill missing JSDoc in providers/telephony/audio/tools

* docs(getpatter-ts): fill missing JSDoc in services/llm/tts/stt/observability/dashboard/top-level

Adds short JSDoc summaries to public classes, interfaces, type aliases, and
exported functions that were missing them. Existing JSDoc is preserved
verbatim — this is a fill-the-gaps pass only, no rewrites.

* docs(changelog): note docstring/JSDoc sweep in Unreleased

* chore(comments): update sdk-py/sdk-ts path refs after libraries/ reorg

Mechanical replace of stale path strings in docstrings, comments, and
.env.example headers:
- sdk-ts/src/* → libraries/typescript/src/*
- sdk-py/getpatter/* → libraries/python/getpatter/*
- conceptual "(sdk-py)" → "the Python SDK"

No behaviour change; tests still 1538 passed, tsc clean.

* chore(ts): remove playwright e2e tests + devDeps

The e2e Playwright suite (tests/e2e/*.spec.ts + playwright.config.ts +
@playwright/test / playwright devDeps) were inherited from an earlier
"comprehensive test suite" PR but never integrated with downstream
test infra after the libraries/ reorg. Per CLAUDE.md, end-to-end call
testing lives in a separate downstream test repo.

- Drop libraries/typescript/playwright.config.ts.
- Drop libraries/typescript/tests/e2e/ (6 .spec.ts + test-server.ts).
- Remove @playwright/test and playwright from package.json devDeps.
- Refresh package-lock.json (npm install).
- silero-vad.ts: switch back to @ts-ignore on the optional
  onnxruntime-node import — the dynamic-import line surfaces a TS7016
  warning when types are unresolved post-lock-refresh.

* feat(parity): port DefaultToolExecutor, LLMChunk, builtin_clip_path, select_sound_from_list, resample_24k_to_16k from TypeScript SDK

Closes 5 public-surface gaps in the Python SDK so every symbol exported
from libraries/typescript/src/index.ts now has a Python counterpart.

- ``DefaultToolExecutor`` — async tool dispatcher with retry/fallback,
  webhook SSRF validation via ``server.validate_webhook_url``, and the
  same JSON error shape as the TypeScript class. Added to
  ``services/llm_loop.py``.
- ``LLMChunk`` — frozen dataclass mirror of the TS ``LLMChunk``
  interface (text/tool_call/done/usage). ``to_dict()`` produces the same
  shape as ``OpenAILLMProvider.stream`` for callers that prefer dicts.
- ``builtin_clip_path`` — top-level helper resolving ``BuiltinAudioClip``
  values (or raw filenames) to absolute paths. ``BuiltinAudioClip.path``
  now delegates to the new function for a single source of truth.
- ``select_sound_from_list`` — promoted from a private static method on
  ``BackgroundAudioPlayer`` to a public top-level function. The static
  method is preserved as a backward-compatible delegator.
- ``resample_24k_to_16k`` — stateless one-shot helper following the
  existing ``resample_8k_to_16k`` / ``resample_16k_to_8k`` convention,
  including the per-process ``DeprecationWarning`` latch.

All five symbols are re-exported from ``getpatter.__init__`` and listed
in ``__all__``. The five ``TODO(parity)`` markers are removed in the
same commit.

25 unit tests added in ``tests/unit/test_parity_ports.py`` covering
public-symbol reachability, ``LLMChunk`` round-trip, real
handler/webhook dispatch through ``DefaultToolExecutor`` (including the
SSRF-blocked branch), bundled clip resolution, weighted selection
empirics, and equivalence of ``resample_24k_to_16k`` to a single-shot
``StatefulResampler``.

Tests: 1546 → 1571, all passing.

* docs: document ErrorCode enum and disable_phone_preamble

* docs: document PricingUnit, OpenAITTS targetSampleRate, WS carrier auto-detect, and LLM primitives

* docs: document SentenceChunker language, provider enums, and audio helpers

* fix(ci): defer silero_onnx imports + remove dead E2E workflow

CI failures on PR #83:

1. Python SDK Tests (3.11/3.13) and Security Tests fail with
   ModuleNotFoundError: No module named 'numpy'. Cause: a recent commit
   eagerly re-exported OnnxExecutionProvider and SileroOnnxSampleRate
   from getpatter/__init__.py, which transitively imports numpy via
   silero_onnx. Move the re-export into the existing __getattr__ lazy
   path (parallel with SileroVAD / KrispSampleRate) so importing
   getpatter no longer requires the silero extra.

2. E2E Tests job tries to run playwright on a folder we deleted in
   47a97f0. Drop the job — E2E lives in a downstream test repo.

Local verify: `python3 -c 'import getpatter'` works with numpy blocked
via meta_path; pytest still 1563 passed.

* release: 0.6.0

Bump 0.5.5 → 0.6.0 for the cleanup + restructure release. Minor bump
because of two breaking changes:

1. Agent.provider is now a closed enum (ProviderMode); arbitrary strings
   error.
2. Internal import paths reorganised — callers reaching into
   getpatter.handlers.* / getpatter.services.{transcoding,pcm_mixer,
   background_audio,tool_decorator,tool_executor} must migrate to the
   new telephony/, audio/, tools/ folders. Public top-level imports
   (`from getpatter import Patter`) are unchanged.

Migration guidance and the full surface diff live in CHANGELOG.md
under the 0.6.0 section.

- libraries/python/pyproject.toml: 0.5.5 → 0.6.0
- libraries/python/getpatter/__init__.py: __version__ = "0.6.0"
- libraries/typescript/package.json: 0.5.5 → 0.6.0
- CHANGELOG.md: rename Unreleased section to 0.6.0 (2026-05-03)

* fix(silero-vad): align defaults with upstream Silero VAD

Three SileroVAD.load defaults were tuned for telephony in an earlier
pass, but they diverged from the upstream Silero defaults documented
in snakers4/silero-vad utils_vad.py. The upstream values are conservative
and well-vetted; align with them so callers who follow the official
Silero docs get consistent behaviour.

  | param                  | old (telephony-tuned) | upstream Silero | new      |
  |------------------------|-----------------------|-----------------|----------|
  | min_speech_duration    | 0.05 s (50 ms)        | 250 ms          | 0.25 s   |
  | min_silence_duration   | 0.55 s (550 ms)       | 100 ms          | 0.10 s   |
  | prefix_padding         | 0.5 s  (500 ms)       | 30 ms           | 0.03 s   |

Activation threshold (0.5) and the derived deactivation threshold
(max(t-0.15, 0.01) ≈ 0.35) already matched upstream and stay unchanged.

Both SDKs match byte-for-byte; no test references the prior literals.
Tests: Py 1563 passed, TS 1224 passed, tsc --noEmit clean.

Callers that previously relied on the telephony-tuned defaults can
restore them explicitly via `SileroVAD.load(min_speech_duration=0.05,
min_silence_duration=0.55, prefix_padding_duration=0.5)` (Py) or the
`minSpeechDuration: 0.05, minSilenceDuration: 0.55, prefixPaddingDuration: 0.5`
options (TS).

* feat(silero-vad): auto-VAD in pipeline mode + forPhoneCall preset + robust ONNX path

Three changes that make SileroVAD usable out of the box. Closes the
"have to recreate VAD every time" complaint and removes the
createRequire(import.meta.url).resolve("getpatter") workaround callers
needed under bundlers that break import.meta.url.

1. Auto-VAD in pipeline mode
   - PipelineStreamHandler auto-loads SileroVAD.for_phone_call (Py) /
     SileroVAD.forPhoneCall (TS) when agent.vad is not provided.
   - Falls back silently to the STT-endpoint heuristic when the silero
     extra / onnxruntime-node is not installed.
   - No opt-out flag — auto-VAD is a strict upgrade over the heuristic
     when the optional dep is available.

2. forPhoneCall / for_phone_call factory
   - Convenience wrapper around load(...) that pre-applies the
     telephony preset (sample_rate=16000, min_silence_duration=1.0).
   - Pass overrides for noisy environments (e.g. minSilenceDuration=1.5
     for tunnel + speakerphone echo).

3. Robust ONNX model resolution (TS)
   - silero-vad.ts now probes 4 anchor candidates (__dirname,
     import.meta.url, createRequire(import.meta.url).resolve("getpatter/
     package.json"), createRequire(cwd).resolve) crossed with 3 path
     shapes (<dir>/resources/, <dir>/../resources/, <dir>/dist/
     resources/). Mirrors the user-side workaround directly inside the
     SDK so callers stop needing it.

Tests: Py 1563 passed, TS 1224 passed, tsc --noEmit clean.

* fix(tunnel): block phone.ready until tunnel hostname is publicly resolvable

Outbound calls placed immediately after `phone.ready` would race the
public DNS edge: cloudflared returns the trycloudflare.com URL the
moment its control plane has issued it, but the edge can take several
seconds to start serving the hostname. Twilio (and any webhook caller)
gets HTTP 502 "Unknown host" and the call is torn down before
ever reaching the WS media stream.

`phone.ready` now blocks until:
1. Embedded server is in `listen` state (existing behaviour)
2. Tunnel hostname resolves through public DNS (1.1.1.1 / 8.8.8.8)
3. 2.5 s grace window passes for the cloudflared origin bridge

DNS resolution bypasses the OS resolver to dodge macOS mDNSResponder's
aggressive NXDOMAIN cache. Why DNS-only and not full HTTP probing:
trycloudflare quick tunnels frequently fail same-host loopback (NAT
hairpinning / IPv4 vs IPv6 race) even when the URL is reachable from
external hosts. Twilio's edge resolves via public DNS, so DNS resolution
is the correct proxy for "Twilio can reach us".

- Python: raw UDP DNS query parser (~30 lines, no new dependency).
  Smoke-test verified against cloudflare.com (returns IP) and
  non-existent host (returns None). Fallback 1.1.1.1 → 8.8.8.8.
- TypeScript: Node's `dns.Resolver` with custom servers (3 lines, c-ares
  built-in already bypasses OS cache).
- Static / explicit-webhookUrl callers skip the probe (the operator
  already knows the host is up).
- 30 s total timeout with exponential backoff (250 ms → 2 s capped).

Cleanup pass on the surrounding docstrings: aligned the Py rationale
with the TS one (the previous Py docstring incorrectly claimed an
HTTP /health probe). Both now explain DNS-only + grace as a single
coherent design choice.

Tests: Py 1563 passed, TS 1224 passed, tsc --noEmit clean.

* fix: pipeline-mode hardening — assemblyai handshake race, barge-in trailing chunk, tunnel resolver budget, alarm fatigue cleanup

Five small but customer-visible fixes uncovered while running outbound
pipeline tests on this branch.

1. AssemblyAISTT.sendAudio: silently drop audio on WS not OPEN
   - Mirrors Deepgram / Cartesia / Soniox / Telnyx parity. Twilio starts
     streaming media frames immediately on call connect; the first
     ~10–25 frames (200–500 ms) race the AssemblyAI WS handshake. The
     previous `throw` propagated out of `handleAudio` and killed the
     call. Lost frames during connect are acceptable — the user is
     still saying "Hello" — and the connect path retries on close.
   - Architectural race in server.ts (handleAudio fires concurrently
     with handleCallStart) is flagged out-of-scope; this fix is the
     symptom-level guard.

2. afterSynthesize hook + barge-in race
   - Both Py and TS pipelines re-check `isSpeaking` after the
     `afterSynthesize` hook returns. The hook's await yields long
     enough for the VAD path to fire BARGE-IN and flip `isSpeaking` to
     false; without the re-check, exactly one trailing TTS chunk
     (~20–100 ms of audio) raced past the cancel and prolonged the
     perceived "agent didn't stop" window.

3. SileroVAD.for_phone_call / forPhoneCall — match upstream defaults
   - Stop pinning min_silence_duration=1.0s. The factory now only pins
     sample_rate=16000 (the only rate Patter's pipeline-mode bus uses);
     every other parameter mirrors snakers4/silero-vad upstream
     defaults. Docstring documents the override path
     (min_silence_duration=0.5–1.0 s) for deployments that experience
     truncation on natural pauses.

4. Tunnel reachability — c-ares budget fix + 60s ceiling
   - `Resolver({ timeout: 1500, tries: 1 })` overrides c-ares's default
     5000 ms × 4 = up to 20 s per resolve4 call so the outer retry
     loop actually retries. Without this each NXDOMAIN burned 5–20 s
     of wall-clock and the budget ran out after 1–2 attempts.
   - Total budget raised 30 s → 60 s for slow Cloudflare propagation.

5. Stale "Pipeline mode without VAD" warning removed (Py + TS)
   - The warning fired even now that auto-VAD lands a working
     SileroVAD when onnxruntime is installed. Operators saw a scary
     warning AND a successful auto-VAD log on every call —
     alarm-fatigue territory. The auto-VAD path already logs a single
     accurate message in the rare case the load fails, so the
     server-level warning is pure noise.

Tests: Py 1563 passed, TS 1224 passed, tsc --noEmit clean.

* fix(llm-loop): cancel in-flight LLM stream on barge-in (architectural)

Barge-in used to set llmAbort/llm_cancel_event but the upstream
provider fetch was never aborted — the loop only checked
``signal.aborted`` between tokens. With no token arriving (fetch
blocked on the network response), the loop sat blocked until the
provider's own 30 s timeout.

Symptom: after the user interrupted the agent, the next utterance
was queued but never processed because ``transcriptProcessing`` /
the equivalent Py guard stayed true until the original LLM fetch
timed out — agent stayed silent up to 30 s.

This commit plumbs the per-turn cancel signal end-to-end:

TypeScript
- New ``LLMStreamOptions { signal?: AbortSignal }`` shape.
- ``LLMProvider.stream`` accepts ``opts?: LLMStreamOptions``;
  ``LLMLoop.run`` accepts ``opts`` and forwards it.
- Built-in ``OpenAILLMProvider`` and the four standalone providers
  (cerebras, anthropic, groq, google) now combine ``opts.signal``
  with their existing 30 s timeout via ``AbortSignal.any([...])`` so
  a barge-in tears the fetch down immediately.
- ``runPipelineLlm`` passes ``{ signal: llmSignal }`` into
  ``llmLoop.run``.

Python
- ``LLMProvider.stream`` accepts ``cancel_event: asyncio.Event |
  None``; ``LLMLoop.run`` forwards it.
- Built-in ``OpenAILLMProvider``, plus ``cerebras_llm``,
  ``anthropic_llm``, ``google_llm`` now check the event between
  upstream chunks and short-circuit (``await response.close()`` /
  break out of ``async with`` / ``return``).
- ``PipelineStreamHandler`` passes ``self._llm_cancel_event`` into
  ``llm_loop.run``.

Backward compat: every parameter is keyword-only with a default of
None; existing callers keep working. Test mocks updated with
``**_kwargs`` so they accept the new kwarg without rewrites.

Tests: Py 1563 passed (10 mock signatures patched), TS 1224 passed,
tsc --noEmit clean.

* fix(llm-loop,ts): mergeAbortSignals polyfill — Node 18 compat

The previous commit's signal-merge used AbortSignal.any() unconditionally,
but that API only landed in Node 20.3. Patter's engines.node says
">=18.0.0" so users on Node 18.x would have crashed with
"AbortSignal.any is not a function" on the first LLM call —
worse-than-original-bug regression introduced by the cancel fix.

Add a small ``mergeAbortSignals(...signals)`` helper exported from
``llm-loop.ts``:
- Falls through to ``AbortSignal.any(filtered)`` when available (Node
  20.3+, browsers).
- Polyfills via aggregating ``AbortController`` + ``addEventListener
  ('abort')`` listeners on Node 18 / older runtimes.
- Single-signal inputs short-circuit to the input itself (no allocation).

All five LLM provider stream sites (built-in OpenAI in llm-loop.ts +
the four standalones cerebras / anthropic / groq / google) now call
``mergeAbortSignals(opts?.signal, AbortSignal.timeout(30_000))`` which
behaves identically on Node 20.3+ and gracefully on Node 18.

Tests: TS 1224 / lint clean. Py 1563.

* fix(client): clear tunnel-owned webhookUrl on disconnect()

Bug reproduced today against the published 0.6.0 dist:

  1. Patter.serve() starts a cloudflared tunnel and stores the freshly-
     minted hostname in localConfig.webhookUrl so subsequent call()s in
     the same process resolve to the right host.
  2. Patter.disconnect() stops the tunnel handle and embedded server but
     never clears localConfig.webhookUrl.
  3. Plugins / integrations that wrap Patter often dispose+rebuild on
     agent-identity changes via disconnect() → serve(). On the second
     serve() the stale webhookUrl is still set AND the constructor still
     wants a Cloudflare tunnel → the guard "Cannot use both tunnel: true
     and webhookUrl. Pick one." throws and the plugin breaks.

Fix tracks ownership of the webhookUrl: a new private flag
``tunnelOwnsWebhookUrl`` (TS) / ``_tunnel_owns_webhook_url`` (Py) goes
true the moment serve() pulls the hostname out of the tunnel handle.
disconnect() clears the field IF AND ONLY IF the flag is set, leaving
explicit / Static-tunnel hostnames in place. It also drops the
``ready`` / ``tunnel_ready`` deferreds so a follow-up serve() recreates
them fresh — without this the next ``await phone.ready`` resolved with
the previous lifecycle's hostname.

- libraries/typescript/src/client.ts: tunnelOwnsWebhookUrl flag, set
  in serve() after tunnel start, cleared in disconnect() along with the
  Promise pair (re-derived with pre-resolve when an explicit webhookUrl
  is still configured).
- libraries/python/getpatter/client.py: parity port. Frozen-dataclass
  config gets ``replace(webhook_url="")``; deferreds are cleared to
  None so the lazy ``ready`` / ``tunnel_ready`` properties recreate
  them on next access.

Tests added (5 Py + 4 TS):
- disconnect clears the tunnel-owned webhookUrl
- explicit webhookUrl is preserved across disconnect
- disconnect is idempotent
- ready / tunnelReady are fresh handles after disconnect
- _server reference is null after disconnect

Suites: Py 1566 (+3), TS 1228 (+4), tsc clean.

* feat(audio): NLMS acoustic echo cancellation (opt-in)

Bug #2 from the barge-in audit: on speakerphone / tunnel-loop
deployments the agent's outbound TTS bleeds back into the mic. VAD
sees that bleed as continuous voice-like energy and never transitions
out of "speaking" state, so a caller interruption only registers
during natural TTS pauses → "interrupt sometimes works, sometimes the
agent keeps talking" intermittent symptom.

Fix at the source — proper acoustic echo cancellation. NLMS adaptive
filter (2048 taps @ 16 kHz, 128 ms history) subtracts an estimate of
the TTS-derived echo from the mic stream before VAD/STT see it.
Geigel double-talk detector freezes adaptation when the caller is
speaking on top of the agent so the filter does not learn the user's
voice as part of the channel response.

Convergence on the synthetic narrowband test signal:
- ~24 dB ERLE after 1 s of TTS-only training
- Near-end speech preserved within 0 dB during double-talk

Not a drop-in replacement for WebRTC AEC3 (state-of-the-art needs
adaptive sub-band processing + comfort noise + nonlinear post-filter
that this scope does not cover). For production-grade quality, wrap
a binding to ``webrtc-audio-processing-2`` externally.

- libraries/python/getpatter/audio/aec.py — NlmsEchoCanceller class.
- libraries/typescript/src/audio/aec.ts — TS parity.
- Agent.echo_cancellation / AgentOptions.echoCancellation — opt-in
  flag, default false. Handset / headset deployments don't need it
  and the 0.5–2 s convergence period would briefly attenuate caller
  speech if they spoke before any TTS played.
- PipelineStreamHandler.start() (Py) / StreamHandler.initPipeline
  (TS) instantiate the canceller when the flag is on. Far-end tap
  fires before the carrier transcode in synthesizeSentence; near-end
  runs after the inbound 8k→16k resample, before VAD.
- 8 unit tests per SDK covering convergence, double-talk preservation,
  construction validation, pass-through-before-priming, reset, empty
  buffers.

Tests: Py 1574 passed (+8), TS 1236 passed (+8), tsc clean.

* fix(client,py): expose aggressive_first_flush / disable_phone_preamble / echo_cancellation in agent() builder

Pre-existing parity violation surfaced during the AEC audit: the Py
``Patter.agent(...)`` builder enumerates kwargs explicitly, so any field
not listed silently drops on the floor. Three boolean flags on the Agent
dataclass — ``aggressive_first_flush``, ``disable_phone_preamble``, and
the freshly added ``echo_cancellation`` — were unreachable through the
builder, forcing users to construct ``Agent(...)`` directly.

TS does not have this problem because ``agent(opts)`` spreads the whole
``AgentOptions`` object, so every field passes through.

Add the three flags to the Py builder signature and forward them to
``Agent(...)``. Defaults match the dataclass (all ``False``) so existing
callers keep their behaviour.

2 new tests:
- builder defaults match dataclass defaults (no silent True leak)
- explicit ``aggressive_first_flush=True`` / ``disable_phone_preamble=
  True`` / ``echo_cancellation=True`` reach the resulting Agent

Tests: Py 1576 passed (+2), TS 1236 unchanged, tsc clean.

* fix(audio): NLMS AEC — 512 taps + adaptive step for fast convergence

Real cellular-call test on 0.6.0 with the initial 2048-tap +
constant-0.1-step config exposed an 8–12 s convergence window during
which the user's first turn was either over-cancelled to silence
(filter eats voice while learning the channel) or contaminated by
residual echo (Deepgram transcribes garbage and discards). The user
report: ~11 s of perceived silence after firstMessage, then everything
worked from turn 4 onward. Net first-turn UX was worse than no AEC.

The architectural fix the user asked for ("source-level, no workaround,
solid"): two NLMS hyperparameter changes that compress convergence
into the first ~250 ms — the same window where the agent's
firstMessage finishes playing.

1. **512 taps (was 2048)**: 4× fewer coefficients to converge with no
   measurable cancellation loss on cellular / VoIP paths whose RT60
   stays under ~50 ms after the carrier's own echo suppression. Pass
   ``filter_taps=2048`` explicitly for landline hairpin loops where the
   tail extends beyond 32 ms @ 16 kHz.
2. **Adaptive step**: aggressive warm-up step (0.5) for the first 0.5 s
   of processed audio, then taper to the textbook 0.1 for steady-state
   tracking. The Geigel double-talk detector still gates updates so the
   larger step does not learn the caller's voice into the echo model.

Verification: regression-test fed a broadband synthetic signal (3
sinusoids + white noise) in realistic 20 ms frames hits **17–19 dB
ERLE in the very first 250 ms** with the new defaults — well above the
previous 0 dB at the 1.25 s mark.

- New constructor knobs: ``warmup_step_size`` (default 0.5),
  ``warmup_seconds`` (default 0.5). Step branch is constant within a
  frame so the inner sample-by-sample loop stays branch-free.
- Validation extended for the two new fields.
- ``reset()`` now clears the ``processed_samples`` counter so the
  warm-up window re-engages on filter reset.
- 1 new regression test per SDK enforces the "≥10 dB ERLE in the first
  250 ms with defaults" guarantee on a broadband signal.

Tests: Py 1577 passed (+1), TS 1237 passed (+1), tsc --noEmit clean.

* fix(stream-handler): firstMessage isSpeaking + AEC tap; barge-in-only ring flush

Two fixes for the speakerphone "agent unresponsive on first turn / mid-call
gets stuck after a few exchanges" symptom reported on 0.6.0 cellular tests.

1. firstMessage was bypassing beginSpeaking + AEC far-end tap
   The ``firstMessage`` block streamed TTS chunks directly to the carrier
   without (a) marking ``isSpeaking=true`` and (b) pushing each chunk to
   ``aec.pushFarEnd()``. Consequence on speakerphone: while the intro
   played, the self-hearing guard never engaged, the user's audio (mixed
   with TTS bleed) was forwarded to STT and produced garbage transcripts;
   AEC had no reference signal so the bleed survived in the inbound
   channel. Wraps the firstMessage TTS streaming loop in
   ``beginSpeaking()`` + ``try/finally { endSpeakingWithGrace() }`` and
   pushes each chunk to ``aec.pushFarEnd()`` before encoding for the
   carrier. Mirrors the per-turn behaviour of ``runPipelineLlm`` /
   ``_process_streaming_response``.

2. Ring buffer must NOT flush on natural turn end
   An earlier iteration also flushed ``inboundAudioRing`` from
   ``endSpeakingWithGrace`` so user audio captured during the agent's TTS
   that never tripped VAD would still reach STT. In practice this raced
   live STT input post-grace: the ring contained partially-cancelled echo
   (AEC still adapting) and possibly over-cancelled user voice (Geigel
   rho=0.6 misses quiet double-talk). Replaying on every turn produced
   phantom transcripts that confused the LLM and caused the "out of order
   responses + agent gets stuck" symptom the user observed mid-call.
   Reverted: flush only on real barge-in (where VAD confirmed user
   speech). Audio captured during the agent's turn that VAD did not
   classify as speech is intentionally dropped at the next
   ``beginSpeaking`` — the user can repeat themselves rather than have
   the LLM react to a stale phantom transcript.

The barge-in flush remains: extracted into ``flushInboundAudioRing()`` /
``_flush_inbound_audio_ring()`` helpers (clean refactor, 1 caller now).

Stale "2048 taps + 0.5–2 s convergence" log message updated to the
post-AEC-tuning "512 taps + 0.5 s warmup μ=0.5 → ~250 ms convergence".

Tests: Py 1577 passed, TS 1237 passed, tsc --noEmit clean.

* fix(stream-handler): gate barge-in on minimum agent-speaking duration

The previous fix wrapped the firstMessage TTS in
``beginSpeaking`` + ``endSpeakingWithGrace`` so the self-hearing guard
could engage during the intro. This worked, but exposed a second
defect: the AEC filter needs ~500 ms of TTS reference to converge, and
during that warmup window residual TTS bleed in the inbound mic stream
still looks like speech to VAD. With ``isSpeaking=true`` from frame
zero of the firstMessage, the very first chunk of bleed triggered an
immediate barge-in cancel — the firstMessage was killed before a
single byte had been played. Test reported "agent never speaks".

Fix: gate both barge-in entry points (VAD ``speech_start`` and
transcript-based) on a 1-second minimum agent-speaking duration. Real
users almost never start interrupting within the first second of an
agent turn anyway, and the gate cleanly covers the AEC convergence
period (500 ms warmup + safety margin).

- TypeScript: ``MIN_AGENT_SPEAKING_MS_BEFORE_BARGE_IN = 1000`` static
  on ``StreamHandler``. New ``speakingStartedAt: number | null`` field
  set in ``beginSpeaking()`` and cleared in ``cancelSpeaking()`` and
  the grace flip. New ``canBargeIn()`` helper used by both barge-in
  sites; suppressed events log at debug level so call-debug logs still
  show why the cancel did not fire.
- Python: ``MIN_AGENT_SPEAKING_S_BEFORE_BARGE_IN = 1.0`` module-level
  constant. ``_speaking_started_at`` field with the same lifecycle.
  ``_can_barge_in()`` helper applied at the VAD speech_start path in
  ``on_audio_received`` and at the entry of ``_handle_barge_in``.
  Helper uses ``getattr`` so test fixtures that bypass
  ``_begin_speaking`` still permit barge-in to fire.

5 new unit tests (3 Py + 5 TS):
- ``canBargeIn() / _can_barge_in()`` returns true with no active turn,
  false within the gate window, true past the gate window.
- ``handleBargeIn / _handle_barge_in`` returns / does nothing during
  the warmup window, ``isSpeaking`` stays True.
- ``handleBargeIn / _handle_barge_in`` fires normally past the gate.

Tests: Py 1579 passed (+2), TS 1242 passed (+5), tsc --noEmit clean.

* fix(stream-handler): warn that NLMS AEC is wrong for PSTN; keep it off-default

The previous AEC commits added a server-side NLMS adaptive filter and
exposed an ``echoCancellation`` flag. Real-call testing on cellular
PSTN turned up a fundamental architectural mismatch the early
benchmarks did not catch: the round-trip echo path on Twilio Media
Streams is ~250–1500 ms (jitter buffer + carrier loop), but a 512-tap
NLMS filter at 16 kHz can only see the most recent 32 ms of far-end
samples. The echo never lands inside the filter's window, the weights
stay near zero, and the filter silently no-ops. Worse, with
``isSpeaking=true`` during firstMessage and a barge-in gate of 1 s,
once the gate releases any residual bleed reaching VAD triggers an
immediate self-cancel — the agent stops talking right after starting.

Industry consensus from this round of research:
- LiveKit & Pipecat handle echo cancellation at the transport layer
  for browser/native paths only.
- Twilio's own guidance is to "rely on network echo cancellers" for
  telephone scenarios.
- Vapi, Retell, Bland do not run server-side AEC. They rely on the
  carrier's network echo suppression and the caller device's built-in
  AEC (modern handsets ship one).

Server-side NLMS is the right tool only when the SDK owns the audio
path end-to-end and the loop latency is on the order of the filter
window (~30 ms — browser WebRTC, mobile native). PSTN does not meet
that bar and never will under realistic carrier conditions.

This commit:

- ``echoCancellation`` stays opt-in (default false) so existing PSTN
  callers see no change in behaviour.
- When ``echoCancellation: true`` is detected on a Twilio or Telnyx
  carrier, log a clear warning explaining why it will not work as
  intended and what to do instead. The filter is still instantiated so
  curious operators can compare; the warning makes the recommendation
  explicit.

For PSTN deployments, the working stack is: Patter's self-hearing
guard + 1 s barge-in cooldown + Silero VAD with the phone-tuned preset
+ carrier / handset native echo suppression. No server-side AEC.

Tests: Py 1579 passed, TS 1242 passed, tsc --noEmit clean.

* fix: barge-in robustness + AMD on-by-default + STT finalize + post-cancel drain

Six architectural fixes for the post-barge-in failure modes surfaced during
the 0.6.0 acceptance pass against real PSTN calls. Validated end-to-end on
six pipeline stacks (Deepgram + Groq/OpenAI/Anthropic/Cerebras/Google +
Cartesia/OpenAI TTS) with verbose Italian conversation flow.

1. Adaptive barge-in gate
   - MIN_AGENT_SPEAKING_MS_BEFORE_BARGE_IN_AEC = 1000 (covers AEC warmup)
   - MIN_AGENT_SPEAKING_MS_BEFORE_BARGE_IN_NO_AEC = 250 (anti-flicker only)
   - canBargeIn() picks the right gate based on whether AEC is wired.
   - Suppression call sites log at INFO level with the AEC state.

2. Inbound audio ring cap reduced from 30 frames (~600 ms) to 13 (~260 ms)
   to match VAD minSpeechDuration. Pre-fix, the replay was dragging in
   ~350 ms of agent TTS bleed which Deepgram (default English) transcribed
   as English garbage and committed to the LLM as phantom user input.

3. STT.finalize() on VAD speech_end
   - New optional finalize() on STTAdapter / STTProvider.
   - DeepgramSTT.finalize() exposes {type: 'Finalize'} as a public method.
   - StreamHandler calls stt.finalize() whenever the SDK's VAD signals
     speech_end so the provider returns is_final immediately rather than
     waiting on its own (slow) endpointing heuristic.

4. AMD on by default + onMachineDetection callback (Twilio + Telnyx parity)
   - New MachineDetectionResult carrier-agnostic shape.
   - Twilio: MachineDetection=DetectMessageEnd + AsyncAmd=true (no
     answer-latency penalty on human pickups).
   - Telnyx: answering_machine_detection=greeting_end.
   - Callback fires on both webhooks before the legacy voicemail-drop
     path so callers see the result regardless of voicemailMessage.

5. Post-cancel drain window of 150 ms
   - Tracks lastCancelAt timestamp on every barge-in cancel (both
     VAD-path and transcript-path).
   - beginSpeaking() is now async and awaits the drain remainder so the
     remote PSTN player has time to flush the cancelled turn's tail
     before the next TTS chunk lands on top of it. Eliminates the
     "doubled first sentence" audio artefact reported during testing.

6. AssemblyAI accepts a parity-only `language` field for cross-provider
   uniformity (forwarded as no-op; AssemblyAI selects language by model).

Both SDKs (TypeScript and Python) updated with identical defaults,
constants, and call-site coverage. Unit tests: TS 24/24 passing, Python
33/33 passing. Includes [DIAG] INFO logs in TS deepgram-stt.ts and
stream-handler.ts for the diagnostic phase; these can be removed in a
follow-up commit once the bleed-transcription root cause is sealed.

* feat(sdk): tool platform overhaul + Realtime fixes + persist option + tunnel grace

Bundles the SDK changes from a focused work session: 5 bug fixes + 6
new feature areas, with full Python ↔ TypeScript parity.

Bug fixes
---------
* fix(client): bump cloudflared quick-tunnel grace 2.5 → 5 s. The 2.5 s
  window covered HTTP propagation only — Twilio's WSS upgrade for the
  media stream goes through a different cloudflared edge route that
  takes ~1-3 s longer; ~5 % of first calls dropped silently at pickup
  with no media. 5 s drops the failure rate to <1 %. (client.ts /
  client.py)
* fix(realtime): handler-only tools were silently ignored in TS Realtime
  mode (CRITICAL). `handleFunctionCall` only dispatched `webhookUrl`
  tools; tools with an in-process `handler` callback (the default
  pattern in the demos) fell through without sending
  `function_call_output`, hanging the model.
* fix(realtime): `onTranscript({ role: 'assistant' })` was never fired.
  Assistant text was pushed into history but never surfaced via the
  user-supplied callback, so demos only saw `[user]` lines.
* fix(realtime): dashboard transcript shown out of order. OpenAI Realtime
  emits `input_audio_transcription.completed` AFTER `response.done`, so
  the naïve push order was [assistant, user, ...]. Added a per-call
  buffer (`pendingAssistantTurn` + 3 s fallback timer) that holds the
  assistant turn until the matching user transcript arrives.
* fix(realtime): tool invocations were invisible in the transcript
  timeline. Added `emitToolEvent` that pushes `role: 'tool'` history
  entries and fires `onTranscript({ role: 'tool', tool_name, tool_args,
  tool_result, ... })` for the call/return semantics.

Features
--------
* feat(api): `Patter({ persist })` opt-in dashboard persistence. The
  on-disk per-call records (metadata.json, transcript.jsonl, events.jsonl)
  were previously opt-in only via `PATTER_LOG_DIR`. New explicit option:
  `false` (default), `true` (platform default location), or a custom
  string path. Env var still works as deployment-time override.
* feat(tools): JSON-schema validation at `agent()` build time +
  OpenAI strict-mode opt-in. Schemas are validated structurally for
  every tool; `Tool({ strict: true })` additionally enforces OpenAI's
  strict-mode requirements (recursive `additionalProperties: false`,
  every property in `required`). Catches typos at build time.
* feat(tools): retry with exponential backoff + per-tool circuit
  breaker. Both handler and webhook paths now get 3 attempts with
  jittered backoff (capped at 5 s). New `CircuitBreakerRegistry` trips
  OPEN after 5 consecutive failures and stays OPEN for 30 s before
  allowing a HALF_OPEN probe; while OPEN it returns
  `{error, fallback: true, circuit_state: "open", retry_after_ms}`.
* feat(tools): reassurance auto-message during long tool calls. New
  `Tool({ reassurance: "Let me check..." })` (or
  `{ message, afterMs }`) bridges the silence on slow tools by
  enqueueing the message via `OpenAIRealtimeAdapter.sendText` after
  `afterMs` (default 1500 ms) — cancelled if the tool returns earlier.
  Realtime-only for now.
* feat(tools): MCP (Model Context Protocol) client integration (MVP).
  New `agent({ mcpServers: [...] })` plugs the agent into MCP servers
  (Google Workspace, PayPal, Postgres, GitHub, ...) without writing
  wrapper handlers. Each server is queried at call start via
  `tools/list`; discovered tools are wrapped with synthetic handlers
  that dispatch to `tools/call` and merged into `agent.tools`.
  Optional dependency: `@modelcontextprotocol/sdk` (TS) /
  `mcp` (Py extra). Streamable-HTTP transport only for now.
* feat(tools): streaming results via async generator handlers. Tool
  handlers can be `async function*` (TS) / `async def ... yield` (Py)
  generators that emit `{ progress: "..." }` updates while running;
  each yield is sent to the agent via `sendText` for inline status.

New files
---------
* libraries/typescript/src/tools/schema-validation.ts
* libraries/typescript/src/tools/circuit-breaker.ts
* libraries/typescript/src/tools/mcp-client.ts
* libraries/python/getpatter/tools/schema_validation.py
* libraries/python/getpatter/tools/circuit_breaker.py
* libraries/python/getpatter/tools/mcp_client.py
* test files: 4 TS + 3 Py covering schema validation, breaker, streaming, reassurance

Tests
-----
1280 TS · 1156 Py · 0 regressions. Updates two stale tests (AMD
on-by-default test in new-features.test.ts; handler retry count in
llm-loop.test.ts) to reflect new behaviour.

* feat(dashboard): React + Vite SPA replaces inline HTML/CSS template

The dashboard is now a real SPA in `dashboard-app/` (Vite + React +
TypeScript) instead of a 700-line HTML/CSS/JS string embedded in
`dashboard/ui.{ts,py}`. The build pipeline produces a single
self-contained HTML file (vite-plugin-singlefile inlines JS + CSS)
which is committed to `libraries/typescript/src/dashboard/ui.html` and
mirrored to the Python package via `dashboard-app/scripts/sync.mjs`.

At runtime the SDK serves the same `GET /` endpoint as before — the
inlined HTML is loaded by tsup's esbuild ``text`` loader (TS) or the
package-data file (Py). Customer-side: zero change in start-up UX
(`phone.serve()` → http://127.0.0.1:8000/), but the dashboard is now
typed, modular, and maintainable as proper React.

Why this approach (option D from the design discussion):
* No CDN dependency at runtime (no unpkg.com / Babel-in-browser).
* No new runtime deps in the SDK — React + Vite live only at build time
  in the dev repo; the published package ships static HTML.
* Self-contained bundle: the SDK still works air-gapped and behind
  corporate firewalls.
* Type safety end-to-end (TSX components, tsconfig strict).

Components ported from the reference design:
* Topbar, PageHeader, Metric cards
* CallTable with row selection + search
* LiveCallPanel (transcript stream + call controls)
* LatencyPanel (p50 / p95 / STT / TTS bars)
* CostPanel (per-provider breakdown)

Hooks:
* useDashboardData — fetches `/api/dashboard/calls` + subscribes to the
  SSE stream at `/api/dashboard/stream`
* useTranscript — incremental transcript updates per selected call
* mappers.ts — maps the wire format (CallRecord) to the UI shape

Build:
* `dashboard-app/` is its own Vite project with `npm run build && npm
  run sync` — sync copies the inlined HTML to both SDKs.
* `libraries/typescript/tsup.config.ts` adds the ``.html`` text loader
  so the asset is inlined into `dist/index.{js,mjs}`.
* `libraries/python/pyproject.toml` declares `ui.html` as
  `getpatter.dashboard` package-data so it ships with `pip install`.
* `libraries/typescript/package.json` `files` array includes
  `src/dashboard/ui.html` so npm packs it.

* docs(changelog): unreleased entries for tool platform + Realtime fixes + persist

Documents the two preceding commits in CHANGELOG.md under
``## Unreleased``:

* Added: ``Patter(persist=...)`` option, JSON-schema validation +
  strict mode, retry + circuit breaker, reassurance auto-message,
  MCP client integration, streaming results.
* Fixed: Realtime handler-tool dispatch, assistant ``onTranscript``,
  transcript ordering buffer, tool transcript events, cloudflared
  quick-tunnel WSS upgrade race.

Per the project rule (``.claude/rules/documentation-best-practices.md``
invariant 0): every user-visible change updates ``## Unreleased`` in
the same unit of work. The dashboard rewrite is intentionally NOT in
the changelog — same URL, same UX, same `phone.serve()` entry point;
the SPA migration is internal and customer-invisible.

* chore(dashboard): commit Python sync of ui.html

Mirror of the built dashboard SPA into the Python package — produced by
``dashboard-app/scripts/sync.mjs`` alongside the TS-side
``libraries/typescript/src/dashboard/ui.html``. Should have been part
of the dashboard SPA commit; tracking it now keeps the two SDKs in
parity for ``pip install getpatter``.

* chore: black reformatting of test_dashboard.py

Pure formatter pass — splits long argument lists across multiple lines
and adds the missing blank line after the conditional ``import
fastapi``. No logic changes; the test still verifies the dashboard
store and routes the same way.

* feat(dashboard): polish — real logo, range filter, interactive sparklines, realtime mode

Iterative refinement of the React/Vite dashboard SPA shipped in
3877719. Customer-side it remains a single embedded HTML file served
from `phone.serve()` at `/`, but the UX is now markedly closer to the
target design.

UI changes:
* Real Patter logo: mark (wireframe stack-tile from the favicon path,
  thin stroke instead of the chunky filled silhouette in
  `docs/logo/light.svg`) + tightened-viewBox wordmark, sized
  independently so the wordmark stays large while the mark line weight
  stays light.
* Tab title: "Patter | Dashboard". Favicon: stack-tile SVG inline,
  matching the previous SDK dashboard.
* Topbar: dropped Bell / Settings / Avatar buttons and "Place call" CTA
  (will reintroduce when actually wired). Phone-number pill always shown,
  derived from the most recent call's Patter-side number.
* Live chip pulse: peach static when zero calls, green pulsing when ≥1
  is active.
* Latency + Cost merged into one MetricsPanel with a peach segmented
  switcher, fixing the right-rail clipping that hid Cost on short
  viewports. Realtime mode collapses the STT/LLM/TTS waterfall to a
  single end-to-end bar (those metrics aren't meaningful when the
  provider does the round-trip in one model call).

Range filter (1h / 24h / 7d / All) is now real:
* Bucket strategy aligned to natural boundaries — 12 × 5min, 24 × 1h,
  7 × 1day, plus 9-bucket auto for All. Tooltip ranges read as
  "11:00 → 12:00" instead of "11:39 → 12:33".
* Filtered call list, headline counters (Calls / Latency p95 / Spend),
  and sparklines all reflect the active range. Live calls always stay
  visible even when out of the range so users see what's happening now.
* Sparkline scaling: tallest bar normalises to 100, no more lonely
  single bar surrounded by ghost grey lines.

Sparklines are now interactive:
* Hover any bar → custom tooltip (instant, dark surface, mono numerics
  in peach) showing the bucket window, call count, and a 4-call sample
  (number / status / cost). React-driven, replaces the slow native
  `title=""`.
* Click → selects the newest call in that bucket into the right rail.
* Empty buckets are invisible (no grey ghosts).
* Bars now sit flush against the card bottom (flex column +
  `margin-top: auto`), matching the original design.

Export CSV button is now wired to `/api/dashboard/export/calls?format=csv`
via a transient anchor download.

Backend additions: none — every change above is in `dashboard-app/`
plus the synced `ui.html` rebundles in both SDKs. Pre-publish flow is
still `cd dashboard-app && npm run build && npm run sync`.

* feat(tts): add Inworld TTS provider (TTS-2 default, NDJSON streaming)

New TTS adapter calling Inworld's HTTP NDJSON streaming endpoint
`POST https://api.inworld.ai/tts/v1/voice:stream`. Defaults to
`inworld-tts-2` (sub-200 ms TTFB, 100+ languages, natural-language voice
steering); pass `model: "inworld-tts-1.5-max"` for the prior generation.
Default audio output is PCM_S16LE at 16 kHz so the result feeds straight
into the Patter pipeline without transcoding.

Public API parity:
- TS:  `import { InworldTTS } from "getpatter"` / `getpatter/tts/inworld`
- Py:  `from getpatter import InworldTTS`        / `getpatter.tts.inworld`
- Env-var auto-resolve via `INWORLD_API_KEY` (paste the Base64 token from
  the Inworld dashboard — already in `Authorization: Basic <token>` form).
- Optional knobs: `language` (BCP-47), `temperature` (TTS-1.5 only),
  `speakingRate` (0.5-1.5), `deliveryMode` (`EXPRESSIVE`/`BALANCED`/
  `STABLE` — TTS-2 only), `bitrate`.

Pricing entry `inworld` added to both pricing tables (placeholder
$0.020/1k chars — verify against current platform tier). Optional
dependency `getpatter[inworld]` adds `aiohttp>=3.10`.

7 mocked unit tests per SDK covering payload shape, NDJSON line
interleave (`audio, timestamp, audio`), base64 audio decoding, optional
field omission, env-var fallback, and non-200 error surfacing.

New files:
- libraries/typescript/src/providers/inworld-tts.ts
- libraries/typescript/src/tts/inworld.ts
- libraries/python/getpatter/providers/inworld_tts.py
- libraries/python/getpatter/tts/inworld.py
- libraries/{typescript,python}/{tests/unit/inworld-tts*.test.*,tests/unit/test_inworld_tts.py}

* feat(observability): speech-edge events for turn-taking instrumentation

Adds seven optional async callbacks to every Patter instance plus a read-only
conversation_state snapshot, mirroring the public APIs of LiveKit Agents,
Pipecat and OpenAI Realtime so downstream metrics map onto the canonical
Hamming AI / Coval / Cekura voice-agent metric set without translation:

  on_user_speech_started   - raw VAD positive edge
  on_user_speech_ended     - raw VAD trailing edge (not EOU)
  on_user_speech_eos       - committed end-of-utterance (canonical "user
                             finished" — anchors eos_to_first_token_ms)
  on_agent_speech_started  - first wire-time chunk (what user hears)
  on_agent_speech_ended    - last wire chunk; payload includes interrupted
  on_llm_token             - TTFT marker, fires once per turn
  on_audio_out             - first TTS chunk per turn (warmup, distinct
                             from wire-time)

Each event also records an OpenTelemetry span event on the current call
span (patter.event.*), with gen_ai.* attributes for the LLM event per the
OTel GenAI semconv. OTel branch is a zero-cost no-op when the peer dep is
missing.

Wired into the realtime stream handler so the user/agent edge events fire
automatically on the OpenAI Realtime + Twilio/Telnyx path; LLM/TTS-warmup
events are exposed on the dispatcher for adapter/pipeline integrations.

Public API: SpeechEvents, SpeechEventCallback, ConversationStateSnapshot,
UserState, AgentState, EouTrigger.

Tests: 16 unit tests Py + 15 unit tests TS covering payload schema,
state transitions, idempotency, OTel attach contract, callback-exception
isolation, and Patter-level proxy mirroring.

Motivated by patter-agent-runner's 15 turn-taking acceptance verbs that
previously auto-skipped because the SDK did not surface per-side speech
edges.

* fix: Realtime first_message role swap, dashboard 404 spam, SDK plumbing for speech-edge events

Three Realtime mode fixes (Python + TypeScript parity) plus the host-
binding / observability plumbing required to drive the speech-edge
event suite from external test runners.

Realtime: first_message role swap
------------------…
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dependencies Pull requests that update a dependency file javascript Pull requests that update javascript code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant