Skip to content

Releases: AlanY1an/echotwin

EchoTwin v0.1.1

12 Jun 06:40

Choose a tag to compare

What's new

English deployments actually hear English now. The persona language field auto-selects the matching streaming-ASR model (Chinese-first bilingual zipformer for zh, the English zipformer for en) — previously full English speech was garbled by the bilingual model. Switching personas rebuilds the recognizers on the fly; asr.sherpa_stream.repo left empty means auto, an explicit repo still wins.

Admin commands work in guild channels. /persona-admin, /voice-admin and /admin no longer force you into a DM — still owner-only, replies are ephemeral. (Command contexts re-sync on bot restart; global propagation can take up to 1h.)

Smoother first run. scripts/download_models.sh now prefetches both streaming ASR models (zh + en, ~200 MB) and drops a dead ~900 MB prefetch; quick start gained the missing git clone step; CONTRIBUTING.md added.

Full diff: v0.1.0...v0.1.1

EchoTwin v0.1.0

12 Jun 06:11

Choose a tag to compare

First public release.

What works today

  • One-on-one voice conversation in a cloned voice, sub-second mouth-to-ear (best 361 ms, typically 0.6–1.1 s measured live)
  • Streaming ASR (sherpa-onnx zipformer) with speculative ASR/LLM execution, pre-opened TTS sockets, cached fillers
  • Barge-in, tool calls (time/date/weather), hot-swappable personas with per-persona Fish Audio TTS tuning
  • Per-turn cost ledger with daily/monthly budget caps
  • Bilingual prompt system: persona-level language: zh|en switches every LLM-facing prompt

Experimental

  • Organic multi-party mode: three-layer addressee pipeline (table-lookup reflexes → LLM arbiter with room context → golden-set-tested heuristics)

See docs/SETUP.md to go from zero to a talking bot in ~15 minutes.