xiaozhi-go

xiaozhi-esp32-server 的 Go 精简版，OpenClaw 透传模式。

架构

ESP32 设备                   xiaozhi-go                    外部服务
┌──────────┐    WebSocket    ┌──────────────────┐
│  麦克风   │ ──── Opus ────→ │ Opus解码 → VAD   │
│          │                 │    ↓              │
│          │                 │ PCM → WAV         │
│          │                 │    ↓              │      ┌─────────────┐
│          │                 │ ASR (Whisper API) │ ───→ │ 语音识别服务  │
│          │                 │    ↓              │      └─────────────┘
│          │                 │ 文本              │
│          │                 │    ↓              │      ┌─────────────┐
│          │                 │ LLM (OpenClaw)    │ ───→ │ OpenClaw GW │
│          │                 │    ↓ (streaming)  │      └─────────────┘
│          │                 │ 分句 → TTS        │
│          │                 │    ↓              │      ┌─────────────┐
│  喇叭    │ ←── Opus ────── │ PCM → Opus编码    │ ←── │ TTS 服务     │
└──────────┘                 └──────────────────┘      └─────────────┘

依赖

Go 1.22+
libopus-dev (Opus 编解码)
ffmpeg (TTS 音频转换)
edge-tts (默认 TTS, pip install edge-tts)

macOS:

brew install opus ffmpeg
pip install edge-tts

Ubuntu:

apt install libopus-dev ffmpeg
pip install edge-tts

编译运行

# 下载依赖
go mod tidy

# 编译
go build -o xiaozhi-go .

# 运行
./xiaozhi-go -config config.yaml

配置

编辑 config.yaml:

# OpenClaw Gateway
llm:
  base_url: "http://your-gateway:18789/v1"
  api_key: "your-token"
  model: "openclaw"

# ASR (OpenAI Whisper 兼容)
asr:
  base_url: "https://api.openai.com/v1"
  api_key: "sk-xxx"
  model: "whisper-1"
  language: "zh"

# TTS
tts:
  provider: "edge"
  voice: "zh-CN-XiaoxiaoNeural"

# 设备→Agent 映射
device_agents:
  "AA:BB:CC:DD:EE:FF": "kitchen_agent"

代码结构

xiaozhi-go/
├── main.go                         # 入口
├── config.yaml                     # 配置
├── internal/
│   ├── config/config.go            # 配置加载
│   ├── server/
│   │   ├── server.go               # WebSocket 服务器
│   │   └── connection.go           # 核心管道: Audio→VAD→ASR→LLM→TTS→Audio
│   ├── protocol/messages.go        # 小智通信协议消息
│   ├── audio/
│   │   ├── opus.go                 # Opus 编解码 + PCM/WAV 工具
│   │   └── vad.go                  # 能量 VAD
│   ├── asr/openai_asr.go           # ASR: Whisper API
│   ├── llm/openclaw.go             # LLM: OpenClaw (SSE streaming)
│   └── tts/tts.go                  # TTS: Edge TTS / HTTP API
└── go.mod

与 Python 版的对比

	Python (xiaozhi-server)	Go (xiaozhi-go)
代码行数	26,693	~900
文件数	170	10
启动时间	10-30s (加载模型)	<1s
内存占用	~500MB (PyTorch)	~20MB
并发连接	asyncio+threading	goroutine (原生)
VAD	SileroVAD (PyTorch)	能量 VAD (零依赖)
本地 ASR	FunASR	需外部 API
LLM	多 provider	OpenClaw 透传
TTS	22 providers	Edge TTS / HTTP

后续扩展

Silero VAD: 用 onnxruntime-go 加载 silero ONNX 模型替换能量 VAD
sherpa-onnx ASR: 用 sherpa-onnx Go binding 实现本地离线 ASR
更多 TTS: 添加阿里云/火山引擎等 TTS provider

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
internal		internal
.gitignore		.gitignore
README.md		README.md
config.yaml		config.yaml
go.mod		go.mod
go.sum		go.sum
main.go		main.go

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

xiaozhi-go

架构

依赖

编译运行

配置

代码结构

与 Python 版的对比

后续扩展

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

xiaozhi-go

架构

依赖

编译运行

配置

代码结构

与 Python 版的对比

后续扩展

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages