tts-web

Browser-native text-to-speech running 100% client-side via Rust/WASM.

Disclaimer: Experimental port. Model from Pocket TTS by Kyutai Labs.

Requirements

Modern browser with WebAssembly support

Quick Start

# 1. Build WASM package
wasm-pack build crates/tts-wasm --target web

# 2. Start dev server
bun web/serve.mjs

Architecture

TTS model (crates/tts-wasm/): Pocket TTS compiled to WebAssembly via Candle. Generates speech from text tokens using a voice embedding.
mimi-rs: Shared Rust library for the Mimi audio codec (encoder + decoder + streaming transformer). Used by both tts-web and stt-web.
Web UI (web/): Web Worker orchestrates model loading and generation, streams audio chunks back to the main thread for real-time playback.

Quantization

The model ships as a GGUF Q8_0 file (~130MB). Weights are loaded directly as Q8_0 via candle's QMatMul, keeping ~97M quantized parameters in memory (~103MB vs ~388MB F32) and reducing memory bandwidth ~4x per inference step.

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
.cargo		.cargo
crates		crates
scripts		scripts
web		web
.gitignore		.gitignore
CANDLE_QMATMUL_BRIEF.md		CANDLE_QMATMUL_BRIEF.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
QUANTIZATION.md		QUANTIZATION.md
QUANTIZATION_RESEARCH.md		QUANTIZATION_RESEARCH.md
README.md		README.md
quantize.py		quantize.py
quantize_to_gguf.py		quantize_to_gguf.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

tts-web

Requirements

Quick Start

Architecture

Quantization

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

License

idle-intelligence/tts-web

Folders and files

Latest commit

History

Repository files navigation

tts-web

Requirements

Quick Start

Architecture

Quantization

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages