otoji (音字)

realtime speech ⇄ text — 音を字に

otoji is a Rust workspace that wires up streaming ASR, LLM-polished transcripts, and TTS behind a single react-ink-style terminal UI built on ratatui.

mic / file ──► AudioChunk ──► AsrProvider ──► AsrEvent ──► Polisher ──► TUI
                                                                    └─► transcript.md

Workspace layout

Crate	Purpose
`otoji-core`	Shared types: `AudioChunk`, `AsrEvent`, `Word`, `OtojiError`
`otoji-audio`	Audio sources — `cpal` mic capture (with resampling) and PCM file replay
`otoji-asr`	`AsrProvider` trait + `iflytek_rtasr` (HMAC-SHA1 signa, WebSocket)
`otoji-tts`	`TtsProvider` trait + `iflytek_tts` (HMAC-SHA256 auth, MP3/PCM streaming)
`otoji-polish`	`Polisher` trait + `NoopPolisher` and `AnthropicPolisher` (Claude Haiku 4.5 default)
`otoji-cli`	`otoji` binary — clap subcommands + ratatui TUI

See ./docs/ for the architecture rationale and the comparison of RT ASR providers (iFlytek RTASR / CoLi / SenseVoice / Whisper / Deepgram).

Build

cargo build --release

Usage

# 1) Live mic → RTASR → polished TUI
export IFLYTEK_APP_ID=...
export IFLYTEK_API_KEY=...
export ANTHROPIC_API_KEY=...   # optional, enables LLM polish layer
cargo run -p otoji-cli -- listen

# 2) Replay a 16kHz mono PCM file in real time
cargo run -p otoji-cli -- file 16k_10.pcm

# 3) Synthesize speech via iFlytek TTS
export IFLYTEK_TTS_API_KEY=...
export IFLYTEK_TTS_API_SECRET=...
cargo run -p otoji-cli -- speak "你好，世界" --out hello.mp3

TUI

The transcript view shows:

[seg_id] confirmed segments in white bold (polished) or gray (raw, awaiting polish)
The current partial hypothesis as ░ ... in dark gray italic
A header with provider state and counts

Press q / Esc / Ctrl-C to quit.

Roadmap

otoji-asr/coli.rs — CoLi ASR via ListenHub
otoji-asr/sensevoice.rs — FunASR self-host bridge
otoji-tts/edge_tts.rs — Microsoft Edge TTS as a free fallback
otoji-cli record — write transcripts to *.md next to the source audio
Bench harness (CER / latency / cost) under crates/otoji-bench

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 99 Commits
.github/workflows		.github/workflows
docs		docs
scripts		scripts
src		src
tests		tests
web		web
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md
build.rs		build.rs
main.d.ts		main.d.ts
main.js		main.js
package-lock.json		package-lock.json
package.json		package.json
release-plz.toml		release-plz.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

otoji (音字)

Workspace layout

Build

Usage

TUI

Roadmap

License

About

Uh oh!

Releases 8

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

otoji (音字)

Workspace layout

Build

Usage

TUI

Roadmap

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 8

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages