Skill files in this repo are auto-overwritten on each release. For feedback or fixes, please open an issue instead of submitting a PR.
Voice in your AI workflow. Six skills that let any AI coding agent (Claude Code · Cursor · Codex · Gemini CLI · Cline) speak in 200+ voices, generate podcasts, dub videos, transcribe audio, and turn text into card images or narrated card videos — through one CLI.
Why VoxFlow over a raw TTS API? One CLI handles auth, voice search, multi-speaker dialogue, video pipelines, and quota. The skills layer makes it native to whichever agent you're already using — no new context-switch.
The simplest path — let the VoxFlow CLI auto-detect your agent and run the right command:
npm install -g voxflow
voxflow skills installIt detects Claude Code / Cursor / Codex / Gemini / WorkBuddy / OpenClaw on your $PATH, picks the right install command, asks for confirmation, runs it, and prints next steps. Use --all to install for every detected agent, or --for <agent> to force one.
If you'd rather run the install command directly — one command for every agent:
npx -y skills add VoxFlowStudio/skills --all --yes --globalThe skills npm package detects every AI agent on your machine (Claude Code,
Cursor, Codex CLI, Gemini CLI, Cline, Amp, Antigravity, CodeBuddy, OpenClaw…)
and writes the 6 VoxFlow skills (hub, podcast, transcribe, video,
slice, card) to each agent's standard skill location in a single shot.
npm install -g voxflow
voxflow login # one-time browser authSix focused skills, each loaded on demand:
| Skill | Invoked as | What it covers |
|---|---|---|
| hub | voxflow:hub |
say · narrate · story · voices · auth · quota · feedback |
| podcast | voxflow:podcast |
Multi-speaker AI podcast from topic / URL / script |
| transcribe | voxflow:transcribe |
asr · asr-jobs · translate · dub · video-translate · summarize · publish |
| video | voxflow:video |
picstory · present · slides · explain · image |
| slice | voxflow:slice |
Article → vertical card video (1080×1920); 13 editorial / poster / magazine themes |
| card | voxflow:card |
Text → shareable card images (HTML/CSS + Playwright); 1:1 / 3:4 / 9:16, editorial design system. Optional narrated MP4 video via voxflow card render (TTS + FFmpeg, in-project output) |
Use the hub skill as the starting point — it routes to the others automatically.
| You say | Agent runs |
|---|---|
| "Read this README out loud" | voxflow narrate README.md -o readme.mp3 |
| "Make a 5-min podcast on AI agents" | voxflow podcast "AI agents" --length short |
| "Dub this tutorial into Japanese" | voxflow video-translate tutorial.mp4 --to ja |
| "把这段话合成语音" | voxflow say "..." -o output.mp3 |
| "生成一个 AI 播客" | voxflow podcast "topic" --length medium |
| "把这个视频翻译成日语" | voxflow video-translate video.mp4 --to ja |
| "转录这段录音" | voxflow asr recording.mp3 |
| "做一个 AI 知识短视频" | voxflow picstory "topic" --style sketchnote |
| "生成一套演示幻灯片" | voxflow slides "topic" --slides 8 |
voxflow/
hub/SKILL.md # TTS, voice search, auth, quota, feedback
podcast/SKILL.md # AI dialogue podcast
transcribe/SKILL.md # ASR, translation, dubbing
video/SKILL.md # AI short video, slides, images
slice/SKILL.md # Article → vertical card video (13 themes)
card/SKILL.md # Text → shareable card images (1:1 / 3:4 / 9:16)
registry.json # VoxFlow CLI add-on recipes index
voxflow/
dub-anime-jp-zh/ # Anime fan-dub voice preset (JP→ZH)
Install a recipe:
voxflow add dub-anime-jp-zhFree tier: 10,000 / month. Check before large jobs:
voxflow status| Operation | Cost |
|---|---|
say (1 call) |
~100 |
narrate (per segment) |
~100 |
podcast (medium) |
~5,000 |
picstory (5 scenes) |
~3,100 |
- No API keys or tokens in this repository.
- Use
voxflow loginfor interactive auth orVOXFLOW_TOKENenv var for CI. - See SECURITY.md for vulnerability disclosure.
- VoxFlow Studio — Web app
- CLI on npm —
npm install -g voxflow - CLI docs — Full command reference
