VibeTube

Local-first voice cloning, character creation, story assembly, and talking-character rendering.

VibeTube started from a fork/vendor of Voicebox. Huge credit to the original project. This repo now evolves that foundation into VibeTube's own workflow and UI.

VibeTube is a local-first app for building voice-driven character content with a React frontend and a FastAPI backend. In development it runs as a web app against a local server; in packaged use it runs as a Tauri desktop app with a bundled backend. The current product centers on Characters, Generate, Stories, Broadcast, and VibeTube rendering, with support for local runtime by default and configurable server connections where needed.

Demo

Watch the current app demo on YouTube: https://youtu.be/Oco9v5mhcpg?si=VhFm2XoPrfx1EYQc

This walkthrough predates the current Broadcast feature. Broadcast is already available in the app, and a newer demo video will be added later.

What VibeTube Includes

Characters

Create and edit characters from voice samples
Attach avatar images and VibeTube avatar state packs
Import and export voice-profile data where supported
Assign characters to channels for generation workflows

Generate

Generate speech with the Qwen3-TTS-backed workflow
Reuse saved characters in the main generation flow
Review generation history and replay outputs locally
Download generated audio from the app

Stories

Build multi-clip, multi-character story compositions
Arrange clips in a timeline/track editor
Reuse generated clips inside story workflows
Keep story editing separate from one-off generations

Broadcast

Use your avatar as a PNGTuber in live sessions
Stream VibeTube output into OBS Studio
Drive avatar speaking states from your voice in real time
Use Broadcast as a live workflow separate from offline render exports

VibeTube Rendering

Render talking-character output from avatar states
Tune render settings for motion, timing, and output size
Use background color or uploaded background images
Support character/video workflows, not just standalone audio

Transcription and Audio Capture

Record audio inside the app
Transcribe audio with Whisper-backed tooling
Capture system audio where the platform supports it
Manage model downloads needed for transcription and generation

Runtime and Model Management

Bundled backend in packaged desktop builds
Separate backend process in development
Model download, status, and cache management
Local-first operation with configurable server connection settings

Download

Windows installers are available from the latest GitHub release. macOS and Linux packaged downloads are not published yet.

Platform	Download
macOS	TBD
Windows	Latest release
Linux	TBD

Development

For full setup and contribution details, see CONTRIBUTING.md and DEV_RUN.md.

Quick local run

Prerequisites:

Bun
Python 3.11+
Rust for Tauri development only

# Install JavaScript dependencies
bun install

# Create and activate a Python virtual environment
python -m venv .venv

# Windows PowerShell
.\.venv\Scripts\Activate.ps1

# macOS / Linux
source .venv/bin/activate

# Install backend dependencies
pip install -r backend/requirements.txt

# Start the backend on http://127.0.0.1:17493
python -m uvicorn backend.main:app --host 127.0.0.1 --port 17493 --reload

# In another terminal, start the web app on http://127.0.0.1:5173
bun run dev:web -- --host 127.0.0.1

Desktop app

# Starts the Tauri app in development mode
bun run dev

In development, the desktop app expects the backend to be started separately. In packaged desktop builds, the backend is bundled and started by the app.

API

VibeTube exposes a FastAPI backend for the same core workflows used by the app:

characters / voice profiles
generation
transcription
stories
models
active task tracking

When the backend is running locally, interactive API docs are available at:

http://127.0.0.1:17493/docs

Project Structure

VibeTube/
|-- app/         # Shared application UI
|-- web/         # Web wrapper/runtime
|-- tauri/       # Desktop wrapper/runtime
|-- backend/     # FastAPI backend and model orchestration
|-- docs/        # Documentation site content
|-- scripts/     # Build and maintenance scripts
`-- legacy_cli/  # Archived legacy CLI work

Contributing

Contribution guidelines live in CONTRIBUTING.md.

Security

Security reporting information is in SECURITY.md.

License

VibeTube is released under the MIT License. See LICENSE. For an informational responsibility notice that does not alter the MIT terms, see DISCLAIMER.md.

Name		Name	Last commit message	Last commit date
Latest commit History 51 Commits
.github		.github
app		app
backend		backend
data		data
docs		docs
legacy_cli		legacy_cli
mlx-test		mlx-test
scripts		scripts
tauri		tauri
web		web
.biomeignore		.biomeignore
.bumpversion.cfg		.bumpversion.cfg
.gitignore		.gitignore
.npmrc		.npmrc
CONTRIBUTING.md		CONTRIBUTING.md
DEV_RUN.md		DEV_RUN.md
DISCLAIMER.md		DISCLAIMER.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
SECURITY.md		SECURITY.md
biome.json		biome.json
bun.lock		bun.lock
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

VibeTube

Demo

What VibeTube Includes

Characters

Generate

Stories

Broadcast

VibeTube Rendering

Transcription and Audio Capture

Runtime and Model Management

Download

Development

Quick local run

Desktop app

API

Project Structure

Contributing

Security

License

About

Uh oh!

Releases 16

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

VibeTube

Demo

What VibeTube Includes

Characters

Generate

Stories

Broadcast

VibeTube Rendering

Transcription and Audio Capture

Runtime and Model Management

Download

Development

Quick local run

Desktop app

API

Project Structure

Contributing

Security

License

About

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 16

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages