A desktop background agent that observes local streaming context and intelligently controls local streaming tools using AI-powered action planning.
AuTuber runs as a background desktop app that captures your streaming environment (camera, screen, audio), sends structured observations to a language model, receives an action plan, validates it for safety, and executes approved actions through OBS and VTube Studio.
Primary workflow:
Capture inputs → Build observation → Call model → Parse action plan → Validate actions → Execute local actions
- Multi-source capture: Camera frames, screen/window capture, and audio transcription
- Model-agnostic: Supports OpenRouter, self-hosted OpenAI-compatible models, and mock providers
- Action automation: VTube Studio hotkey triggers, parameter adjustments, OBS scene/source control, overlay messages
- Safety by default: Action validation, cooldowns, autonomy levels, and confirmation gates
- Local control: OBS WebSocket and VTube Studio API integration
- Structured logging: Full pipeline visibility and debugging
- Settings UI: Configure capture sources, model providers, safety policies, and hotkey mappings
The app is built as an Electron desktop application with a clear separation of concerns:
- Main Process: Core logic, external API calls, IPC handlers, service layer
- Renderer: React UI for setup, status, controls, and logs
- Hidden Capture Window: Browser media APIs for frame/audio sampling
- Services: Modular services for OBS, VTube Studio, capture orchestration, and model routing
- Electron for cross-platform desktop app
- React for UI
- TypeScript for type safety
- Vite for fast builds
- Zod for data validation
- OBS WebSocket JS for OBS integration
- WebSocket (ws) for VTube Studio connection
pnpm installpnpm devThis starts Vite dev server and Electron app.
pnpm build
pnpm build:electronautuber/
├── electron/ # Main Electron app
│ ├── src/
│ │ ├── main/ # Main process & services
│ │ ├── renderer/ # React UI
│ │ ├── preload/ # Secure IPC bridge
│ │ └── shared/ # Schemas & types
│ └── ...
├── apps/ # Future additional apps
├── packages/ # Shared packages
├── docs/ # Architecture & setup docs
├── models/ # Prompts & provider config
└── scripts/ # Build & dev scripts
The app uses a JSON config file (created on first run) with sensible defaults. Key settings:
- Model provider: OpenRouter, self-hosted, or mock
- Capture: Camera/screen FPS, resolution, audio sample rate
- Automation: Tick interval, max actions per tick, autonomy level
- Safety: Confirmation gates for scene changes and source visibility
- OBS & VTS: WebSocket endpoints and credentials
See SPEC.md for full data contracts, service interfaces, and implementation details.
The app demonstrates a complete loop with:
- Connect to OBS and VTube Studio
- Configure a model provider
- Click "Analyze Now"
- App captures OBS/VTS state and optional text
- Model generates action plan
- App validates and executes actions
- View full results in logs
- SPEC.md — Complete technical specification, data contracts, service interfaces
- docs/architecture.md — System design & component interactions
- docs/security.md — Security model & secret handling
- docs/setup.md — Detailed setup instructions
Private project.