podcommentators

podcommentators is a browser-based AI commentary companion for live audio and video. It can listen to your microphone, camera, screen share, or a local/remote stream URL, transcribe the audio in real time, and let a configurable cast of AI commentators react live in an overlay.

The app is built with Next.js, React, and TypeScript, and runs fully on the client. Your keys and commentator settings are stored in browser localStorage.

What It Does

Captures audio from:
- microphone
- camera
- screen share
- stream URL
Shows video for:
- camera
- screen share
- video stream URLs
Transcribes live audio with:
- ElevenLabs Scribe, if configured
- Gemini audio transcription, if configured
- Web Speech API fallback for basic mic-only browser support
Streams live AI reactions from a configurable set of commentators
Supports two viewing modes:
- Regular: source + transcript only
- Enhanced: source + transcript + commentator overlay
Includes an in-app settings editor for commentator CRUD:
- create
- edit
- duplicate
- enable/disable
- delete

Current Status

podcommentators is ready for local use and repo publishing.

Implemented:

local Next.js app
browser-based settings persistence
editable commentators
camera, screen, mic, and stream URL input modes
focused video view with a header stop button
OBS guidance for camera and local HLS workflows
saved YouTube ingest settings in the UI

Not yet implemented:

direct YouTube live publishing from the browser app itself

Why not: YouTube live publishing uses RTMP/RTMPS ingest, which requires an encoder or relay layer that this app does not currently provide.

Requirements

Node.js 18+ recommended
npm
Chrome or Edge recommended for the best media capture support

Optional accounts and keys:

Gemini API key for AI commentary and Gemini transcription
ElevenLabs API key for higher-quality transcription

Quick Start

git clone https://github.com/YOUR_USERNAME/podcommentators.git
cd podcommentators
npm install
npm run dev

Then open:

http://localhost:3000

First Run Setup

Open the app in your browser.
Click Settings.
Add your API keys:
- Gemini API key
- ElevenLabs API key, optional
Save settings.
Choose a source and start listening.

Source Modes

Mic

Use your microphone or a virtual audio device as the input source.

Best for:

quick testing
podcasts routed into a virtual input
voice-only sessions

Camera

Uses getUserMedia() for live video plus audio capture.

Best for:

webcam sessions
OBS Virtual Camera workflows

Notes:

the camera feed becomes the main video view
when video is visible, the app switches to focused video mode
click the header Stop button to return to the source/transcript screen

Screen

Uses getDisplayMedia() for screen video and optional system audio, then mixes mic audio in separately.

Best for:

presenting a browser tab or app window
reacting to livestreams or desktop content
capturing screen video in the main preview

Notes:

for tab or system audio, the browser share dialog must include the audio-sharing option
Firefox support for system audio is limited

Stream URL

Loads a direct media URL and can show video if the URL points to a supported video stream or file.

Typical examples:

.m3u8
.mp4
.webm
.mov

Focused Video Mode

When video is active through:

camera
screen share
video stream URL

podcommentators hides:

the source panel
the transcript panel

And shows:

the video preview
the commentator overlay in Enhanced mode
a Stop button in the top header

Clicking Stop returns the app to the source/transcript screen.

Regular vs Enhanced

Regular

source controls visible
transcript visible
commentators hidden

Enhanced

source controls visible when not in focused video mode
transcript visible when not in focused video mode
commentator overlay enabled

AI Commentators

Commentators are fully editable from Settings.

For each commentator you can change:

name
title
icon
accent color
enabled/disabled state
search grounding on/off
cooldown
temperature
max tokens
system prompt
relevance prompt

Supported CRUD actions:

add a new commentator
edit an existing commentator
duplicate a commentator
delete a commentator

Settings are stored in localStorage per browser profile.

API Keys and Privacy

Keys are stored locally in the browser via localStorage.

They are used for:

Gemini requests directly from the client to Google
ElevenLabs transcription requests directly from the client to ElevenLabs

There is no backend in this repo handling your secrets.

That is convenient for local use, but it also means:

do not use this architecture for a public multi-user deployment without a backend
browser users with access to the app can access their own stored keys in that browser profile

OBS and Streaming Software

podcommentators includes an OBS setup guide in the UI. There are two main workflows.

OBS Virtual Camera

Use this when you want OBS output to appear as a webcam inside podcommentators.

High-level flow:

Start Virtual Camera in OBS.
In podcommentators, choose Camera.
Start camera and select OBS Virtual Camera.
Route audio separately with a virtual audio device if needed.

Local HLS via MediaMTX

Use this when you want OBS to publish to a local RTMP server and podcommentators to load the resulting HLS URL.

High-level flow:

Run MediaMTX locally.
Point OBS to the local RTMP endpoint.
Load the HLS URL in podcommentators as a stream URL.

This is often the best local setup for full video + audio capture without relying on virtual camera behavior.

YouTube Live

The settings screen includes fields for:

YouTube ingest URL
YouTube stream key

This currently helps you store your destination details and jump into YouTube Live Control Room, but podcommentators does not yet push the stream to YouTube directly.

To support direct YouTube publishing, the project would need one of these:

an OBS-based outbound workflow
a local FFmpeg relay
a backend or desktop helper that can encode and publish RTMP/RTMPS

Scripts

npm run dev
npm run build
npm run start
npm run lint

Tech Stack

Next.js 16
React 19
TypeScript
@google/genai
browser media APIs:
- getUserMedia
- getDisplayMedia
- MediaRecorder
- Web Speech API
- AudioContext

Project Structure

src/
  app/
    layout.tsx
    page.tsx
    globals.css
    page.module.css
  components/
    AudioSourcePanel.tsx
    CommentatorRail.tsx
    OBSSetupGuide.tsx
    SettingsModal.tsx
    TranscriptPanel.tsx
    VideoDisplay.tsx
    WaveformCanvas.tsx
  context/
    SettingsContext.tsx
  hooks/
    usePersonaOrchestrator.ts
    useTranscript.ts
  lib/
    gemini.ts
    elevenlabs.ts
    personas.ts
    settings.ts
  prompts/
    ...
  types/
    index.ts

Development Notes

State Persistence

app settings are stored in localStorage
commentators are part of persisted settings
settings updates are distributed through useSyncExternalStore

Prompt System

Default commentator prompts are stored in:

src/prompts/*/system.md
src/prompts/*/relevance.md

Those defaults are copied into local settings and can then be edited in the UI.

Browser-Centric Architecture

This project currently assumes:

local development
a trusted single user
no backend secret management

If you want to deploy it publicly, the first things to add are:

a server-side API layer
secret storage
auth
rate limiting
media publishing infrastructure

Troubleshooting

No video appears

confirm the selected source is camera, screen, or a video stream URL
confirm the browser permission prompt was approved
for screen share, confirm you actually selected a screen/window/tab

Screen share has no system audio

enable the browser checkbox for tab/system audio during sharing
prefer Chrome for the best support

Transcript does not start

confirm you added a Gemini or ElevenLabs key for screen/stream workflows
confirm your browser has mic permission if the selected source needs mic input

Commentators do not show

switch the top mode toggle to Enhanced
confirm at least one commentator is enabled in Settings
confirm you have a Gemini API key configured

Stream URL fails to load

check that the URL is reachable from your browser
verify the file extension or stream format is supported
if using local HLS, confirm your local media server is actually running

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
public		public
src		src
.gitignore		.gitignore
AGENTS.md		AGENTS.md
CLAUDE.md		CLAUDE.md
README.md		README.md
eslint.config.mjs		eslint.config.mjs
next.config.ts		next.config.ts
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json

Folders and files

Latest commit

History

Repository files navigation

podcommentators

What It Does

Current Status

Requirements

Quick Start

First Run Setup

Source Modes

Mic

Camera

Screen

Stream URL

Focused Video Mode

Regular vs Enhanced

Regular

Enhanced

AI Commentators

API Keys and Privacy

OBS and Streaming Software

OBS Virtual Camera

Local HLS via MediaMTX

YouTube Live

Scripts

Tech Stack

Project Structure

Development Notes

State Persistence

Prompt System

Browser-Centric Architecture

Troubleshooting

No video appears

Screen share has no system audio

Transcript does not start

Commentators do not show

Stream URL fails to load

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages