Skip to content

jmviz/substrata

Repository files navigation

substrata

A browser extension that displays multiple simultaneous subtitles on YouTube videos. It combines YouTube's built-in captions with AI-powered transcription (Whisper) and AI-powered translation (LLM chat completions), so you can watch a video with subtitles in two languages at once.

How it works

The extension adds an "S" button to the YouTube player controls. From there you can pick a primary subtitle track (from YouTube's own captions or AI transcription) and a secondary translation track (generated via an LLM). Subtitles are displayed as a draggable overlay on the video.

AI features require API keys and a native messaging host (a small Python program) that handles audio download, compression, and API calls. The native host uses yt-dlp for audio download and ffmpeg for compression.

Installation

Prerequisites

  • Firefox 128+ or Chrome
  • Python 3.10+ with uv (curl -LsSf https://astral.sh/uv/install.sh | sh)
  • yt-dlp and ffmpeg — e.g. brew install yt-dlp ffmpeg on macOS

1. Install the extension

Download the latest release from GitHub Releases:

Firefox:

  • Download the .xpi file
  • Open the file in Firefox (or drag it onto the Firefox window)
  • Click "Add" when prompted

Chrome:

  • Download the -chrome.zip file and unzip it
  • Go to chrome://extensions, enable "Developer mode" (top right)
  • Click "Load unpacked" and select the unzipped folder
  • Note the extension ID shown on the card — you'll need it for the native host
Build from source (unsigned)

Since you'll be cloning the repo in the next step anyway, you can build and load the extension yourself instead of downloading a release. You'll need Node.js 18+ and npm.

git clone https://github.com/jmviz/substrata.git
cd substrata
npm install
npm run build           # Firefox
npm run build:chrome    # Chrome

The built extension lands in .output/firefox-mv3/ or .output/chrome-mv3/.

Firefox — load it as a temporary add-on (lasts until browser restart):

  • Go to about:debugging → "This Firefox" → "Load Temporary Add-on..."
  • Select .output/firefox-mv3/manifest.json

For a persistent unsigned install, use Firefox Developer Edition or Nightly, which allow disabling signature enforcement: go to about:config and set xpinstall.signatures.required to false, then drag the .xpi produced by npm run zip onto the Firefox window.

Chrome — load it as an unpacked extension:

  • Go to chrome://extensions, enable "Developer mode" (top right)
  • Click "Load unpacked" and select .output/chrome-mv3/
  • Note the extension ID shown on the card — you'll need it for the native host

2. Install the native messaging host

The native host is a small Python program that handles audio download and API calls for AI transcription and translation. It's required for AI features (YouTube's built-in captions work without it).

git clone https://github.com/jmviz/substrata.git
cd substrata/native
./install.sh              # Firefox (default)
./install.sh chrome ID    # Chrome — pass your extension ID from step 1
./install.sh all ID       # Both browsers

Restart your browser after installing.

What does install.sh do?
  1. Creates a Python virtual environment in native/.venv/ using uv sync
  2. Symlinks your existing yt-dlp, ffmpeg, and ffprobe binaries into the venv (browsers launch native hosts with a restricted PATH, so these need to be discoverable)
  3. Writes a JSON manifest to your browser's native messaging hosts directory (e.g for Firefox on macOS: ~/Library/Application Support/Mozilla/NativeMessagingHosts/) — this is how the browser knows where the host binary is and which extensions can use it

You can read the full script at native/install.sh.

3. Configure

Open any YouTube video and click the S button in the player controls. In the settings panel, enter your API keys and endpoints for the models you want to use. For AI transcription, you can use any API provider compatible with OpenAI Whisper API. For AI translation, you can use any provider compatible with OpenAI Chat Completions API. The extension will use the native host to make API calls on your behalf. Local models can be used as well. Just spin up a local model server (e.g. vLLM) that implements the expected API and point the extension to the local server address.

Development

Requirements

  • Node.js 18+
  • npm
  • Python 3.10+ with uv
  • yt-dlp, ffmpeg/ffprobe

Setup

npm install
cd native && ./install.sh && cd ..

Dev server (hot reload)

npm run dev           # Firefox
npm run dev:chrome    # Chrome

This launches WXT in dev mode. The extension auto-reloads on file changes.

Build

npm run build           # Firefox
npm run build:chrome    # Chrome

Output goes to .output/. To produce a distributable zip:

npm run zip             # Firefox
npm run zip:chrome      # Chrome

Tests

npm test              # Run once
npm run test:watch    # Watch mode

Tests use vitest with jsdom. Test files live in __tests__/ directories alongside the code they test.

Updating AGENTS.md

After making significant changes, run the update-agents-md skill to keep AGENTS.md in sync with the codebase. In VS Code Copilot or any compatible agent, type /update-agents-md.

Project layout

entrypoints/
  background.js             Background service worker (native messaging bridge)
  bridge.content.js         MAIN world content script (YouTube player API access)
  youtube.content/          ISOLATED world content script (UI, logic, API calls)
utils/                      Shared modules (state, constants, parsers, etc.)
native/                     Python native messaging host (yt-dlp, Whisper, LLM translation)

About

advanced multilingual subtitles for youtube

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors