Semantic search across thousands of hours of running podcast transcripts. Ask a question in plain English — get answers with exact episode sources and timestamps.
Live: runcast-intelligence.vercel.app · Backend: Railway · Auth: Clerk (Google login)
Most podcast knowledge is locked behind episode titles and show notes. RunCast transcribes every episode with Whisper, embeds the content into a vector database, and lets you search across everything at once using natural language.
Ask: "How do elites taper for a marathon?" and get an answer synthesised from 4,000+ episodes, with the exact timestamp so you can jump straight to the source.
┌─────────────────────────────────────────────────────────────────┐
│ User asks a question │
└────────────────────────────┬────────────────────────────────────┘
│
Embed query (OpenAI)
│
▼
┌─────────────────────┐
│ Supabase pgvector │ ← similarity search
│ 4,151 episodes │
│ chunked + embedded │
└──────────┬──────────┘
│ top-k chunks with timestamps
▼
┌─────────────────────┐
│ LLM (OpenRouter) │ ← RAG answer synthesis
└──────────┬──────────┘
│
▼
Answer + episode sources + timestamps
RSS Feeds (9 podcasts, 4,151 episodes)
│
▼
Crawler (scripts/crawl.py)
│ stores episode metadata
▼
Supabase
│
├── Transcription pipeline
│ ffmpeg (split >25MB audio) → OpenAI Whisper → raw transcript
│ stored in Supabase with chunk offsets
│
└── Embedding pipeline
text-embedding-3-small → pgvector
chunk size: 500 tokens, 50-token overlap
FastAPI (src/api/)
├── POST /search → embed query → pgvector → LLM → response
└── GET /health
Next.js frontend
├── Public homepage
└── Search (Clerk auth required)
| Podcast | Host |
|---|---|
| The Running Explained Podcast | Elisabeth Scott |
| Ali on the Run Show | Ali Feller |
| The Strength Running Podcast | Jason Fitzgerald |
| The CITIUS MAG Podcast | Chris Chavez |
| The Morning Shakeout Podcast | Mario Fraioli |
| Run to the Top | Runners Connect |
| Some Work, All Play | David & Megan Roche |
| Real Talk Running | — |
| The Planted Runner | — |
| Layer | Technology |
|---|---|
| Backend | Python · FastAPI |
| Database | Supabase (PostgreSQL + pgvector) |
| Transcription | OpenAI Whisper (with ffmpeg chunking for >25MB files) |
| Embeddings | OpenAI text-embedding-3-small |
| LLM | OpenRouter |
| Frontend | Next.js · TypeScript · Tailwind |
| Auth | Clerk (Google login) |
| Backend hosting | Railway |
| Frontend hosting | Vercel |
- Python 3.11+
- Node 20+
- ffmpeg (
brew install ffmpeg) - A Supabase project (free tier)
- An OpenAI API key
- An OpenRouter API key
git clone https://github.com/lmenta/runcast-intelligence
cd runcast-intelligence
python -m venv .venv && source .venv/bin/activate
pip install -e ".[dev]"cp .env.example .envFill in .env:
SUPABASE_URL=https://your-project.supabase.co
SUPABASE_SERVICE_KEY=your-service-role-key
OPENAI_API_KEY=sk-...
OPENROUTER_API_KEY=sk-or-...Open the Supabase SQL editor and run both migrations in order:
# Copy the contents of each file and run in Supabase SQL editor
supabase/migrations/001_initial_schema.sql
supabase/migrations/002_add_transcript.sqlThis creates the episodes, chunks, and podcasts tables with pgvector enabled.
make setup # seeds 9 podcasts and crawls all RSS feeds
make check-feeds # verify all feeds are reachableThis populates the episodes table with metadata (title, date, audio URL) but no transcripts yet.
make transcribe # transcribes 3 episodes (~$0.15 in OpenAI credits)For large audio files (>25MB), the pipeline automatically splits them with ffmpeg before sending to Whisper.
make embed # chunks transcripts and stores embeddings in pgvectormake search
# Query: how do elites taper for a marathon?make api # FastAPI on http://localhost:8000
make dev # Next.js on http://localhost:3000 (in a second terminal)- Connect this repo to Railway
- Add all environment variables from
.env - Railway picks up
railway.tomlautomatically — no extra config
- Connect the
frontend/directory to Vercel - Add environment variables:
NEXT_PUBLIC_API_URL— your Railway backend URLNEXT_PUBLIC_USE_MOCK=false- Clerk keys (
NEXT_PUBLIC_CLERK_PUBLISHABLE_KEY,CLERK_SECRET_KEY)
For processing large backlogs without keeping a local machine running:
pip install modal
modal secret create runcast \
SUPABASE_URL=... \
SUPABASE_SERVICE_KEY=... \
OPENAI_API_KEY=...
modal deploy src/transcription/modal_worker.pyDeploys a serverless worker that transcribes new episodes on GPU. Cost: ~$0.05/hour of audio.
| Service | Cost |
|---|---|
| Supabase | Free tier |
| Railway | ~$5/month |
| Vercel | Free |
| OpenAI (embeddings) | ~$0.02/episode |
| OpenAI (Whisper) | ~$0.006/minute of audio |
| OpenRouter (search) | ~$0.001/query |