Chat with your video library. Upload videos to mixedbread, then ask questions and get answers grounded in video transcriptions with inline citations, timestamps, and YouTube links.
Built with Next.js, Vercel AI SDK, Google Gemini, and the mixedbread SDK.
- Bun (or Node.js)
- A mixedbread API key
- A Google AI API key (for Gemini)
bun install
cp .env.example .env # fill in your keys
bun run devThe app runs at http://localhost:3000.
| Variable | Description |
|---|---|
MXBAI_API_KEY |
mixedbread API key |
GOOGLE_GENERATIVE_AI_API_KEY |
Google Gemini API key |
Videos are uploaded into a mixedbread store called videos. mixedbread automatically transcribes, chunks, and embeds the video content — no preprocessing required.
Supported video formats: MP4, WebM, MOV, AVI, OGV.
If the store doesn't exist yet:
import { Mixedbread } from "@mixedbread/sdk";
const mxbai = new Mixedbread({ apiKey: process.env.MXBAI_API_KEY });
await mxbai.stores.create({ name: "videos" });import * as fs from "node:fs";
import { Mixedbread } from "@mixedbread/sdk";
const mxbai = new Mixedbread({ apiKey: process.env.MXBAI_API_KEY });
const file = await mxbai.stores.files.upload({
storeIdentifier: "videos",
file: fs.createReadStream("Lec 01. Introduction to Deep Learning [6FkRvTtUc-o].mp4"),
});
console.log(file.id); // file is now processingFiles go through pending → in_progress → completed. For large files (>100 MB), multipart upload kicks in automatically. You can customize the behavior:
await mxbai.stores.files.upload({
storeIdentifier: "videos",
file: fs.createReadStream("./long-lecture.mp4"),
multipartUpload: {
threshold: 50 * 1024 * 1024, // trigger at 50 MB instead of 100 MB
concurrency: 10, // parallel upload streams (default: 5)
onPartUpload: ({ partNumber, totalParts, uploadedBytes, totalBytes }) => {
console.log(`Part ${partNumber}/${totalParts} — ${Math.round((uploadedBytes / totalBytes) * 100)}%`);
},
},
});You can also upload files through the mixedbread dashboard.
Name your video files like this:
Title [YOUTUBE_ID].mp4
For example: Lec 01. Introduction to Deep Learning [6FkRvTtUc-o].mp4
The YouTube ID in brackets is used to generate thumbnail URLs and link citations back to the original YouTube video at the correct timestamp.
When mixedbread processes a video, it automatically generates metadata on each chunk including:
start_time_seconds/end_time_seconds— timestamp range of the chunktotal_duration_seconds— total length of the videotranscription— the transcribed text for that chunk
This metadata powers the timestamped citations in the chat UI.
Each video has a lecture metadata key. The app exposes a GET /api/lectures endpoint that uses mixedbread's metadata facets to list all available lectures with their chunk counts:
const response = await mxbai.stores.metadataFacets({
store_identifiers: ["videos"],
facets: ["lecture"],
});
// response.facets.lecture → { "Introduction to Deep Learning": 42, "Backpropagation": 38, ... }Search happens automatically when you chat. Every user message triggers a searchKnowledge tool call that:
- Calls
mxbai.stores.search()on thevideosstore withtop_k: 5 - Expands each result by ±1 neighboring chunks for more context
- Merges overlapping regions within the same file
- Returns transcriptions with timestamps and relevance scores
The AI synthesizes an answer with inline citations (e.g. [1], [2]) that link back to the source videos. Hover a citation to see the video thumbnail, timestamp, and transcription snippet.
const results = await mxbai.stores.search({
query: "How does backpropagation work?",
store_identifiers: ["videos"],
top_k: 5,
});
for (const chunk of results.data) {
console.log(chunk.score, chunk.transcription);
}Search supports additional options like filters for metadata filtering, rerank for second-stage ranking, and rewrite_query for AI-powered query expansion. See the search docs for details.