🌐 Try the free web app → · drop a .srt or .vtt, fix the sync, download. Nothing uploaded.
You exported captions and the player won't take them (it wants WebVTT, you have
SRT). Or the subtitles play 2 seconds late. Or they slowly drift out of
sync by the end of the video. Fixing this by hand means editing dozens of
00:01:23,456 timestamps without breaking the format — and one stray comma or
missing millisecond makes the whole file invalid.
captionkit does it correctly: convert SRT ⇄ WebVTT, shift every cue, scale/resync to fix drift, and fix overlaps — with exact millisecond timecode math, zero dependencies, and 100% locally.
📸 Screenshot / demo GIF:
./web/screenshot.png— record the live app dropping an out-of-sync SRT and nudging it into place.
- AI can't reliably do this. Re-timing every cue and emitting spec-valid
SRT/WebVTT (comma vs dot,
WEBVTTheader, numbering rules, CRLF) is exact, fiddly format work — a chatbot will quietly corrupt a timestamp. It's a job for a small, tested, deterministic tool. - Privacy & friction. Online subtitle converters make you upload your file and wait. captionkit runs on your machine, instantly.
- The drift fix is the "aha".
resynclinearly maps the first and last cue to where they should be — repairing subtitles that slip out of sync as the video goes on. Most tools only do a flat shift.
Video creators & YouTubers, marketers (captioned social video), educators, accessibility teams, translators, and developers building media tooling who want a tiny subtitle library.
No install — just open the web app.
For the library:
npm install captionkitZero dependencies. ESM + CJS + TypeScript types. Runs in the browser, Node, Deno and Bun.
import { parse, convert, shift, resync, toVTT } from "captionkit";
// Convert SRT → WebVTT (auto-detects the input)
convert(srtText, "vtt");
// Parse, then re-time
const cues = parse(srtText);
shift(cues, 2500); // everything 2.5s later
shift(cues, -1000); // …or earlier (clamped at 0)
// Fix drift: map the first cue to 1.0s and the last to 10:00.0
resync(cues, { firstStart: 1000, lastStart: 600000 });
toVTT(cues); // serialize back outimport { scale, fixOverlaps, totalDuration } from "captionkit";
scale(cues, 25 / 23.976); // framerate conversion
fixOverlaps(cues, 40); // no two cues overlap (40ms min gap)
totalDuration(cues); // total on-screen time (ms)| Function | Description |
|---|---|
parse(text) |
Parse SRT or WebVTT → Cue[] ({ index, start, end, text }, ms). |
toSRT(cues) / toVTT(cues) / convert(text, to) |
Serialize / convert. |
detectFormat(text) |
"srt" or "vtt". |
shift(cues, ms) |
Offset all cues. |
scale(cues, factor) |
Multiply all timestamps. |
resync(cues, { firstStart, lastStart }) |
Linear drift correction. |
fixOverlaps(cues, minGap?) |
Remove overlaps. |
renumber / totalDuration |
Helpers. |
parseTimecode / formatSRT / formatVTT |
Timecode ⇄ ms. |
Is my subtitle file uploaded anywhere? No. Everything runs on your device — no server, no telemetry, works offline.
What formats are supported? SRT and WebVTT today. ASS/SSA and SBV are on the roadmap — open an issue.
What's the difference between shift, scale and resync? Shift adds a fixed offset (captions are uniformly early/late). Scale multiplies timestamps (framerate mismatch). Resync pins the first and last cue to target times and interpolates the rest — the fix for gradual drift.
Will it keep my line breaks and styling? Multi-line cue text is preserved. Inline WebVTT tags pass through as text; full ASS styling is not converted (yet).
Can I use it in a build or batch script? Yes — the library works in Node; map the functions over your files.
Contributions welcome! See CONTRIBUTING.md and the Code of Conduct.
git clone https://github.com/didrod205/captionkit.git
cd captionkit
npm install
npm test # run the suite
npm run dev # run the web app locallycaptionkit is free, MIT-licensed, and built in spare time. If it saved you from hand-editing timestamps, please consider supporting it:
- ⭐ Star this repo — free, and it genuinely helps others find it.
- 🍋 Sponsor via Lemon Squeezy — one-time or recurring support.
Where your support goes: more formats (ASS/SSA, SBV, SAMI), split/merge, characters-per-second warnings, an auto-shift-by-waveform helper, a CLI, keeping the free web app online, and fast issue responses.
MIT © captionkit contributors