Skip to content

danielsemerjya/tabscribe

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TabScribe

Free and open-source. Record any browser tab — get a clean, speaker-labeled transcript. The only thing you need is a Gemini API key from Google AI Studio (free tier works).

No subscription. No account. No backend. No telemetry. Just a Chrome extension and your own key.


Why TabScribe

Other meeting transcribers run on remote servers, charge monthly, or require a native desktop app. TabScribe is none of those things.

  • 100% free, MIT-licensed. No paid tier, no trial limits, no "premium features". The whole codebase is here.
  • Free Gemini tier is usually enough. A free Google AI Studio key handles typical personal use. Heavy users pay Google directly — pennies per hour on gemini-3-flash-preview.
  • Local-first. Audio and transcripts live in your browser. The only outbound request is to Gemini, with your own key.
  • No install ceremony. It's a single unpacked Chrome extension. No native binary, no build step, no sign-up.
  • Readable code. Vanilla JS, no bundler, ~1000 lines total. Fork it, change it, audit it.

What you'll need

That's it. Two things, both free.

  1. Google Chrome (or any Chromium-based browser that supports Manifest V3 extensions)
  2. A free Gemini API key from Google AI Studio — sign in with Google, click "Create API key", copy

What it does

  • Captures audio from the active tab — Meet, Zoom (web), YouTube, podcasts, anything that plays through Chrome
  • Mixes in your microphone so your own voice is in the transcript too
  • Sends the recording to Gemini for transcription with auto-detected language
  • Returns clean markdown with Participants, Transcript, and Summary sections
  • Lets you supply a name list and a global dictionary so proper nouns are spelled correctly

Install (Developer Mode)

This extension isn't on the Chrome Web Store. Install it manually in 30 seconds.

  1. Clone or download this repo:
    git clone https://github.com/<your-user>/tabscribe.git
  2. Open chrome://extensions in Chrome
  3. Toggle Developer mode in the top-right corner
  4. Click Load unpacked → select the tabscribe/ folder
  5. Pin the TabScribe icon to your toolbar for quick access

Updates: git pull and click the refresh icon on the extension card.

First-time setup

When you first open the popup you'll see a setup checklist. Two steps:

1. Add your Gemini API key

2. Grant microphone access

  • Right-click the TabScribe icon → Options (opens a real tab — extension permission prompts only work from visible tabs)
  • Click Grant Microphone Access → choose Allow on every visit
  • Click Test Microphone to verify it's actually capturing audio

When both checkmarks turn green, the setup banner disappears and the Record button activates.

Usage

Recording. Open any tab, click TabScribe → Record tab. Chrome shows a red dot on the tab to indicate capture. You'll still hear the tab audio normally. Click Stop recording when done — transcription starts automatically and lands in the list when ready.

Reading. Click any recording to see the transcript. Copy Markdown drops the full formatted transcript onto your clipboard — paste into Notion, Obsidian, Linear, anywhere.

Continuing. If a meeting got interrupted (closed tab, network drop, you forgot to hit record at the start), open the recording and click Resume recording. New audio appends to the existing one and the transcript is regenerated on the combined file.

Getting names right. Two layers help here:

  • Per-recording Participants (in the recording detail view): one name per line. Gemini tries to attribute each spoken line to a real name based on context.
  • Global Dictionary (Settings → Dictionary): proper nouns, acronyms, your last name. Sent with every transcription so Gemini spells them right.

Features

Feature Where
Speaker diarization with real names Recording detail → Participants
Global vocabulary hints Settings → Dictionary
Resume after interruption Recording detail → Resume recording
Re-run transcription with new prompt Recording detail → Re-transcribe
Download raw audio (.webm) Recording detail → Download Audio
Auto-save on tab close Automatic — no setup needed
Bulk cleanup Settings → Maintenance

Models

The dropdown is locked to four Gemini 3 models. Pick based on speed/quality tradeoff:

Model Best for
gemini-3-flash-preview Default. Fast, cheap, great quality for most meetings.
gemini-3-1-flash-lite Even cheaper. Use for long, low-stakes recordings.
gemini-3-pro-preview Higher accuracy. Slower. Long meetings with many overlapping speakers.
gemini-3-1-pro-preview Latest pro variant. Use when others don't satisfy.

If your key doesn't have access to a model, it appears greyed out in the dropdown.

Privacy

  • Your audio reaches exactly one external endpoint: generativelanguage.googleapis.com (Gemini API)
  • The request uses your own API key — Google bills you, not us
  • The extension has no other network access. No telemetry, no analytics, no error reporting
  • API key, recordings, and transcripts are stored in your browser's IndexedDB
  • Nothing is synced to your Google account or any cloud

How it works

  • Manifest V3 extension, plain HTML/CSS/JS, no build step
  • chrome.tabCapture.getMediaStreamId for the tab audio stream
  • navigator.mediaDevices.getUserMedia for the microphone
  • Web Audio API mixes both into a single MediaStreamDestination
  • MediaRecorder (WebM/Opus) writes chunks every second
  • Recording runs in an Offscreen Document so it survives popup closes
  • IndexedDB stores audio blobs and metadata
  • Gemini REST API: inline base64 for audio ≤ 18 MB, Files API (resumable upload) for larger files

Limits

  • Works only with tabs in this Chrome window. Native Zoom / Teams / Meet apps don't expose a tab to capture.
  • Transcription quality depends on audio quality. Mic gain too high or too low = worse output.
  • Concatenated WebM (from Resume) plays fine in modern apps but may glitch in very old players. Gemini handles it without issue.
  • Chrome may terminate the Service Worker on long idle. Active recording keeps it alive via the offscreen document.

Contributing

The codebase is intentionally small and free of frameworks. Pull requests welcome. Ideas: options page UI improvements, transcript editing, export to other formats, content-script integration with Meet/Zoom DOM to auto-fill participant names.

License

MIT. Use it, fork it, ship your own version — just don't ship it as-is to the Chrome Web Store under the TabScribe name.

About

Record any browser tab + mic, get speaker-labeled markdown transcripts via Gemini. Local-first Chrome extension. BYO API key.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages