Gemini Video Transcript CLI

This repository provides a command-line workflow that converts batches of local video files to audio (.m4a) and submits them to Google’s Gemini 2.5 Flash model for automatic transcription. Use it to bulk-process recordings you already possess—no YouTube links, web UI, or manual uploads required.

Features

Scans a directory for videos with a specified extension (default mp4).
Uses ffmpeg to extract audio only when an .m4a copy does not already exist.
Uploads each audio file to Gemini 2.5 Flash and retrieves a verbatim transcript stored beside the source video.
Cleans up uploaded assets from the Gemini account to avoid orphaned files.

Requirements

Python 3.10+
ffmpeg available on your PATH
Google Generative AI API access and a GOOGLE_API_KEY
Python dependencies from requirements.txt (google-generativeai, python-dotenv)

Setup

python -m venv .venv
source .venv/bin/activate  # Windows: .venv\Scripts\activate
pip install -r requirements.txt
touch .env
echo "GOOGLE_API_KEY=your_key" >> .env

Usage

python app.py --dir data/videos --format mp4

--dir points to the folder with your source files.
--format is the input video extension (omit the dot). The default is mp4; change to mov, mkv, etc., as needed.
Each processed video produces:
- <name>.m4a (audio) — skipped if it already exists.
- <name>.txt — plain-text transcript emitted next to the video.

Troubleshooting

ffmpeg missing: install via your package manager (e.g., brew install ffmpeg, choco install ffmpeg) and reopen the shell.
Authentication errors: ensure .env contains a valid GOOGLE_API_KEY and that the key has access to Gemini 2.5 Flash.
Quota limits: the script stops when Gemini rejects an upload; retry after confirming usage limits or switch to a paid tier.

Contributing

Issues and pull requests are welcome. Please describe the scenario you processed (--dir, sample formats), list commands you ran, and attach relevant logs or transcript snippets to expedite reviews.

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
images		images
.gitattributes		.gitattributes
.gitignore		.gitignore
AGENTS.md		AGENTS.md
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Gemini Video Transcript CLI

Features

Requirements

Setup

Usage

Troubleshooting

Contributing

About

Uh oh!

Releases

Packages

Languages

howardpchen/Video-Transcript-Summary

Folders and files

Latest commit

History

Repository files navigation

Gemini Video Transcript CLI

Features

Requirements

Setup

Usage

Troubleshooting

Contributing

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages