Build knowledge base from YouTube video transcripts
ytq (short for YouTube Query) is a CLI tool that processes YouTube videos to create a searchable knowledge base. It:
- Downloads and extracts transcripts from YouTube videos
- Uses LLMs to generate structured summaries
- Each transcript is split into multiple chunks (subsections). Each section preserves its start and end times timestemps. Then chunks are embedded using openai 'text-embedding-3-small' model. Created embeddings are used when
--semanticsearch flag is enabled. - Stores everything in a searchable SQLite database
- Provides a CLI for adding videos, searching, and viewing summaries
Install this tool using pip:
pip install ytqIf you are using uv then you can run directly the cli in temporary enviironment like so:
uvx ytq <command> <args>or you can also install it as a tool:
uv tool install ytq
# and then
ytq <command> <args>To add a YouTube video to your knowledge base, use the add command:
ytq add <video_url>Optional parameters:
--chunk-size: Maximum size of each text chunk (default: 1000 characters)--chunk-overlap: Overlap between chunks (default: 100 characters)--provider: LLM summarization provider (default: "openai")--model: LLM summarization model (default: "gpt-4o-mini")
Example:
ytq add https://youtube.com/watch?v=example --chunk-size 1500 --provider anthropicIf you try storing a video that is already in the db, the old version is removed and replaced with the new version.
Search your knowledge base using the query command:
ytq query <search_term>Search options:
--chunks: Enable chunk-level search--semantic: Enable semantic search (when chunk search is enabled)--limit: Maximum number of results (default: 3)
Examples:
# Video-level search (default)
ytq query "machine learning"
# Chunk-level keyword search
ytq query "neural networks" --chunks
# Semantic chunk-level search
ytq query "types of algorithms" --chunks --semanticTo view a summary of a specific video:
ytq summary <video_id>Example:
ytq summary dQw4w9WgXcQTo remove a video from the knowledge base:
ytq delete <video_id>To check the version of ytq:
ytq --versionTo contribute to this tool, first checkout the code. Then create a new virtual environment:
cd ytq
python -m venv venv
source venv/bin/activateNow install the dependencies and test dependencies:
pip install -e '.[test]'To run the tests:
python -m pytest