"Your tonalli for learning" (toh-NAH-lee)
Transform spoken wisdom into written knowledge.
In Aztec/Mexica cosmology, tonalli represents one's vital energy, destiny, and inner fire - the warmth and consciousness housed in one's head that drives learning and spiritual growth. We honor this sacred concept as we help capture and illuminate the knowledge within video and audio.
- Video & Audio Transcription: Upload video files (.mp4, .mov, .mkv, etc.) or audio files (.mp3, .wav, .m4a, etc.) and get accurate transcriptions using OpenAI's Whisper model
- Project Management: Organize multiple videos into projects for collective analysis
- AI-Powered Q&A: Ask questions about your video content using Google Gemini AI
- Smart Summaries: Generate comprehensive overviews with key topics and suggested questions
- Modern UI: Clean, Bootstrap-based interface with custom color scheme
- Persistent Storage: Transcripts are saved permanently while original files are cleaned up automatically
- Security-First: No API keys stored in code, temporary file cleanup, input validation
- Backend: FastAPI, Python 3.11+
- Transcription: OpenAI Whisper (large model, configurable)
- AI Integration: Google Gemini API
- Audio Processing: FFmpeg
- Database: SQLite with SQLAlchemy
- Frontend: HTML, JavaScript, Bootstrap 5
- Python 3.11 or higher
- FFmpeg (for video processing)
- (Optional) Google Gemini API key for AI features
macOS:
brew install ffmpegUbuntu/Debian:
sudo apt update
sudo apt install ffmpegWindows: Download from https://ffmpeg.org/download.html
cd tldr-vid./setup.shThis will:
- Create a Python virtual environment
- Install all dependencies
- Create necessary directories
- Set up the
.envfile
Edit .env to add your Gemini API key for AI features:
GEMINI_API_KEY=your_api_key_here
WHISPER_MODEL=large
MAX_FILE_SIZE_MB=1024./run.shOr manually:
source venv/bin/activate
uvicorn main:app --reloadNavigate to: http://localhost:8000
- Click "New Project" in the sidebar
- Enter a project name and optional description
- Click "Create Project"
- Select a project from the sidebar
- Drag and drop a video/audio file into the upload zone, or click "Select File"
- Wait for the transcription to complete (this may take a few minutes depending on file size)
Generate Overview:
- After uploading transcripts, click "Generate Overview"
- View the summary, key topics, and suggested questions
Ask Questions:
- Type your question in the Q&A section
- Press Enter or click the send button
- The AI will answer based on all transcripts in the project
- Conversation history is maintained for context
- Click on a project to view its transcripts and conversations
- Use the trash icon to delete a project (this will delete all associated data)
- Download individual transcripts using the download button on each transcript card
tldr-vid/
├── main.py # FastAPI application entry point
├── config.py # Configuration management
├── models.py # Database models
├── transcription.py # Whisper & FFmpeg integration
├── ai_integration.py # Gemini AI integration
├── static/
│ ├── index.html # Frontend HTML
│ ├── script.js # Frontend JavaScript
│ └── styles.css # Custom styles
├── uploads/ # Temporary file storage
├── transcripts/ # Permanent transcript storage
├── setup.sh # Setup script
├── run.sh # Run script
├── requirements.txt # Python dependencies
├── .env.example # Environment variables template
├── .env # Your configuration (not in git)
├── .gitignore # Git ignore rules
├── LICENSE # BSD 3-Clause License
└── README.md # This file
Edit .env to customize:
| Variable | Default | Description |
|---|---|---|
GEMINI_API_KEY |
- | Google Gemini API key (optional) |
WHISPER_MODEL |
large | Whisper model size (tiny/base/small/medium/large) |
MAX_FILE_SIZE_MB |
1024 | Maximum upload size in MB |
HOST |
0.0.0.0 | Server host |
PORT |
8000 | Server port |
- tiny: Fastest, least accurate (~1GB)
- base: Fast, basic accuracy (~1GB)
- small: Balanced (~2GB)
- medium: Good accuracy (~3GB)
- large: Best accuracy, slower (~6GB) - Default
POST /api/projects- Create a new projectGET /api/projects- List all projectsGET /api/projects/{id}- Get project detailsDELETE /api/projects/{id}- Delete a project
POST /api/transcribe- Upload and transcribe a fileGET /api/projects/{id}/transcripts- List project transcriptsGET /api/transcripts/{id}/download- Download transcript
POST /api/ai/overview- Generate AI overviewPOST /api/ai/ask- Ask a questionGET /api/projects/{id}/conversation- Get conversation history
GET /api/health- System health check
- No API Key Storage: API keys are only in
.env(excluded from git) - Input Validation: File types, sizes, and MIME types are validated
- Automatic Cleanup: Uploaded videos/audio are deleted after transcription
- No Shell Injection: Uses safe subprocess calls
- CORS Protection: Configurable CORS middleware
- Error Handling: No internal paths or stack traces exposed
If you see "FFmpeg not installed":
- Install FFmpeg using instructions above
- Ensure FFmpeg is in your system PATH
- Restart the application
If AI features are unavailable:
- Check that
GEMINI_API_KEYis set in.env - Verify your API key is valid
- Check your internet connection
- Review the console for error messages
- Ensure the file format is supported
- Check file size is under the limit
- Verify FFmpeg is working:
ffmpeg -version - Check disk space for temporary files
If port 8000 is occupied:
- Edit
.envand changePORT=8000to another port - Or stop the other application using port 8000
source venv/bin/activate
uvicorn main:app --reload --log-level debugsource venv/bin/activate
pip install package-name
pip freeze > requirements.txt- Use smaller Whisper models (base/small) for faster transcription
- Keep video files under 500MB for optimal performance
- The first transcription will be slower as Whisper downloads the model
- Subsequent transcriptions are faster as the model is cached
This tool is designed for personal/educational use. Feel free to fork and modify for your needs.
This project is licensed under the BSD 3-Clause License - see the LICENSE file for details.
- OpenAI Whisper for speech recognition
- Google Gemini for AI-powered Q&A
- FastAPI for the web framework
- FFmpeg for audio/video processing
For issues, questions, or suggestions:
- Check the troubleshooting section above
- Review the console output for error messages
- Ensure all prerequisites are installed correctly
☀️ Tonalli - Built with respect and reverence for learning, inspired by ancient wisdom