☀️ Tonalli

"Your tonalli for learning" (toh-NAH-lee)

Transform spoken wisdom into written knowledge.

In Aztec/Mexica cosmology, tonalli represents one's vital energy, destiny, and inner fire - the warmth and consciousness housed in one's head that drives learning and spiritual growth. We honor this sacred concept as we help capture and illuminate the knowledge within video and audio.

Features

Video & Audio Transcription: Upload video files (.mp4, .mov, .mkv, etc.) or audio files (.mp3, .wav, .m4a, etc.) and get accurate transcriptions using OpenAI's Whisper model
Project Management: Organize multiple videos into projects for collective analysis
AI-Powered Q&A: Ask questions about your video content using Google Gemini AI
Smart Summaries: Generate comprehensive overviews with key topics and suggested questions
Modern UI: Clean, Bootstrap-based interface with custom color scheme
Persistent Storage: Transcripts are saved permanently while original files are cleaned up automatically
Security-First: No API keys stored in code, temporary file cleanup, input validation

Tech Stack

Backend: FastAPI, Python 3.11+
Transcription: OpenAI Whisper (large model, configurable)
AI Integration: Google Gemini API
Audio Processing: FFmpeg
Database: SQLite with SQLAlchemy
Frontend: HTML, JavaScript, Bootstrap 5

Prerequisites

Python 3.11 or higher
FFmpeg (for video processing)
(Optional) Google Gemini API key for AI features

Installing FFmpeg

macOS:

brew install ffmpeg

Ubuntu/Debian:

sudo apt update
sudo apt install ffmpeg

Windows: Download from https://ffmpeg.org/download.html

Quick Start

1. Clone or Download

cd tldr-vid

2. Run Setup

./setup.sh

This will:

Create a Python virtual environment
Install all dependencies
Create necessary directories
Set up the .env file

3. Configure (Optional)

Edit .env to add your Gemini API key for AI features:

GEMINI_API_KEY=your_api_key_here
WHISPER_MODEL=large
MAX_FILE_SIZE_MB=1024

4. Run the Application

./run.sh

Or manually:

source venv/bin/activate
uvicorn main:app --reload

5. Open Your Browser

Navigate to: http://localhost:8000

Usage Guide

Creating a Project

Click "New Project" in the sidebar
Enter a project name and optional description
Click "Create Project"

Uploading Videos

Select a project from the sidebar
Drag and drop a video/audio file into the upload zone, or click "Select File"
Wait for the transcription to complete (this may take a few minutes depending on file size)

Using AI Features

Generate Overview:

After uploading transcripts, click "Generate Overview"
View the summary, key topics, and suggested questions

Ask Questions:

Type your question in the Q&A section
Press Enter or click the send button
The AI will answer based on all transcripts in the project
Conversation history is maintained for context

Managing Projects

Click on a project to view its transcripts and conversations
Use the trash icon to delete a project (this will delete all associated data)
Download individual transcripts using the download button on each transcript card

Project Structure

tldr-vid/
├── main.py                 # FastAPI application entry point
├── config.py               # Configuration management
├── models.py               # Database models
├── transcription.py        # Whisper & FFmpeg integration
├── ai_integration.py       # Gemini AI integration
├── static/
│   ├── index.html         # Frontend HTML
│   ├── script.js          # Frontend JavaScript
│   └── styles.css         # Custom styles
├── uploads/               # Temporary file storage
├── transcripts/           # Permanent transcript storage
├── setup.sh               # Setup script
├── run.sh                 # Run script
├── requirements.txt       # Python dependencies
├── .env.example           # Environment variables template
├── .env                   # Your configuration (not in git)
├── .gitignore            # Git ignore rules
├── LICENSE               # BSD 3-Clause License
└── README.md             # This file

Configuration

Edit .env to customize:

Variable	Default	Description
`GEMINI_API_KEY`	-	Google Gemini API key (optional)
`WHISPER_MODEL`	large	Whisper model size (tiny/base/small/medium/large)
`MAX_FILE_SIZE_MB`	1024	Maximum upload size in MB
`HOST`	0.0.0.0	Server host
`PORT`	8000	Server port

Whisper Model Options

tiny: Fastest, least accurate (~1GB)
base: Fast, basic accuracy (~1GB)
small: Balanced (~2GB)
medium: Good accuracy (~3GB)
large: Best accuracy, slower (~6GB) - Default

API Endpoints

Projects

POST /api/projects - Create a new project
GET /api/projects - List all projects
GET /api/projects/{id} - Get project details
DELETE /api/projects/{id} - Delete a project

Transcription

POST /api/transcribe - Upload and transcribe a file
GET /api/projects/{id}/transcripts - List project transcripts
GET /api/transcripts/{id}/download - Download transcript

AI Features

POST /api/ai/overview - Generate AI overview
POST /api/ai/ask - Ask a question
GET /api/projects/{id}/conversation - Get conversation history

System

GET /api/health - System health check

Security Features

No API Key Storage: API keys are only in .env (excluded from git)
Input Validation: File types, sizes, and MIME types are validated
Automatic Cleanup: Uploaded videos/audio are deleted after transcription
No Shell Injection: Uses safe subprocess calls
CORS Protection: Configurable CORS middleware
Error Handling: No internal paths or stack traces exposed

Troubleshooting

FFmpeg Not Found

If you see "FFmpeg not installed":

Install FFmpeg using instructions above
Ensure FFmpeg is in your system PATH
Restart the application

AI Features Not Working

If AI features are unavailable:

Check that GEMINI_API_KEY is set in .env
Verify your API key is valid
Check your internet connection
Review the console for error messages

Transcription Fails

Ensure the file format is supported
Check file size is under the limit
Verify FFmpeg is working: ffmpeg -version
Check disk space for temporary files

Port Already in Use

If port 8000 is occupied:

Edit .env and change PORT=8000 to another port
Or stop the other application using port 8000

Development

Running in Development Mode

source venv/bin/activate
uvicorn main:app --reload --log-level debug

Installing Additional Dependencies

source venv/bin/activate
pip install package-name
pip freeze > requirements.txt

Performance Tips

Use smaller Whisper models (base/small) for faster transcription
Keep video files under 500MB for optimal performance
The first transcription will be slower as Whisper downloads the model
Subsequent transcriptions are faster as the model is cached

Contributing

This tool is designed for personal/educational use. Feel free to fork and modify for your needs.

License

This project is licensed under the BSD 3-Clause License - see the LICENSE file for details.

Acknowledgments

OpenAI Whisper for speech recognition
Google Gemini for AI-powered Q&A
FastAPI for the web framework
FFmpeg for audio/video processing

Support

For issues, questions, or suggestions:

Check the troubleshooting section above
Review the console output for error messages
Ensure all prerequisites are installed correctly

☀️ Tonalli - Built with respect and reverence for learning, inspired by ancient wisdom

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
design		design
static		static
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
DOCKER.md		DOCKER.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
ai_integration.py		ai_integration.py
config.py		config.py
docker-compose.yml		docker-compose.yml
logger_config.py		logger_config.py
main.py		main.py
migrate_add_overview.py		migrate_add_overview.py
models.py		models.py
requirements.txt		requirements.txt
run.sh		run.sh
setup.sh		setup.sh
transcription.py		transcription.py
worker.py		worker.py

Folders and files

Latest commit

History

Repository files navigation

☀️ Tonalli

Features

Tech Stack

Prerequisites

Installing FFmpeg

Quick Start

1. Clone or Download

2. Run Setup

3. Configure (Optional)

4. Run the Application

5. Open Your Browser

Usage Guide

Creating a Project

Uploading Videos

Using AI Features

Managing Projects

Project Structure

Configuration

Whisper Model Options

API Endpoints

Projects

Transcription

AI Features

System

Security Features

Troubleshooting

FFmpeg Not Found

AI Features Not Working

Transcription Fails

Port Already in Use

Development

Running in Development Mode

Installing Additional Dependencies

Performance Tips

Contributing

License

Acknowledgments

Support

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages