# TheoremExplainAgent (TEA) 🍵 - Google Colab Setup

This notebook demonstrates how to set up and run TheoremExplainAgent in Google Colab with OpenRouter API support.

## What is TheoremExplainAgent?

TheoremExplainAgent is an AI-powered system that generates educational video content for mathematical theorems using:
- **Manim Community**: Mathematical animations
- **AI Models**: For content generation and planning
- **Text-to-Speech**: For narration
- **Multi-provider Support**: OpenAI, Anthropic, Google, OpenRouter, and more

## Prerequisites

1. **OpenRouter API Key** (recommended for cost-effective access): Get yours at [OpenRouter.ai](https://openrouter.ai/)
2. Or any other supported LLM provider (OpenAI, Anthropic, Google, etc.)


## 1. Installation and Setup

### Install System Dependencies

In [None]:
# Install system dependencies for Manim and audio processing
!apt-get update -qq
!apt-get install -y -qq portaudio19-dev libsdl-pango-dev ffmpeg

# Install LaTeX for mathematical typesetting
!apt-get install -y -qq texlive texlive-latex-extra texlive-fonts-extra texlive-latex-recommended texlive-science tipa

print("✅ System dependencies installed successfully!")

### Clone Repository and Install Python Dependencies

In [None]:
# Clone the repository
!git clone https://github.com/dr-data/TheoremExplainAgent.git
%cd TheoremExplainAgent

# Install Python dependencies
!pip install -r requirements.txt

print("✅ Repository cloned and dependencies installed!")

### Download Required Models

In [None]:
# Create models directory and download Kokoro TTS models
!mkdir -p models
!wget -P models https://github.com/thewh1teagle/kokoro-onnx/releases/download/model-files/kokoro-v0_19.onnx
!wget -P models https://github.com/thewh1teagle/kokoro-onnx/releases/download/model-files/voices.bin

print("✅ Kokoro TTS models downloaded successfully!")

## 2. Environment Configuration

### Set up API Keys and Environment Variables

In [None]:
import os
from getpass import getpass

# Set up OpenRouter API (recommended for cost-effective access)
print("🔑 Setting up OpenRouter API")
print("Get your API key from: https://openrouter.ai/")
openrouter_key = getpass("Enter your OpenRouter API key: ")

# Set environment variables
os.environ["OPENROUTER_API_KEY"] = openrouter_key
os.environ["OPENROUTER_API_BASE_URL"] = "https://openrouter.ai/api/v1"

# Kokoro TTS Settings
os.environ["KOKORO_MODEL_PATH"] = "models/kokoro-v0_19.onnx"
os.environ["KOKORO_VOICES_PATH"] = "models/voices.bin"
os.environ["KOKORO_DEFAULT_VOICE"] = "af"
os.environ["KOKORO_DEFAULT_SPEED"] = "1.0"
os.environ["KOKORO_DEFAULT_LANG"] = "en-us"

# Set Python path for proper imports
import sys
sys.path.append('/content/TheoremExplainAgent')
os.environ["PYTHONPATH"] = "/content/TheoremExplainAgent"

print("✅ Environment configured successfully!")
print("📝 You can also use other providers like OpenAI, Anthropic, Google, etc.")

### Optional: Configure Alternative Providers

If you prefer to use other LLM providers, uncomment and run the relevant section below:

In [None]:
# Uncomment the provider you want to use:

# # OpenAI
# openai_key = getpass("Enter your OpenAI API key: ")
# os.environ["OPENAI_API_KEY"] = openai_key

# # Anthropic
# anthropic_key = getpass("Enter your Anthropic API key: ")
# os.environ["ANTHROPIC_API_KEY"] = anthropic_key

# # Google Gemini
# gemini_key = getpass("Enter your Google Gemini API key: ")
# os.environ["GEMINI_API_KEY"] = gemini_key

# # Azure OpenAI
# azure_key = getpass("Enter your Azure OpenAI API key: ")
# azure_base = input("Enter your Azure OpenAI base URL: ")
# azure_version = input("Enter your Azure OpenAI API version: ")
# os.environ["AZURE_API_KEY"] = azure_key
# os.environ["AZURE_API_BASE"] = azure_base
# os.environ["AZURE_API_VERSION"] = azure_version

print("ℹ️ Alternative provider setup complete (if configured)")

## 3. Test the Setup

### Verify Installation and API Connectivity

In [None]:
# Test import of key modules
try:
    from mllm_tools.litellm import LiteLLMWrapper
    from src.config.config import Config
    import litellm
    print("✅ Core modules imported successfully")
except ImportError as e:
    print(f"❌ Import error: {e}")
    print("🔧 Try restarting the runtime and running all cells again")

# Test model connectivity with a simple completion
try:
    # Test with OpenRouter model (cost-effective)
    test_model = LiteLLMWrapper(
        model_name="openrouter/openai/gpt-4o-mini",  # Cost-effective model
        temperature=0.7,
        print_cost=True,
        verbose=True,
        use_langfuse=False  # Disable for testing
    )
    
    # Simple test message
    test_messages = [
        {"type": "text", "content": "Hello! Can you briefly explain what a mathematical theorem is?"}
    ]
    
    response = test_model(test_messages)
    print("\n✅ Model test successful!")
    print(f"📝 Response preview: {response[:100]}...")
    
except Exception as e:
    print(f"❌ Model test failed: {e}")
    print("🔧 Check your API key and internet connection")

## 4. Generate Educational Videos

### Example 1: Single Topic Generation

In [None]:
# Generate a video for a single mathematical topic
!python generate_video.py \
    --model "openrouter/openai/gpt-4o-mini" \
    --helper_model "openrouter/openai/gpt-4o-mini" \
    --output_dir "output/pythagorean_theorem" \
    --topic "Pythagorean theorem" \
    --context "fundamental relation in Euclidean geometry among the three sides of a right triangle: a² + b² = c²" \
    --verbose

print("\n✅ Single topic generation complete!")
print("📁 Check the output/pythagorean_theorem directory for results")

### Example 2: Batch Generation from Dataset

In [None]:
# Generate videos for multiple theorems from the provided dataset
!python generate_video.py \
    --model "openrouter/anthropic/claude-3-haiku" \
    --helper_model "openrouter/anthropic/claude-3-haiku" \
    --output_dir "output/batch_experiment" \
    --theorems_path "data/easy_20.json" \
    --sample_size 3 \
    --max_scene_concurrency 2 \
    --max_topic_concurrency 1 \
    --verbose

print("\n✅ Batch generation complete!")
print("📁 Check the output/batch_experiment directory for results")

## 5. Advanced Features

### Enable RAG (Retrieval Augmented Generation)

In [None]:
# Optional: Set up RAG for enhanced content generation
# This requires additional setup - see the main repository documentation

print("ℹ️ RAG setup is optional and requires additional configuration.")
print("📖 See the main repository documentation for RAG setup instructions.")
print("🔗 https://github.com/dr-data/TheoremExplainAgent#generation-with-rag")

# Example with RAG (uncomment if you have set up RAG)
# !python generate_video.py \
#     --model "openrouter/openai/gpt-4o-mini" \
#     --helper_model "openrouter/openai/gpt-4o-mini" \
#     --output_dir "output/rag_example" \
#     --topic "Complex numbers" \
#     --context "numbers that can be expressed in the form a + bi, where a and b are real numbers" \
#     --use_rag \
#     --chroma_db_path "data/rag/chroma_db" \
#     --manim_docs_path "data/rag/manim_docs" \
#     --verbose

### Compare Different Models

In [None]:
# Example: Compare different OpenRouter models for the same topic
models_to_test = [
    "openrouter/openai/gpt-4o-mini",           # Cost-effective, high quality
    "openrouter/anthropic/claude-3-haiku",    # Fast and efficient
    "openrouter/meta-llama/llama-3.1-8b-instruct", # Open source alternative
]

topic = "Quadratic formula"
context = "formula that provides the solution(s) to a quadratic equation: x = (-b ± √(b² - 4ac)) / 2a"

for i, model in enumerate(models_to_test):
    print(f"\n🧪 Testing model {i+1}/{len(models_to_test)}: {model}")
    
    !python generate_video.py \
        --model "{model}" \
        --helper_model "{model}" \
        --output_dir "output/model_comparison_{i+1}" \
        --topic "{topic}" \
        --context "{context}" \
        --only_plan  # Only generate plans to compare approaches quickly
    
    print(f"✅ Model {i+1} planning complete")

print("\n🎯 Model comparison complete!")
print("📊 Compare the generated plans in the different output directories")

## 6. View Results

### List Generated Files

In [None]:
# List all generated output directories
!ls -la output/

print("\n📁 Generated content structure:")
print("Each output directory contains:")
print("  📋 scene_outlines.json - Generated scene plans")
print("  📝 implementation_plans/ - Detailed implementation plans")
print("  🎬 scenes/ - Individual scene videos")
print("  🎥 final_video.mp4 - Combined final video (if generation completed)")
print("  📊 logs/ - Generation logs and metadata")

### Download Generated Videos

In [None]:
from google.colab import files
import os
import glob

# Find all generated final videos
video_files = glob.glob("output/*/final_video.mp4")

if video_files:
    print(f"📥 Found {len(video_files)} video(s) to download:")
    for video in video_files:
        print(f"  🎥 {video}")
        try:
            files.download(video)
            print(f"✅ Downloaded: {video}")
        except Exception as e:
            print(f"❌ Error downloading {video}: {e}")
else:
    print("ℹ️ No final videos found. Videos may still be generating or only plans were created.")
    print("💡 You can also download individual scene videos or other generated files.")

# Alternative: Download specific output directory as zip
print("\n📦 To download entire output directories:")
print("!zip -r output_backup.zip output/")
print("files.download('output_backup.zip')")

## 7. Troubleshooting & Tips

### Common Issues and Solutions

1. **Import Errors**: 
   - Restart runtime and run all cells again
   - Ensure PYTHONPATH is set correctly

2. **API Errors**:
   - Check your API key is correct
   - Verify you have sufficient credits/quota
   - Try a different model if rate limited

3. **Memory Issues**:
   - Reduce `max_scene_concurrency` and `max_topic_concurrency`
   - Use smaller models for testing
   - Generate fewer topics at once

4. **Video Generation Fails**:
   - Check LaTeX installation
   - Ensure Kokoro models downloaded correctly
   - Use `--only_plan` to test planning first

### Performance Tips

- **Cost Optimization**: Use OpenRouter's cost-effective models like `gpt-4o-mini` or `claude-3-haiku`
- **Speed**: Start with small sample sizes and `--only_plan` for quick testing
- **Quality**: Use better models for final production runs
- **Debugging**: Enable `--verbose` for detailed logs

### Additional Resources

- 📖 [Main Repository](https://github.com/dr-data/TheoremExplainAgent)
- 🔧 [LiteLLM Documentation](https://docs.litellm.ai/docs/providers)
- 🎨 [Manim Community Documentation](https://docs.manim.community/)
- 🌐 [OpenRouter Models](https://openrouter.ai/models)
