# Gemini Batch Prediction Framework
## GSoC 2025 - Week 1 Demo

**Goal** Achieve 4-5x reduction in Gemini API calls through intelligent batch processing

### Core Innovation
Traditional approach: Send each question separately, repeating the same content
```
Question 1: [CONTENT] + [Q1] → API Call 1
Question 2: [CONTENT] + [Q2] → API Call 2  
Question 3: [CONTENT] + [Q3] → API Call 3
```

Batch approach: Send all questions together
```
Batch: [CONTENT] + [Q1, Q2, Q3, ...] → Single API Call
```

### Why This Works: Token Economics

**Efficiency = Generated Tokens / Total Tokens**

For large content (like video transcripts), content tokens dominate the calculation:

In [None]:
# Example: Educational video analysis (1 hour video ≈ 946,800 tokens)
def calculate_efficiency_example():
    # Constants for 1-hour video
    video_tokens = 946800  # 1 hour × 263 tokens/second
    question_tokens = 50   # Average question length
    answer_tokens = 100    # Average answer length

    print("TOKEN ECONOMICS COMPARISON")
    print("=" * 50)

    # Individual processing (5 questions)
    questions = 5
    individual_total = questions * (video_tokens + question_tokens + answer_tokens)
    individual_generated = questions * answer_tokens
    individual_efficiency = individual_generated / individual_total

    # Batch processing (5 questions)
    batch_total = (
        video_tokens + (questions * question_tokens) + (questions * answer_tokens)
    )
    batch_generated = questions * answer_tokens
    batch_efficiency = batch_generated / batch_total

    print(f"Individual Processing ({questions} separate calls):")
    print(f"  Total tokens: {individual_total:,}")
    print(f"  Efficiency: {individual_efficiency:.4f} "
          f"({individual_efficiency*100:.2f}%)")

    print("\nBatch Processing (1 call):")
    print(f"  Total tokens: {batch_total:,}")
    print(f"  Efficiency: {batch_efficiency:.4f} ({batch_efficiency*100:.2f}%)")

    improvement = batch_efficiency / individual_efficiency
    token_savings = individual_total - batch_total

    print("\nResult:")
    print(f"  Efficiency improvement: {improvement:.1f}×")
    print(f"  Tokens saved: {token_savings:,} "
          f"({(token_savings/individual_total)*100:.1f}%)")

    return improvement

theoretical_improvement = calculate_efficiency_example()
print(
    f"\n💡 Theoretical max efficiency ≈ "
    f"{theoretical_improvement:.1f}× for this scenario"
)

**Expected efficiency:** Content-heavy scenarios approach N× improvement where N = number of questions

---

## 🔧 Setup and Installation

In [None]:
# Install the package with visualization dependencies
%pip install -e .[viz]

import os
import sys

# Add the project root to Python path (adjust if needed)
sys.path.append("..")

import warnings

# Try to import visualization dependencies
try:
    import matplotlib.pyplot as plt
    import seaborn as sns

    from gemini_batch.visualization import (
        create_efficiency_visualizations,
        run_efficiency_experiment,
        visualize_scaling_results,
    )

    # Configure plotting
    plt.style.use("seaborn-v0_8")
    sns.set_palette("husl")

    VISUALIZATION_AVAILABLE = True
    print("✅ Visualization dependencies loaded successfully")

except ImportError as e:
    print("⚠️  Visualization dependencies not available.")
    print("   Run: pip install -e .[viz] to enable visualizations")
    print(f"   Error: {e}")
    VISUALIZATION_AVAILABLE = False

from gemini_batch import BatchProcessor

warnings.filterwarnings("ignore")

In [None]:
# API Key Setup
from dotenv import load_dotenv

load_dotenv()

# Verify API key is available
api_key = os.getenv('GEMINI_API_KEY')
if not api_key:
    print("⚠️  Please set GEMINI_API_KEY in your .env file")
    print("   You can get a key from: https://ai.dev/")
else:
    print("✅ API key loaded successfully")

# Initialize processor
processor = BatchProcessor()
print("✅ Batch processor initialized")

---

## 📊 Interactive Demo: Content Analysis

### Demo Content: Educational AI Article


In [None]:
# Educational content for demonstration
demo_content = """
Artificial Intelligence (AI) represents one of the most transformative technologies
of the 21st century, fundamentally reshaping how we interact with information,
solve complex problems, and understand the world around us. The field has evolved
dramatically from its early theoretical foundations in the 1950s to today's
sophisticated systems that demonstrate remarkable capabilities across multiple domains.

Modern AI systems excel in natural language processing, enabling machines to
understand, interpret, and generate human language with unprecedented accuracy.
These systems can translate between languages, summarize complex documents,
answer questions, and even engage in creative writing tasks. Computer vision
has similarly advanced, allowing AI to recognize objects, faces, and patterns
in images and videos with superhuman precision in many cases.

Machine learning, the driving force behind most modern AI applications, enables
systems to learn from data without being explicitly programmed for every task.
Deep learning, a subset of machine learning using neural networks with multiple
layers, has been particularly revolutionary. These networks can identify complex
patterns in vast datasets, leading to breakthroughs in image recognition,
speech processing, and predictive analytics.

Key applications span numerous industries. In healthcare, AI assists with medical
diagnosis by analyzing medical images, predicting disease progression, and
accelerating drug discovery processes. The finance sector leverages AI for
algorithmic trading, fraud detection, and risk assessment. Transportation is
being transformed through autonomous vehicles, while entertainment increasingly
relies on AI for content recommendation and computer graphics.

However, the rapid advancement of AI also presents significant challenges that
society must address. Bias in AI systems is a critical concern, as these systems
can perpetuate or amplify existing societal biases present in their training data.
Job displacement concerns arise as AI systems become capable of performing tasks
traditionally done by humans. Privacy implications are substantial, as AI systems
often require vast amounts of personal data to function effectively.
"""

# Educational questions about the content
demo_questions = [
    "What are the main technical capabilities that modern AI systems demonstrate?",
    "How has machine learning, particularly deep learning, revolutionized AI development?",
    "What are the key applications of AI across different industries mentioned?",
    "What are the primary challenges and concerns associated with AI advancement?",
    "How do modern AI capabilities compare to the early theoretical foundations from the 1950s?",
    "What role does data play in machine learning and AI system development?",
    "What specific examples are given for AI applications in healthcare and finance?",
    "How might the societal implications of AI affect future development and adoption?"
]

print(f"📄 Content length: {len(demo_content):,} characters")
print(f"❓ Questions to analyze: {len(demo_questions)}")
print(f"📏 Average question length: "
      f"{sum(len(q) for q in demo_questions) / len(demo_questions):.1f} characters")

---

## 🧪 Experiment 1: Efficiency Comparison

In [None]:
if VISUALIZATION_AVAILABLE:
    # Run the experiment
    experiment_results = run_efficiency_experiment(
        processor, demo_content, demo_questions, "AI Article Analysis"
    )
else:
    print(
        "⚠️  Visualization features not available. Install with: pip install -e .[viz]"
    )

---

## 📈 Visualization: Efficiency Gains

In [None]:
if VISUALIZATION_AVAILABLE:
    # Create visualizations
    create_efficiency_visualizations(experiment_results)
else:
    print(
        "⚠️  Visualization features not available. Install with: pip install -e .[viz]"
    )

---

## 📊 Scaling Analysis

In [None]:
def scaling_experiment():
    """Demonstrate how efficiency improves with more questions"""

    print("🔬 SCALING EXPERIMENT: Efficiency vs Question Count")
    print("=" * 60)

    # Use shorter content for faster experimentation
    short_content = demo_content[:1000]
    question_counts = [2, 3, 4, 5, 6]
    results_data = []

    for q_count in question_counts:
        current_questions = demo_questions[:q_count]
        print(f"Testing with {q_count} questions...")

        try:
            results = processor.process_text_questions(
                short_content, current_questions, compare_methods=True
            )

            efficiency_ratio = results['efficiency']['token_efficiency_ratio']
            meets_target = results['efficiency']['meets_target']
            results_data.append({
                'questions': q_count,
                'efficiency': efficiency_ratio,
                'meets_target': meets_target,
                'individual_tokens': results['metrics']['individual']['tokens'],
                'batch_tokens': results['metrics']['batch']['tokens']
            })

            print(f"   Efficiency: {efficiency_ratio:.1f}×")

        except Exception as e:
            print(f"   Error: {e}")

    return results_data

# Run scaling experiment
scaling_data = scaling_experiment()

In [None]:
if VISUALIZATION_AVAILABLE:
    # Visualize scaling results
    visualize_scaling_results(scaling_data)
else:
    print(
        "⚠️  Visualization features not available. Install with: pip install -e .[viz]"
    )


---

## ✨ Week 1 Summary

In [None]:
print("WEEK 1 ACHIEVEMENTS")
print("=" * 40)
print("✅ 3-6× token efficiency demonstrated")
print("✅ 8:1 API call reduction achieved")
print("✅ Quality maintained across batch processing")
print("✅ Scalable architecture for video integration")
print("\nNOTE: Efficiency variance is expected with short content")
print("      Video processing (Week 2+) will show more stable gains")