## Topic: AI in Personalized Learning

## IIT Ropar Minor in AI Project: Mathexpert

## Project Architecture:

![Project Architecture](data/Manim%20Project%20Architecture.png "")

### Datasets used
1. [Hugging Face Dataset] brando/olympiad-bench-imo-math-boxed-825-v2-21-08-2024 (For Math Olympiad problems)
2. [Hugging Face Dataset] nvidia/OpenMathReasoning (for calculus problems)

## Tech stack
1. Chroma DB (vector database for context retrieval)
2. BeautifulSoup4 (for web scrapping entire documentation of Manim)
3. Manim (python library for elegant animations)
4. Streamlit (for web app development)
5. Sqlite3 (database for Math problems)

### Make sure to install the python libraries from requirements.txt

In [None]:
%pip install -r requirements.txt

## First lets setup the video animation generation (manim) part

### script_generator.py

In [None]:
import json
from google import genai
from dotenv import load_dotenv
import os
import re
from typing import Any, Dict

load_dotenv()
api_key = os.getenv("GEMINI_KEY")
client = genai.Client(api_key=api_key)

def extract_json_dict(s: str) -> Dict[str, Any]:
    # Try to find a ```json ... ``` fenced block (or ``` ... ```)
    m = re.search(r"```(?:json)?\s*(\{.*?\})\s*```", s, flags=re.DOTALL)
    if m:
        json_text = m.group(1)
    else:
        # Fallback: maybe the whole string is JSON
        json_text = s.strip()

    # Parse into dict
    obj = json.loads(json_text)

    if not isinstance(obj, dict):
        raise ValueError("Extracted JSON is not an object/dictionary")
    return obj

def generate_script_json(USER_PROMPT):
    entire_prompt = """
    You are an expert animation director and Manim engineer.
    Your job is to convert the user's request into a retrieval-optimized JSON script for generating a Manim animation.
    
    You MUST:
    - Output ONLY valid JSON (no markdown, no comments, no trailing commas).
    - Keep the plan executable: each scene should be concrete and animatable.
    - Optimize for retrieval from a documentation RAG system (Chroma) that stores Manim docs as:
      (1) section chunks (explanations),
      (2) code chunks (code blocks + nearby context),
      (3) optional symbol chunks (class/function names and signatures).
    
    CRITICAL RETRIEVAL REQUIREMENTS:
    - Every action MUST contain retrieval fields that make it easy to fetch correct docs/code:
      - exact_symbols: list of likely Manim class/function names (CamelCase or dotted paths if known)
      - semantic_queries: short natural-language queries for vector search
      - keyword_tags: concise tags (1–3 words) to help reranking and filtering
      - expected_doc_areas: likely doc areas to search (reference/tutorials/examples/guides)
      - unknown_api_intent: if you are not sure which Manim symbol(s) to use, set this to true and provide strong fallback keywords (see below)
    
    UNKNOWN API / FALLBACK KEYWORDS (IMPORTANT):
    - If you do NOT know the exact Manim class/function to call for an action:
      - Set "unknown_api_intent": true
      - Put your best guesses (even if unsure) in exact_symbols
      - Add at least 3–6 targeted "keyword_tags" chosen from this controlled list:
        ["camera move","zoom","pan","follow","3D","axes","graph","plot","function graph","parametric curve",
         "vector field","number line","geometry","shapes","transform","morph","highlight","surround","brace",
         "label","text","mathtex","latex","align","arrange","vgroup","layout","fade","write","create",
         "timing","laggedstart","animationgroup","updater","always_redraw","valuetracker","tracker",
         "color gradient","stroke","fill","opacity","path","traced path","motion path","rotate","scale",
         "shift","move to","background","grid","numberplane","coordinate system","table","matrix",
         "code example","best practice","performance","renderer","cairo","opengl"]
      - Add at least 2–4 "semantic_queries" explicitly phrased to discover the right API, e.g.:
        "manim how to do <intent> which class/function"
        "manim <intent> example code"
        "manim reference <intent> camera/mobject/animation"
      - In "constraints", describe the intent precisely (what should happen visually), so retrieval can match.
    
    ACCURACY RULES:
    - Do not invent APIs. If unsure, mark unknown_api_intent true and use fallback keywords.
    - Prefer smaller scenes (15–40 seconds each) but adapt to user needs.
    - Ensure smooth continuity: each scene must specify transition_to_next.
    - Prefer efficient primitives and avoid heavy compute when possible.
    
    JSON schema (must match exactly):
         {
          "title": "string",
          "audience": "beginner|intermediate|advanced",
          "target_duration_seconds": 0,
          "style": {
            "aspect_ratio": "16:9|9:16|1:1",
            "pacing": "slow|medium|fast",
            "tone": "string"
          },
          "constraints": {
            "render_quality": "low|medium|high",
            "avoid_heavy_compute": true,
            "math_typesetting": true
          },
          "scenes": [
            {
              "id": "string",
              "duration_seconds": 0,
              "goal": "string",
              "voiceover": "string",
              "on_screen_text": ["string"],
              "actions": [
                {
                  "id": "string",
                  "description": "string",
                  "code_intent": "string",
                  "retrieval": {
                    "unknown_api_intent": true,
                    "exact_symbols": ["string"],
                    "queries": ["string"],
                    "tags": ["string"],
                    "doc_area_hints": ["reference|tutorial|examples|guides"]
                  }
                }
              ],
              "transition_to_next": {
                "type": "cut|fade|transform|camera_move|wipe",
                "description": "string"
              }
            }
          ],
          "retrieval_plan": {
            "collections": {
              "sections": "string",
              "code": "string",
              "symbols": "string"
            },
            "top_k": {
              "sections": 0,
              "code": 0,
              "symbols": 0
            }
          }
        }
    
    Guidance for exact_symbols (use if relevant, otherwise guesses + unknown_api_intent true):
    - Scenes/camera: Scene, MovingCameraScene, ThreeDScene, ZoomedScene
    - Text/math: Text, MarkupText, Tex, MathTex
    - Geometry/mobjects: VGroup, Group, Dot, Line, Arrow, Circle, Square, Rectangle, Polygon, Brace
    - Coordinate systems: Axes, NumberPlane
    - Animations: Create, Write, FadeIn, FadeOut, Transform, ReplacementTransform, AnimationGroup, Succession, LaggedStart, Indicate, Circumscribe
    - Dynamics: ValueTracker, always_redraw, Updater
    - Motion: Rotate, Scale, Shift, MoveTo (as method intent)
    
    expected_doc_areas must be chosen only from:
    ["reference/scene","reference/mobject","reference/animation","reference/camera","reference/utils","tutorials","examples","guides"]
    
    Now, take the user's request below and produce the JSON script.
    
    USER REQUEST: 

    """
    sys_user_prompt = entire_prompt + USER_PROMPT
    response = client.models.generate_content(
        model="gemini-3-flash-preview",
        contents=sys_user_prompt,
    )
    return extract_json_dict(response.text)

# if __name__ =="__main__":
#     USER_PROMPT = """
#     Create a 60–75 second Manim animation (16:9) that teaches the concept of a derivative as “slope of the tangent line”.
    
#     Requirements:
#     1) Scene 1 (intro): Show the function f(x) = x^2 on axes. Put the title text “Derivative = slope of tangent” at the top.
#     2) Scene 2 (secant → tangent): Pick two points on the parabola at x=1 and x=1+h. Draw the secant line between them. Display the slope formula:
#        (f(1+h) - f(1)) / h
#        Animate h shrinking toward 0 and smoothly morph the secant line into the tangent line at x=1.
#     3) Scene 3 (result): Show the derivative of x^2 is 2x and evaluate at x=1 to get slope 2. Visually show the tangent line has slope 2 (a small “rise over run” marker is enough).
#     4) Use clean labels (MathTex). Keep it efficient: avoid heavy particle effects or complex 3D.
#     5) Include smooth transitions between scenes, and keep the narration concise (1–2 sentences per scene).
#     """
#     print(generate_script_json(USER_PROMPT=USER_PROMPT))

### retrieve_context.py

In [None]:
"""
Manim Animation Script Context Retriever
Efficiently retrieves relevant documentation from ChromaDB based on animation script JSON
"""

import json
import chromadb
from typing import Dict, List, Any, Optional, Set
from dataclasses import dataclass
from pathlib import Path
from collections import defaultdict
import argparse


@dataclass
class RetrievalResult:
    """Structured result from ChromaDB query"""
    chunk_id: str
    text: str
    distance: float
    class_name: str
    qualified_name: str
    category: str
    chunk_type: str
    url: str
    source_query: str  # Which query generated this result
    
    def __hash__(self):
        return hash(self.chunk_id)
    
    def __eq__(self, other):
        if isinstance(other, RetrievalResult):
            return self.chunk_id == other.chunk_id
        return False


class ManimContextRetriever:
    """
    Retrieves relevant Manim documentation context based on animation script JSON.
    Uses intelligent query generation and deduplication for efficient retrieval.
    """
    
    def __init__(self, db_path: str = "./manim_chromadb", 
                 collection_name: str = "manim_docs"):
        """
        Initialize the retriever
        
        Args:
            db_path: Path to ChromaDB storage
            collection_name: Name of the collection to query
        """
        self.db_path = db_path
        self.collection_name = collection_name
        
        # Initialize ChromaDB client
        self.client = chromadb.PersistentClient(path=db_path)
        self.collection = self.client.get_collection(name=collection_name)
        
        print(f"Connected to ChromaDB collection: {collection_name}")
        print(f"Total documents in collection: {self.collection.count()}")
    
    def retrieve_for_script(self, script_json: Dict[str, Any]) -> Dict[str, Any]:
        """
        Main retrieval function that processes entire animation script
        
        Args:
            script_json: Animation script in the specified JSON format
            
        Returns:
            Dictionary containing organized retrieval results
        """
        # Get retrieval configuration
        retrieval_plan = script_json.get('retrieval_plan', {})
        top_k = retrieval_plan.get('top_k', {})
        
        # Default top_k values if not specified
        default_top_k = {
            'sections': top_k.get('sections', 3),
            'code': top_k.get('code', 5),
            'symbols': top_k.get('symbols', 3)
        }
        
        # Collect all retrieval requests
        all_queries = self._extract_all_queries(script_json)
        
        # Execute queries and collect results
        all_results = []
        
        for query_info in all_queries:
            results = self._execute_query(
                query_text=query_info['query'],
                n_results=query_info.get('n_results', default_top_k['code']),
                category_filter=query_info.get('category_filter'),
                chunk_type_filter=query_info.get('chunk_type_filter'),
                source_query=query_info['query']
            )
            all_results.extend(results)
        
        # Deduplicate and organize results
        organized_results = self._organize_results(all_results, script_json)
        
        return organized_results
    
    def _extract_all_queries(self, script_json: Dict[str, Any]) -> List[Dict[str, Any]]:
        """
        Extract all retrieval queries from the script JSON
        
        Returns:
            List of query dictionaries with metadata
        """
        queries = []
        
        # Get global context queries
        queries.extend(self._get_global_queries(script_json))
        
        # Get scene-specific queries
        for scene in script_json.get('scenes', []):
            queries.extend(self._get_scene_queries(scene, script_json))
        
        return queries
    
    def _get_global_queries(self, script_json: Dict[str, Any]) -> List[Dict[str, Any]]:
        """Generate queries for global script context"""
        queries = []
        
        title = script_json.get('title', '')
        audience = script_json.get('audience', 'intermediate')
        style = script_json.get('style', {})
        
        # Query for overall animation approach
        if title:
            queries.append({
                'query': f"Create animation about: {title}. Audience: {audience}",
                'n_results': 3,
                'category_filter': None,
                'chunk_type_filter': 'example',
                'context': 'global'
            })
        
        # Query for style-specific techniques
        pacing = style.get('pacing', 'medium')
        if pacing in ['slow', 'fast']:
            queries.append({
                'query': f"Animation timing and {pacing} pacing techniques",
                'n_results': 2,
                'category_filter': 'animation',
                'chunk_type_filter': None,
                'context': 'global'
            })
        
        return queries
    
    def _get_scene_queries(self, scene: Dict[str, Any], 
                          script_json: Dict[str, Any]) -> List[Dict[str, Any]]:
        """Generate queries for a specific scene"""
        queries = []
        
        scene_id = scene.get('id', 'unknown')
        scene_goal = scene.get('goal', '')
        
        # Query based on scene goal
        if scene_goal:
            queries.append({
                'query': scene_goal,
                'n_results': 3,
                'category_filter': None,
                'chunk_type_filter': None,
                'context': f'scene:{scene_id}'
            })
        
        # Process each action in the scene
        for action in scene.get('actions', []):
            queries.extend(self._get_action_queries(action, scene_id))
        
        # Query for transition if specified
        transition = scene.get('transition_to_next', {})
        if transition.get('type'):
            queries.append({
                'query': f"{transition['type']} transition animation",
                'n_results': 2,
                'category_filter': 'animation',
                'chunk_type_filter': 'example',
                'context': f'scene:{scene_id}:transition'
            })
        
        return queries
    
    def _get_action_queries(self, action: Dict[str, Any], 
                           scene_id: str) -> List[Dict[str, Any]]:
        """Generate queries for a specific action"""
        queries = []
        
        action_id = action.get('id', 'unknown')
        retrieval = action.get('retrieval', {})
        
        # Skip if no retrieval needed
        if not retrieval.get('unknown_api_intent', False):
            return queries
        
        # Process exact symbols
        exact_symbols = retrieval.get('exact_symbols', [])
        for symbol in exact_symbols:
            queries.append({
                'query': f"class {symbol} usage and parameters",
                'n_results': 2,
                'category_filter': None,
                'chunk_type_filter': None,
                'context': f'scene:{scene_id}:action:{action_id}'
            })
        
        # Process custom queries
        custom_queries = retrieval.get('queries', [])
        for query in custom_queries:
            queries.append({
                'query': query,
                'n_results': 3,
                'category_filter': None,
                'chunk_type_filter': None,
                'context': f'scene:{scene_id}:action:{action_id}'
            })
        
        # Process tags to generate targeted queries
        tags = retrieval.get('tags', [])
        if tags:
            # Combine tags into a single query
            tag_query = ' '.join(tags)
            queries.append({
                'query': tag_query,
                'n_results': 3,
                'category_filter': None,
                'chunk_type_filter': None,
                'context': f'scene:{scene_id}:action:{action_id}'
            })
        
        # Use code_intent if available
        code_intent = action.get('code_intent', '')
        if code_intent:
            queries.append({
                'query': code_intent,
                'n_results': 4,
                'category_filter': None,
                'chunk_type_filter': 'example',
                'context': f'scene:{scene_id}:action:{action_id}'
            })
        
        return queries
    
    def _execute_query(self, query_text: str, n_results: int = 5,
                      category_filter: Optional[str] = None,
                      chunk_type_filter: Optional[str] = None,
                      source_query: str = "") -> List[RetrievalResult]:
        """
        Execute a single query against ChromaDB
        
        Args:
            query_text: The search query
            n_results: Number of results to return
            category_filter: Filter by category (animation, mobject, etc.)
            chunk_type_filter: Filter by chunk type (overview, example, parameters)
            source_query: Original query for tracking
            
        Returns:
            List of RetrievalResult objects
        """
        # Build where clause
        where = {}
        if category_filter:
            where['category'] = category_filter
        if chunk_type_filter:
            where['type'] = chunk_type_filter
        
        where_clause = where if where else None
        
        try:
            # Query ChromaDB
            results = self.collection.query(
                query_texts=[query_text],
                n_results=n_results,
                where=where_clause
            )
            
            # Convert to RetrievalResult objects
            retrieval_results = []
            for i in range(len(results['ids'][0])):
                metadata = results['metadatas'][0][i]
                retrieval_results.append(RetrievalResult(
                    chunk_id=results['ids'][0][i],
                    text=results['documents'][0][i],
                    distance=results['distances'][0][i],
                    class_name=metadata.get('class_name', 'Unknown'),
                    qualified_name=metadata.get('qualified_name', 'Unknown'),
                    category=metadata.get('category', 'Unknown'),
                    chunk_type=metadata.get('type', 'Unknown'),
                    url=metadata.get('url', ''),
                    source_query=source_query or query_text
                ))
            
            return retrieval_results
            
        except Exception as e:
            print(f"Error executing query '{query_text}': {e}")
            return []
    
    def _organize_results(self, results: List[RetrievalResult], 
                         script_json: Dict[str, Any]) -> Dict[str, Any]:
        """
        Organize and deduplicate results into structured format
        
        Args:
            results: List of all retrieval results
            script_json: Original script JSON for context
            
        Returns:
            Organized dictionary with deduplicated results
        """
        # Deduplicate by chunk_id (using set)
        unique_results = list(set(results))
        
        # Sort by distance (relevance)
        unique_results.sort(key=lambda x: x.distance)
        
        # Organize by category
        by_category = defaultdict(list)
        for result in unique_results:
            by_category[result.category].append(result)
        
        # Organize by chunk type
        by_type = defaultdict(list)
        for result in unique_results:
            by_type[result.chunk_type].append(result)
        
        # Group by class
        by_class = defaultdict(list)
        for result in unique_results:
            by_class[result.qualified_name].append(result)
        
        # Build output structure
        output = {
            'metadata': {
                'total_queries': len(set(r.source_query for r in results)),
                'total_results': len(results),
                'unique_results': len(unique_results),
                'script_title': script_json.get('title', 'Untitled'),
                'audience': script_json.get('audience', 'intermediate')
            },
            'top_results': [self._result_to_dict(r) for r in unique_results[:10]],
            'by_category': {
                cat: [self._result_to_dict(r) for r in results]
                for cat, results in by_category.items()
            },
            'by_type': {
                typ: [self._result_to_dict(r) for r in results]
                for typ, results in by_type.items()
            },
            'unique_classes': list(by_class.keys()),
            'classes_detail': {
                cls: {
                    'results': [self._result_to_dict(r) for r in results],
                    'chunk_types': list(set(r.chunk_type for r in results))
                }
                for cls, results in by_class.items()
            }
        }
        
        return output
    
    def _result_to_dict(self, result: RetrievalResult) -> Dict[str, Any]:
        """Convert RetrievalResult to dictionary"""
        return {
            'chunk_id': result.chunk_id,
            'text': result.text,
            'distance': result.distance,
            'class_name': result.class_name,
            'qualified_name': result.qualified_name,
            'category': result.category,
            'chunk_type': result.chunk_type,
            'url': result.url,
            'source_query': result.source_query
        }
    
    def retrieve_by_symbols(self, symbols: List[str], 
                           n_results_per_symbol: int = 3) -> Dict[str, List[Dict[str, Any]]]:
        """
        Retrieve documentation for specific symbols/class names
        
        Args:
            symbols: List of class/function names to look up
            n_results_per_symbol: Number of results per symbol
            
        Returns:
            Dictionary mapping symbol names to results
        """
        results = {}
        
        for symbol in symbols:
            query_results = self._execute_query(
                query_text=f"class {symbol} methods parameters usage",
                n_results=n_results_per_symbol,
                source_query=f"symbol:{symbol}"
            )
            
            results[symbol] = [self._result_to_dict(r) for r in query_results]
        
        return results
    
    def get_examples_by_category(self, category: str, 
                                 n_results: int = 5) -> List[Dict[str, Any]]:
        """
        Get code examples for a specific category
        
        Args:
            category: Category to filter by (animation, mobject, etc.)
            n_results: Number of examples to retrieve
            
        Returns:
            List of example results
        """
        results = self._execute_query(
            query_text=f"{category} code examples",
            n_results=n_results,
            category_filter=category,
            chunk_type_filter='example',
            source_query=f"category:{category}"
        )
        
        return [self._result_to_dict(r) for r in results]


# def main():
#     """Command-line interface for testing queries"""
#     parser = argparse.ArgumentParser(
#         description='Retrieve Manim documentation context from ChromaDB based on animation script'
#     )
#     parser.add_argument('script_json', type=str, 
#                        help='Path to animation script JSON file')
#     parser.add_argument('--db-path', type=str, default='./manim_chromadb',
#                        help='Path to ChromaDB storage (default: ./manim_chromadb)')
#     parser.add_argument('--collection-name', type=str, default='manim_docs',
#                        help='ChromaDB collection name (default: manim_docs)')
#     parser.add_argument('--output', type=str, default=None,
#                        help='Output file for results (default: print to console)')
#     parser.add_argument('--pretty', action='store_true',
#                        help='Pretty print JSON output')
    
#     args = parser.parse_args()
    
#     # Load script JSON
#     script_path = Path(args.script_json)
#     if not script_path.exists():
#         print(f"ERROR: Script file not found: {args.script_json}")
#         return
    
#     with open(script_path, 'r') as f:
#         script_json = json.load(f)
    
#     print(f"Loaded script: {script_json.get('title', 'Untitled')}")
#     print(f"Scenes: {len(script_json.get('scenes', []))}")
    
#     # Initialize retriever
#     retriever = ManimContextRetriever(
#         db_path=args.db_path,
#         collection_name=args.collection_name
#     )
    
#     # Retrieve context
#     print("\nRetrieving context from ChromaDB...")
#     results = retriever.retrieve_for_script(script_json)
    
#     # Output results
#     if args.output:
#         with open(args.output, 'w') as f:
#             json.dump(results, f, indent=2 if args.pretty else None)
#         print(f"\nResults written to: {args.output}")
#     else:
#         if args.pretty:
#             print("\n" + "="*80)
#             print("RETRIEVAL RESULTS")
#             print("="*80)
#             print(json.dumps(results, indent=2))
#         else:
#             print(json.dumps(results))
    
#     # Print summary
#     print("\n" + "="*80)
#     print("SUMMARY")
#     print("="*80)
#     print(f"Total queries executed: {results['metadata']['total_queries']}")
#     print(f"Total results: {results['metadata']['total_results']}")
#     print(f"Unique results: {results['metadata']['unique_results']}")
#     print(f"Unique classes found: {len(results['unique_classes'])}")
#     print(f"\nClasses: {', '.join(results['unique_classes'][:10])}")
#     if len(results['unique_classes']) > 10:
#         print(f"  ... and {len(results['unique_classes']) - 10} more")

### manim_scenes_generator.py

In [None]:
"""
Parallel Manim Scene Code Generator
Reads prompts from directory and generates Python code using Gemini API in parallel threads
"""

import os
import json
import argparse
from pathlib import Path
from typing import Dict, List, Optional, Tuple
from concurrent.futures import ThreadPoolExecutor, as_completed
from threading import Lock
import time
from dotenv import load_dotenv
from google import genai

# Thread-safe printing
print_lock = Lock()
load_dotenv()
api_key = os.getenv("GEMINI_API_KEY")
client = genai.Client(api_key=api_key)

def thread_safe_print(message: str):
    """Thread-safe printing function"""
    with print_lock:
        print(message)


def load_prompt_file(prompt_path: Path) -> Tuple[str, str]:
    """
    Load prompt from file
    
    Args:
        prompt_path: Path to prompt text file
        
    Returns:
        Tuple of (scene_id, prompt_text)
    """
    with open(prompt_path, 'r', encoding='utf-8') as f:
        prompt_text = f.read()
    
    # Extract scene_id from filename (remove _prompt.txt)
    scene_id = prompt_path.stem.replace('_prompt', '')
    
    return scene_id, prompt_text


def load_all_prompts(prompts_dir: str) -> List[Tuple[str, str, Path]]:
    """
    Load all prompt files from directory
    
    Args:
        prompts_dir: Directory containing prompt files
        
    Returns:
        List of tuples (scene_id, prompt_text, prompt_path)
    """
    prompts_path = Path(prompts_dir)
    
    if not prompts_path.exists():
        raise FileNotFoundError(f"Prompts directory not found: {prompts_dir}")
    
    # Find all prompt files
    prompt_files = list(prompts_path.glob("*_prompt.txt"))
    
    if not prompt_files:
        raise ValueError(f"No prompt files found in: {prompts_dir}")
    
    # Load metadata if available
    metadata_file = prompts_path / "prompts_metadata.json"
    scene_order = []
    
    if metadata_file.exists():
        with open(metadata_file, 'r') as f:
            metadata = json.load(f)
            scene_order = [s['scene_id'] for s in metadata.get('scenes', [])]
    
    # Load prompts
    prompts = []
    for prompt_file in prompt_files:
        scene_id, prompt_text = load_prompt_file(prompt_file)
        prompts.append((scene_id, prompt_text, prompt_file))
    
    # Sort by scene order if metadata available
    if scene_order:
        prompts.sort(key=lambda x: scene_order.index(x[0]) if x[0] in scene_order else 999)
    else:
        prompts.sort(key=lambda x: x[0])
    
    return prompts


def generate_code_with_gemini(prompt: str) -> str:
    """
    Generate code using Gemini API
    
    Args:
        prompt: The complete prompt for code generation
        
    Returns:
        Generated code text
    """
    
    response = client.models.generate_content(
        model="gemini-3-flash-preview",
        contents=prompt,
    )
    
    return response.text


def extract_code_from_response(response_text: str) -> str:
    """
    Extract Python code from LLM response
    Handles cases where code is wrapped in markdown code blocks
    
    Args:
        response_text: Raw response from LLM
        
    Returns:
        Clean Python code
    """
    # Check if response contains markdown code blocks
    if "```python" in response_text:
        # Extract code between ```python and ```
        parts = response_text.split("```python")
        if len(parts) > 1:
            code_part = parts[1].split("```")[0]
            return code_part.strip()
    
    elif "```" in response_text:
        # Extract code between ``` and ```
        parts = response_text.split("```")
        if len(parts) >= 3:
            return parts[1].strip()
    
    # If no code blocks, return as is (assume entire response is code)
    return response_text.strip()


def save_generated_code(scene_id: str, code: str, output_dir: str) -> Path:
    """
    Save generated code to file
    
    Args:
        scene_id: Scene identifier
        code: Generated Python code
        output_dir: Output directory
        
    Returns:
        Path to saved file
    """
    output_path = Path(output_dir)
    output_path.mkdir(parents=True, exist_ok=True)
    
    filename = f"{scene_id}.py"
    filepath = output_path / filename
    
    with open(filepath, 'w', encoding='utf-8') as f:
        f.write(code)
    
    return filepath


def generate_scene_code(scene_id: str, prompt_text: str, prompt_path: Path,
                       output_dir: str,
                       max_retries: int = 3) -> Dict[str, any]:
    """
    Generate code for a single scene (thread worker function)
    
    Args:
        scene_id: Scene identifier
        prompt_text: The prompt for code generation
        prompt_path: Path to prompt file (for reference)
        output_dir: Output directory for generated code
        max_retries: Maximum retry attempts on failure
        
    Returns:
        Result dictionary with status and details
    """
    result = {
        'scene_id': scene_id,
        'success': False,
        'output_file': None,
        'error': None,
        'attempts': 0,
        'code_length': 0
    }
    
    thread_safe_print(f"[{scene_id}] Starting code generation...")
    
    for attempt in range(1, max_retries + 1):
        result['attempts'] = attempt
        
        try:
            # Generate code using Gemini
            thread_safe_print(f"[{scene_id}] Attempt {attempt}/{max_retries} - Calling Gemini API...")
            
            start_time = time.time()
            response_text = generate_code_with_gemini(prompt_text)
            elapsed_time = time.time() - start_time
            
            thread_safe_print(f"[{scene_id}] API call completed in {elapsed_time:.2f}s")
            
            # Extract code from response
            code = extract_code_from_response(response_text)
            result['code_length'] = len(code)
            
            if not code or len(code) < 100:
                raise ValueError(f"Generated code too short ({len(code)} chars)")
            
            # Validate code contains required imports
            if "from manim import" not in code and "import manim" not in code:
                thread_safe_print(f"[{scene_id}] WARNING: Code missing manim import, adding...")
                code = "from manim import *\n\n" + code
            
            # Save code to file
            output_file = save_generated_code(scene_id, code, output_dir)
            result['output_file'] = str(output_file)
            result['success'] = True
            
            thread_safe_print(f"[{scene_id}] ✓ Code generated successfully ({len(code)} chars) -> {output_file}")
            break
            
        except Exception as e:
            error_msg = f"Attempt {attempt} failed: {str(e)}"
            thread_safe_print(f"[{scene_id}] ✗ {error_msg}")
            result['error'] = error_msg
            
            if attempt < max_retries:
                wait_time = attempt * 2  # Exponential backoff
                thread_safe_print(f"[{scene_id}] Retrying in {wait_time}s...")
                time.sleep(wait_time)
            else:
                thread_safe_print(f"[{scene_id}] ✗ Failed after {max_retries} attempts")
    
    return result


def generate_all_scenes_parallel(prompts_dir: str, output_dir: str, 
                                 max_workers: int = 4, max_retries: int = 3) -> Dict[str, any]:
    """
    Generate code for all scenes in parallel using thread pool
    
    Args:
        prompts_dir: Directory containing prompt files
        output_dir: Output directory for generated code
        max_workers: Maximum number of parallel threads
        max_retries: Maximum retry attempts per scene
        
    Returns:
        Summary dictionary with results
    """
    # Load all prompts
    print(f"Loading prompts from: {prompts_dir}")
    prompts = load_all_prompts(prompts_dir)
    print(f"Found {len(prompts)} scenes to generate\n")
    
    # Track results
    results = []
    start_time = time.time()
    
    # Create thread pool and submit tasks
    with ThreadPoolExecutor(max_workers=max_workers) as executor:
        # Submit all tasks
        future_to_scene = {
            executor.submit(
                generate_scene_code,
                scene_id, prompt_text, prompt_path,
                output_dir, max_retries
            ): scene_id
            for scene_id, prompt_text, prompt_path in prompts
        }
        
        # Process completed tasks
        for future in as_completed(future_to_scene):
            result = future.result()
            results.append(result)
    
    elapsed_time = time.time() - start_time
    
    # Compile summary
    successful = [r for r in results if r['success']]
    failed = [r for r in results if not r['success']]
    
    summary = {
        'total_scenes': len(prompts),
        'successful': len(successful),
        'failed': len(failed),
        'elapsed_time': elapsed_time,
        'results': results,
        'output_dir': output_dir
    }
    
    return summary


def print_summary(summary: Dict[str, any]):
    """Print generation summary"""
    print("\n" + "="*80)
    print("CODE GENERATION SUMMARY")
    print("="*80)
    print(f"Total scenes: {summary['total_scenes']}")
    print(f"Successful: {summary['successful']}")
    print(f"Failed: {summary['failed']}")
    print(f"Total time: {summary['elapsed_time']:.2f}s")
    print(f"Output directory: {summary['output_dir']}")
    
    if summary['successful'] > 0:
        print("\n✓ Successful generations:")
        for result in summary['results']:
            if result['success']:
                print(f"  - {result['scene_id']}: {result['output_file']} ({result['code_length']} chars)")
    
    if summary['failed'] > 0:
        print("\n✗ Failed generations:")
        for result in summary['results']:
            if not result['success']:
                print(f"  - {result['scene_id']}: {result['error']}")
    
    print("="*80)


def save_summary(summary: Dict[str, any], output_dir: str):
    """Save generation summary to JSON file"""
    output_path = Path(output_dir)
    summary_file = output_path / "generation_summary.json"
    
    with open(summary_file, 'w') as f:
        json.dump(summary, f, indent=2)
    
    print(f"\nSummary saved to: {summary_file}")


# def main():
#     """Command-line interface"""
#     parser = argparse.ArgumentParser(
#         description='Generate Manim scene code from prompts using Gemini API in parallel'
#     )
#     parser.add_argument('prompts_dir', type=str,
#                        help='Directory containing prompt files')
#     parser.add_argument('--output-dir', type=str, default='./generated_scenes',
#                        help='Output directory for generated code (default: ./generated_scenes)')
#     parser.add_argument('--max-workers', type=int, default=4,
#                        help='Maximum parallel threads (default: 4)')
#     parser.add_argument('--max-retries', type=int, default=3,
#                        help='Maximum retry attempts per scene (default: 3)')
    
#     args = parser.parse_args()
    
#     if not api_key:
#         print("ERROR: GEMINI_KEY not found in environment variables")
#         print(f"Please set GEMINI_KEY in {args.env_file} file")
#         return 1
    
#     print(f"Max workers: {args.max_workers}")
#     print(f"Max retries: {args.max_retries}\n")
    
#     # Generate all scenes
#     try:
#         summary = generate_all_scenes_parallel(
#             prompts_dir=args.prompts_dir,
#             output_dir=args.output_dir,
#             max_workers=args.max_workers,
#             max_retries=args.max_retries
#         )
        
#         # Print and save summary
#         print_summary(summary)
#         save_summary(summary, args.output_dir)
        
#         # Return exit code based on success
#         return 0 if summary['failed'] == 0 else 1
        
#     except Exception as e:
#         print(f"\nERROR: {str(e)}")
#         return 1


# if __name__ == "__main__":
#     exit(main())

In [None]:
### parse_and_generate_prompt.py
"""
Manim Scene-by-Scene Prompt Generator
Generates optimized prompts for LLM code generation, one per scene for parallel rendering
"""

import json
import argparse
from typing import Dict, List, Any, Optional
from pathlib import Path
from dataclasses import dataclass
from collections import defaultdict


@dataclass
class ScenePrompt:
    """Structured prompt for a single scene"""
    scene_id: str
    scene_index: int
    prompt_text: str
    metadata: Dict[str, Any]


class ManimPromptGenerator:
    """
    Generates scene-specific prompts combining script JSON and context JSON
    Each scene gets an independent prompt for parallel code generation
    """
    
    def __init__(self, script_json: Dict[str, Any], context_json: Dict[str, Any]):
        """
        Initialize prompt generator
        
        Args:
            script_json: Animation script specification
            context_json: Retrieved documentation context from ChromaDB
        """
        self.script = script_json
        self.context = context_json
        
        # Build context lookup structures for efficient access
        self.context_by_class = self._build_class_lookup()
        self.examples_by_category = self._build_category_lookup()
    
    def _build_class_lookup(self) -> Dict[str, List[Dict[str, Any]]]:
        """Build lookup table for context by class name"""
        lookup = defaultdict(list)
        
        classes_detail = self.context.get('classes_detail', {})
        for class_name, details in classes_detail.items():
            lookup[class_name] = details.get('results', [])
        
        return dict(lookup)
    
    def _build_category_lookup(self) -> Dict[str, List[Dict[str, Any]]]:
        """Build lookup table for examples by category"""
        lookup = defaultdict(list)
        
        by_category = self.context.get('by_category', {})
        for category, results in by_category.items():
            # Filter for examples only
            examples = [r for r in results if r.get('chunk_type') == 'example']
            lookup[category] = examples
        
        return dict(lookup)
    
    def generate_all_prompts(self) -> List[ScenePrompt]:
        """
        Generate prompts for all scenes
        
        Returns:
            List of ScenePrompt objects, one per scene
        """
        prompts = []
        scenes = self.script.get('scenes', [])
        
        for idx, scene in enumerate(scenes):
            prompt = self.generate_scene_prompt(scene, idx, len(scenes))
            prompts.append(prompt)
        
        return prompts
    
    def generate_scene_prompt(self, scene: Dict[str, Any], 
                             scene_index: int, total_scenes: int) -> ScenePrompt:
        """
        Generate a complete prompt for a single scene
        
        Args:
            scene: Scene specification from script JSON
            scene_index: Index of this scene (0-based)
            total_scenes: Total number of scenes
            
        Returns:
            ScenePrompt object containing the formatted prompt
        """
        scene_id = scene.get('id', f'scene_{scene_index}')
        
        # Build prompt sections
        sections = [
            self._build_header(scene, scene_index, total_scenes),
            self._build_global_context(),
            self._build_scene_context(scene),
            self._build_relevant_documentation(scene),
            self._build_code_requirements(scene, scene_index, total_scenes),
            self._build_output_format(scene_id),
        ]
        
        prompt_text = "\n\n".join(sections)
        
        # Build metadata
        metadata = {
            'scene_id': scene_id,
            'scene_index': scene_index,
            'total_scenes': total_scenes,
            'duration': scene.get('duration_seconds', 0),
            'num_actions': len(scene.get('actions', [])),
            'has_transition': bool(scene.get('transition_to_next')),
            'transition_type': scene.get('transition_to_next', {}).get('type', 'none')
        }
        
        return ScenePrompt(
            scene_id=scene_id,
            scene_index=scene_index,
            prompt_text=prompt_text,
            metadata=metadata
        )
    
    def _build_header(self, scene: Dict[str, Any], 
                     scene_index: int, total_scenes: int) -> str:
        """Build the prompt header with context"""
        title = self.script.get('title', 'Untitled Animation')
        scene_id = scene.get('id', f'scene_{scene_index}')
        
        header = f"""# MANIM SCENE GENERATION REQUEST

## Animation Project: {title}
**Scene {scene_index + 1} of {total_scenes}** (ID: `{scene_id}`)

You are tasked with generating a single, self-contained Manim scene class that will be rendered independently as part of a larger animation project. This scene will be rendered in parallel with other scenes and later composited together.

**CRITICAL**: This scene must be completely independent and not rely on any state from previous scenes."""
        
        return header
    
    def _build_global_context(self) -> str:
        """Build global context section from script"""
        audience = self.script.get('audience', 'intermediate')
        style = self.script.get('style', {})
        constraints = self.script.get('constraints', {})
        
        context = f"""## Global Project Context

### Audience
- **Level**: {audience}
- **Explanation depth**: {"detailed explanations" if audience == "beginner" else "concise" if audience == "advanced" else "moderate explanations"}

### Visual Style
- **Aspect ratio**: {style.get('aspect_ratio', '16:9')}
- **Pacing**: {style.get('pacing', 'medium')}
- **Tone**: {style.get('tone', 'educational')}

### Technical Constraints
- **Render quality**: {constraints.get('render_quality', 'medium')}
- **Avoid heavy compute**: {constraints.get('avoid_heavy_compute', True)}
- **Math typesetting**: {constraints.get('math_typesetting', True)}"""
        
        return context
    
    def _build_scene_context(self, scene: Dict[str, Any]) -> str:
        """Build scene-specific context"""
        duration = scene.get('duration_seconds', 0)
        goal = scene.get('goal', '')
        voiceover = scene.get('voiceover', '')
        on_screen_text = scene.get('on_screen_text', [])
        
        context = f"""## Scene Specification

### Scene Goal
{goal}

### Duration
**Target**: {duration} seconds

### Voiceover Script
```
{voiceover}
```

### On-Screen Text Elements
{self._format_list(on_screen_text) if on_screen_text else "No additional on-screen text"}

### Actions Required
The scene must accomplish the following actions in sequence:
"""
        
        # Add actions
        actions = scene.get('actions', [])
        for idx, action in enumerate(actions, 1):
            action_id = action.get('id', f'action_{idx}')
            description = action.get('description', '')
            code_intent = action.get('code_intent', '')
            
            context += f"\n**Action {idx}** (ID: `{action_id}`)\n"
            context += f"- Description: {description}\n"
            if code_intent:
                context += f"- Code intent: {code_intent}\n"
        
        return context
    
    def _build_relevant_documentation(self, scene: Dict[str, Any]) -> str:
        """Build documentation section with relevant context"""
        doc_section = """## Relevant Manim Documentation

Below is curated documentation relevant to this scene's requirements. You can use these as reference/context for correct API usage, parameters, and patterns.

"""
        
        # Collect all symbols needed for this scene
        symbols_used = set()
        categories_used = set()
        
        for action in scene.get('actions', []):
            retrieval = action.get('retrieval', {})
            exact_symbols = retrieval.get('exact_symbols', [])
            symbols_used.update(exact_symbols)
            
            # Extract categories from tags
            tags = retrieval.get('tags', [])
            for tag in tags:
                if tag in ['animation', 'mobject', 'scene', 'camera', 'utility']:
                    categories_used.add(tag)
        
        # Add documentation for each unique symbol
        if symbols_used:
            doc_section += "### Class Documentation\n\n"
            
            for symbol in sorted(symbols_used):
                symbol_docs = self._get_symbol_documentation(symbol)
                if symbol_docs:
                    doc_section += symbol_docs + "\n\n"
        
        # Add relevant examples by category
        if categories_used:
            doc_section += "### Relevant Code Examples\n\n"
            
            for category in sorted(categories_used):
                examples = self.examples_by_category.get(category, [])
                if examples:
                    doc_section += f"#### {category.capitalize()} Examples\n\n"
                    # Include top 2 examples per category
                    for example in examples[:2]:
                        doc_section += self._format_example(example) + "\n"
        
        # Add top general results if no specific docs found
        if not symbols_used and not categories_used:
            doc_section += "### General Reference\n\n"
            top_results = self.context.get('top_results', [])[:3]
            for result in top_results:
                doc_section += self._format_context_result(result) + "\n"
        
        return doc_section
    
    def _get_symbol_documentation(self, symbol: str) -> Optional[str]:
        """Get documentation for a specific symbol/class"""
        # Try exact match first
        for class_name, results in self.context_by_class.items():
            if symbol in class_name:
                # Find overview chunk
                overview = next((r for r in results if r.get('chunk_type') == 'overview'), None)
                parameters = next((r for r in results if r.get('chunk_type') == 'parameters'), None)
                example = next((r for r in results if r.get('chunk_type') == 'example'), None)
                
                doc = f"#### `{class_name}`\n\n"
                
                if overview:
                    doc += f"{overview['text']}\n\n"
                
                if parameters:
                    doc += "**Parameters:**\n```\n" + parameters['text'] + "\n```\n\n"
                
                if example:
                    doc += "**Example:**\n```python\n" + example['text'] + "\n```\n\n"
                
                return doc
        
        return None
    
    def _format_example(self, example: Dict[str, Any]) -> str:
        """Format a code example for display"""
        class_name = example.get('class_name', 'Unknown')
        text = example.get('text', '')
        
        return f"**{class_name}**\n```python\n{text}\n```\n"
    
    def _format_context_result(self, result: Dict[str, Any]) -> str:
        """Format a generic context result"""
        qualified_name = result.get('qualified_name', 'Unknown')
        text = result.get('text', '')
        chunk_type = result.get('chunk_type', 'unknown')
        
        return f"**{qualified_name}** ({chunk_type})\n```\n{text}\n```\n"
    
    def _build_code_requirements(self, scene: Dict[str, Any], 
                                scene_index: int, total_scenes: int) -> str:
        """Build code generation requirements section"""
        scene_id = scene.get('id', f'scene_{scene_index}')
        duration = scene.get('duration_seconds', 0)
        transition = scene.get('transition_to_next', {})
        
        requirements = f"""## Code Generation Requirements

### Class Structure
Generate a single Python class that:
1. **Class name**: `{self._to_class_name(scene_id)}`
2. **Inherits from**: `Scene` (from manim)
3. **Single method**: `construct(self)` containing all animation logic

### Scene Independence
- This scene will be rendered separately and composited later
- Do NOT reference any objects or state from other scenes
- Initialize all objects within this scene's `construct` method
- The scene should be completely self-contained

### Timing Requirements
- **Target duration**: {duration} seconds
- Plan animations to fit within this timeframe
- Use appropriate `run_time` parameters for animations
- Consider using `self.wait()` for pacing

### Output Preparation"""
        
        # Add transition requirements
        if transition.get('type') and scene_index < total_scenes - 1:
            trans_type = transition.get('type')
            trans_desc = transition.get('description', '')
            
            requirements += f"""

### Transition to Next Scene
This scene should END with a state that enables a `{trans_type}` transition to the next scene.
- **Transition type**: {trans_type}
- **Description**: {trans_desc}

**Requirements for {trans_type} transition:**
"""
            
            if trans_type == 'fade':
                requirements += """- End with objects in their final positions
- The compositor will apply a fade effect between scenes"""
            
            elif trans_type == 'transform':
                requirements += """- Export the final state of key objects
- Objects should be in positions suitable for morphing to next scene
- Use clear naming for objects that will transform"""
            
            elif trans_type == 'camera_move':
                requirements += """- Set camera to final position at end
- The next scene will pick up from this camera state"""
            
            elif trans_type == 'wipe':
                requirements += """- End with a clean composition
- Objects should be arranged for wipe transition"""
            
            else:  # cut or other
                requirements += """- End with a complete visual state
- Ensure clean ending frame"""
        
        else:
            requirements += """

### Scene Ending
This is the final scene. End with a clean, complete state."""
        
        requirements += """

### Best Practices
1. **Import statements**: Include all necessary imports at the top
2. **Comments**: Add clear comments explaining key steps
3. **Error handling**: Use safe animations that won't fail during render
4. **Performance**: Avoid unnecessary complexity given the constraints
5. **Readability**: Use clear variable names and logical structure"""
        
        return requirements
    
    def _build_output_format(self, scene_id: str) -> str:
        """Build output format instructions"""
        class_name = self._to_class_name(scene_id)
        
        output = f"""## Output Format

Provide ONLY the complete Python code for the scene class. No explanations before or after.

### Required Structure:
```python
from manim import *

class {class_name}(Scene):
    def construct(self):
        # Your animation code here
        pass
```

### File naming:
This code will be saved as: `{scene_id}.py`

### Rendering:
The scene will be rendered using:
```bash
manim -qm {scene_id}.py {class_name}
```

Generate the complete, working Manim scene code now."""
        
        return output
    
    def _to_class_name(self, scene_id: str) -> str:
        """Convert scene ID to valid Python class name"""
        # Remove special characters and convert to PascalCase
        parts = scene_id.replace('-', '_').replace(' ', '_').split('_')
        class_name = ''.join(word.capitalize() for word in parts if word)
        return class_name + 'Scene'
    
    def _format_list(self, items: List[str]) -> str:
        """Format a list of strings as markdown"""
        return '\n'.join(f"- {item}" for item in items)
    
    def save_prompts(self, prompts: List[ScenePrompt], output_dir: str):
        """
        Save prompts to individual files
        
        Args:
            prompts: List of generated prompts
            output_dir: Directory to save prompt files
        """
        output_path = Path(output_dir)
        output_path.mkdir(parents=True, exist_ok=True)
        
        # Save each prompt
        for prompt in prompts:
            filename = f"{prompt.scene_id}_prompt.txt"
            filepath = output_path / filename
            
            with open(filepath, 'w', encoding='utf-8') as f:
                f.write(prompt.prompt_text)
            
            print(f"Saved prompt: {filepath}")
        
        # Save metadata summary
        metadata_file = output_path / "prompts_metadata.json"
        metadata = {
            'total_prompts': len(prompts),
            'scenes': [
                {
                    'scene_id': p.scene_id,
                    'scene_index': p.scene_index,
                    'prompt_file': f"{p.scene_id}_prompt.txt",
                    **p.metadata
                }
                for p in prompts
            ]
        }
        
        with open(metadata_file, 'w') as f:
            json.dump(metadata, f, indent=2)
        
        print(f"\nSaved metadata: {metadata_file}")
        print(f"Total prompts generated: {len(prompts)}")


# def main():
#     """Command-line interface"""
#     parser = argparse.ArgumentParser(
#         description='Generate scene-specific prompts for Manim code generation'
#     )
#     parser.add_argument('script_json', type=str,
#                        help='Path to animation script JSON file')
#     parser.add_argument('context_json', type=str,
#                        help='Path to retrieved context JSON file')
#     parser.add_argument('--output-dir', type=str, default='./prompts',
#                        help='Directory to save generated prompts (default: ./prompts)')
#     parser.add_argument('--print-prompts', action='store_true',
#                        help='Print prompts to console in addition to saving')
#     parser.add_argument('--scene-index', type=int, default=None,
#                        help='Generate prompt for specific scene index only (0-based)')
    
#     args = parser.parse_args()
    
#     # Load input files
#     print("Loading input files...")
    
#     with open(args.script_json, 'r') as f:
#         script_json = json.load(f)
    
#     with open(args.context_json, 'r') as f:
#         context_json = json.load(f)
    
#     print(f"Script: {script_json.get('title', 'Untitled')}")
#     print(f"Scenes: {len(script_json.get('scenes', []))}")
#     print(f"Context entries: {context_json.get('metadata', {}).get('unique_results', 0)}")
    
#     # Generate prompts
#     print("\nGenerating prompts...")
#     generator = ManimPromptGenerator(script_json, context_json)
    
#     if args.scene_index is not None:
#         # Generate single scene prompt
#         scenes = script_json.get('scenes', [])
#         if 0 <= args.scene_index < len(scenes):
#             scene = scenes[args.scene_index]
#             prompt = generator.generate_scene_prompt(
#                 scene, args.scene_index, len(scenes)
#             )
#             prompts = [prompt]
#             print(f"Generated prompt for scene {args.scene_index}")
#         else:
#             print(f"ERROR: Scene index {args.scene_index} out of range (0-{len(scenes)-1})")
#             return
#     else:
#         # Generate all prompts
#         prompts = generator.generate_all_prompts()
#         print(f"Generated {len(prompts)} prompts")
    
#     # Save prompts
#     generator.save_prompts(prompts, args.output_dir)
    
#     # Optionally print prompts
#     if args.print_prompts:
#         print("\n" + "="*80)
#         for prompt in prompts:
#             print(f"\n{'='*80}")
#             print(f"PROMPT FOR SCENE: {prompt.scene_id}")
#             print(f"{'='*80}\n")
#             print(prompt.prompt_text)
#             print("\n")
    
#     print("\n✓ Prompt generation complete!")
#     print(f"Prompts saved to: {args.output_dir}")


# if __name__ == "__main__":
#     main()

## Now moving on to setup the database of the documentation of manim

In [None]:
%python -m utils.parse_and_store manim_docs_site/docs.manim.community/en/stable

## Now the database for math problems

In [None]:
%python -m utils.generate_calculus_database
%python -m utils.generate_problems_database

## Now moving on to the main application