# 🔶 Multi-Language Code Complexity Annotator

## Why I Built This

Understanding time complexity (Big-O notation) is crucial for writing efficient algorithms, identifying bottlenecks, making informed optimization decisions, and passing technical interviews.

Analyzing complexity manually is tedious and error-prone. This tool **automates** the entire process—detecting loops, recursion, and functions, then annotating code with Big-O estimates and explanations.

---

## What This Does

This app analyzes source code and automatically:
- 📊 Detects loops, recursion, and functions
- 🧮 Estimates Big-O complexity (O(1), O(n), O(n²), etc.)
- 💬 Inserts inline comments explaining the complexity
- 🎨 Generates syntax-highlighted previews
- 🤖 **Optional:** Gets AI-powered code review from LLaMA

**Supports 13 languages:** Python • JavaScript • TypeScript • Java • C/C++ • C# • Go • PHP • Swift • Ruby • Kotlin • Rust

**Tech:** HuggingFace Transformers • LLaMA 3.2 • Gradio UI • Pygments • Regex Analysis

---

**Use Case:** Upload your code → Get instant complexity analysis → Optimize with confidence


## Step 1: Install Dependencies

Installing the complete stack:
- **Transformers** - HuggingFace library for loading LLaMA models
- **Accelerate** - Fast distributed training/inference
- **Gradio** - Beautiful web interface
- **PyTorch** (CPU version) - Deep learning framework
- **BitsAndBytes** - 4/8-bit quantization for large models
- **Pygments** - Syntax highlighting engine
- **Python-dotenv** - Environment variable management

**Note:** This installs the CPU-only version of PyTorch. For GPU support, modify the install command.


In [12]:
!uv pip -q install -U pip
!uv pip -q install transformers accelerate gradio torch --extra-index-url https://download.pytorch.org/whl/cpu
!uv pip -q install bitsandbytes pygments python-dotenv

## Step 2: Core Configuration & Imports

Setting up:
- **Environment variables** to suppress progress bars (prevents Jupyter ContextVar issues)
- **Dummy tqdm** class to avoid notebook conflicts
- **Language mappings** for 13+ programming languages
- **Complexity constants** for Big-O estimation
- **Comment syntax** for each language (# vs //)

**Key Configurations:**
- Max file size: 2 MB
- Default model: `meta-llama/Llama-3.2-1B`
- Supported file extensions and their language identifiers


In [13]:
import os
import re
import io
import json
import time
import math
from dataclasses import dataclass
from typing import Tuple, List, Dict, Optional, Generator

# Disable tqdm progress bars to avoid Jupyter ContextVar issues
os.environ["TRANSFORMERS_NO_ADVISORY_WARNINGS"] = "1"
os.environ["TOKENIZERS_PARALLELISM"] = "false"
os.environ["TQDM_DISABLE"] = "1"  # Completely disable tqdm

# Provide a module-level lock expected by some integrations
class _DummyLock:
    def __enter__(self):
        return self
    def __exit__(self, *args):
        pass

class _DummyTqdm:
    """Dummy tqdm that does nothing - prevents Jupyter notebook ContextVar errors"""
    def __init__(self, *args, **kwargs):
        self.iterable = args[0] if args else None
        self.total = kwargs.get('total', 0)
        self.n = 0
    def __iter__(self):
        return iter(self.iterable) if self.iterable else iter([])
    def __enter__(self):
        return self
    def __exit__(self, *args):
        pass
    def update(self, n=1, *args, **kwargs):
        self.n += n
    def close(self):
        pass
    def set_description(self, *args, **kwargs):
        pass
    def set_postfix(self, *args, **kwargs):
        pass
    def refresh(self, *args, **kwargs):
        pass
    def clear(self, *args, **kwargs):
        pass
    def write(self, *args, **kwargs):
        pass
    def reset(self, total=None):
        self.n = 0
        if total is not None:
            self.total = total
    @staticmethod
    def get_lock():
        """Return a dummy lock to avoid ContextVar issues"""
        return _DummyLock()
    
    @staticmethod
    def set_lock(lock=None):
        """Dummy set_lock method - does nothing"""
        pass

def _dummy_get_lock():
    """Module-level get_lock function"""
    return _DummyLock()

def _dummy_set_lock(lock=None):
    """Module-level set_lock function - does nothing"""
    pass

# Import and immediately patch tqdm before transformers can use it
def _patch_tqdm():
    """Patch tqdm to avoid ContextVar errors in Jupyter"""
    import sys  # Import sys here since it's not available in outer scope
    try:
        import tqdm
        import tqdm.auto
        import tqdm.notebook

        # Patch classes
        tqdm.tqdm = _DummyTqdm
        tqdm.auto.tqdm = _DummyTqdm
        tqdm.notebook.tqdm = _DummyTqdm

        # Patch module-level functions that other code might call directly
        tqdm.get_lock = _dummy_get_lock
        tqdm.auto.get_lock = _dummy_get_lock
        tqdm.notebook.get_lock = _dummy_get_lock
        tqdm.set_lock = _dummy_set_lock
        tqdm.auto.set_lock = _dummy_set_lock
        tqdm.notebook.set_lock = _dummy_set_lock

        # Also patch in sys.modules to catch any dynamic imports
        sys.modules['tqdm'].tqdm = _DummyTqdm
        sys.modules['tqdm.auto'].tqdm = _DummyTqdm
        sys.modules['tqdm.notebook'].tqdm = _DummyTqdm
        sys.modules['tqdm'].get_lock = _dummy_get_lock
        sys.modules['tqdm.auto'].get_lock = _dummy_get_lock
        sys.modules['tqdm.notebook'].get_lock = _dummy_get_lock
        sys.modules['tqdm'].set_lock = _dummy_set_lock
        sys.modules['tqdm.auto'].set_lock = _dummy_set_lock
        sys.modules['tqdm.notebook'].set_lock = _dummy_set_lock

    except ImportError:
        pass

_patch_tqdm()

from dotenv import load_dotenv

SUPPORTED_EXTENSIONS = {
    ".py": "python",
    ".js": "javascript",
    ".ts": "typescript",
    ".java": "java",
    ".c": "c",
    ".h": "c",
    ".cpp": "cpp",
    ".cc": "cpp",
    ".hpp": "cpp",
    ".cs": "csharp",
    ".go": "go",
    ".php": "php",
    ".swift": "swift",
    ".rb": "ruby",
    ".kt": "kotlin",
    ".rs": "rust",
}

COMMENT_SYNTAX = {
    "python": "#",
    "javascript": "//",
    "typescript": "//",
    "java": "//",
    "c": "//",
    "cpp": "//",
    "csharp": "//",
    "go": "//",
    "php": "//",
    "swift": "//",
    "ruby": "#",
    "kotlin": "//",
    "rust": "//",
}

MAX_FILE_SIZE_MB = 2.0
# Llama 3.2 1B - The actual model name (not -Instruct suffix)
# Requires Meta approval: https://huggingface.co/meta-llama/Llama-3.2-1B
DEFAULT_MODEL_ID = "meta-llama/Llama-3.2-1B"
DEVICE_HINT = "auto"

# Global token storage (set in Cell 2 to avoid Jupyter ContextVar issues)
HUGGINGFACE_TOKEN = None

# Complexity estimation constants
LOOP_KEYWORDS = [r"\bfor\b", r"\bwhile\b"]

FUNCTION_PATTERNS = [
    r"^\s*def\s+([A-Za-z_]\w*)\s*\(",                                            # Python
    r"^\s*(?:public|private|protected)?\s*(?:static\s+)?[A-Za-z_<>\[\]]+\s+([A-Za-z_]\w*)\s*\(",  # Java/C#/C++
    r"^\s*function\s+([A-Za-z_]\w*)\s*\(",                                       # JavaScript
    r"^\s*(?:const|let|var)\s+([A-Za-z_]\w*)\s*=\s*\(",                          # JavaScript arrow/function
]

COMPLEXITY_ORDER = {
    "O(1)": 0,
    "O(log n)": 1,
    "O(n)": 2,
    "O(n log n)": 3,
    "O(n^2)": 4,
    "O(n^3)": 5,
}

RECURSION_PATTERNS = {
    "divide_conquer": r"\b(n/2|n >> 1|n>>1|n\s*//\s*2|mid\b)",
}

# HTML syntax highlighting styles (orange comments)
SYNTAX_HIGHLIGHT_CSS = """<style>
.codehilite .c, .codehilite .c1, .codehilite .cm, .codehilite .cp {
    color: orange !important;
    font-style: italic;
}
.codehilite {
    background: #0b0b0b11;
    padding: 12px;
    border-radius: 10px;
    overflow: auto;
    font-family: 'Monaco', 'Menlo', 'Ubuntu Mono', 'Consolas', monospace;
    font-size: 14px;
    line-height: 1.5;
}
</style>"""

## Step 3: Load HuggingFace Token

Loading authentication token from `.env` file to access gated models like LLaMA.

**Why?** Meta's LLaMA models require:
1. Accepting their license agreement on HuggingFace
2. Using an access token for authentication

**Create a `.env` file with:**
```
HF_TOKEN=hf_your_token_here
```

Get your token at: https://huggingface.co/settings/tokens


In [14]:
load_dotenv()

# Load token from .env file
HF_TOKEN = os.getenv("HF_TOKEN", "").strip()

# Store in global variable to avoid Jupyter ContextVar issues with os.environ
global HUGGINGFACE_TOKEN

if HF_TOKEN:
    os.environ["HUGGING_FACE_HUB_TOKEN"] = HF_TOKEN
    HUGGINGFACE_TOKEN = HF_TOKEN  # Store in global variable
    print("✅ Hugging Face token loaded successfully from .env file")
    print(f"   Token length: {len(HF_TOKEN)} characters")
else:
    print("⚠️ No HF_TOKEN found in .env file. Gated models may not work.")
    HUGGINGFACE_TOKEN = None

✅ Hugging Face token loaded successfully from .env file
   Token length: 37 characters


## Step 4: Language Detection Functions

Two simple but essential utilities:

1. **`detect_language(filename)`** - Detects programming language from file extension
2. **`comment_prefix_for(lang)`** - Returns the comment symbol for that language (# or //)

These enable the tool to automatically adapt to any supported language.


In [15]:
def detect_language(filename: str) -> str:
    """
    Detect programming language based on file extension.
    
    Args:
        filename: Name of the file (must have a supported extension)
    
    Returns:
        Language identifier string (e.g., 'python', 'javascript', etc.)
    
    Raises:
        ValueError: If file extension is not supported
    """
    ext = os.path.splitext(filename)[1].lower()
    
    if not ext:
        supported = ", ".join(sorted(SUPPORTED_EXTENSIONS.keys()))
        raise ValueError(f"File has no extension. Supported extensions: {supported}")
    
    if ext not in SUPPORTED_EXTENSIONS:
        supported = ", ".join(sorted(SUPPORTED_EXTENSIONS.keys()))
        raise ValueError(f"Unsupported file extension '{ext}'. Supported extensions: {supported}")
    
    return SUPPORTED_EXTENSIONS[ext]

def comment_prefix_for(lang: str) -> str:
    """
    Get the comment prefix for a given language.
    
    Args:
        lang: Language identifier (e.g., 'python', 'javascript')
    
    Returns:
        Comment prefix string (e.g., '#' or '//')
    
    Raises:
        ValueError: If language is not supported
    """
    if lang not in COMMENT_SYNTAX:
        raise ValueError(f"Unsupported language '{lang}'. Supported: {', '.join(sorted(COMMENT_SYNTAX.keys()))}")
    
    return COMMENT_SYNTAX[lang]

## Step 5: Complexity Estimation Engine

The core analysis logic using **heuristic pattern matching**:

**How it works:**
1. **Detect blocks** - Find all functions, loops, and recursion using regex patterns
2. **Analyze loops** - Count nesting depth (1 loop = O(n), 2 nested = O(n²), etc.)
3. **Analyze recursion** - Detect divide-and-conquer (O(log n)) vs exponential (O(2^n))
4. **Aggregate** - Functions inherit the worst complexity of their inner operations

**Key Functions:**
- `detect_blocks()` - Pattern matching for code structures
- `analyze_recursion()` - Identifies recursive patterns
- `analyze_loop_complexity()` - Counts nested loops
- `estimate_complexity()` - Orchestrates the full analysis


In [16]:
@dataclass
class BlockInfo:
    """Represents a code block (function, loop, or recursion) with complexity information."""
    line_idx: int
    kind: str  # "function" | "loop" | "recursion"
    name: Optional[str] = None
    depth: int = 0
    complexity: str = "O(1)"
    reason: str = ""


def get_indent_level(line: str) -> int:
    """Calculate indentation level of a line (tabs converted to 4 spaces)."""
    normalized = line.replace("\t", "    ")
    return len(normalized) - len(normalized.lstrip(" "))


def find_function_name(line: str) -> Optional[str]:
    """Extract function name from a line if it contains a function declaration."""
    for pattern in FUNCTION_PATTERNS:
        match = re.search(pattern, line)
        if match and match.lastindex:
            return match.group(1)
    return None


def get_block_end(block: BlockInfo, all_blocks: List[BlockInfo], total_lines: int) -> int:
    """Calculate the end line index for a given block."""
    end = total_lines
    for other in all_blocks:
        if other.line_idx > block.line_idx and other.depth <= block.depth:
            end = min(end, other.line_idx)
    return end


def rank_complexity(complexity: str) -> int:
    """Assign a numeric rank to a complexity string for comparison."""
    # Check for polynomial complexities O(n^k)
    match = re.match(r"O\(n\^(\d+)\)", complexity)
    if match:
        return 10 + int(match.group(1))
    
    return COMPLEXITY_ORDER.get(complexity, 0)


def detect_blocks(lines: List[str], lang: str) -> List[BlockInfo]:
    """Detect all code blocks (functions and loops) in the source code."""
    blocks = []
    stack = []
    brace_depth = 0
    
    # Pre-compute indentation for Python
    indents = [get_indent_level(line) for line in lines] if lang == "python" else []
    
    for i, line in enumerate(lines):
        stripped = line.strip()
        
        # Track brace depth for non-Python languages
        if lang != "python":
            brace_depth += line.count("{") - line.count("}")
            brace_depth = max(0, brace_depth)
        
        # Update stack based on indentation/brace depth
        if lang == "python":
            while stack and indents[i] < stack[-1]:
                stack.pop()
        else:
            while stack and brace_depth < stack[-1]:
                stack.pop()
        
        current_depth = len(stack)
        
        # Detect loops
        if any(re.search(pattern, stripped) for pattern in LOOP_KEYWORDS):
            blocks.append(BlockInfo(
                line_idx=i,
                kind="loop",
                depth=current_depth + 1
            ))
            stack.append(indents[i] if lang == "python" else brace_depth)
        
        # Detect functions
        func_name = find_function_name(line)
        if func_name:
            blocks.append(BlockInfo(
                line_idx=i,
                kind="function",
                name=func_name,
                depth=current_depth + 1
            ))
            stack.append(indents[i] if lang == "python" else brace_depth)
    
    return blocks


def analyze_recursion(block: BlockInfo, blocks: List[BlockInfo], lines: List[str]) -> None:
    """Analyze a function block for recursion and update its complexity."""
    if block.kind != "function" or not block.name:
        return
    
    end = get_block_end(block, blocks, len(lines))
    body = "\n".join(lines[block.line_idx:end])
    
    # Count recursive calls (subtract 1 for the function definition itself)
    recursive_calls = len(re.findall(rf"\b{re.escape(block.name)}\s*\(", body)) - 1
    
    if recursive_calls == 0:
        return
    
    # Detect divide-and-conquer pattern
    if re.search(RECURSION_PATTERNS["divide_conquer"], body):
        block.kind = "recursion"
        block.complexity = "O(log n)"
        block.reason = "Divide-and-conquer recursion (problem size halves each call)."
    # Multiple recursive calls suggest exponential
    elif recursive_calls >= 2:
        block.kind = "recursion"
        block.complexity = "O(2^n)"
        block.reason = "Multiple recursive calls per frame suggest exponential growth."
    # Single recursive call is linear
    else:
        block.kind = "recursion"
        block.complexity = "O(n)"
        block.reason = "Single recursive call per frame suggests linear recursion."


def analyze_loop_complexity(block: BlockInfo, all_loops: List[BlockInfo], blocks: List[BlockInfo], total_lines: int) -> None:
    """Analyze loop nesting depth and assign complexity."""
    if block.kind != "loop":
        return
    
    end = get_block_end(block, blocks, total_lines)
    
    # Count nested loops within this loop
    inner_loops = [loop for loop in all_loops 
                   if block.line_idx < loop.line_idx < end]
    
    nesting_depth = 1 + len(inner_loops)
    
    if nesting_depth == 1:
        block.complexity = "O(n)"
        block.reason = "Single loop scales linearly with input size."
    elif nesting_depth == 2:
        block.complexity = "O(n^2)"
        block.reason = "Nested loops indicate quadratic time."
    elif nesting_depth == 3:
        block.complexity = "O(n^3)"
        block.reason = "Three nested loops indicate cubic time."
    else:
        block.complexity = f"O(n^{nesting_depth})"
        block.reason = f"{nesting_depth} nested loops suggest polynomial time."


def analyze_function_complexity(block: BlockInfo, blocks: List[BlockInfo], total_lines: int) -> None:
    """Analyze overall function complexity based on contained blocks."""
    if block.kind != "function":
        return
    
    end = get_block_end(block, blocks, total_lines)
    
    # Get all blocks within this function
    inner_blocks = [b for b in blocks if block.line_idx < b.line_idx < end]
    
    # Find the worst complexity among inner blocks
    worst_complexity = "O(1)"
    for inner in inner_blocks:
        if rank_complexity(inner.complexity) > rank_complexity(worst_complexity):
            worst_complexity = inner.complexity
    
    # Special case: recursion + loop = O(n log n)
    has_recursion = any(b.kind == "recursion" for b in inner_blocks)
    has_loop = any(b.kind == "loop" for b in inner_blocks)
    
    if has_recursion and has_loop:
        block.complexity = "O(n log n)"
        block.reason = "Combines recursion with iteration (e.g., merge sort pattern)."
    else:
        block.complexity = worst_complexity
        block.reason = "Based on worst-case complexity of inner operations."


def estimate_complexity(lines: List[str], lang: str) -> List[BlockInfo]:
    """
    Estimate Big-O complexity for code blocks using heuristic analysis.
    
    Heuristics:
    - Single/nested loops: O(n), O(n^2), O(n^3), etc.
    - Recursion patterns: O(n), O(log n), O(2^n)
    - Function complexity: worst case of internal operations
    
    Args:
        lines: Source code lines
        lang: Programming language identifier
    
    Returns:
        List of BlockInfo objects with complexity estimates
    """
    # Step 1: Detect all blocks
    blocks = detect_blocks(lines, lang)
    
    # Step 2: Analyze recursion in functions
    for block in blocks:
        analyze_recursion(block, blocks, lines)
    
    # Step 3: Analyze loop complexities
    loops = [b for b in blocks if b.kind == "loop"]
    for loop in loops:
        analyze_loop_complexity(loop, loops, blocks, len(lines))
    
    # Step 4: Analyze overall function complexities
    for block in blocks:
        analyze_function_complexity(block, blocks, len(lines))
    
    return blocks

## Step 6: Code Annotation Functions

Takes the complexity estimates and **inserts them as comments** into the source code:

**Process:**
1. `create_annotation_comment()` - Formats Big-O annotations as language-specific comments
2. `insert_annotations()` - Inserts comments below each function/loop
3. `to_markdown()` - Wraps annotated code in Markdown code blocks

**Example output:**
```python
def bubble_sort(arr):
# Big-O: O(n^2)
# Explanation: Nested loops indicate quadratic time.
    for i in range(len(arr)):
        for j in range(len(arr) - i - 1):
            if arr[j] > arr[j + 1]:
                arr[j], arr[j + 1] = arr[j + 1], arr[j]
```


In [17]:
def create_annotation_comment(block: BlockInfo, comment_prefix: str) -> List[str]:
    """
    Create annotation comments for a code block.
    
    Args:
        block: BlockInfo object containing complexity information
        comment_prefix: Comment syntax for the language (e.g., '#' or '//')
    
    Returns:
        List of comment lines to insert
    """
    complexity = block.complexity or "O(1)"
    reason = block.reason or "Heuristic estimate based on detected structure."
    
    return [
        f"{comment_prefix} Big-O: {complexity}",
        f"{comment_prefix} Explanation: {reason}"
    ]


def insert_annotations(code: str, lang: str) -> str:
    """
    Insert Big-O complexity annotations into source code.
    
    Analyzes the code for loops, functions, and recursion, then inserts
    orange-colored comment annotations (when syntax highlighted) beneath
    each detected block explaining its time complexity.
    
    Args:
        code: Source code string to annotate
        lang: Programming language identifier
    
    Returns:
        Annotated source code with Big-O comments inserted
    """
    if not code.strip():
        return code
    
    lines = code.splitlines()
    blocks = estimate_complexity(lines, lang)
    
    if not blocks:
        return code
    
    comment_prefix = comment_prefix_for(lang)
    
    # Build a map of line numbers to annotation comments
    annotations: Dict[int, List[str]] = {}
    for block in blocks:
        line_num = block.line_idx + 1  # Convert 0-indexed to 1-indexed
        comments = create_annotation_comment(block, comment_prefix)
        annotations.setdefault(line_num, []).extend(comments)
    
    # Insert annotations after their corresponding lines
    annotated_lines = []
    for line_num, original_line in enumerate(lines, start=1):
        annotated_lines.append(original_line)
        if line_num in annotations:
            annotated_lines.extend(annotations[line_num])
    
    return "\n".join(annotated_lines)


def to_markdown(code: str, lang: str) -> str:
    """
    Format annotated code as Markdown with syntax highlighting.
    
    Args:
        code: Annotated source code
        lang: Programming language identifier for syntax highlighting
    
    Returns:
        Markdown-formatted code block
    """
    lang_display = lang.capitalize()
    
    return f"""### Annotated Code ({lang_display})

```{lang}
{code}
```
"""

## Step 7: Syntax Highlighting with Pygments

Generates beautiful, syntax-highlighted HTML previews with **orange-colored complexity comments**.

**Features:**
- Uses Pygments lexer for accurate language-specific highlighting
- Custom CSS to make Big-O comments stand out in orange
- Fallback to plain HTML if Pygments is unavailable
- HTML escaping for security


In [18]:
def escape_html(text: str) -> str:
    """
    Escape HTML special characters for safe display.
    
    Args:
        text: Raw text to escape
    
    Returns:
        HTML-safe text
    """
    html_escape_table = {
        "&": "&amp;",
        "<": "&lt;",
        ">": "&gt;",
        '"': "&quot;",
        "'": "&#x27;",
    }
    return "".join(html_escape_table.get(c, c) for c in text)


def highlighted_html(code: str, lang: str) -> str:
    """
    Generate syntax-highlighted HTML with orange-colored comments.
    
    Uses Pygments for syntax highlighting with custom CSS to make
    comments appear in orange for visual emphasis of Big-O annotations.
    
    Args:
        code: Source code to highlight
        lang: Programming language identifier
    
    Returns:
        HTML string with embedded CSS and syntax highlighting
    """
    if not code.strip():
        return f"<pre><code>{escape_html(code)}</code></pre>"
    
    try:
        from pygments import highlight
        from pygments.lexers import get_lexer_by_name
        from pygments.formatters import HtmlFormatter
        
        # Get appropriate lexer for the language
        lexer = get_lexer_by_name(lang)
        
        # Configure HTML formatter
        formatter = HtmlFormatter(
            nowrap=False,
            full=False,
            cssclass="codehilite",
            linenos=False
        )
        
        # Generate highlighted HTML
        html_code = highlight(code, lexer, formatter)
        
        return SYNTAX_HIGHLIGHT_CSS + html_code
        
    except ImportError:
        # Pygments not available - return plain HTML
        return f"<pre><code>{escape_html(code)}</code></pre>"
    
    except Exception as e:
        # Lexer not found or other error - fallback to plain HTML
        print(f"⚠️ Syntax highlighting failed for '{lang}': {e}")
        return f"<pre><code>{escape_html(code)}</code></pre>"

## Step 8: LLaMA Model Loading & Streaming

Loading HuggingFace LLaMA models for AI-powered code review:

**Key Features:**
- **Quantization support** - 4-bit or 8-bit to reduce memory (requires GPU)
- **Streaming generation** - See tokens appear in real-time
- **Automatic device mapping** - Uses GPU if available, CPU otherwise
- **Thread-safe streaming** - Uses `TextIteratorStreamer` for parallel generation

**Functions:**
- `load_model()` - Downloads and initializes the LLaMA model
- `stream_generate()` - Generates text token-by-token with streaming

**Memory Requirements:**
- **Without quantization:** ~14GB RAM (7B models) or ~26GB (13B models)
- **With 8-bit:** ~50% reduction (GPU required)
- **With 4-bit:** ~75% reduction (GPU required)


In [19]:
# Hugging Face model imports
try:
    from transformers import (
        AutoModelForCausalLM,
        AutoTokenizer,
        BitsAndBytesConfig,
        TextIteratorStreamer,
        pipeline
    )
    import threading
    TRANSFORMERS_AVAILABLE = True
except ImportError:
    TRANSFORMERS_AVAILABLE = False

# Global model state
MODEL_PIPELINE = None
TOKENIZER = None


def get_quantization_config(load_in_4bit: bool, load_in_8bit: bool) -> Optional[BitsAndBytesConfig]:
    """
    Create quantization configuration for model loading.
    
    Args:
        load_in_4bit: Whether to use 4-bit quantization
        load_in_8bit: Whether to use 8-bit quantization
    
    Returns:
        BitsAndBytesConfig object or None if quantization not requested/available
    
    Raises:
        RuntimeError: If quantization requested but CUDA not available
    """
    if not (load_in_4bit or load_in_8bit):
        return None
    
    # Check if CUDA is available
    try:
        import torch
        if not torch.cuda.is_available():
            raise RuntimeError(
                "Quantization requires CUDA (NVIDIA GPU).\n\n"
                "You are running on CPU/Mac and have requested quantization.\n"
                "Options:\n"
                "  1. Disable both 4-bit and 8-bit quantization to run on CPU\n"
                "     (requires ~26GB RAM for 13B models, ~14GB for 7B models)\n"
                "  2. Use a GPU with CUDA support\n"
                "  3. Try smaller models like gpt2 or microsoft/DialoGPT-medium\n\n"
                "Note: Quantization significantly reduces memory usage but requires GPU."
            )
    except ImportError:
        pass
    
    try:
        return BitsAndBytesConfig(load_in_4bit=load_in_4bit, load_in_8bit=load_in_8bit)
    except Exception as e:
        raise RuntimeError(f"Failed to create quantization config: {e}")


def load_model(
    model_id: str = DEFAULT_MODEL_ID,
    device_map: str = DEVICE_HINT,
    load_in_8bit: bool = False,
    load_in_4bit: bool = False
) -> None:
    """
    Load a Hugging Face LLaMA-family model for text generation.
    
    Supports optional 4-bit or 8-bit quantization to reduce memory usage.
    Model is loaded into global MODEL_PIPELINE and TOKENIZER variables.
    
    Args:
        model_id: Hugging Face model identifier
        device_map: Device mapping strategy ('auto', 'cpu', 'cuda', etc.)
        load_in_8bit: Enable 8-bit quantization
        load_in_4bit: Enable 4-bit quantization
    
    Raises:
        ImportError: If transformers is not installed
        Exception: If model loading fails
    """
    global MODEL_PIPELINE, TOKENIZER
    
    if not TRANSFORMERS_AVAILABLE:
        raise ImportError(
            "Transformers library is not installed. "
            "Please run the installation cell and restart the kernel."
        )
    
    # Use global variable instead of os.environ to avoid Jupyter ContextVar issues
    global HUGGINGFACE_TOKEN
    hf_token = HUGGINGFACE_TOKEN if HUGGINGFACE_TOKEN else None
    
    if hf_token:
        print(f"   Using HuggingFace token: {hf_token[:10]}...{hf_token[-4:]}")
    else:
        print("   No HuggingFace token available (may fail for gated models)")
    
    # Configure quantization if requested
    quant_config = get_quantization_config(load_in_4bit, load_in_8bit)
    
    print(f"🔄 Loading model: {model_id}")
    print(f"   Device map: {device_map}")
    print(f"   Quantization: 8-bit={load_in_8bit}, 4-bit={load_in_4bit}")
    print(f"   This may take 2-5 minutes... please wait...")

    # Final tqdm patch before model loading (catches any missed imports)
    _patch_tqdm()

    try:
        # Suppress transformers warnings
        from transformers.utils import logging
        logging.set_verbosity_error()
        
        TOKENIZER = AutoTokenizer.from_pretrained(
            model_id,
            token=hf_token,
            trust_remote_code=False
        )
        
        print("   ✓ Tokenizer loaded")
        
        # Load model
        model = AutoModelForCausalLM.from_pretrained(
            model_id,
            device_map=device_map,
            quantization_config=quant_config,
            token=hf_token,
            trust_remote_code=False,
            low_cpu_mem_usage=True
        )
        
        print("   ✓ Model loaded into memory")
        
        # Create pipeline
        MODEL_PIPELINE = pipeline(
            "text-generation",
            model=model,
            tokenizer=TOKENIZER
        )
        
        print("✅ Model loaded successfully")
        
    except Exception as e:
        print(f"❌ Model loading failed: {e}")
        print("\n💡 Troubleshooting:")
        print("  • Gated models require HuggingFace approval and token")
        print("  • Large models (13B+) need quantization OR ~26GB+ RAM")
        print("  • Quantization requires NVIDIA GPU with CUDA")
        print("\n💡 Models that work on CPU/Mac (no GPU needed):")
        print("  • gpt2 (~500MB RAM)")
        print("  • microsoft/DialoGPT-medium (~1GB RAM)")
        print("  • meta-llama/Llama-2-7b-chat-hf (~14GB RAM, needs approval)")
        print("\nBrowse more models: https://huggingface.co/models?pipeline_tag=text-generation")
        MODEL_PIPELINE = None
        TOKENIZER = None
        raise


def stream_generate(
    prompt: str,
    max_new_tokens: int = 256,
    temperature: float = 0.7
) -> Generator[str, None, None]:
    """
    Stream generated tokens from the loaded model.
    
    Uses TextIteratorStreamer for real-time token streaming.
    Falls back to non-streaming generation if streaming is unavailable.
    
    Args:
        prompt: Input text prompt for generation
        max_new_tokens: Maximum number of tokens to generate
        temperature: Sampling temperature (0.0 = deterministic, higher = more random)
    
    Yields:
        Generated text tokens as they are produced
    """
    # Validate model is loaded
    if MODEL_PIPELINE is None:
        yield "⚠️ Model not loaded. Please run load_model() first."
        return
    
    # Validate inputs
    if not prompt.strip():
        yield "⚠️ Empty prompt provided."
        return
    
    try:
        # Create streamer
        streamer = TextIteratorStreamer(
            MODEL_PIPELINE.tokenizer,
            skip_prompt=True,
            skip_special_tokens=True
        )
        
        # Prepare generation arguments
        generation_kwargs = {
            "text_inputs": prompt,
            "streamer": streamer,
            "max_new_tokens": max_new_tokens,
            "do_sample": True,
            "temperature": temperature,
        }
        
        # Run generation in separate thread
        def generate_in_thread():
            try:
                MODEL_PIPELINE(**generation_kwargs)
            except Exception as e:
                print(f"⚠️ Generation error: {e}")
        
        thread = threading.Thread(target=generate_in_thread, daemon=True)
        thread.start()
        
        # Stream tokens as they're generated
        for token in streamer:
            yield token
            
    except Exception as e:
        # Fallback to non-streaming generation
        print(f"⚠️ Streaming failed ({e}), falling back to non-streaming generation")
        try:
            result = MODEL_PIPELINE(
                prompt,
                max_new_tokens=max_new_tokens,
                do_sample=True,
                temperature=temperature
            )
            yield result[0]["generated_text"]
        except Exception as fallback_error:
            yield f"❌ Generation failed: {fallback_error}"

## Step 9: File Processing Pipeline

The main orchestration function that ties everything together:

**Workflow:**
1. **Read file** - Validate size (<2MB) and decode to text
2. **Detect language** - From file extension
3. **Analyze code** - Estimate complexity using heuristics
4. **Annotate** - Insert Big-O comments
5. **Generate previews** - Create Markdown and HTML views
6. **Optional AI review** - Send to LLaMA for deeper analysis

**Functions:**
- `read_file_content()` - Loads and validates uploaded files
- `create_review_prompt()` - Formats code for LLM analysis
- `generate_model_analysis()` - Gets AI-powered insights
- `process_code_file()` - Main orchestrator


In [20]:
def read_file_content(fileobj) -> Tuple[str, str, float]:
    """
    Read and decode file content from a file-like object.
    
    Args:
        fileobj: File-like object (from Gradio upload or file handle)
    
    Returns:
        Tuple of (filename, content_text, size_in_mb)
    
    Raises:
        ValueError: If file is too large
    """
    # Get filename, ensuring we have a valid name
    filename = getattr(fileobj, "name", None)
    if not filename:
        raise ValueError("Uploaded file must have a valid filename with extension")
    
    # Read raw content
    raw = fileobj.read()
    
    # Decode to text and calculate size
    if isinstance(raw, bytes):
        text = raw.decode("utf-8", errors="replace")
        size_mb = len(raw) / (1024 * 1024)
    else:
        text = str(raw)
        size_mb = len(text.encode("utf-8")) / (1024 * 1024)
    
    # Validate file size
    if size_mb > MAX_FILE_SIZE_MB:
        raise ValueError(
            f"File too large: {size_mb:.2f} MB. "
            f"Maximum allowed size is {MAX_FILE_SIZE_MB} MB."
        )
    
    return filename, text, size_mb


def create_review_prompt(code: str, lang: str, max_code_chars: int = 4000) -> str:
    """
    Create a prompt for LLM code review.
    
    Args:
        code: Annotated source code
        lang: Programming language
        max_code_chars: Maximum characters to include in prompt
    
    Returns:
        Formatted prompt string
    """
    # Truncate code if necessary to fit token limits
    code_snippet = code[:max_code_chars]
    if len(code) > max_code_chars:
        code_snippet += "\n... (code truncated for analysis)"
    
    return f"""You are a senior code reviewer specializing in performance analysis.

Language: {lang}

Task: Analyze the following annotated code and provide:
1. Validation of the Big-O annotations
2. Identification of performance bottlenecks
3. Specific optimization suggestions (max 8 bullet points)
4. Any algorithmic improvements

--- CODE START ---
{code_snippet}
--- CODE END ---

Provide a concise, actionable analysis:"""


def generate_model_analysis(code: str, lang: str, model_params: Dict) -> str:
    """
    Generate LLM-powered code complexity analysis.
    
    Args:
        code: Annotated source code
        lang: Programming language
        model_params: Parameters for model generation (max_new_tokens, temperature)
    
    Returns:
        Generated analysis text or error message
    """
    # Check if model is loaded
    if MODEL_PIPELINE is None:
        return "⚠️ **Model not loaded.** Please click '🔄 Load Model' button first before requesting AI analysis."
    
    try:
        prompt = create_review_prompt(code, lang)
        
        # Stream and collect tokens
        tokens = []
        for token in stream_generate(prompt, **model_params):
            tokens.append(token)
        
        result = "".join(tokens)
        return result if result.strip() else "_(No analysis generated)_"
        
    except Exception as e:
        return f"⚠️ Model analysis failed: {e}"


def process_code_file(
    fileobj,
    ask_model: bool,
    model_params: Dict
) -> Tuple[str, str, str, str, str]:
    """
    Process uploaded code file: detect language, annotate complexity, generate HTML preview.
    
    This is the main orchestration function that:
    1. Reads and validates the uploaded file
    2. Detects programming language from extension
    3. Analyzes and annotates code with Big-O complexity
    4. Generates Markdown and HTML previews
    5. Optionally generates LLM-powered code review
    
    Args:
        fileobj: File-like object from Gradio file upload
        ask_model: Whether to generate LLM analysis
        model_params: Dict with 'max_new_tokens' and 'temperature' for generation
    
    Returns:
        Tuple of (language, annotated_code, markdown_preview, html_preview, model_commentary)
    
    Raises:
        ValueError: If file is invalid, too large, or has unsupported extension
    """
    # Step 1: Read and validate file
    filename, code_text, file_size_mb = read_file_content(fileobj)
    
    print(f"📄 Processing: {filename} ({file_size_mb:.2f} MB)")
    
    # Step 2: Detect language from file extension
    lang = detect_language(filename)
    
    print(f"🔍 Detected language: {lang}")
    
    # Step 3: Analyze and annotate code
    annotated_code = insert_annotations(code_text, lang)
    
    # Step 4: Generate preview formats
    markdown_preview = to_markdown(annotated_code, lang)
    html_preview = highlighted_html(annotated_code, lang)
    
    # Step 5: Optionally generate model analysis
    model_commentary = ""
    if ask_model:
        print("🤖 Generating model analysis...")
        model_commentary = generate_model_analysis(annotated_code, lang, model_params)
    
    return lang, annotated_code, markdown_preview, html_preview, model_commentary

## Step 10: Build the Gradio Interface

Creating a professional two-column UI with:

**Left Column (Input):**
- File uploader (filters to code files only)
- AI review toggle
- Model configuration (ID, quantization options)
- Temperature and max tokens sliders
- Load model & process buttons

**Right Column (Output):**
- Detected language display
- Syntax-highlighted code preview (orange comments!)
- AI code review (if enabled)
- Download buttons for annotated code + Markdown

**Event Handlers:**
- `handle_model_loading()` - Shows live progress during model download
- `handle_file_processing()` - Processes uploaded files and updates all outputs


In [21]:
# Gradio UI imports
try:
    import gradio as gr
    GRADIO_AVAILABLE = True
except ImportError:
    GRADIO_AVAILABLE = False

import tempfile
from pathlib import Path


def get_file_extension_for_language(lang: str) -> str:
    """
    Get the primary file extension for a given language.
    
    Args:
        lang: Language identifier
    
    Returns:
        File extension with dot (e.g., '.py', '.js')
    """
    # Create reverse mapping from language to primary extension
    lang_to_ext = {
        "python": ".py",
        "javascript": ".js",
        "typescript": ".ts",
        "java": ".java",
        "c": ".c",
        "cpp": ".cpp",
        "csharp": ".cs",
        "go": ".go",
        "php": ".php",
        "swift": ".swift",
        "ruby": ".rb",
        "kotlin": ".kt",
        "rust": ".rs",
    }
    return lang_to_ext.get(lang, ".txt")


def save_outputs_to_temp(annotated_code: str, markdown: str, lang: str) -> Tuple[str, str]:
    """
    Save annotated code and markdown to temporary files for download.
    
    Args:
        annotated_code: Annotated source code
        markdown: Markdown preview
        lang: Programming language
    
    Returns:
        Tuple of (source_file_path, markdown_file_path)
    """
    # Get appropriate extension
    ext = get_file_extension_for_language(lang)
    
    # Create temporary files
    source_file = tempfile.NamedTemporaryFile(
        mode='w',
        suffix=ext,
        prefix='annotated_',
        delete=False,
        encoding='utf-8'
    )
    
    markdown_file = tempfile.NamedTemporaryFile(
        mode='w',
        suffix='.md',
        prefix='annotated_',
        delete=False,
        encoding='utf-8'
    )
    
    # Write content
    source_file.write(annotated_code)
    source_file.close()
    
    markdown_file.write(markdown)
    markdown_file.close()
    
    return source_file.name, markdown_file.name


def handle_model_loading(model_id: str, load_in_8bit: bool, load_in_4bit: bool):
    """
    Handle model loading with error handling and live progress updates for Gradio UI.
    Yields status updates with elapsed time.
    
    Args:
        model_id: Hugging Face model identifier
        load_in_8bit: Whether to use 8-bit quantization
        load_in_4bit: Whether to use 4-bit quantization
    
    Yields:
        Status message updates with progress
    """
    import threading
    import time
    
    # Immediate status update - clears old text
    yield "🔄 **Step 1/4:** Initializing... (0s elapsed)"
    
    print(f"\n{'='*60}")
    print(f"🔄 Starting model load: {model_id.strip()}")
    print(f"{'='*60}\n")
    
    start_time = time.time()
    loading_complete = False
    error_message = None
    
    # Function to load model in background thread
    def load_in_background():
        nonlocal loading_complete, error_message
        try:
            load_model(
                model_id.strip(),
                device_map=DEVICE_HINT,
                load_in_8bit=load_in_8bit,
                load_in_4bit=load_in_4bit
            )
            loading_complete = True
        except Exception as e:
            error_message = str(e)
            loading_complete = True
    
    # Start loading in background thread
    thread = threading.Thread(target=load_in_background, daemon=True)
    thread.start()
    
    # Progress stages with approximate timing
    stages = [
        (0, "🔄 **Step 1/4:** Connecting to HuggingFace..."),
        (5, "🔄 **Step 2/4:** Downloading tokenizer..."),
        (15, "🔄 **Step 3/4:** Loading model weights..."),
        (30, "🔄 **Step 4/4:** Finalizing model setup..."),
    ]
    
    stage_idx = 0
    last_update = time.time()
    
    # Show progress updates while loading
    while not loading_complete:
        elapsed = int(time.time() - start_time)
        
        # Move to next stage if enough time passed
        if stage_idx < len(stages) - 1 and elapsed >= stages[stage_idx + 1][0]:
            stage_idx += 1
        
        # Update every 2 seconds
        if time.time() - last_update >= 2:
            current_stage = stages[stage_idx][1]
            yield f"{current_stage} ({elapsed}s elapsed)"
            last_update = time.time()
        
        time.sleep(0.5)  # Check every 0.5 seconds
    
    # Final result
    elapsed = int(time.time() - start_time)
    if error_message:
        yield f"❌ **Model loading failed** ({elapsed}s elapsed)\n\n{error_message}"
    else:
        yield f"✅ **Model loaded successfully!** ({elapsed}s total)"


def handle_file_processing(
    file,
    ask_model_flag: bool,
    temperature: float,
    max_new_tokens: int
) -> Tuple[str, str, str, Optional[str], Optional[str]]:
    """
    Handle file processing workflow for Gradio UI.
    
    Args:
        file: Gradio file upload object
        ask_model_flag: Whether to generate model commentary
        temperature: Generation temperature
        max_new_tokens: Max tokens to generate
    
    Returns:
        Tuple of (language, html_preview, model_commentary, source_path, markdown_path)
    """
    # Validate file upload
    if file is None:
        return "", "<i>⚠️ Please upload a code file.</i>", "", None, None
    
    # Check if model is required but not loaded
    if ask_model_flag and MODEL_PIPELINE is None:
        return "", "", "⚠️ **Model not loaded.** Please click '🔄 Load Model' button first before requesting AI analysis.", None, None
    
    try:
        # Gradio provides file as a path string or file object
        if isinstance(file, str):
            file_path = file
        elif hasattr(file, 'name'):
            file_path = file.name
        else:
            return "", "<pre>❌ Invalid file upload format</pre>", "", None, None
        
        # Open and process the file
        with open(file_path, 'rb') as f:
            # Prepare model parameters
            model_params = {
                "max_new_tokens": int(max_new_tokens),
                "temperature": float(temperature)
            }
            
            # Process the code file
            lang, annotated_code, markdown_preview, html_preview, model_commentary = process_code_file(
                f,
                ask_model_flag,
                model_params
            )
        
        # Save outputs to temporary files for download
        source_path, markdown_path = save_outputs_to_temp(annotated_code, markdown_preview, lang)
        
        # Format model commentary
        commentary_display = model_commentary if model_commentary else "_(No model analysis generated)_"
        
        return lang, html_preview, commentary_display, source_path, markdown_path
        
    except ValueError as e:
        # User-facing errors (file too large, unsupported extension, etc.)
        return "", f"<pre>⚠️ {str(e)}</pre>", "", None, None
    except Exception as e:
        # Unexpected errors
        import traceback
        error_detail = traceback.format_exc()
        print(f"Error processing file: {error_detail}")
        return "", f"<pre>❌ Processing failed: {str(e)}</pre>", "", None, None


def build_ui():
    """
    Build the Gradio user interface for the Code Complexity Annotator.
    
    Returns:
        Gradio Blocks interface
    """
    if not GRADIO_AVAILABLE:
        raise ImportError(
            "Gradio is not installed. Please run the installation cell "
            "and restart the kernel."
        )
    
    # Custom CSS for better UI
    custom_css = """
    footer {visibility: hidden}
    .gradio-container {font-family: 'Inter', sans-serif}
    """
    
    with gr.Blocks(css=custom_css, title="Code Complexity Annotator") as demo:
        # Header
        gr.Markdown("# 🔶 Multi-Language Code Complexity Annotator")
        gr.Markdown(
            "Upload code → Detect language → Auto-annotate with Big-O complexity → "
            "Preview with syntax highlighting → Download results. "
            "Optional: Get AI-powered code review from LLaMA."
        )
        
        with gr.Row():
            # Left column: Input controls
            with gr.Column(scale=2):
                gr.Markdown("### 📤 Upload & Settings")
                
                file_upload = gr.File(
                    label="Upload Code File",
                    file_count="single",
                    file_types=[ext for ext in SUPPORTED_EXTENSIONS.keys()]
                )
                
                ask_model = gr.Checkbox(
                    label="🤖 Generate AI Code Review",
                    value=True,
                    info="⚠️ Requires model to be loaded first using the button below"
                )
                
                gr.Markdown("### 🧠 Model Configuration")
                
                model_id = gr.Textbox(
                    label="Hugging Face Model ID",
                    value=DEFAULT_MODEL_ID,
                    placeholder="meta-llama/Llama-3.2-1B"
                )
                
                with gr.Row():
                    load_8bit = gr.Checkbox(
                        label="8-bit Quantization",
                        value=False,
                        info="⚠️ Requires CUDA/GPU (reduces memory by ~50%)"
                    )
                    load_4bit = gr.Checkbox(
                        label="4-bit Quantization",
                        value=False,
                        info="⚠️ Requires CUDA/GPU (reduces memory by ~75%, lower quality)"
                    )
                
                temperature = gr.Slider(
                    label="Temperature",
                    minimum=0.0,
                    maximum=1.5,
                    value=0.7,
                    step=0.05,
                    info="Lower = more deterministic, Higher = more creative"
                )
                
                max_tokens = gr.Slider(
                    label="Max New Tokens",
                    minimum=16,
                    maximum=1024,
                    value=256,
                    step=16,
                    info="Maximum length of generated review"
                )
                
                with gr.Row():
                    load_model_btn = gr.Button("🔄 Load Model", variant="secondary")
                    process_btn = gr.Button("🚀 Process & Annotate", variant="primary")
                
                model_status = gr.Markdown("⚪ **Status:** Model not loaded")
            
            # Right column: Output displays
            with gr.Column(scale=3):
                gr.Markdown("### 📊 Results")
                
                detected_lang = gr.Textbox(
                    label="Detected Language",
                    interactive=False,
                    placeholder="Upload a file to detect language"
                )
                
                html_preview = gr.HTML(
                    label="Code Preview (Orange = Complexity Annotations)",
                    value="<i>Upload and process a file to see preview...</i>"
                )
                
                model_output = gr.Markdown(
                    label="🤖 AI Code Review",
                    value="*Enable 'Generate AI Code Review' and process a file to see analysis...*"
                )
                
                gr.Markdown("### 💾 Downloads")
                
                with gr.Row():
                    download_source = gr.File(
                        label="Annotated Source Code",
                        interactive=False
                    )
                    download_markdown = gr.File(
                        label="Markdown Preview",
                        interactive=False
                    )
        
        # Event handlers
        load_model_btn.click(
            fn=handle_model_loading,
            inputs=[model_id, load_8bit, load_4bit],
            outputs=[model_status],
            show_progress="full"  # Show clear loading indicator
        )
        
        process_btn.click(
            fn=handle_file_processing,
            inputs=[file_upload, ask_model, temperature, max_tokens],
            outputs=[detected_lang, html_preview, model_output, download_source, download_markdown]
        )
    
    return demo


# Build and display the interface
demo = build_ui()
demo

Gradio Blocks instance: 2 backend functions
-------------------------------------------
fn_index=0
 inputs:
 |-<gradio.components.textbox.Textbox object at 0x14292bbf0>
 |-<gradio.components.checkbox.Checkbox object at 0x142d06990>
 |-<gradio.components.checkbox.Checkbox object at 0x142913380>
 outputs:
 |-<gradio.components.markdown.Markdown object at 0x142d5b4d0>
fn_index=1
 inputs:
 |-<gradio.components.file.File object at 0x142913d70>
 |-<gradio.components.checkbox.Checkbox object at 0x1400ab080>
 |-<gradio.components.slider.Slider object at 0x14264f3e0>
 |-<gradio.components.slider.Slider object at 0x112bd9c40>
 outputs:
 |-<gradio.components.textbox.Textbox object at 0x1425ea420>
 |-<gradio.components.html.HTML object at 0x1426dcec0>
 |-<gradio.components.markdown.Markdown object at 0x142d4bc80>
 |-<gradio.components.file.File object at 0x1423232c0>
 |-<gradio.components.file.File object at 0x1422f2750>

## Step 11: Launch the App

Starting the Gradio server with auto-browser launch.

**Options:**
- `share=False` - Local only (set to True for public Gradio link)
- `inbrowser=True` - Automatically opens in your default browser
- `show_error=True` - Displays detailed error messages in the UI

The app will be available at: `http://127.0.0.1:7861`

---

## 💡 How to Use

### Without AI Review (No Model Needed):
1. **Upload** a code file (.py, .js, .java, etc.)
2. **Uncheck** "Generate AI Code Review"
3. **Click** "🚀 Process & Annotate"
4. **View** syntax-highlighted code with Big-O annotations
5. **Download** the annotated source + Markdown

### With AI Review (Requires Model):
1. **Click** "🔄 Load Model" (wait 2-5 minutes for first download)
2. **Upload** your code file
3. **Check** "Generate AI Code Review"
4. **Adjust** temperature/tokens if needed
5. **Click** "🚀 Process & Annotate"
6. **Read** AI-generated optimization suggestions

---

## 🎯 Supported Languages

Python • JavaScript • TypeScript • Java • C • C++ • C# • Go • PHP • Swift • Ruby • Kotlin • Rust

---

## 🧠 Model Options

**Recommended for CPU/Mac:**
- `meta-llama/Llama-3.2-1B` (Default, ~1GB, requires HF approval)
- `gpt2` (No approval needed, ~500MB)
- `microsoft/DialoGPT-medium` (~1GB)

**For GPU users:**
- Any model with 8-bit or 4-bit quantization enabled
- `meta-llama/Llama-2-7b-chat-hf` (requires approval)

---

**Note:** First model load downloads weights (~1-14GB depending on model). Subsequent runs load from cache.


In [22]:
# Launch the Gradio interface
demo.launch(
    share=False,        # Set to True to create a public shareable link
    inbrowser=True,     # Automatically open in browser
    show_error=True     # Show detailed errors in UI
)

* Running on local URL:  http://127.0.0.1:7861
* To create a public link, set `share=True` in `launch()`.




📄 Processing: /private/var/folders/jq/vdvn5cg53sj2xsq1w_0wjjc80000gn/T/gradio/d8fe7d241f82ae93c8cf07e99823e6db91d20185c411ded7454eb7a0d89174a4/3 Simple Python Functions with Different Time Complexities.py (0.00 MB)
🔍 Detected language: python
📄 Processing: /private/var/folders/jq/vdvn5cg53sj2xsq1w_0wjjc80000gn/T/gradio/a2b7a4fdfb5e5f657878a74459fd8d68e30fc0afdfb6e5627aab99cf8552011d/Simple Python Functions with Different Time Complexities.py (0.00 MB)
🔍 Detected language: python
📄 Processing: /private/var/folders/jq/vdvn5cg53sj2xsq1w_0wjjc80000gn/T/gradio/4dad1dc092f0232b348a683e42414de456c388b3e21d93ee820b8e7bc4a2aa47/Python Function.py (0.00 MB)
🔍 Detected language: python
