# 🧪 Lab 2: Code Review Assistant (Google Colab)

In this lab you'll build a **Code Review Assistant** with an LLM. You'll:

**What you'll learn**
1. Set up SDK & helper for reliable LLM calls
2. Review a single Python file for bugs, smells, and readability
3. Generate **diff-style fixes** and **unit tests**
4. Produce a **structured JSON report** with severities
5. Run a **batch review** for multiple files (simulating a PR)

> **Why are we doing this?**
> Engineers use LLMs as *review bots* that flag issues and propose safe patches. You’ll learn stable prompts and JSON outputs you can wire into CI.

## ✅ Step 0 — Colab Runtime Check

In [None]:
import sys
print('Python', sys.version)
try:
    import google.colab  # type: ignore
    print('✅ Running in Google Colab')
except Exception:
    print('ℹ️ Not in Colab (that is okay for local runs).')

## 🔐 Step 1 — Install SDK & Set API Key

We'll use the official **OpenAI Python SDK (>=1.0)**. It needs an API key.

**How to use in Colab**
1. Create a key at: https://platform.openai.com/account/api-keys
2. Run the cell below (you’ll be prompted to paste the key).

In [None]:
!pip -q install --upgrade openai>=1.40

import os
from getpass import getpass

if 'OPENAI_API_KEY' not in os.environ or not os.environ['OPENAI_API_KEY']:
    print('Enter your OpenAI API key (hidden):')
    os.environ['OPENAI_API_KEY'] = getpass()
print('✅ API key set in environment.')

## 🧰 Step 2 — Minimal Client & Helpers

We’ll create:
- `call_llm` for general chat completions with retries
- `call_llm_json` to enforce **strict JSON** outputs (for CI pipelines)
- Prompt templates for **review**, **fix diff**, and **unit tests**

In [None]:
from openai import OpenAI
import time, json, uuid
from typing import List, Dict, Any

client = OpenAI()
DEFAULT_MODEL = 'gpt-4o-mini'  # fast & cost-efficient for labs

class LLMError(Exception):
    pass

def call_llm(messages: List[Dict[str, str]], model: str = DEFAULT_MODEL,
             temperature: float = 0.2, max_tokens: int = 800,
             retries: int = 2, **kwargs: Any) -> str:
    last_err = None
    for attempt in range(retries + 1):
        try:
            resp = client.chat.completions.create(
                model=model,
                messages=messages,
                temperature=temperature,
                max_tokens=max_tokens,
                **kwargs
            )
            return resp.choices[0].message.content
        except Exception as e:
            last_err = e
            time.sleep(0.6 * (attempt + 1))
    raise LLMError(f'LLM call failed after retries: {last_err}')

def call_llm_json(user_prompt: str, schema_hint: str, system_hint: str = None,
                  temperature: float = 0.1, max_tokens: int = 1200) -> dict:
    sys_msg = (system_hint or "Return strictly valid JSON only. No markdown, no commentary.")
    text = call_llm([
        {"role": "system", "content": sys_msg + " Respond ONLY with JSON."},
        {"role": "user", "content": f"Schema: {schema_hint}\nInput: {user_prompt}"}
    ], temperature=temperature, max_tokens=max_tokens)
    return json.loads(text)

## 🧪 Step 3 — A Buggy Python File (Demo Input)

We’ll start with a small function that has multiple issues:
- Arithmetic bug
- Mutable default argument
- Poor error handling
- Missing docstring and type hints

In [None]:
buggy_code = '''\
def accumulate(values=[]):
    total = 0
    for v in values:
        total -= v  # BUG: should add
    try:
        avg = total / len(values)
    except:
        avg = 0
    return total, avg
'''
print(buggy_code)

## 🧑‍⚖️ Step 4 — Code Review Prompt (Findings + Severity)

We’ll ask the model for a structured JSON report with severity for each issue. This is CI-friendly.

In [None]:
review_schema = '{"file":"str","summary":"str","issues":[{"title":"str","severity":"oneof:LOW,MEDIUM,HIGH,CRITICAL","category":"oneof:BUG,SECURITY,STYLE,PERF,MAINTAINABILITY","line":"int|null","explanation":"str","suggestion":"str"}]}'

review_prompt = f"""
Review the following Python source code. Identify bugs, security risks, style issues, performance concerns, and maintainability problems.
Provide a JSON report that matches the given schema. Estimate the line number if possible.

CODE:\n{buggy_code}
"""

review = call_llm_json(user_prompt=review_prompt, schema_hint=review_schema, system_hint=(
    "You are a precise code review bot for Python.\n"
    "Explain issues clearly and suggest concrete changes."
))
review

## 🩹 Step 5 — Generate a Unified Diff Patch

Ask for a minimal patch in **unified diff** format (compatible with `git apply`).

In [None]:
diff_prompt = f"""
Given this Python file, produce a minimal **unified diff** patch that fixes the issues while preserving function behavior.
Only output the diff. Use filename `accumulate.py`.

CODE:\n{buggy_code}
"""
patch_text = call_llm([
    {"role": "system", "content": "You generate small, correct unified diffs. Output ONLY the diff."},
    {"role": "user", "content": diff_prompt}
], temperature=0.1, max_tokens=600)
print(patch_text)

## 🧪 Step 6 — Generate Unit Tests (pytest)

We’ll create minimal **pytest** tests using the intended behavior inferred from the review.

In [None]:
tests_prompt = f"""
Write **pytest** unit tests for the fixed version of the function in `accumulate.py`.
Cover: empty list, positive numbers, negative numbers, and mixed values. Use clear asserts.
Return only Python test code (no explanations).
"""
tests_code = call_llm([
    {"role": "system", "content": "You write concise, correct pytest tests."},
    {"role": "user", "content": tests_prompt}
], temperature=0.2, max_tokens=700)
print(tests_code)

## 🧰 Step 7 — Wrap Into Reusable Functions

These helpers let you plug any source string(s) and get a **report**, **patch**, and **tests** back.

In [None]:
def review_source(filename: str, code_text: str) -> dict:
    schema = '{"file":"str","summary":"str","issues":[{"title":"str","severity":"oneof:LOW,MEDIUM,HIGH,CRITICAL","category":"oneof:BUG,SECURITY,STYLE,PERF,MAINTAINABILITY","line":"int|null","explanation":"str","suggestion":"str"}]}'
    prompt = f"Review file `{filename}` and return a JSON report matching schema.\nCODE:\n{code_text}"
    return call_llm_json(prompt, schema)

def suggest_patch(filename: str, code_text: str) -> str:
    prompt = f"Produce a minimal unified diff to fix issues in `{filename}`. Output ONLY the diff.\nCODE:\n{code_text}"
    return call_llm([
        {"role": "system", "content": "You generate minimal, correct unified diffs."},
        {"role": "user", "content": prompt}
    ], temperature=0.1, max_tokens=700)

def generate_tests(filename: str, code_text: str) -> str:
    prompt = (
        f"Write pytest tests for the fixed version of `{filename}` covering typical and edge cases. "
        "Return only code."
    )
    return call_llm([
        {"role": "system", "content": "You write concise, correct pytest tests."},
        {"role": "user", "content": prompt}
    ], temperature=0.2, max_tokens=800)

# Quick sanity check on helpers
report = review_source('accumulate.py', buggy_code)
patch  = suggest_patch('accumulate.py', buggy_code)
tests  = generate_tests('accumulate.py', buggy_code)
print(report['summary'])
print('\n--- DIFF ---\n', patch[:400], '...')
print('\n--- TESTS ---\n', tests[:400], '...')

## 📦 Step 8 — Batch Review (Simulated PR)

Feed a dict of `{filename: code}` and get a combined review with per-file issues and a summary. This simulates a **pull request** code review workflow.

In [None]:
sample_pr = {
    'accumulate.py': buggy_code,
    'utils.py': '''\
def is_even(n):
    # style issue: no typing, no docstring
    return n % 2 == 0
'''
}

def review_pr(files: Dict[str, str]) -> dict:
    items = []
    for fname, code in files.items():
        items.append({"file": fname, "code": code})
    schema = '{"summary":"str","files":[{"file":"str","issues":[{"title":"str","severity":"oneof:LOW,MEDIUM,HIGH,CRITICAL","category":"oneof:BUG,SECURITY,STYLE,PERF,MAINTAINABILITY","line":"int|null","explanation":"str","suggestion":"str"}]}]}'
    user_prompt = json.dumps({"pr": items})
    return call_llm_json(user_prompt, schema, system_hint=(
        "You are a senior code reviewer. Aggregate issues across files and produce a concise summary."
    ))

pr_report = review_pr(sample_pr)
pr_report

## 🧭 What to try next

1. Swap `buggy_code` with your own file(s) — paste code or read from GitHub.
2. Tighten prompts for your org’s **style guide** and **security baselines**.
3. Have the bot return **SARIF** for integration with code scanning.
4. Post results to a PR via CI (e.g., GitHub Actions) and gate on `CRITICAL` issues.

**You’ve completed Lab 2.** 🎉