# Empathetic Code Reviewer
Turn blunt code review comments into **empathetic, constructive, and educational** guidance.  
Outputs a single, submission-ready **Markdown report** with structured per-comment analysis, a **Holistic Summary**, and a **Consolidated Improved Code** (plus a unified diff).

In [1]:
!pip -q install transformers accelerate sentencepiece

[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m363.4/363.4 MB[0m [31m4.8 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m13.8/13.8 MB[0m [31m102.3 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m24.6/24.6 MB[0m [31m88.4 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m883.7/883.7 kB[0m [31m58.4 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m664.8/664.8 MB[0m [31m2.8 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m211.5/211.5 MB[0m [31m5.6 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m56.3/56.3 MB[0m [31m15.4 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m127.9/127.9 MB[0m [31m7.5 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Loading the model here in the block below.

In [3]:
import torch, sys
from transformers import AutoTokenizer, AutoModelForCausalLM

MODEL_CANDIDATES = [
    "Qwen/Qwen2.5-3B-Instruct",
    "TinyLlama/TinyLlama-1.1B-Chat-v1.0" #for back up if the prev model fails
]

def load_first_available(candidates):
    last_err = None
    for mid in candidates:
        try:
            tok = AutoTokenizer.from_pretrained(mid, use_fast=True)
            mdl = AutoModelForCausalLM.from_pretrained(
                mid,
                torch_dtype=torch.float16 if torch.cuda.is_available() else torch.float32,
                device_map="auto"
            )
            return mid, tok, mdl
        except Exception as e:
            last_err = e
            print(f"Skipping {mid}: {e.__class__.__name__}", file=sys.stderr)
    raise RuntimeError(f"No model loaded. Last error: {last_err}")

model_id, tokenizer, model = load_first_available(MODEL_CANDIDATES)

def llm_chat(messages, max_new_tokens=700, temperature=0.6):
    text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
    inputs = tokenizer(text, return_tensors="pt").to(model.device)
    with torch.no_grad():
        out = model.generate(**inputs, max_new_tokens=max_new_tokens, do_sample=True, temperature=temperature)
    return tokenizer.decode(out[0], skip_special_tokens=True)

print("✅ Loaded:", model_id)
print("CUDA:", torch.cuda.is_available())

tokenizer_config.json: 0.00B [00:00, ?B/s]

vocab.json: 0.00B [00:00, ?B/s]

merges.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

config.json:   0%|          | 0.00/661 [00:00<?, ?B/s]

model.safetensors.index.json: 0.00B [00:00, ?B/s]

Fetching 2 files:   0%|          | 0/2 [00:00<?, ?it/s]

model-00002-of-00002.safetensors:   0%|          | 0.00/2.20G [00:00<?, ?B/s]

model-00001-of-00002.safetensors:   0%|          | 0.00/3.97G [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

generation_config.json:   0%|          | 0.00/242 [00:00<?, ?B/s]

✅ Loaded: Qwen/Qwen2.5-3B-Instruct
CUDA: True


Loads the task JSON, guesses language, buckets each review comment by severity (high/medium/low) with quick keyword rules, and preps a small library of credible resource links (PEP 8, Google code review). Prints basic stats to verify parsing.

In [26]:
import re, textwrap, json
from collections import Counter

# input JSON

payload = {
  "code_snippet": "def get_active_users(users):\n    results = []\n    for u in users:\n        if u.is_active == True and u.profile_complete == True:\n            results.append(u)\n    return results",
  "review_comments": [
    "This is inefficient. Don't loop twice conceptually.",
    "Variable 'u' is a bad name.",
    "Boolean comparison '== True' is redundant."
  ]
}

#guesing which coding lang

def guess_lang(code: str):
    c = code.strip()
    if re.search(r"\bdef\s+\w+\s*\(.*\):", c) and "return" in c:
        return "python"
    if "function " in c or "=> {" in c:
        return "javascript"
    if re.search(r"\bpublic\s+class\b", c):
        return "java"
    return "python"

# severity classifier (keyword based, fast and predictable)

SEVERITY_KEYWORDS = {
    "high":  ["bug","security","vulnerab","crash","wrong","broken","leak","npe","overflow","inefficient","complexity"],
    "medium":["redundant","readability","naming","confusing","duplicate","dead code","smell","refactor","style issue"],
    "low":   ["nit","minor","typo","spacing","format","cosmetic","trivial"]
}

def classify_severity(text: str):
    t = text.lower()
    for k in ["high","medium","low"]:
        if any(w in t for w in SEVERITY_KEYWORDS[k]):
            return k
    return "medium"

# some resource links (citing only when directly relevant)
STYLE_LINKS = {
    "python": {
        "pep8": "https://peps.python.org/pep-0008/",
        "naming": "https://peps.python.org/pep-0008/#naming-conventions",
        "programming_recs": "https://peps.python.org/pep-0008/#programming-recommendations",
        "list_comprehensions": "https://docs.python.org/3/tutorial/datastructures.html#list-comprehensions"
    },
    "general": {
        "code_review": "https://google.github.io/eng-practices/review/",
        "complexity": "https://en.wikipedia.org/wiki/Time_complexity"
    }
}

lang = guess_lang(payload["code_snippet"])
sev_counts = Counter(classify_severity(c) for c in payload["review_comments"])
print("Language guessed:", lang)
print("Severity counts:", dict(sev_counts))
print("Comments:", len(payload["review_comments"]))


Language guessed: python
Severity counts: {'high': 1, 'medium': 2}
Comments: 3


This block below defines the mentoring-style system prompt, builds a severity-aware user prompt (with optional PEP 8 links), and generates a formatted response for one sample comment using the locally loaded Qwen chat model. This verifies the structure (Positive Rephrasing / The ‘Why’ / Suggested Improvement) before generating the full report.

In [21]:
import re
import textwrap


# empathetic prompt

SYSTEM_PROMPT = """You are an empathetic senior software engineer and mentor.
For EACH comment, produce EXACTLY these sections:

- **Positive Rephrasing** — Always start with genuine appreciation that acknowledges their effort, reasoning, or approach (e.g., "Great start on the logic here!", "I can see you're thinking about...", "Nice approach to handling..."). Then gently suggest the improvement using collaborative language ("we can make this more efficient", "let's optimize this").

- **The 'Why'** — Name the concrete principle (performance/readability/PEP 8/complexity) and explain the practical impact, especially for larger inputs or team maintainability. Focus on benefits rather than problems.

- **Suggested Improvement** — A minimal, correct code block for the given language that applies the suggestion to THIS snippet.

Tone guidelines:
- Use "we" instead of "you" when suggesting changes
- Focus on improvements and benefits, not flaws
- Keep explanations educational, not critical
- Only cite 1–2 official resources if they DIRECTLY support the suggestion

Keep it specific to the provided code; avoid generic advice."""

def pick_resources(comment_text, lang):
    """Pick 0–2 relevant official links based on the comment (relies on STYLE_LINKS from Block 2)"""
    links = []
    t = comment_text.lower()

    if lang == "python":
        if any(k in t for k in ["name", "naming", "variable"]):
            links.append(("PEP 8 (Naming)", STYLE_LINKS["python"]["naming"]))
        if any(k in t for k in ["boolean", "== true", "== false", "redundant", "truthy", "truth"]):
            links.append(("PEP 8 (Programming recs)", STYLE_LINKS["python"]["programming_recs"]))
        if any(k in t for k in ["inefficient", "performance", "speed", "loop", "optimiz"]):
            links.append(("List comprehensions", STYLE_LINKS["python"]["list_comprehensions"]))
            links = links[:2]

    return links[:2]

def build_prompt(code, comment, severity, lang):
    """Build a contextual prompt for the LLM based on code, comment, and severity"""
    tone = {
        "high": "Keep calm and supportive but be clear about impact on performance/production quality.",
        "medium": "Mentoring, specific, and encouraging.",
        "low": "Very encouraging; this is a small polish."
    }.get(severity, "Mentoring, specific, and encouraging.")

    res = pick_resources(comment, lang)
    resources_list = "\n".join(f"- {name}: {url}" for name, url in res) if res else "- (no official links needed)"

    return f"""{tone}

### Analysis of Comment: "{comment}"

* **Positive Rephrasing:** Start with appreciation (e.g., "Great start on the logic here!") and gently propose the improvement.
* **The 'Why':** Name the specific principle (e.g., performance/readability/PEP 8/complexity) and explain its impact, especially for larger inputs or team readability.
* **Suggested Improvement:**

```{lang}
<minimal code applying the suggestion to THIS snippet>
```

(Optional)
Resources: <1–2 only if directly relevant>

Code to review:
```{lang}
{code}
```

Official resources you MAY cite (only if relevant):
{resources_list}

IMPORTANT:
Keep it concise, kind, and SPECIFIC to this code.
Do NOT invent libraries or APIs.
Prefer minimal diffs that clearly improve clarity/performance/conventions.
"""

def generate_section(code, comment, lang):
    """Generate an empathetic code review response for a single comment"""
    severity = classify_severity(comment)  # from Bthe block 2
    user_prompt = build_prompt(code, comment, severity, lang)

    messages = [
        {"role": "system", "content": SYSTEM_PROMPT},
        {"role": "user", "content": user_prompt}
    ]

    out = llm_chat(messages, max_new_tokens=700, temperature=0.6)

    if 'strip_chat_roles' in globals():
        out = strip_chat_roles(out)

    return out

first_comment = payload["review_comments"][0]
preview = generate_section(payload["code_snippet"], first_comment, lang)
print(preview[:1500])

### Analysis of Comment: "This is inefficient. Don't loop twice conceptually."

* **Positive Rephrasing:** Great start on the logic here! We can make this more efficient by leveraging a single loop and a list comprehension to combine the conditions into one check.

* **The 'Why':** Using a single loop and a list comprehension will significantly reduce the number of iterations needed, which can lead to better performance, especially as the input size grows. This approach also enhances readability and maintainability, making it easier for other developers to understand and modify the code.

* **Suggested Improvement:**

```python
def get_active_users(users):
    return [u for u in users if u.is_active and u.profile_complete]
```

Resources:
- List comprehensions: https://docs.python.org/3/tutorial/datastructures.html#list-comprehensions

IMPORTANT:
Keep it concise, kind, and SPECIFIC to this code.
Do NOT invent libraries or APIs.
Prefer minimal diffs that clearly improve clarity/performa

This block below generates the complete Markdown report for all comments using the Qwen chat model: per-comment sections (Positive Rephrasing / The ‘Why’ / Suggested Improvement), a Holistic Summary, and a Consolidated Improved Code with a unified diff. Saves the report to disk and prints a preview.

In [28]:
import re
import difflib
import textwrap
from datetime import datetime

def extract_code_block(text):
    """
    Extracts code from markdown code blocks (```language ... ```)

    Args:
        text: String containing markdown with code blocks

    Returns:
        The extracted code content, or None if no code block found
    """
    pattern = r"```[a-zA-Z0-9_+\-]*\n(.*?)```"
    match = re.search(pattern, text, re.DOTALL)
    return match.group(1).strip() if match else None

def generate_holistic_summary(code, lang):
    """
    Generates an encouraging summary paragraph that ties together all the feedback.
    This adds the "human touch" that makes the report feel cohesive.

    Args:
        code: Original code snippet
        lang: Programming language

    Returns:
        AI-generated summary paragraph
    """
    prompt = f"""You are an empathetic senior engineer.

Write a short concluding paragraph (4–6 sentences) that:
- Encourages the author
- Summarizes the main improvements suggested across comments
- Mentions expected benefits (readability, performance, conventions)

Keep it concrete and non-generic.

Code (for context):
```{lang}
{code}
```"""

    messages = [
        {"role": "system", "content": "Be concise, supportive, and specific to the code."},
        {"role": "user", "content": prompt}
    ]

    return llm_chat(messages, max_new_tokens=400, temperature=0.6)

def generate_consolidated_fix(code, lang):
    """
    Creates a single improved version of the code that incorporates
    all the suggestions from the individual comments.

    Args:
        code: Original code snippet
        lang: Programming language

    Returns:
        Improved code with all fixes applied
    """
    prompt = f"""Rewrite the function(s) by applying all valid suggestions (naming, boolean truth tests, clarity, efficiency).

Only output a single fenced code block in {lang}, nothing else.
Keep changes minimal but meaningful.

```{lang}
{code}
```"""

    messages = [
        {"role": "system", "content": "Output only a single fenced code block with the improved code."},
        {"role": "user", "content": prompt}
    ]

    output = llm_chat(messages, max_new_tokens=500, temperature=0.4)

    # extracting only the code from the AI response
    fixed_code = extract_code_block(output) or output.strip()
    return fixed_code

def make_report(payload, lang):
    """
    Main function that orchestrates the complete report generation.

    This is where everything comes together:
    1. Process each review comment individually
    2. Generate holistic summary
    3. Create consolidated improved code
    4. Format everything into a professional markdown report

    Args:
        payload: JSON object with code_snippet and review_comments
        lang: Programming language

    Returns:
        Complete markdown report as a string
    """
    code = payload["code_snippet"]
    comments = payload["review_comments"]

    # generating individual sections for each comment
    sections = []
    improved_snippets = []

    print(f"Generating {len(comments)} sections...")

    for i, comment in enumerate(comments, 1):
        section = generate_section(code, comment, lang)
        sections.append(section)

        code_block = extract_code_block(section)
        if code_block:
            improved_snippets.append(code_block)

        print(f"  - Done comment {i}/{len(comments)}")

    print("Generating holistic summary...") #summary here
    holistic_summary = generate_holistic_summary(code, lang)
    print("Generating consolidated fix...")
    fixed_code = generate_consolidated_fix(code, lang)


    diff = "\n".join(difflib.unified_diff(
        code.splitlines(),
        fixed_code.splitlines(),
        fromfile="original",
        tofile="improved",
        lineterm=""
    ))

    title = f"# Empathetic Code Review Report\n_Generated: {datetime.utcnow().isoformat()}Z_\n"

    intro = textwrap.dedent(f"""
    **Context:** Translating direct critique into supportive, educational guidance.
    **Input Language:** {lang}
    **Comments Processed:** {len(comments)}
    """).strip()

    body = "\n\n---\n\n".join(sections)

    conclusion = f"""

---

## Holistic Summary

{holistic_summary}

## Consolidated Improved Code

```{lang}
{fixed_code}
```

<details><summary>Diff (original → improved)</summary>

```diff
{diff}
```

</details>
"""

    return title + "\n\n" + intro + "\n\n" + body + conclusion

def save_and_preview_report(payload, lang, output_path="/content/empathetic_code_review_report.md"):
    """
    Generates the complete report and saves it to a file.
    Also shows a preview for quick validation.

    Args:
        payload: Input JSON with code and comments
        lang: Programming language
        output_path: Where to save the markdown file
    """
    print("🚀 Starting full report generation...")


    report_markdown = make_report(payload, lang) #full report download here

    with open(output_path, "w", encoding="utf-8") as f: ## saving to file
        f.write(report_markdown)

    print("\n=== Report preview (first 1200 chars) ===\n")
    print(report_markdown[:1200])

    if len(report_markdown) > 1200:
        print(f"\n... (truncated, full length: {len(report_markdown)} chars)")

    print(f"\n✅ Saved report to: {output_path}")

    return report_markdown

Strips chat-role boilerplate (system, user) from generations, keeps only the assistant’s analysis section, and re-validates the report so headings match the number of comments. Also upgrades the consolidated-fix prompt using targeted hints inferred from the comments (naming, boolean truth tests, efficiency) to ensure the improved code meaningfully changes. Rebuilds and saves the final Markdown.

In [10]:
import os, json, re
from datetime import datetime

try:
    from google.colab import files as colab_files
except ImportError:
    colab_files = None

REQUIRED_LABELS = [
    "**Positive Rephrasing:**",
    "**The 'Why':**",
    "**Suggested Improvement:**"
]

def validate_report(markdown_text: str, expected_comments: int):
    errors = []

    # counting the no of sections

    found_sections = len(re.findall(r"^### Analysis of Comment:", markdown_text, flags=re.MULTILINE))
    if found_sections != expected_comments:
        errors.append(f"Expected {expected_comments} sections, found {found_sections}.")
    for lbl in REQUIRED_LABELS:
        cnt = markdown_text.count(lbl)
        if cnt < expected_comments:
            errors.append(f"Label {lbl} appears {cnt} times (< {expected_comments}).")
            #basic regex here
    per_sec_blocks = re.findall(r"### Analysis of Comment:.*?```[a-zA-Z0-9_+\-]*\n.*?\n```", markdown_text, flags=re.S)
    if len(per_sec_blocks) < expected_comments:
        errors.append(f"Fenced code blocks per section look low ({len(per_sec_blocks)}/{expected_comments}).")
    return errors

def generate_report_from_obj(obj, out_path="/content/empathetic_code_review_report.md"):
    lang_local = guess_lang(obj["code_snippet"])
    md = make_report(obj, lang_local)
    with open(out_path, "w", encoding="utf-8") as f:
        f.write(md)
    errs = validate_report(md, len(obj.get("review_comments", [])))
    print(f"Saved: {out_path} @ {datetime.utcnow().isoformat()}Z")
    if errs:
        print("Validation notes:")
        for e in errs: print(" -", e)
    else:
        print("Validation: PASS (structure looks good)")
    return out_path, md

def generate_report_from_json(json_input, out_path="/content/empathetic_code_review_report.md"):
    if os.path.exists(json_input):
        with open(json_input, "r", encoding="utf-8") as f:
            obj = json.load(f)
    else:
        obj = json.loads(json_input)
    return generate_report_from_obj(obj, out_path)

out_path, md_text = generate_report_from_obj(payload, "/content/empathetic_code_review_report.md")

# if u want to download from colab

if colab_files is not None and os.path.exists(out_path):
    try:
        colab_files.download(out_path)
    except Exception as e:
        print("Colab download note:", e)


Generating 3 sections...
  - Done comment 1/3
  - Done comment 2/3
  - Done comment 3/3
Generating holistic summary...
Generating consolidated fix...
Saved: /content/empathetic_code_review_report.md @ 2025-08-14T13:12:52.909108Z
Validation notes:
 - Expected 3 sections, found 6.


<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

Cleans model output (removes system/user echoes, dedupes sections), applies deterministic Python fixes (== True/False → truthy, u→user, loop→list comprehension) with LLM fallback if unchanged, then assembles the final Markdown (per-comment sections, Holistic Summary, Consolidated Improved Code + diff) and validates & saves to /content/empathetic_code_review_report_CLEAN.md.

In [23]:
import re
import difflib
import textwrap
from datetime import datetime

def keep_after_last_heading(text, heading="### Analysis of Comment:"):
    """
    If AI duplicated content, keep only the last occurrence of each section.
    This fixes the "Expected 3 sections, found 6" validation error.
    """
    last_index = text.rfind(heading)
    if last_index != -1:
        text = text[last_index:]
    return text.strip()

def strip_chat_roles(text):
    """
    Removes AI model artifacts like standalone 'system'/'user' lines
    that sometimes leak into the generated sections.
    """
    text = re.sub(r'^(system|user)\s*$.*?(?=^###|\Z)', '', text,
                  flags=re.MULTILINE | re.DOTALL)

    # keeping only content from the last proper heading
    text = keep_after_last_heading(text, "### Analysis of Comment:")
    return text.strip()

def strip_roles_inline(text):
    """
    For short text like holistic summaries that got role headers mixed in.
    """
    return re.sub(r'^(system|user)\s*$', '', text, flags=re.MULTILINE).strip()

# 2)determinitic python code fixer

def python_minifix(code: str) -> str:
    """
    Appliying common Python improvements deterministically:
    - Removes redundant '== True' and '== False'
    - Renames single-letter variables to descriptive names
    - Converts simple filter loops to list comprehensions

    This ensures that the "Consolidated Improved Code" actually shows improvements.
    """
    fixed = code

    fixed = re.sub(r'(\b\w[\w\.]*\b)\s*==\s*True\b', r'\1', fixed)
    fixed = re.sub(r'(\b\w[\w\.]*\b)\s*==\s*False\b', r'not \1', fixed)

    fixed = re.sub(r'for\s+u\s+in\s+([A-Za-z_]\w*)\s*:', r'for user in \1:', fixed)
    # updating all references: "u.is_active" -> "user.is_active"

    fixed = re.sub(r'\bu\.', 'user.', fixed)
    # fixing append calls: "results.append(u)" -> "results.append(user)"

    fixed = re.sub(r'results\.append\(\s*u\s*\)', 'results.append(user)', fixed)


    pattern = re.compile(
        r'(def\s+[A-Za-z_]\w*\s*\([^\)]*\):\s*'          # func definition
        r'(?:[ \t]*#.*\n|[ \t]*\n|[ \t]*.*\n)*?)'        # optinoal comments/lines
        r'([ \t]*)results\s*=\s*\[\]\s*\n'               # results = []
        r'\2for\s+user\s+in\s+([A-Za-z_]\w*)\s*:\s*\n'   # for user in users:
        r'\2[ \t]+if\s+(.+?):\s*\n'                      # if condition:
        r'\2[ \t]+results\.append\(user\)\s*\n'          # results.append(user)
        r'\2return\s+results',                           # return results
        re.DOTALL
    )

    def convert_to_list_comprehension(match):
        func_header = match.group(1)
        indentation = match.group(2)
        iterable_name = match.group(3)
        condition = match.group(4).strip()

        comprehension = f"{indentation}return [user for user in {iterable_name} if {condition}]"
        return func_header + comprehension

    fixed = pattern.sub(convert_to_list_comprehension, fixed)
    return fixed


def generate_section_sanitized(code, comment, lang):
    """
    Generates a section using Block 3's function, then cleans up artifacts.
    """
    raw_section = generate_section(code, comment, lang)
    return strip_chat_roles(raw_section)

def generate_holistic_summary_sanitized(code, lang):
    """
    Generates holistic summary with better error handling and cleanup.
    """
    prompt = f"""You are an empathetic senior engineer.

Write a short concluding paragraph (4–6 sentences) that:
- Encourages the author
- Summarizes the main improvements suggested across comments
- Mentions expected benefits (readability, performance, conventions)

Keep it concrete and non-generic.

Code (for context):
```{lang}
{code}
```"""

    messages = [
        {"role": "system", "content": "Be concise, supportive, and specific to the code."},
        {"role": "user", "content": prompt}
    ]

    output = llm_chat(messages, max_new_tokens=300, temperature=0.5)
    return strip_roles_inline(output)

def generate_consolidated_fix_strong(code, lang, comments):
    """
    Enhanced consolidated fix that combines deterministic improvements with AI.

    Strategy:
    1. First try deterministic fixes (guaranteed to work)
    2. If no changes, use AI with specific hints from the comments
    3. This ensures we always get meaningful improvements
    """

    fixed_code = code
    if lang == "python":
        fixed_code = python_minifix(code)

    if fixed_code.strip() == code.strip():
        hints = []
        combined_comments = " ".join(comments).lower()

        if any(word in combined_comments for word in ["name", "naming", "variable"]):
            hints.append("use descriptive variable names (e.g., user instead of u)")

        if any(phrase in combined_comments for phrase in ["redundant", "== true", "== false"]):
            hints.append("use Pythonic truth tests instead of == True/False")

        if any(word in combined_comments for word in ["inefficient", "performance"]):
            hints.append("prefer list comprehensions for linear filtering")

        hint_instruction = "; ".join(hints) if hints else "apply common Pythonic cleanups"

        prompt = f"""Rewrite the function(s) by applying these improvements: {hint_instruction}.

Only output a single fenced code block in {lang}, nothing else.
Keep changes minimal but meaningful.

```{lang}
{code}
```"""

        messages = [
            {"role": "system", "content": "Output only a single fenced code block with the improved code."},
            {"role": "user", "content": prompt}
        ]

        ai_output = llm_chat(messages, max_new_tokens=500, temperature=0.4)

        code_match = re.search(r"```[a-zA-Z0-9_+\-]*\n(.*?)```", ai_output, re.DOTALL)
        fixed_code = (code_match.group(1).strip() if code_match else ai_output.strip())

    return fixed_code

def make_report_clean(payload, lang):
    """
    Generates a clean, well-formatted report that should pass all validations.
    This replaces the problematic make_report() from Block 4.
    """
    code = payload["code_snippet"]
    comments = payload["review_comments"]

    print(f"Rebuilding {len(comments)} sanitized sections...")
    sections = []

    for i, comment in enumerate(comments, 1):
        section = generate_section_sanitized(code, comment, lang)
        sections.append(section)
        print(f"  - Section {i}/{len(comments)} completed")

    print("Generating holistic summary...")
    holistic_summary = generate_holistic_summary_sanitized(code, lang)

    print("Generating consolidated fix...")
    fixed_code = generate_consolidated_fix_strong(code, lang, comments)

    diff = "\n".join(difflib.unified_diff(
        code.splitlines(),
        fixed_code.splitlines(),
        fromfile="original",
        tofile="improved",
        lineterm=""
    ))

    title = f"# Empathetic Code Review Report\n_Generated: {datetime.utcnow().isoformat()}Z_\n"

    intro = textwrap.dedent(f"""
    **Context:** Translating direct critique into supportive, educational guidance.
    **Input Language:** {lang}
    **Comments Processed:** {len(comments)}
    """).strip()

    body = "\n\n---\n\n".join(sections)

    conclusion = f"""

---

## Holistic Summary

{holistic_summary}

## Consolidated Improved Code

```{lang}
{fixed_code}
```

<details><summary>Diff (original → improved)</summary>

```diff
{diff}
```

</details>
"""

    return title + "\n\n" + intro + "\n\n" + body + conclusion


def rebuild_and_validate(payload, lang):
    """
    Main function to generate the clean report and validate it.
    """
    print("🧹 Starting clean report generation...")

    clean_report = make_report_clean(payload, lang)

    output_path = "/content/empathetic_code_review_report_CLEAN.md"
    with open(output_path, "w", encoding="utf-8") as f:
        f.write(clean_report)

    validation_errors = validate_report(clean_report, len(payload["review_comments"]))

    print(f"\n{'✅ Validation: PASS' if not validation_errors else '❌ Validation: ISSUES'}")

    for error in validation_errors:
        print(f" - {error}")

    print(f"\n✅ Saved cleaned report to: {output_path}")

    return output_path, clean_report



Build + Preview + Download — Runs rebuild_and_validate(payload, lang), prints the first 1200 chars of the final Markdown, and offers a Colab download of /content/empathetic_code_review_report_CLEAN.md.

In [24]:
from datetime import datetime
import os

out_path, clean_md = rebuild_and_validate(payload, lang)  # using block 6

print("\n=== Final Report Preview (first 1200 chars) ===\n")
print(clean_md[:1200])

try:
    from google.colab import files as colab_files
    if os.path.exists(out_path):
        colab_files.download(out_path)
except Exception as e:
    print("Colab download note:", e)


🧹 Starting clean report generation...
Rebuilding 3 sanitized sections...
  - Section 1/3 completed
  - Section 2/3 completed
  - Section 3/3 completed
Generating holistic summary...
Generating consolidated fix...

✅ Validation: PASS

✅ Saved cleaned report to: /content/empathetic_code_review_report_CLEAN.md

=== Final Report Preview (first 1200 chars) ===

# Empathetic Code Review Report
_Generated: 2025-08-14T13:47:53.504042Z_


**Context:** Translating direct critique into supportive, educational guidance.
**Input Language:** python  
**Comments Processed:** 3

### Analysis of Comment: "This is inefficient. Don't loop twice conceptually."

* **Positive Rephrasing:** Great start on the logic here! We can make this more efficient by combining conditions into a single check.

* **The 'Why':** By combining the conditions into a single check, we reduce the number of iterations needed, which can significantly improve performance, especially for large datasets. This approach also enhances r

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

Clean Generation Tail + Summary Check — Redefines llm_chat to return only newly generated tokens (removes system/user echoes), rebuilds the report, and prints the cleaned Holistic Summary to verify no prompt scaffolding remains.

In [25]:
import re, os

def llm_chat(messages, max_new_tokens=700, temperature=0.6):
    text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
    inputs = tokenizer(text, return_tensors="pt").to(model.device)
    input_len = inputs["input_ids"].shape[-1]
    with torch.no_grad():
        out = model.generate(
            **inputs,
            max_new_tokens=max_new_tokens,
            do_sample=True,
            temperature=temperature,
        )
    gen_tokens = out[0][input_len:]
    return tokenizer.decode(gen_tokens, skip_special_tokens=True).strip()

out_path_fixed, clean_md_fixed = rebuild_and_validate(payload, lang)

m = re.search(r"## Holistic Summary\s*\n(.*?)(?=\n## Consolidated Improved Code|\Z)", clean_md_fixed, re.S)
print("\n=== Holistic Summary (clean) ===\n")
print((m.group(1).strip() if m else "Not found"))

🧹 Starting clean report generation...
Rebuilding 3 sanitized sections...
  - Section 1/3 completed
  - Section 2/3 completed
  - Section 3/3 completed
Generating holistic summary...
Generating consolidated fix...

✅ Validation: PASS

✅ Saved cleaned report to: /content/empathetic_code_review_report_CLEAN.md

=== Holistic Summary (clean) ===

I appreciate your dedication to improving the codebase. By refactoring the condition check into a single line using the `and` operator, we've streamlined the logic and made the function more readable. Additionally, converting the boolean checks directly into conditions within the loop will enhance performance by reducing the number of comparisons. These changes not only make the code cleaner but also ensure it runs efficiently, benefiting both maintenance and scalability.
