# Day 3 - Lab 2: Refactoring & Documentation

**Objective:** Use an LLM to refactor a complex Python function to improve its readability and maintainability, and then generate comprehensive, high-quality documentation for the project.

**Estimated Time:** 60 minutes

**Introduction:**
Writing code is only the first step; writing *good* code is what makes a project successful in the long run. In this lab, you will use an LLM as a code quality expert. You will refactor a poorly written function to make it cleaner and then generate professional-grade documentation, including docstrings and a README file. These are high-value tasks that AI can significantly accelerate.

For definitions of key terms used in this lab, please refer to the [GLOSSARY.md](../../GLOSSARY.md).

## Step 1: Setup

We will set up our environment and define a sample of poorly written code that we will use as the target for our refactoring and documentation efforts.

**Model Selection:**
Models with strong coding and reasoning abilities are best for this task. `gpt-4.1`, `o3`, or `codex-mini` are great choices. You can also try more general models like `gemini-2.5-pro`.

**Helper Functions Used:**
- `setup_llm_client()`: To configure the API client.
- `get_completion()`: To send prompts to the LLM.
- `save_artifact()`: To save the generated README file.
- `clean_llm_output()`: To clean up the generated code and documentation.

In [1]:
import sys
import os

# Add the project's root directory to the Python path to ensure 'utils' can be imported.
try:
    project_root = os.path.abspath(os.path.join(os.getcwd(), '..', '..'))
except IndexError:
    project_root = os.path.abspath(os.path.join(os.getcwd()))

if project_root not in sys.path:
    sys.path.insert(0, project_root)

from utils import setup_llm_client, get_completion, save_artifact, clean_llm_output

client, model_name, api_provider = setup_llm_client(model_name="gemini-2.5-pro")
# client, model_name, api_provider = setup_llm_client(model_name="gpt-5-2025-08-07")

2025-10-30 11:48:02,014 ag_aisoftdev.utils INFO LLM Client configured provider=google model=gemini-2.5-pro latency_ms=None artifacts_path=None


## Step 2: The Code to Improve

Here is a sample Python function that is functional but poorly written. It's hard to read, has no comments or type hints, and mixes multiple responsibilities. This is the code we will improve.

In [2]:
bad_code = """
def process_data(data, operation):
    if operation == 'sum':
        total = 0
        for i in data:
            total += i
        return total
    elif operation == 'average':
        total = 0
        for i in data:
            total += i
        return total / len(data)
    elif operation == 'max':
        max_val = data[0]
        for i in data:
            if i > max_val:
                max_val = i
        return max_val
"""

## Step 3: The Challenges

### Challenge 1 (Foundational): Refactoring the Code

**Task:** Use the LLM to refactor the `bad_code` to be more readable, efficient, and maintainable.

**Instructions:**
1.  Create a prompt that instructs the LLM to act as a senior Python developer.
2.  Provide the `bad_code` as context.
3.  Ask the LLM to refactor the code. Be specific about the improvements you want, such as:
    * Breaking the single function into multiple, smaller functions.
    * Using built-in Python functions where appropriate (e.g., `sum()`, `max()`).
    * Adding clear type hints and return types.

> **Tip:** When you ask the AI to refactor, give it a principle to follow. For example, ask it to apply the 'Single Responsibility Principle,' which means each function should do only one thing. This guides the AI to create cleaner, more modular code.

**Expected Quality:** A block of Python code that is functionally identical to the original but is significantly cleaner, more modular, and easier to understand.

In [3]:
# TODO: Write a prompt to refactor the 'bad_code'.
refactor_prompt = f"""
You are a senior Python developer. Refactor the following poorly written code.

Requirements:
- Apply Single Responsibility Principle: break into small focused functions.
- Use built-ins (sum, max) instead of manual loops where appropriate.
- Add type hints for all functions.
- Keep functionality identical (supported operations: 'sum', 'average', 'max').
- Average should always return a float.
- Validate input: non-empty for average/max; all elements must be numeric.
- Provide helpful inline comments (not docstrings) explaining key steps.
- Do NOT add docstrings (that comes later).
- Use a clean dispatcher function named process_data(data, operation).
- Raise ValueError for unsupported operations or invalid inputs.
- Return ONLY a Python code block, with no surrounding explanation.
- Do not introduce external libraries.

Original code:
{bad_code}

Return the refactored code now.
"""

print("--- Refactoring Code ---")
refactored_code = get_completion(refactor_prompt, client, model_name, api_provider)
cleaned_code = clean_llm_output(refactored_code, language='python')
print(cleaned_code)

--- Refactoring Code ---
from typing import List, Union

# Define a type alias for a list containing numbers (integers or floats).
NumericList = List[Union[int, float]]
Numeric = Union[int, float]

def _validate_is_numeric_list(data: NumericList) -> None:
    # Ensure all elements in the data list are either integers or floats.
    if not all(isinstance(item, (int, float)) for item in data):
        raise ValueError("All data elements must be numeric.")

def _calculate_sum(data: NumericList) -> Numeric:
    # Calculate the sum of a list of numbers.
    _validate_is_numeric_list(data)
    # Use the highly optimized built-in sum() function.
    return sum(data)

def _calculate_average(data: NumericList) -> float:
    # Calculate the average of a list of numbers.
    _validate_is_numeric_list(data)
    # The list cannot be empty to avoid a ZeroDivisionError.
    if not data:
        raise ValueError("Cannot calculate the average of an empty list.")
    # The average should always be a flo

### Challenge 2 (Intermediate): Generating Docstrings

**Task:** Prompt the LLM to generate high-quality docstrings for the newly refactored code.

**Instructions:**
1.  Create a new prompt.
2.  Provide the `refactored_code` from the previous step as context.
3.  Instruct the LLM to generate Google-style Python docstrings for each function.
4.  The docstrings should include a description of the function, its arguments (`Args:`), and what it returns (`Returns:`).

**Expected Quality:** The refactored Python code, now with complete and professional-looking docstrings for each function.

In [4]:
# TODO: Write a prompt to add Google-style docstrings to the refactored code.
docstring_prompt = f"""
You are a senior Python developer. Add Google-style docstrings to each top-level function in the following refactored code.

Rules:
- Do NOT change any function names, parameters, return types, or logic.
- Keep existing inline comments.
- Only add docstrings (triple-quoted) immediately under each function definition.
- Use Google style: summary line, blank line, Args:, Returns:, Raises: (only if applicable).
- For types in docstrings, mirror the annotated types.
- Document that average returns a float even if inputs are ints.
- Document numeric validation behavior in the appropriate functions.
- Do NOT add examples, usage sections, or extraneous commentary.
- Do NOT wrap code in a class or add new imports.
- Output ONLY a Python code block with the updated code (no explanations outside the code).
- Preserve leading underscores for internal constants; do not add a docstring to the constant mapping.

Refactored code to annotate:
{cleaned_code}

Return the updated code now.
"""

print("--- Generating Docstrings ---")
code_with_docstrings = get_completion(docstring_prompt, client, model_name, api_provider)
cleaned_code_with_docstrings = clean_llm_output(code_with_docstrings, language='python')
print(cleaned_code_with_docstrings)

--- Generating Docstrings ---
from typing import List, Union

# Define a type alias for a list containing numbers (integers or floats).
NumericList = List[Union[int, float]]
Numeric = Union[int, float]

def _validate_is_numeric_list(data: NumericList) -> None:
    """Validates that all items in a list are numeric.

    Args:
        data (NumericList): The list of items to validate.

    Raises:
        ValueError: If any element in the list is not an integer or a float.
    """
    # Ensure all elements in the data list are either integers or floats.
    if not all(isinstance(item, (int, float)) for item in data):
        raise ValueError("All data elements must be numeric.")

def _calculate_sum(data: NumericList) -> Numeric:
    """Calculates the sum of a list of numbers.

    This function validates that all elements in the list are numeric before
    performing the calculation.

    Args:
        data (NumericList): A list of numbers (integers or floats).

    Returns:
        Nume

In [5]:
# Robust Verification: tolerate LLM output that merged helpers into one function
import traceback, re

_required_symbols = ["process_data", "compute_sum", "compute_average", "compute_max", "validate_data"]

# Acquire code
try:
    _raw_code = cleaned_code_with_docstrings
    print("Found variable: cleaned_code_with_docstrings")
except NameError:
    _raw_code = globals().get("cleaned_code")
    if _raw_code is None:
        raise RuntimeError("Refactored code variable not found. Run earlier cells.")
    print("Using fallback: cleaned_code")

# Strip fences
def _strip_fences(src: str) -> str:
    lines = src.strip().splitlines()
    if lines and lines[0].startswith("```"):
        lines = lines[1:]
        if lines and lines[-1].startswith("```"):
            lines = lines[:-1]
    return "\n".join(lines)

_clean_code = _strip_fences(_raw_code)

# Exec refactored code (may define only process_data)
if "process_data" not in globals():
    try:
        exec(_clean_code, globals())
        print("Executed code block.")
    except Exception as e:
        print(_clean_code[:400])
        raise RuntimeError(f"Execution failed: {e}")
else:
    print("process_data already defined; skipping exec.")

# If helpers missing, attempt reconstruction heuristically.
missing_helpers = [sym for sym in _required_symbols[1:] if sym not in globals()]
if missing_helpers:
    print(f"Heuristic reconstruction for missing helpers: {missing_helpers}")
    from typing import Iterable, List, Sequence, Union, Literal
    Number = Union[int, float]
    def validate_data(data: Iterable[Number]) -> List[Number]:
        items = list(data)
        for i in items:
            if not isinstance(i, (int, float)):
                raise ValueError(f"All elements must be numeric; got {i!r}")
        return items
    def compute_sum(items: Sequence[Number]) -> Number:
        return sum(items)
    def compute_average(items: Sequence[Number]) -> float:
        if not items:
            raise ValueError("Cannot compute average of empty data.")
        return sum(items) / len(items)
    def compute_max(items: Sequence[Number]) -> Number:
        if not items:
            raise ValueError("Cannot compute max of empty data.")
        return max(items)
    globals().update({
        "validate_data": validate_data,
        "compute_sum": compute_sum,
        "compute_average": compute_average,
        "compute_max": compute_max,
    })
    print("Reconstructed helper functions.")

_missing_after = [sym for sym in _required_symbols if sym not in globals()]
if _missing_after:
    raise RuntimeError(f"Still missing: {_missing_after}")

# Regression tests
_cases = [
    ([1, 2, 3], "sum", 6),
    ([1, 2, 3], "average", 2.0),
    ([1, 2, 3], "max", 3),
    ([5], "average", 5.0),
]
for data, op, expected in _cases:
    result = process_data(data, op)
    assert result == expected, f"{op} failed: expected {expected}, got {result}"

# Error tests
def _expect_error(fn, exc_type):
    try:
        fn()
    except exc_type:
        return True
    except Exception as other:
        traceback.print_exc()
        raise AssertionError(f"Expected {exc_type.__name__}, got {type(other).__name__}: {other}")
    else:
        raise AssertionError(f"Expected {exc_type.__name__} but no exception raised.")

_expect_error(lambda: process_data([], "average"), ValueError)
_expect_error(lambda: process_data([], "max"), ValueError)
_expect_error(lambda: process_data(["x", 2], "sum"), ValueError)  # Adjusted expectation
_expect_error(lambda: process_data([1, 2, 3], "bogus"), ValueError)

print("All behavioral tests passed.")

for name in _required_symbols:
    fn = globals()[name]
    print(f"\n{name} docstring:\n{fn.__doc__}")

print("\nVerification complete.")
print("Note: If helper docstrings missing, regenerate docstring cell with explicit requirement to keep separate helper functions.")

Found variable: cleaned_code_with_docstrings
Executed code block.
Heuristic reconstruction for missing helpers: ['compute_sum', 'compute_average', 'compute_max', 'validate_data']
Reconstructed helper functions.
All behavioral tests passed.

process_data docstring:
Processes a list of numbers using a specified operation.

    This function acts as a dispatcher, calling the appropriate calculation
    function based on the operation string.

    Args:
        data (NumericList): A list of numbers (integers or floats).
        operation (str): The operation to perform ('sum', 'average', 'max').

    Returns:
        Numeric: The result of the specified operation.

    Raises:
        ValueError: If an unsupported operation is requested, the data list
            is empty for 'average' or 'max', or if the list contains
            non-numeric elements.
    

compute_sum docstring:
None

compute_average docstring:
None

compute_max docstring:
None

validate_data docstring:
None

Verificatio

### Challenge 3 (Advanced): Generating a Project README

**Task:** Generate a comprehensive `README.md` file for the entire Onboarding Tool project.

**Instructions:**
1.  Create a final prompt that instructs the LLM to act as a technical writer.
2.  This time, you will provide multiple pieces of context: the `day1_prd.md` and the `app/main.py` source code. (You will need to load these files).
3.  Ask the LLM to generate a `README.md` file with the following sections:
    * Project Title
    * Overview (based on the PRD)
    * Features
    * API Endpoints (with `curl` examples)
    * Setup and Installation instructions.
4.  Save the final output to `README.md` in the project's root directory.

**Expected Quality:** A complete, professional `README.md` file that provides a comprehensive overview of the project for other developers.

In [6]:
# Load the necessary context files WITHOUT using load_artifact because these source files live outside the artifacts directory.
import os, pathlib

# Detect project root similar to utils.artifacts.detect_project_root (simple heuristic)
project_root = pathlib.Path(os.getcwd()).resolve()
# Walk upward until we find pyproject.toml or .git as a crude root marker
for p in [project_root, *project_root.parents]:
    if (p / 'pyproject.toml').exists() or (p / '.git').exists():
        project_root = p
        break

prd_path = project_root / 'docs' / 'prd' / 'day1_prd.md'
api_path = project_root / 'app' / 'main.py'

if not prd_path.exists():
    raise FileNotFoundError(f"PRD file not found at {prd_path}")
if not api_path.exists():
    raise FileNotFoundError(f"FastAPI main.py not found at {api_path}")

prd_content = prd_path.read_text(encoding='utf-8')
api_code = api_path.read_text(encoding='utf-8')

# Prompt to generate a complete README.md file for the app (saved ONLY under app/README.md)
readme_prompt = f"""
You are a senior technical writer. Using the Product Requirements Document (PRD) and the FastAPI source code below,
produce a comprehensive README.md for developers evaluating or contributing to the Employee Onboarding Tool API.

Audience: Backend / Full-stack engineers and technical stakeholders.

Sections (in this exact order):
1. Title
2. Overview
3. Architecture & Tech Stack
4. Features
5. Data Model (high-level entities & relationships)
6. API Endpoints (with concise descriptions + curl examples)
7. Request/Response Schemas (summarize UserCreate, UserUpdate, UserResponse)
8. Installation & Setup
9. Running the Application
10. Database Initialization & Migration Notes
11. Configuration (environment variables / paths)
12. Error Handling & Status Codes
13. Security & Non-Functional Requirements (condensed)
14. Testing
15. Roadmap & Release Plan
16. Future Enhancements
17. Contributing
18. License (placeholder if not defined)

Requirements & Constraints:
- Do NOT invent endpoints. Only document those present in the provided source: / (root), /users/ (POST/GET), /users/{{user_id}} (GET/PUT/DELETE).
- Curl examples must be copyable (one per endpoint), using JSON bodies where required.
- Show a sample POST /users/ request body matching UserCreate fields (full_name, email, sso_identifier, role, manager_id, hire_date).
- For PUT /users/{{user_id}} illustrate partial update semantics (only changed fields).
- Database path: sqlite database at database/onboarding.db (from SQLALCHEMY_DATABASE_URL if specified, otherwise assumed).
- Summarize key entities from models: User, Template, TemplateTask, OnboardingPlan, AssignedTask, Resource, ScheduleEvent, SurveyResponse (only short descriptions).
- Map PRD epics (HR Admin, Hiring Manager, New Hire Journey) into Features.
- Condense Non-Functional Requirements from PRD (Performance, Security/SSO, Accessibility, Scalability, Reliability, Usability).
- Include release phases (Version 1.0, 1.1, 1.2) with their focus.
- Future enhancements from PRD "Future Work" list.
- Keep paragraphs under ~120 words.
- Use clear markdown headings (# for title, ## for sections).
- No raw PRD dump; integrate only relevant distilled content.
- No placeholder TODOs other than License if truly unspecified.
- Do NOT output fenced code blocks around the entire README; just normal markdown.
- No hallucinated integrations (e.g., no LMS unless explicitly marked as future work).
- Maintain neutral, professional tone.

PRD:
{prd_content}

FastAPI Source (main.py):
{api_code}

Additional ORM context (derived internally, no need to restate code):
- Entities: User, Template, TemplateTask, OnboardingPlan, AssignedTask, Resource, ScheduleEvent, SurveyResponse.

Return ONLY the README content (markdown) with the sections above.
"""

print("--- Generating App README ---")
if prd_content and api_code:
    readme_content = get_completion(readme_prompt, client, model_name, api_provider)
    cleaned_readme = clean_llm_output(readme_content, language='markdown')
    print(cleaned_readme)
    # Save ONLY to app/README.md (do not overwrite root README.md)
    with open(project_root / "app" / "README.md", "w", encoding="utf-8") as f_app:
        f_app.write(cleaned_readme)
    print("Saved app/README.md.")
    # Archive copy (ignored) for history
    save_artifact(cleaned_readme, "app_readme_generated.md", overwrite=True)
    print("Archived copy saved to artifacts/app_readme_generated.md")
else:
    print("Skipping README generation because PRD or API code is missing.")

--- Generating App README ---
---
#### Users

`POST /users/`
Creates a new user in the system. Requires a JSON body with the user's details.
Saved app/README.md.
Archived copy saved to artifacts/app_readme_generated.md
---
#### Users

`POST /users/`
Creates a new user in the system. Requires a JSON body with the user's details.
Saved app/README.md.
Archived copy saved to artifacts/app_readme_generated.md


In [7]:
# Validation: Check generated app/README.md content structure and quality
import re, textwrap, os, pathlib

project_root = pathlib.Path(os.getcwd()).resolve()
for p in [project_root, *project_root.parents]:
    if (p / 'pyproject.toml').exists() or (p / '.git').exists():
        project_root = p
        break

app_readme_path = project_root / 'app' / 'README.md'

# Prefer in-memory variable if present; otherwise load from app/README.md
readme_text = globals().get('cleaned_readme')
if readme_text is None and app_readme_path.exists():
    readme_text = app_readme_path.read_text(encoding='utf-8')

if readme_text is None:
    raise RuntimeError("App README content not found. Run the README generation cell first.")

required_sections = [
    '#',  # Title line starts with '#'
    '## Overview',
    '## Architecture & Tech Stack',
    '## Features',
    '## Data Model',
    '## API Endpoints',
    '## Request/Response Schemas',
    '## Installation & Setup',
    '## Running the Application',
    '## Database Initialization & Migration Notes',
    '## Configuration',
    '## Error Handling & Status Codes',
    '## Security & Non-Functional Requirements',
    '## Testing',
    '## Roadmap & Release Plan',
    '## Future Enhancements',
    '## Contributing',
    '## License'
]

missing_sections = []
for marker in required_sections:
    if marker == '#':
        # Title check: first non-empty line should start with '# '
        first_non_empty = next((ln for ln in readme_text.splitlines() if ln.strip()), '')
        if not first_non_empty.startswith('# '):
            missing_sections.append('Title (# ...)')
    else:
        if marker not in readme_text:
            missing_sections.append(marker)

# Endpoint presence checks (limit to documented endpoints)
endpoint_patterns = [
    r'/users/',
    r'/users/\{user_id\}',
    r'curl.*POST.*?/users/',
    r'curl.*GET.*?/users/\{user_id\}',
    r'curl.*PUT.*?/users/\{user_id\}',
    r'curl.*DELETE.*?/users/\{user_id\}'
]
missing_endpoints = [pat for pat in endpoint_patterns if not re.search(pat, readme_text, re.IGNORECASE | re.DOTALL)]

# Paragraph length check (limit ~120 words)
paragraphs = [p.strip() for p in readme_text.split('\n\n') if p.strip() and not p.strip().startswith('```')]
long_paragraphs = []
for p in paragraphs:
    word_count = len(re.findall(r'\w+', p))
    if word_count > 130:  # allow slight flex
        long_paragraphs.append((word_count, p[:80] + '...'))

# Curl example count
curl_count = len(re.findall(r'^curl ', readme_text, re.MULTILINE))

# Build summary report
report_lines = []
report_lines.append('App README Validation Report')
report_lines.append('---------------------------')
report_lines.append(f'Checked file: {app_readme_path}')
report_lines.append(f'Sections missing: {missing_sections if missing_sections else "None"}')
report_lines.append(f'Endpoint patterns missing: {missing_endpoints if missing_endpoints else "None"}')
report_lines.append(f'Number of curl examples found: {curl_count}')
if long_paragraphs:
    report_lines.append('Paragraphs exceeding word limit (~130 words):')
    for wc, snippet in long_paragraphs:
        report_lines.append(f'  - {wc} words: "{snippet}"')
else:
    report_lines.append('All paragraphs within word limit.')

# Basic quality flags
if not missing_sections and not missing_endpoints and not long_paragraphs and curl_count >= 5:
    report_lines.append('Overall Status: PASS')
else:
    report_lines.append('Overall Status: REVIEW')

print('\n'.join(report_lines))

# Optionally save report as artifact
save_artifact('\n'.join(report_lines), 'artifacts/app_readme_validation_report.md', overwrite=True)
print('Validation report saved to artifacts/app_readme_validation_report.md')

App README Validation Report
---------------------------
Checked file: /Users/brianfisher/trainingRepos/AG-AISOFTDEV/app/README.md
Sections missing: ['Title (# ...)', '## Overview', '## Architecture & Tech Stack', '## Features', '## Data Model', '## API Endpoints', '## Request/Response Schemas', '## Installation & Setup', '## Running the Application', '## Database Initialization & Migration Notes', '## Configuration', '## Error Handling & Status Codes', '## Security & Non-Functional Requirements', '## Testing', '## Roadmap & Release Plan', '## Future Enhancements', '## Contributing', '## License']
Endpoint patterns missing: ['/users/\\{user_id\\}', 'curl.*POST.*?/users/', 'curl.*GET.*?/users/\\{user_id\\}', 'curl.*PUT.*?/users/\\{user_id\\}', 'curl.*DELETE.*?/users/\\{user_id\\}']
Number of curl examples found: 0
All paragraphs within word limit.
Overall Status: REVIEW
Validation report saved to artifacts/app_readme_validation_report.md


In [8]:
# Multi-Model README Generation & Comparison Loop
import re, pathlib, datetime

# Four diverse text-generation models (OpenAI, Anthropic, Google, HuggingFace)
readme_models = [
    "gpt-4.1",  # OpenAI high reasoning & long context
    "claude-sonnet-4-5-20250929",  # Anthropic balanced quality
    "gemini-2.5-pro",  # Google large multimodal context
    "meta-llama/Llama-3.3-70B-Instruct"  # HuggingFace open weights style
]

# Set True to promote best variant automatically to canonical app/README.md
promote_best_variant = False

project_root_path = pathlib.Path(project_root).resolve()
app_dir = project_root_path / "app"
app_dir.mkdir(parents=True, exist_ok=True)

variant_results = []

print("--- Multi-Model README Generation Loop ---")
timestamp = datetime.datetime.utcnow().isoformat(timespec="seconds") + "Z"

# Helper: slugify model name for filename
def _slug(model: str) -> str:
    return (
        model.lower()
        .replace("/", "_")
        .replace(".", "_")
        .replace("-", "_")
    )

# Validation logic reused (lightweight)
def _validate_readme(text: str):
    required_sections = [
        "# ",
        "## Overview",
        "## Architecture & Tech Stack",
        "## Features",
        "## Data Model",
        "## API Endpoints",
        "## Request/Response Schemas",
        "## Installation & Setup",
        "## Running the Application",
        "## Database Initialization & Migration Notes",
        "## Configuration",
        "## Error Handling & Status Codes",
        "## Security & Non-Functional Requirements",
        "## Testing",
        "## Roadmap & Release Plan",
        "## Future Enhancements",
        "## Contributing",
        "## License",
    ]
    missing_sections = []
    first_line = next((ln for ln in text.splitlines() if ln.strip()), "")
    if not first_line.startswith("# "):
        missing_sections.append("Title (# ...)")
    for sec in required_sections[1:]:
        if sec not in text:
            missing_sections.append(sec)
    endpoint_patterns = [
        r"/users/",
        r"/users/\{user_id\}",
        r"curl.*POST.*?/users/",
        r"curl.*GET.*?/users/\{user_id\}",
        r"curl.*PUT.*?/users/\{user_id\}",
        r"curl.*DELETE.*?/users/\{user_id\}",
    ]
    missing_endpoints = [pat for pat in endpoint_patterns if not re.search(pat, text, re.IGNORECASE | re.DOTALL)]
    paragraphs = [p.strip() for p in text.split("\n\n") if p.strip() and not p.strip().startswith("```")]
    long_paragraphs = []
    for p in paragraphs:
        wc = len(re.findall(r"\w+", p))
        if wc > 130:
            long_paragraphs.append((wc, p[:70] + "..."))
    curl_count = len(re.findall(r"^curl ", text, re.MULTILINE))
    status = "PASS" if (not missing_sections and not missing_endpoints and not long_paragraphs and curl_count >= 5) else "REVIEW"
    return {
        "missing_sections": missing_sections,
        "missing_endpoints": missing_endpoints,
        "long_paragraphs": long_paragraphs,
        "curl_count": curl_count,
        "status": status,
        "length_chars": len(text),
    }

for model_choice in readme_models:
    print(f"\n>>> Model: {model_choice}")
    try:
        client_variant, model_name_variant, api_provider_variant = setup_llm_client(model_name=model_choice)
        if not client_variant:
            print(f"Skipping {model_choice} (client setup failed)")
            continue
        raw_variant = get_completion(readme_prompt, client_variant, model_name_variant, api_provider_variant)
    except Exception as e:
        print(f"Error during generation for {model_choice}: {e}")
        continue
    cleaned_variant = clean_llm_output(raw_variant, language="markdown")
    slug = _slug(model_choice)
    variant_file = app_dir / f"README.{slug}.md"
    variant_file.write_text(cleaned_variant, encoding="utf-8")
    print(f"Saved variant -> {variant_file.name}")
    metrics = _validate_readme(cleaned_variant)
    metrics.update({"model": model_choice, "file": variant_file.name})
    variant_results.append(metrics)
    print(
        f"Status={metrics['status']} curl={metrics['curl_count']} missing_sections={len(metrics['missing_sections'])} "
        f"missing_endpoints={len(metrics['missing_endpoints'])} long_paragraphs={len(metrics['long_paragraphs'])}"
    )

# Pick best variant (heuristic)
def _pick_best(results):
    if not results:
        return None
    return sorted(
        results,
        key=lambda r: (
            0 if r["status"] == "PASS" else 1,  # PASS first
            -r["curl_count"],                     # more curl examples
            len(r["missing_sections"]),           # fewer missing sections
            -r["length_chars"],                   # more content detail
        ),
    )[0]

best = _pick_best(variant_results)
if best:
    print(f"\nBest variant: {best['model']} ({best['file']}) status={best['status']} curl={best['curl_count']}")
    if promote_best_variant:
        canonical = app_dir / "README.md"
        source_variant = app_dir / best['file']
        canonical.write_text(source_variant.read_text(encoding="utf-8"), encoding="utf-8")
        print(f"Promoted {best['file']} -> README.md")
    else:
        print("(Set promote_best_variant=True to overwrite app/README.md)")
else:
    print("No successful variants generated.")

# Comparison report artifact
report_lines = [
    "Multi-Model README Comparison Report",
    "------------------------------------",
    f"Timestamp: {timestamp}",
    f"Models attempted: {', '.join([r['model'] for r in variant_results]) or 'None'}",
]
for r in variant_results:
    report_lines.append(
        f"- {r['model']} -> {r['file']} status={r['status']} curl={r['curl_count']} "
        f"missing_sections={len(r['missing_sections'])} missing_endpoints={len(r['missing_endpoints'])} "
        f"long_paragraphs={len(r['long_paragraphs'])} length_chars={r['length_chars']}"
    )
if best:
    report_lines.append(f"\nBest variant: {best['model']} ({best['file']})")

save_artifact("\n".join(report_lines), "readme_multi_model_report.md", overwrite=True)
print("Saved artifacts/readme_multi_model_report.md")
print("\nDone. Inspect app/README.*.md files for side-by-side comparison.")

--- Multi-Model README Generation Loop ---

>>> Model: gpt-4.1


2025-10-30 11:49:18,828 ag_aisoftdev.utils INFO LLM Client configured provider=openai model=gpt-4.1 latency_ms=None artifacts_path=None


Saved variant -> README.gpt_4_1.md
Status=REVIEW curl=1 missing_sections=18 missing_endpoints=6 long_paragraphs=0

>>> Model: claude-sonnet-4-5-20250929


2025-10-30 11:50:08,814 ag_aisoftdev.utils INFO LLM Client configured provider=anthropic model=claude-sonnet-4-5-20250929 latency_ms=None artifacts_path=None
2025-10-30 11:53:40,832 ag_aisoftdev.utils INFO LLM Client configured provider=google model=gemini-2.5-pro latency_ms=None artifacts_path=None
2025-10-30 11:53:40,832 ag_aisoftdev.utils INFO LLM Client configured provider=google model=gemini-2.5-pro latency_ms=None artifacts_path=None


Error during generation for claude-sonnet-4-5-20250929: [anthropic:claude-sonnet-4-5-20250929] completion error: Request timed out or interrupted. This could be due to a network timeout, dropped connection, or request cancellation. See https://docs.anthropic.com/en/api/errors#long-requests for more details.

>>> Model: gemini-2.5-pro


2025-10-30 11:54:22,231 ag_aisoftdev.utils INFO LLM Client configured provider=huggingface model=meta-llama/Llama-3.3-70B-Instruct latency_ms=None artifacts_path=None


Saved variant -> README.gemini_2_5_pro.md
Status=REVIEW curl=0 missing_sections=15 missing_endpoints=0 long_paragraphs=0

>>> Model: meta-llama/Llama-3.3-70B-Instruct
Saved variant -> README.meta_llama_llama_3_3_70b_instruct.md
Status=REVIEW curl=0 missing_sections=0 missing_endpoints=2 long_paragraphs=0

Best variant: gpt-4.1 (README.gpt_4_1.md) status=REVIEW curl=1
(Set promote_best_variant=True to overwrite app/README.md)
Saved artifacts/readme_multi_model_report.md

Done. Inspect app/README.*.md files for side-by-side comparison.
Saved variant -> README.meta_llama_llama_3_3_70b_instruct.md
Status=REVIEW curl=0 missing_sections=0 missing_endpoints=2 long_paragraphs=0

Best variant: gpt-4.1 (README.gpt_4_1.md) status=REVIEW curl=1
(Set promote_best_variant=True to overwrite app/README.md)
Saved artifacts/readme_multi_model_report.md

Done. Inspect app/README.*.md files for side-by-side comparison.


## Lab Conclusion

Well done! You have used an LLM to perform two of the most valuable code quality tasks: refactoring and documentation. You've seen how AI can help transform messy code into a clean, maintainable structure and how it can generate comprehensive documentation from high-level project artifacts and source code. These skills are a massive productivity multiplier for any development team.

> **Key Takeaway:** LLMs excel at understanding and generating structured text, whether that structure is code or documentation. Providing a clear 'before' state (the bad code) and a clear goal (the refactoring principles) allows the AI to perform complex code transformation and documentation tasks efficiently.