# Day 3 - Lab 2: Refactoring & Documentation

**Objective:** Use an LLM to refactor a complex Python function to improve its readability and maintainability, and then generate comprehensive, high-quality documentation for the project.

**Estimated Time:** 60 minutes

**Introduction:**
Writing code is only the first step; writing *good* code is what makes a project successful in the long run. In this lab, you will use an LLM as a code quality expert. You will refactor a poorly written function to make it cleaner and then generate professional-grade documentation, including docstrings and a README file. These are high-value tasks that AI can significantly accelerate.

For definitions of key terms used in this lab, please refer to the [GLOSSARY.md](../../GLOSSARY.md).

## Step 1: Setup

We will set up our environment and define a sample of poorly written code that we will use as the target for our refactoring and documentation efforts.

**Model Selection:**
Models with strong coding and reasoning abilities are best for this task. `gpt-4.1`, `o3`, or `codex-mini` are great choices. You can also try more general models like `gemini-2.5-pro`.

**Helper Functions Used:**
- `setup_llm_client()`: To configure the API client.
- `get_completion()`: To send prompts to the LLM.
- `save_artifact()`: To save the generated README file.
- `clean_llm_output()`: To clean up the generated code and documentation.

In [5]:
import sys
import os

# Add the project's root directory to the Python path to ensure 'utils' can be imported.
try:
    project_root = os.path.abspath(os.path.join(os.getcwd(), '..', '..'))
except IndexError:
    project_root = os.path.abspath(os.path.join(os.getcwd()))

if project_root not in sys.path:
    sys.path.insert(0, project_root)

from utils import setup_llm_client, get_completion, save_artifact, clean_llm_output, load_artifact

client, model_name, api_provider = setup_llm_client(model_name="gemini-2.5-pro")

2025-10-28 16:07:19,675 ag_aisoftdev.utils INFO LLM Client configured provider=google model=gemini-2.5-pro latency_ms=None artifacts_path=None


## Step 2: The Code to Improve

Here is a sample Python function that is functional but poorly written. It's hard to read, has no comments or type hints, and mixes multiple responsibilities. This is the code we will improve.

In [2]:
bad_code = """
def process_data(data, operation):
    if operation == 'sum':
        total = 0
        for i in data:
            total += i
        return total
    elif operation == 'average':
        total = 0
        for i in data:
            total += i
        return total / len(data)
    elif operation == 'max':
        max_val = data[0]
        for i in data:
            if i > max_val:
                max_val = i
        return max_val
"""

In [None]:
# CORRECTED README Generation - Challenge 3 Requirements
# Load the necessary context files
prd_content = load_artifact("artifacts/day1_prd.md")
api_code = load_artifact("app/main.py")

# Refactored prompt matching exact Challenge 3 requirements
readme_prompt = f"""
You are a technical writer. Generate a comprehensive README.md file for the New Hire Experience Platform.

**Product Requirements Document:**
{prd_content}

**API Source Code:**
{api_code}

**Required README Sections:**
Create a README.md file with these 5 sections only:

1. **Project Title** - Clear, professional title for the project

2. **Overview** - Based on the PRD executive summary and problem statement. Include:
   - What the project does
   - The problem it solves
   - Key benefits and value proposition

3. **Features** - List all major features based on:
   - User stories from the PRD
   - Technical capabilities from the API code
   - Core functionality available

4. **API Endpoints** - Complete documentation with curl examples:
   - List all available endpoints from the FastAPI code
   - Include HTTP methods (GET, POST, PUT, DELETE)
   - Provide working curl examples for each endpoint
   - Show request/response formats

5. **Setup and Installation** - Step-by-step instructions:
   - Prerequisites (Python version, dependencies)
   - Installation commands
   - Database setup
   - How to run the application
   - How to test the API

**Output Requirements:**
- Generate ONLY the README.md content in Markdown format
- Start directly with "# Project Title"
- Include practical, working curl examples
- Make each section comprehensive and detailed
- Ensure all sections are complete and professional

Generate the README.md content now:
"""

print("--- Generating Project README (Corrected Version) ---")
if prd_content and api_code:
    readme_content = get_completion(readme_prompt, client, model_name, api_provider)
    cleaned_readme = clean_llm_output(readme_content, language='markdown')
    print("Generated README content:")
    print(cleaned_readme)
    save_artifact(cleaned_readme, "README.md")
    print("README saved successfully!")
else:
    print("Skipping README generation because PRD or API code is missing.")


## Step 3: The Challenges

### Challenge 1 (Foundational): Refactoring the Code

**Task:** Use the LLM to refactor the `bad_code` to be more readable, efficient, and maintainable.

**Instructions:**
1.  Create a prompt that instructs the LLM to act as a senior Python developer.
2.  Provide the `bad_code` as context.
3.  Ask the LLM to refactor the code. Be specific about the improvements you want, such as:
    * Breaking the single function into multiple, smaller functions.
    * Using built-in Python functions where appropriate (e.g., `sum()`, `max()`).
    * Adding clear type hints and return types.

> **Tip:** When you ask the AI to refactor, give it a principle to follow. For example, ask it to apply the 'Single Responsibility Principle,' which means each function should do only one thing. This guides the AI to create cleaner, more modular code.

**Expected Quality:** A block of Python code that is functionally identical to the original but is significantly cleaner, more modular, and easier to understand.

In [3]:
# TODO: Write a prompt to refactor the 'bad_code'.
refactor_prompt = f"""
You are a senior Python developer with expertise in clean code principles and best practices. Your task is to refactor the following poorly written Python function to make it more readable, efficient, and maintainable.

**Original Code:**
```python
{bad_code}
```

**Refactoring Requirements:**
1. Apply the Single Responsibility Principle - break the monolithic function into smaller, focused functions
2. Use built-in Python functions where appropriate (sum(), max(), etc.) instead of manual loops
3. Add comprehensive type hints for all parameters and return types
4. Improve variable naming for clarity
5. Add meaningful comments where necessary
6. Ensure the refactored code is functionally identical to the original
7. Make the code more Pythonic and follow PEP 8 style guidelines

**Expected Output:**
Provide only the refactored Python code with proper imports, type hints, and clear function separation. Do not include any explanatory text or markdown formatting - just clean, production-ready Python code.

Focus on creating modular, reusable functions that each handle a single responsibility. The refactored code should be significantly more maintainable and easier to understand than the original.
"""

print("--- Refactoring Code ---")
refactored_code = get_completion(refactor_prompt, client, model_name, api_provider)
cleaned_code = clean_llm_output(refactored_code, language='python')
print(cleaned_code)

--- Refactoring Code ---
from typing import Sequence, Callable

# A type alias for numeric types to improve readability and maintainability.
Numeric = int | float


def calculate_sum(numbers: Sequence[Numeric]) -> Numeric:
    """
    Calculates the sum of a sequence of numbers using the built-in sum() function.

    Args:
        numbers: A sequence of integers or floats.

    Returns:
        The total sum of the numbers.
    """
    return sum(numbers)


def calculate_average(numbers: Sequence[Numeric]) -> float:
    """
    Calculates the average of a sequence of numbers.

    Args:
        numbers: A sequence of integers or floats.

    Returns:
        The average of the numbers as a float.

    Raises:
        ValueError: If the input sequence is empty.
    """
    if not numbers:
        raise ValueError("Cannot calculate the average of an empty sequence.")
    return sum(numbers) / len(numbers)


def find_maximum(numbers: Sequence[Numeric]) -> Numeric:
    """
    Finds the ma

### Challenge 2 (Intermediate): Generating Docstrings

**Task:** Prompt the LLM to generate high-quality docstrings for the newly refactored code.

**Instructions:**
1.  Create a new prompt.
2.  Provide the `refactored_code` from the previous step as context.
3.  Instruct the LLM to generate Google-style Python docstrings for each function.
4.  The docstrings should include a description of the function, its arguments (`Args:`), and what it returns (`Returns:`).

**Expected Quality:** The refactored Python code, now with complete and professional-looking docstrings for each function.

In [4]:
# TODO: Write a prompt to add Google-style docstrings to the refactored code.
docstring_prompt = f"""
You are a senior Python developer and technical documentation expert. Your task is to add comprehensive Google-style docstrings to the following refactored Python code.

**Refactored Code:**
```python
{cleaned_code}
```

**Docstring Requirements:**
1. Use Google-style docstring format for all functions
2. Each docstring must include:
   - A clear, concise description of what the function does
   - Args section listing all parameters with their types and descriptions
   - Returns section describing the return value type and what it represents
   - Any relevant examples or usage notes if applicable

**Google Docstring Format Example:**
```python
def function_name(param1: int, param2: str) -> bool:
    \"\"\"Brief description of what the function does.
    
    More detailed description if needed, explaining the function's
    purpose, behavior, and any important implementation details.
    
    Args:
        param1 (int): Description of the first parameter.
        param2 (str): Description of the second parameter.
    
    Returns:
        bool: Description of what the function returns.
    
    Example:
        >>> result = function_name(42, "example")
        >>> print(result)
        True
    \"\"\"
```

**Expected Output:**
Provide the complete Python code with all functions properly documented using Google-style docstrings. Ensure each docstring is:
- Professional and clear
- Accurately describes the function's behavior
- Includes proper type information
- Follows Google docstring conventions exactly

Do not include any explanatory text or markdown formatting - just the complete Python code with docstrings.
"""

print("--- Generating Docstrings ---")
code_with_docstrings = get_completion(docstring_prompt, client, model_name, api_provider)
cleaned_code_with_docstrings = clean_llm_output(code_with_docstrings, language='python')
print(cleaned_code_with_docstrings)

--- Generating Docstrings ---
from typing import Sequence, Callable

# A type alias for numeric types to improve readability and maintainability.
Numeric = int | float


def calculate_sum(numbers: Sequence[Numeric]) -> Numeric:
    """Calculates the sum of a sequence of numbers.

    This function takes a sequence of numeric types (integers or floats) and
    computes their sum using the built-in `sum()` function.

    Args:
        numbers (Sequence[Numeric]): A sequence of integers or floats.

    Returns:
        Numeric: The total sum of the numbers in the sequence.

    Example:
        >>> calculate_sum([1, 2, 3, 4, 5])
        15
        >>> calculate_sum([10.5, 20.0, 5.5])
        36.0
    """
    return sum(numbers)


def calculate_average(numbers: Sequence[Numeric]) -> float:
    """Calculates the arithmetic mean (average) of a sequence of numbers.

    This function returns a float, even if the input sequence contains only
    integers. It will raise an error if the input se

### Challenge 3 (Advanced): Generating a Project README

**Task:** Generate a comprehensive `README.md` file for the entire Onboarding Tool project.

**Instructions:**
1.  Create a final prompt that instructs the LLM to act as a technical writer.
2.  This time, you will provide multiple pieces of context: the `day1_prd.md` and the `app/main.py` source code. (You will need to load these files).
3.  Ask the LLM to generate a `README.md` file with the following sections:
    * Project Title
    * Overview (based on the PRD)
    * Features
    * API Endpoints (with `curl` examples)
    * Setup and Installation instructions.
4.  Save the final output to `README.md` in the project's root directory.

**Expected Quality:** A complete, professional `README.md` file that provides a comprehensive overview of the project for other developers.

In [None]:
# Load the necessary context files
prd_content = load_artifact("artifacts/day1_prd.md")
api_code = load_artifact("app/main.py")

# TODO: Write a prompt to generate a complete README.md file.
readme_prompt = f"""
You are a technical writer. Generate a comprehensive README.md file for the New Hire Experience Platform.

**Product Requirements Document:**
{prd_content}

**API Source Code:**
{api_code}

**Required README Sections:**
Create a README.md file with these 5 sections only:

1. **Project Title** - Clear, professional title for the project

2. **Overview** - Based on the PRD executive summary and problem statement. Include:
   - What the project does
   - The problem it solves
   - Key benefits and value proposition

3. **Features** - List all major features based on:
   - User stories from the PRD
   - Technical capabilities from the API code
   - Core functionality available

4. **API Endpoints** - Complete documentation with curl examples:
   - List all available endpoints from the FastAPI code
   - Include HTTP methods (GET, POST, PUT, DELETE)
   - Provide working curl examples for each endpoint
   - Show request/response formats

5. **Setup and Installation** - Step-by-step instructions:
   - Prerequisites (Python version, dependencies)
   - Installation commands
   - Database setup
   - How to run the application
   - How to test the API

**Output Requirements:**
- Generate ONLY the README.md content in Markdown format
- Start directly with "# Project Title"
- Include practical, working curl examples
- Make each section comprehensive and detailed
- Ensure all sections are complete and professional

Generate the README.md content now:
"""

print("--- Generating Project README ---")
if prd_content and api_code:
    readme_content = get_completion(readme_prompt, client, model_name, api_provider)
    print(readme_content)
    save_artifact(readme_content, "README.md")
else:
    print("Skipping README generation because PRD or API code is missing.")

--- Generating Project README ---
Of course. As a senior technical writer and developer advocate, I will create a comprehensive, professional `README.md` file based on the provided product requirements and API source code. Here is the complete file:

---

# New Hire Experience Platform API

![Status](https://img.shields.io/badge/status-in%20development-orange)
![Python Version](https://img.shields.io/badge/python-3.8+-blue.svg)
![Framework](https://img.shields.io/badge/Framework-FastAPI-05998b)
![License](https://img.shields.io/badge/License-MIT-yellow.svg)

A robust backend service designed to revolutionize the employee onboarding process, providing a centralized, engaging, and efficient experience for new hires, HR administrators, and managers.

## Table of Contents

- [Overview](#overview)
- [Key Features](#key-features)
  - [Product Features](#product-features)
  - [Technical Features](#technical-features)
- [API Endpoints](#api-endpoints)
  - [Users](#users)
- [Setup and Installat

## Lab Conclusion

Well done! You have used an LLM to perform two of the most valuable code quality tasks: refactoring and documentation. You've seen how AI can help transform messy code into a clean, maintainable structure and how it can generate comprehensive documentation from high-level project artifacts and source code. These skills are a massive productivity multiplier for any development team.

> **Key Takeaway:** LLMs excel at understanding and generating structured text, whether that structure is code or documentation. Providing a clear 'before' state (the bad code) and a clear goal (the refactoring principles) allows the AI to perform complex code transformation and documentation tasks efficiently.