# Day 3 - Lab 2: Refactoring & Documentation

**Objective:** Use an LLM to refactor a complex Python function to improve its readability and maintainability, and then generate comprehensive, high-quality documentation for the project.

**Estimated Time:** 60 minutes

**Introduction:**
Writing code is only the first step; writing *good* code is what makes a project successful in the long run. In this lab, you will use an LLM as a code quality expert. You will refactor a poorly written function to make it cleaner and then generate professional-grade documentation, including docstrings and a README file. These are high-value tasks that AI can significantly accelerate.

For definitions of key terms used in this lab, please refer to the [GLOSSARY.md](../../GLOSSARY.md).

## Step 1: Setup

We will set up our environment and define a sample of poorly written code that we will use as the target for our refactoring and documentation efforts.

**Model Selection:**
Models with strong coding and reasoning abilities are best for this task. `gpt-4.1`, `o3`, or `codex-mini` are great choices. You can also try more general models like `gemini-2.5-pro`.

**Helper Functions Used:**
- `setup_llm_client()`: To configure the API client.
- `get_completion()`: To send prompts to the LLM.
- `save_artifact()`: To save the generated README file.
- `clean_llm_output()`: To clean up the generated code and documentation.

In [5]:
import sys
import os

# Add the project's root directory to the Python path to ensure 'utils' can be imported.
try:
    project_root = os.path.abspath(os.path.join(os.getcwd(), '..', '..'))
except IndexError:
    project_root = os.path.abspath(os.path.join(os.getcwd()))

if project_root not in sys.path:
    sys.path.insert(0, project_root)

from utils import setup_llm_client, get_completion, save_artifact, clean_llm_output, load_artifact

client, model_name, api_provider = setup_llm_client(model_name="gpt-4o")

✅ LLM Client configured: Using 'openai' with model 'gpt-4o'


## Step 2: The Code to Improve

Here is a sample Python function that is functional but poorly written. It's hard to read, has no comments or type hints, and mixes multiple responsibilities. This is the code we will improve.

In [2]:
bad_code = """
def process_data(data, operation):
    if operation == 'sum':
        total = 0
        for i in data:
            total += i
        return total
    elif operation == 'average':
        total = 0
        for i in data:
            total += i
        return total / len(data)
    elif operation == 'max':
        max_val = data[0]
        for i in data:
            if i > max_val:
                max_val = i
        return max_val
"""

## Step 3: The Challenges

### Challenge 1 (Foundational): Refactoring the Code

**Task:** Use the LLM to refactor the `bad_code` to be more readable, efficient, and maintainable.

**Instructions:**
1.  Create a prompt that instructs the LLM to act as a senior Python developer.
2.  Provide the `bad_code` as context.
3.  Ask the LLM to refactor the code. Be specific about the improvements you want, such as:
    * Breaking the single function into multiple, smaller functions.
    * Using built-in Python functions where appropriate (e.g., `sum()`, `max()`).
    * Adding clear type hints and return types.

> **Tip:** When you ask the AI to refactor, give it a principle to follow. For example, ask it to apply the 'Single Responsibility Principle,' which means each function should do only one thing. This guides the AI to create cleaner, more modular code.

**Expected Quality:** A block of Python code that is functionally identical to the original but is significantly cleaner, more modular, and easier to understand.

In [3]:
# TODO: Write a prompt to refactor the 'bad_code'.
refactor_prompt = f"""
Act as a senior Python developer. Improve the following code by refactoring it to break single function
into smaller, more manageable functions. Additionally, ensure that the code adheres to best practices,
uses built-in Python functions where applicable, and is more readable. Add type hints and return types to the function signatures.

```python
{bad_code}
```
"""

print("--- Refactoring Code ---")
refactored_code = get_completion(refactor_prompt, client, model_name, api_provider)
cleaned_code = clean_llm_output(refactored_code, language='python')
print(cleaned_code)

--- Refactoring Code ---
from typing import List, Union

def calculate_sum(data: List[Union[int, float]]) -> Union[int, float]:
    """Calculates the sum of the data."""
    return sum(data)

def calculate_average(data: List[Union[int, float]]) -> float:
    """Calculates the average of the data."""
    return sum(data) / len(data)

def find_max(data: List[Union[int, float]]) -> Union[int, float]:
    """Finds the maximum value in the data."""
    return max(data)

def process_data(data: List[Union[int, float]], operation: str) -> Union[int, float]:
    """Processes the data based on the specified operation."""
    if operation == 'sum':
        return calculate_sum(data)
    elif operation == 'average':
        return calculate_average(data)
    elif operation == 'max':
        return find_max(data)
    else:
        raise ValueError(f"Unsupported operation: {operation}")


### Challenge 2 (Intermediate): Generating Docstrings

**Task:** Prompt the LLM to generate high-quality docstrings for the newly refactored code.

**Instructions:**
1.  Create a new prompt.
2.  Provide the `refactored_code` from the previous step as context.
3.  Instruct the LLM to generate Google-style Python docstrings for each function.
4.  The docstrings should include a description of the function, its arguments (`Args:`), and what it returns (`Returns:`).

**Expected Quality:** The refactored Python code, now with complete and professional-looking docstrings for each function.

In [4]:
# TODO: Write a prompt to add Google-style docstrings to the refactored code.
docstring_prompt = f"""
Act as a senior Python developer. Add Google-style docstrings to the following refactored code.
```python
{refactored_code}
```
Ensure that the docstrings are clear, concise, and provide descriptions about the function's purpose, parameters, and return values.
"""

print("--- Generating Docstrings ---")
code_with_docstrings = get_completion(docstring_prompt, client, model_name, api_provider)
cleaned_code_with_docstrings = clean_llm_output(code_with_docstrings, language='python')
print(cleaned_code_with_docstrings)

--- Generating Docstrings ---
from typing import List, Union

def calculate_sum(data: List[Union[int, float]]) -> Union[int, float]:
    """Calculates the sum of the data.

    Args:
        data (List[Union[int, float]]): A list of numerical values (integers or floats).

    Returns:
        Union[int, float]: The sum of the provided data.
    """
    return sum(data)

def calculate_average(data: List[Union[int, float]]) -> float:
    """Calculates the average of the data.

    Args:
        data (List[Union[int, float]]): A list of numerical values (integers or floats).

    Returns:
        float: The average of the provided data.
    """
    return sum(data) / len(data)

def find_max(data: List[Union[int, float]]) -> Union[int, float]:
    """Finds the maximum value in the data.

    Args:
        data (List[Union[int, float]]): A list of numerical values (integers or floats).

    Returns:
        Union[int, float]: The maximum value in the provided data.
    """
    return max(da

### Challenge 3 (Advanced): Generating a Project README

**Task:** Generate a comprehensive `README.md` file for the entire Onboarding Tool project.

**Instructions:**
1.  Create a final prompt that instructs the LLM to act as a technical writer.
2.  This time, you will provide multiple pieces of context: the `day1_prd.md` and the `app/main.py` source code. (You will need to load these files).
3.  Ask the LLM to generate a `README.md` file with the following sections:
    * Project Title
    * Overview (based on the PRD)
    * Features
    * API Endpoints (with `curl` examples)
    * Setup and Installation instructions.
4.  Save the final output to `README.md` in the project's root directory.

**Expected Quality:** A complete, professional `README.md` file that provides a comprehensive overview of the project for other developers.

In [10]:
# Load the necessary context files
prd_content = load_artifact("artifacts/day1_prd.md")
api_code = load_artifact("app/main.py")

# TODO: Write a prompt to generate a complete README.md file.
readme_prompt = f"""Act as a technical writer. Generate a comprehensive README.md file for the project based on the following PRD and API code.

PRD:
{prd_content}

API code:
{api_code}

The README.md should include the following sections:
- Project Title
- Overview (based on the PRD)
- Features
- FastAPI Endpoints (with curl examples) http://127.0.0.1:8000
- Setup and Installation instructions
"""

print("--- Generating Project README ---")
if prd_content and api_code:
    readme_content = get_completion(readme_prompt, client, model_name, api_provider)
    cleaned_readme = clean_llm_output(readme_content, language='markdown')
    print(cleaned_readme)
    save_artifact(cleaned_readme, "README.md")
else:
    print("Skipping README generation because PRD or API code is missing.")

--- Generating Project README ---
# Product Requirements Document: Example Product

## Overview

This project aims to create a comprehensive onboarding platform that streamlines the onboarding process for new hires. Our platform provides a centralized, user-friendly interface designed to improve productivity and engagement by minimizing the friction typically associated with onboarding. 

### Vision
To enhance productivity and engagement for new hires by reducing onboarding friction.

### Problem Statement
New hires currently face a fragmented and overwhelming onboarding experience, which leads to confusion and inefficiency.

### Key Goals
1. **Improve New Hire Efficiency**: Reduce the time-to-first-contribution by 20% in Q1.
2. **Reduce Support Load**: Decrease repetitive questions to HR by 30%.
3. **Increase Engagement**: Achieve a 95% onboarding completion rate.

## Features

- **User Authentication**: Secure login via company credentials.
- **Task Checklist**: Personalized onboardi

## Lab Conclusion

Well done! You have used an LLM to perform two of the most valuable code quality tasks: refactoring and documentation. You've seen how AI can help transform messy code into a clean, maintainable structure and how it can generate comprehensive documentation from high-level project artifacts and source code. These skills are a massive productivity multiplier for any development team.

> **Key Takeaway:** LLMs excel at understanding and generating structured text, whether that structure is code or documentation. Providing a clear 'before' state (the bad code) and a clear goal (the refactoring principles) allows the AI to perform complex code transformation and documentation tasks efficiently.