# Day 3 - Lab 2: Refactoring & Documentation (Solution)

**Objective:** Use an LLM to refactor a complex Python function to improve its readability and maintainability, and then generate comprehensive, high-quality documentation for the project.

**Introduction:**
This solution notebook provides the complete prompts for the refactoring and documentation lab. It demonstrates how to guide an LLM to perform specific code quality improvements and generate structured documentation from multiple sources.

For definitions of key terms used in this lab, please refer to the [GLOSSARY.md](../../GLOSSARY.md).

## Step 1: Setup

In [1]:
import sys
import os

# Add the project's root directory to the Python path to ensure 'utils' can be imported.
try:
    project_root = os.path.abspath(os.path.join(os.getcwd(), '..', '..'))
except IndexError:
    project_root = os.path.abspath(os.path.join(os.getcwd()))

if project_root not in sys.path:
    sys.path.insert(0, project_root)

from utils import setup_llm_client, get_completion, save_artifact, load_artifact, clean_llm_output, prompt_enhancer

# Initialize separate LLM clients for each task so we can pick recent models from different providers.
# - Refactoring: strong instruction-following model
# - Docstrings: model tuned for clarity/style
# - README generation: high-capacity synthesis model
refactor_client, refactor_model_name, refactor_api_provider = setup_llm_client(model_name="deepseek-ai/DeepSeek-V3.1")
doc_client, doc_model_name, doc_api_provider = setup_llm_client(model_name="gemini-2.5-pro")
readme_client, readme_model_name, readme_api_provider = setup_llm_client(model_name="gpt-5-2025-08-07")

  from .autonotebook import tqdm as notebook_tqdm
2025-09-21 21:28:52,257 ag_aisoftdev.utils INFO LLM Client configured provider=huggingface model=deepseek-ai/DeepSeek-V3.1 latency_ms=None artifacts_path=None
2025-09-21 21:28:52,502 ag_aisoftdev.utils INFO LLM Client configured provider=google model=gemini-2.5-pro latency_ms=None artifacts_path=None
2025-09-21 21:28:52,795 ag_aisoftdev.utils INFO LLM Client configured provider=openai model=gpt-5-2025-08-07 latency_ms=None artifacts_path=None


## Step 2: The Code to Improve

In [2]:
bad_code = """
def process_data(data, operation):
    if operation == 'sum':
        total = 0
        for i in data:
            total += i
        return total
    elif operation == 'average':
        total = 0
        for i in data:
            total += i
        return total / len(data)
    elif operation == 'max':
        max_val = data[0]
        for i in data:
            if i > max_val:
                max_val = i
        return max_val
"""

## Step 3: The Challenges - Solutions

### Challenge 1 (Foundational): Refactoring the Code

**Explanation:**
This prompt is highly specific about the desired outcome. Instead of just saying "improve this code," we give the LLM concrete principles to follow: apply the 'Single Responsibility Principle,' use built-in functions, and add type hints. This guidance transforms a vague request into a precise engineering task, leading to a much higher-quality output.

In [3]:
refactor_prompt = f"""
You are a senior Python developer who writes clean, efficient, and maintainable code.

Please refactor the following Python code. Apply the 'Single Responsibility Principle' by breaking the main function into smaller, more focused functions. Also, use modern Python features like built-in functions and add type hints.

**Code to Refactor:**
```python
{bad_code}
```

Output only the refactored Python code.
"""

# Enhance the prompt for better results and consistency with other labs
enhanced_refactor_prompt = prompt_enhancer(refactor_prompt)

print("--- Refactoring Code ---")
refactored_code = get_completion(enhanced_refactor_prompt, refactor_client, refactor_model_name, refactor_api_provider)
cleaned_code = clean_llm_output(refactored_code, language='python')
print(cleaned_code)

2025-09-21 21:28:52,818 ag_aisoftdev.utils INFO LLM Client configured provider=openai model=o3 latency_ms=None artifacts_path=None


--- Refactoring Code ---
from typing import List, Union, Optional


def process_data(data: List[Union[int, float]], operation: str) -> Optional[Union[int, float]]:
    operation_handlers = {
        'sum': _calculate_sum,
        'average': _calculate_average,
        'max': _calculate_max
    }
    
    if operation not in operation_handlers:
        return None
    
    return operation_handlers[operation](data)


def _calculate_sum(data: List[Union[int, float]]) -> Union[int, float]:
    return sum(data)


def _calculate_average(data: List[Union[int, float]]) -> float:
    return sum(data) / len(data) if data else 0.0


def _calculate_max(data: List[Union[int, float]]) -> Optional[Union[int, float]]:
    return max(data) if data else None


### Challenge 2 (Intermediate): Generating Docstrings

**Explanation:**
This prompt builds on the previous step. We provide the newly refactored code and ask for another specific, structured output: Google-style docstrings. LLMs are exceptionally good at this type of structured text generation. They can parse the function signature to identify the arguments and return types and generate well-formatted, descriptive documentation.

In [4]:
docstring_prompt = f"""
You are a Python developer who writes excellent documentation.

Add Google-style docstrings to the following Python code. Each docstring should include a description of the function, its arguments (Args:), and what it returns (Returns:).

**Python Code:**
```python
{cleaned_code}
```

Output the complete Python code with the added docstrings.
"""

# Enhance the prompt to improve clarity and formatting
enhanced_docstring_prompt = prompt_enhancer(docstring_prompt)

print("--- Generating Docstrings ---")
code_with_docstrings = get_completion(enhanced_docstring_prompt, doc_client, doc_model_name, doc_api_provider)
cleaned_code_with_docstrings = clean_llm_output(code_with_docstrings, language='python')
print(cleaned_code_with_docstrings)

2025-09-21 21:29:08,299 ag_aisoftdev.utils INFO LLM Client configured provider=openai model=o3 latency_ms=None artifacts_path=None


--- Generating Docstrings ---
from typing import List, Union, Optional


def process_data(data: List[Union[int, float]], operation: str) -> Optional[Union[int, float]]:
    """Processes a list of numbers using a specified operation.

    Args:
        data (List[Union[int, float]]): A list of integers or floats to process.
        operation (str): The name of the operation to perform ('sum', 'average',
            'max').

    Returns:
        Optional[Union[int, float]]: The result of the operation, or None if the
        operation is not supported.
    """
    operation_handlers = {
        'sum': _calculate_sum,
        'average': _calculate_average,
        'max': _calculate_max
    }
    
    if operation not in operation_handlers:
        return None
    
    return operation_handlers[operation](data)


def _calculate_sum(data: List[Union[int, float]]) -> Union[int, float]:
    """Calculates the sum of a list of numbers.

    Args:
        data (List[Union[int, float]]): A list o

### Challenge 3 (Advanced): Generating a Project README

**Explanation:**
Using an LLM to generate docstrings and a README is a massive productivity boost. It excels at this structured writing task, freeing up the developer to focus on complex logic while still ensuring the project is well-documented and easy for others to understand. This prompt is a synthesis task. We provide the LLM with both high-level requirements (the PRD) and low-level implementation details (the API source code). The LLM's job is to merge these two sources of information into a single, comprehensive `README.md` file, complete with overviews, feature lists, and practical `curl` examples derived from the code.

In [None]:
# Load the necessary context files
prd_content = load_artifact("artifacts/day1_prd.md")
api_code = load_artifact("app/main.py")

readme_prompt = f"""
You are a technical writer creating a README.md file for a new open-source project.

Use the provided Product Requirements Document (PRD) for high-level context and the FastAPI source code for technical details.

**PRD Context:**
<prd>
{prd_content}
</prd>

**API Source Code:**
<code>
{api_code}
</code>

Generate a complete README.md file with the following sections:
- Project Title
- Overview (Summarize the project's purpose from the PRD)
- Features
- API Endpoints (List the available endpoints and provide a `curl` example for each one, including the POST request with a JSON body)
- Setup and Installation (Provide basic instructions on how to run the FastAPI app with uvicorn)
"""

# Use prompt enhancer and the readme-specific client for synthesis
enhanced_readme_prompt = prompt_enhancer(readme_prompt)

print("--- Generating Project README ---")
if prd_content and api_code:
    readme_content = get_completion(enhanced_readme_prompt, readme_client, readme_model_name, readme_api_provider)
    cleaned_readme = clean_llm_output(readme_content, language='markdown')
    print(cleaned_readme)
    save_artifact(cleaned_readme, "README.md")
else:
    print("Skipping README generation because PRD or API code is missing.")

2025-09-21 21:29:29,952 ag_aisoftdev.utils INFO LLM Client configured provider=openai model=o3 latency_ms=None artifacts_path=None


--- Generating Project README ---
Example template for documenting each endpoint (replace with real routes once the FastAPI code is available):
- Method and path: GET /health
  - Description: Liveness check.
  - cURL:
    ```sh
    curl -i http://localhost:8000/health
    ```

- Method and path: POST /onboarding
  - Description: Create a new onboarding workflow.
  - Body: JSON payload as defined by the Pydantic model in code.
  - cURL:
    ```sh
    curl -i -X POST http://localhost:8000/onboarding \
      -H "Content-Type: application/json" \
      -d '{ "example": "payload" }'
    ```

Replace the above with the actual endpoints found in the FastAPI app.

## Setup and Installation
Prerequisites:
- Python 3.11+ recommended
- pip
- (Optional) virtualenv or venv

Steps:


## Lab Conclusion

Well done! You have used an LLM to perform two of the most valuable code quality tasks: refactoring and documentation. You've seen how AI can help transform messy code into a clean, maintainable structure and how it can generate comprehensive documentation from high-level project artifacts and source code. These skills are a massive productivity multiplier for any development team.

> **Key Takeaway:** LLMs excel at understanding and generating structured text, whether that structure is code or documentation. Providing a clear 'before' state (the bad code) and a clear goal (the refactoring principles) allows the AI to perform complex code transformation and documentation tasks efficiently.