<a id='section5'></a>

# Working with LLM-Generated Code

Now let's explore how Large Language Models can assist with coding tasks. We'll generate some code using an LLM, test it, and discover common pitfalls and best practices.

## Generating Code with ChatGPT

Let's ask ChatGPT to help us create a function that analyzes sentiment patterns in our AITA dataset. Here's the prompt we'll use:

**Prompt to ChatGPT:**
*"Write a Python function that takes a pandas DataFrame with a 'selftext' column and creates a simple sentiment analysis. The function should count positive and negative words using predefined word lists, calculate a sentiment score for each post, and return a new DataFrame with sentiment columns added."*

## 🤖 AI Generated Code

Below is the code generated by ChatGPT. Let's run it and see what happens:

In [5]:
# Generated by ChatGPT - Let's test this code!
def analyze_sentiment(df):
    """
    Analyze sentiment of posts in a DataFrame.
    
    Args:
        df: pandas DataFrame with 'selftext' column
    
    Returns:
        DataFrame with added sentiment columns
    """
    # Define positive and negative words
    positive_words = ['good', 'great', 'awesome', 'excellent', 'fantastic', 'wonderful', 
                     'amazing', 'perfect', 'best', 'love', 'happy', 'joy', 'pleased']
    
    negative_words = ['bad', 'terrible', 'awful', 'horrible', 'worst', 'hate', 'angry', 
                     'sad', 'upset', 'mad', 'furious', 'disgusting', 'annoying']
    
    # Create a copy to avoid modifying original
    result_df = df.copy()
    
    # Initialize sentiment columns
    result_df['positive_count'] = 0
    result_df['negative_count'] = 0
    result_df['sentiment_score'] = 0.0
    
    # Process each row
    for idx, row in result_df.iterrows():
        text = row['selftext'].lower()
        
        # Count positive words
        pos_count = sum(1 for word in positive_words if word in text)
        neg_count = sum(1 for word in negative_words if word in text)
        
        # Calculate sentiment score
        total_words = len(text.split())
        sentiment_score = (pos_count - neg_count) / total_words
        
        # Update DataFrame
        result_df.loc[idx, 'positive_count'] = pos_count
        result_df.loc[idx, 'negative_count'] = neg_count
        result_df.loc[idx, 'sentiment_score'] = sentiment_score
    
    return result_df

# Test the function
print("Testing AI-generated sentiment analysis function...")
sentiment_df = analyze_sentiment(df)
print("Function completed successfully!")

Testing AI-generated sentiment analysis function...


AttributeError: 'float' object has no attribute 'lower'

🔔 **Question**: Did the code run successfully? If you got an error, what do you think went wrong?

## Debugging the AI Code

Let's investigate what went wrong and fix the issues step by step:

In [7]:
# Let's check what's in our dataset first
print("Checking for missing selftext values:")
print(f"Missing values: {df['selftext'].isna().sum()}")
print(f"Total rows: {len(df)}")

Checking for missing selftext values:
Missing values: 12
Total rows: 20000


## 🥊 Challenge 7: Fix the AI Code

The AI-generated code has several issues. Can you identify and fix them?

**Issues to look for:**
1. What happens if `selftext` contains missing values (NaN)?
2. What happens if `selftext` is not a string?
3. Are there performance issues with this approach?
4. Are there edge cases in the sentiment calculation?

Write an improved version of the function below:

In [None]:
# YOUR CODE HERE - Fix the AI-generated function


## LLM Coding Guidelines

Based on our experience with the AI-generated code, let's establish some guidelines for working with LLMs:

### ✅ **DO:**
- **Provide clear context** and specify desired output format
- **Test the code immediately** after generation  
- **Check for edge cases** like missing values, empty strings, wrong data types
- **Verify performance** - AI often uses inefficient approaches
- **Document AI assistance** in comments (e.g., `# Generated with ChatGPT assistance`)
- **Understand the code** before using it in your projects
- **Ask for explanations** if you don't understand parts of the generated code

### ❌ **DON'T:**
- **Ask for too much at once** - break complex tasks into smaller parts
- **Blindly copy-paste** without understanding the code
- **Skip testing** - always run and verify the output
- **Ignore error handling** - AI often misses edge cases
- **Forget to document** AI usage (academic integrity requirement)
- **Use AI output** that leads to plagiarism or incorrect work

## Course Policy on AI Use

### 📋 **Academic Integrity Guidelines**

**You MAY use LLMs as coding assistants IF:**
- You **document their use** clearly (in code comments or assignment submissions)
- You **personally verify and understand** the solution
- You can **explain how the code works** when asked
- You **test the code thoroughly** and fix any bugs

**Examples of acceptable documentation:**
```python
# Used ChatGPT to help write this sentiment analysis function
# Modified the original output to handle edge cases
def my_function():
    pass
```

**You MAY NOT:**
- Use AI assistance **without acknowledgment** 
- Submit AI-generated code that you **don't understand**
- Use AI output that leads to **plagiarism or incorrect work**
- Claim AI-generated work as **entirely your own**

⚠️ **Remember**: Understanding the code is more important than having perfect code. Using LLMs can speed up development, but only if you comprehend what they produce!

## 🥊 Challenge 8: Evaluate AI Output

Now it's your turn! Ask ChatGPT (or another LLM) to generate code for one of these tasks:

1. **Create a function** that finds the most common words in AITA post titles
2. **Generate code** to create a simple visualization of post scores over time  
3. **Write a function** that categorizes posts by topic based on keywords

**Instructions:**
1. Copy your prompt and the AI's response into the cells below
2. Test the generated code
3. Document any bugs or improvements needed
4. Fix the issues and explain what you learned

**Your prompt to the LLM:**

In [None]:
# YOUR AI-GENERATED CODE HERE
# Remember to add a comment acknowledging AI assistance!


**Issues found and fixes made:**

In [None]:
# YOUR IMPROVED VERSION HERE

<div class="alert alert-success">

## ❗ Key Points

* **LLMs can accelerate coding but require careful testing and understanding of generated code.**
* **Always document AI assistance and verify that generated code handles edge cases properly.**

</div>