# Lab 11 - Module 1: Generate (Stage 1)

**Time:** ~15-20 minutes

## Stage 1: Using AI to Generate Content

In this module, you'll give your prompt to a real AI system and see what it creates.

### Learning Objectives

- Experience AI's content generation capabilities firsthand
- Form initial impressions about quality
- Identify obvious strengths and weaknesses
- Prepare for systematic evaluation in Module 2

### What You'll Do

1. Load your group's prompt from Module 0
2. Copy the prompt to an AI system (ChatGPT, Claude, Gemini, etc.)
3. Record the AI's complete output
4. Give a "first impression" rating
5. Identify 2 strengths and 2 weaknesses

## Setup: Load Libraries and Your Scenario

In [None]:
import numpy as np
import pandas as pd
import json
import ipywidgets as widgets
from IPython.display import display, clear_output, HTML, Markdown
import os

print("✓ Libraries loaded!")

# 📁 Google Drive Setup

**IMPORTANT:** This module loads data from your Google Drive (saved in previous modules).

Run the cell below to mount Google Drive. You may need to authorize access again.

In [None]:
from google.colab import drive
print("Mounting Google Drive...")
drive.mount('/content/drive')
LAB_DIR = '/content/drive/MyDrive/DATA1010/Lab11'
print("✓ Google Drive mounted successfully!")
print(f"✓ Lab directory: {LAB_DIR}")

In [None]:
# Load your group's scenario
group_code = int(input("Enter your group code: "))

try:
    with open(f'{LAB_DIR}/lab11_group_{group_code}_scenario.json', 'r') as f:
        scenario = json.load(f)
    
    print(f"✓ Loaded scenario for Group {group_code}")
    print(f"\nPrompt Family: {scenario['family']}")
    print(f"Recommended Rubric: {scenario['default_rubric']}")
    
except FileNotFoundError:
    print(f"\n❌ ERROR: Could not find scenario file for group {group_code}.")
    print("Please run Module 0 first to generate your scenario.")

## Step 1: Review Your Prompt

Here's the prompt you'll give to the AI:

In [None]:
print("="*70)
print("YOUR PROMPT (Copy this exactly to the AI)")
print("="*70)
print()
print(scenario['prompt'])
print()
print("="*70)

## Step 2: Run the Prompt with an AI System

### Instructions:

1. **Open an AI system** in a new browser window:
   - [ChatGPT](https://chat.openai.com) (free version works)
   - [Claude](https://claude.ai) (free version works)
   - [Google Gemini](https://gemini.google.com) (free version works)
   - Or any other LLM you have access to

2. **Start a fresh conversation** (use incognito/private browsing if possible)

3. **Copy the prompt exactly** from the cell above

4. **Paste it into the AI** and press Enter

5. **Wait for the complete response** (don't interrupt mid-generation)

6. **Copy the ENTIRE output** from the AI (select all, copy)

7. **Paste it in the text area below**

### Important Notes:

- ✓ Use the AI's **first response** - don't regenerate or ask for revisions yet
- ✓ Copy the **complete output** - don't edit or summarize
- ✓ If the AI asks clarifying questions instead of generating, note that in the "Observations" field
- ✓ Record which AI model you used (ChatGPT, Claude, Gemini, etc.)

## Step 3: Record the AI's Output

In [None]:
# Create widgets for data collection
model_used = widgets.Dropdown(
    options=['ChatGPT (GPT-3.5)', 'ChatGPT (GPT-4)', 'Claude (free)', 'Claude (Pro)', 
             'Google Gemini', 'Other (specify in notes)'],
    description='AI Model:',
    style={'description_width': '120px'},
    layout=widgets.Layout(width='400px')
)

ai_output = widgets.Textarea(
    value='',
    placeholder='Paste the complete AI output here...',
    description='AI Output:',
    layout=widgets.Layout(width='100%', height='300px'),
    style={'description_width': '120px'}
)

observations = widgets.Textarea(
    value='',
    placeholder='Any observations about the generation process? Did AI ask questions? Any errors?',
    description='Observations:',
    layout=widgets.Layout(width='100%', height='80px'),
    style={'description_width': '120px'}
)

print("RECORD AI OUTPUT")
print("="*70)
display(model_used)
display(ai_output)
display(observations)

## Step 4: First Impressions

Now that you've read the AI's output, give your immediate reaction **before** we do systematic evaluation.

In [None]:
# First impression widgets
first_impression_rating = widgets.Dropdown(
    options=[
        ('1 - Poor (major problems)', 1),
        ('2 - Below Average (significant issues)', 2),
        ('3 - Average (acceptable but unremarkable)', 3),
        ('4 - Good (solid with minor issues)', 4),
        ('5 - Excellent (impressive, ready to use)', 5)
    ],
    description='Overall Rating:',
    style={'description_width': '120px'},
    layout=widgets.Layout(width='500px')
)

rating_justification = widgets.Textarea(
    value='',
    placeholder='Why did you give this rating? What stands out most?',
    description='Justification:',
    layout=widgets.Layout(width='100%', height='100px'),
    style={'description_width': '120px'}
)

print("\nFIRST IMPRESSION RATING")
print("="*70)
print("Give an immediate, gut-level rating of the AI's output:")
print()
display(first_impression_rating)
display(rating_justification)

## Step 5: Identify Strengths and Weaknesses

Without using a formal rubric yet, what are the **2 biggest strengths** and **2 biggest weaknesses** you notice?

In [None]:
# Strengths and weaknesses widgets
strength_1 = widgets.Textarea(
    value='',
    placeholder='What did the AI do particularly well?',
    description='Strength 1:',
    layout=widgets.Layout(width='100%', height='80px'),
    style={'description_width': '120px'}
)

strength_2 = widgets.Textarea(
    value='',
    placeholder='Another thing the AI did well?',
    description='Strength 2:',
    layout=widgets.Layout(width='100%', height='80px'),
    style={'description_width': '120px'}
)

weakness_1 = widgets.Textarea(
    value='',
    placeholder='What could be improved? Be specific.',
    description='Weakness 1:',
    layout=widgets.Layout(width='100%', height='80px'),
    style={'description_width': '120px'}
)

weakness_2 = widgets.Textarea(
    value='',
    placeholder='Another area needing improvement?',
    description='Weakness 2:',
    layout=widgets.Layout(width='100%', height='80px'),
    style={'description_width': '120px'}
)

print("\nSTRENGTHS (What did AI do well?)")
print("="*70)
display(strength_1)
display(strength_2)

print("\nWEAKNESSES (What needs improvement?)")
print("="*70)
display(weakness_1)
display(weakness_2)

## Step 6: Save Your Data

In [None]:
# Validate that data was entered
def validate_module1_data():
    issues = []
    
    if not ai_output.value or len(ai_output.value.strip()) < 50:
        issues.append("AI output seems too short or empty")
    
    if not first_impression_rating.value:
        issues.append("First impression rating not selected")
    
    if not rating_justification.value or len(rating_justification.value.strip()) < 10:
        issues.append("Rating justification is too brief or empty")
    
    if not strength_1.value or not strength_2.value:
        issues.append("Both strengths must be identified")
    
    if not weakness_1.value or not weakness_2.value:
        issues.append("Both weaknesses must be identified")
    
    return issues

# Save button
save_button = widgets.Button(
    description='💾 Save Data',
    button_style='success',
    layout=widgets.Layout(width='200px', height='40px')
)

output_area = widgets.Output()

def on_save_clicked(b):
    with output_area:
        clear_output()
        
        # Validate
        issues = validate_module1_data()
        if issues:
            print("❌ Cannot save - please complete the following:")
            for issue in issues:
                print(f"   • {issue}")
            return
        
        # Compile data
        module1_data = {
            'group_code': group_code,
            'model_used': model_used.value,
            'ai_output': ai_output.value,
            'observations': observations.value,
            'first_impression_rating': first_impression_rating.value,
            'rating_justification': rating_justification.value,
            'strengths': [
                strength_1.value,
                strength_2.value
            ],
            'weaknesses': [
                weakness_1.value,
                weakness_2.value
            ]
        }
        
        # Save to JSON
        with open(f'{LAB_DIR}/lab11_group_{group_code}_module1.json', 'w') as f:
            json.dump(module1_data, f, indent=2)
        
        print("✓ Data saved successfully!")
        print(f"\nSaved to: lab11_group_{group_code}_module1.json")
        print("\nYou're ready for Module 2 (Evaluation)!")

save_button.on_click(on_save_clicked)

print("\n" + "="*70)
print("SAVE YOUR WORK")
print("="*70)
display(save_button)
display(output_area)

## Module 1 Questions

Answer these on your Lab 11 Answer Sheet.

### Q3: First Impression Rating

What was your first impression rating (1-5) and why? What made you give this specific score?

*(Answer on your answer sheet)*

### Q4: Strengths

What are the 2 biggest strengths of the AI's output? Be specific - quote or reference particular parts if helpful.

*(Answer on your answer sheet)*

### Q5: Weaknesses

What are the 2 biggest weaknesses or areas for improvement? Be specific about what's missing or could be better.

*(Answer on your answer sheet)*

### Q6: Verification Challenge

If you had to verify whether this output is actually "good," how would you do it? What would you check? What expertise or resources would you need?

*(Answer on your answer sheet)*

## Summary: What You've Accomplished

✓ Loaded your group's unique prompt from Module 0

✓ Ran the prompt on a real AI system

✓ Recorded the complete AI output

✓ Gave a first impression rating (1-5)

✓ Identified 2 strengths and 2 weaknesses

✓ Saved your data for Module 2

### What's Next?

In **Module 2 (Evaluate)**, you will:
- Select an explicit rubric with clear criteria
- Score the AI's output systematically (human judgment)
- Ask the AI to score its OWN work using the same rubric
- Compare: Where do you and the AI agree? Where do you disagree?

**This is where things get interesting...**

You'll discover whether AI can reliably evaluate its own quality. Spoiler: the results might surprise you!