<a href="https://colab.research.google.com/github/sbvevo2025/smart-finance-assistant/blob/main/Module%2007%20-%20Directing%20Pandas/week08_workshop.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# 🎯 Welcome to Your Programming Evolution

**Today marks a pivotal moment**: You're transitioning from code writer to code director, and you're starting your final assignment! By the end of today's workshop, you'll have made genuine progress on your Smart Finance Assistant project.

**Workshop Goals:**
- Practice AI-assisted data processing with business data
- Start your actual Smart Finance Assistant assignment
- Make your first meaningful GitHub commit
- Build confidence with the new AI collaboration workflow

---

# Part 1: Mindset Shift Demo (20 minutes)

## 🔄 The Workflow Revolution

Let's demonstrate the difference between old-school programming and AI collaboration.

### Old Way Demo: Manual Coding

Imagine trying to analyze spending data the traditional way:

```python
# Old approach - lots of manual work and potential errors
import pandas as pd

# Load data with potential issues
df = pd.read_csv('transactions.csv')

# Oh no! Amounts have dollar signs - need to clean manually
df['Amount'] = df['Amount'].str.replace('$', '')
df['Amount'] = pd.to_numeric(df['Amount'])

# Calculate totals by category - hoping syntax is right
category_totals = {}
for index, row in df.iterrows():
    if row['Category'] in category_totals:
        category_totals[row['Category']] += row['Amount']
    else:
        category_totals[row['Category']] = row['Amount']

# Sort manually
sorted_categories = sorted(category_totals.items(), key=lambda x: x[1], reverse=True)
print(sorted_categories)
```

**Problems with this approach:**
- Lots of syntax to remember
- Easy to make mistakes
- Repetitive patterns
- Hard to modify or extend
- Time-consuming debugging

### New Way Demo: AI Collaboration

Now let's see the AI-assisted approach:

::: {.callout-note}
## AI Collaboration Example

**Your Prompt to AI:**
*"I have a CSV file with columns Date, Amount, Category, Description. The Amount column has dollar signs that need cleaning. I want to calculate total spending by category and show results sorted from highest to lowest. Please write pandas code with comments explaining each step."*

**AI Response:**
:::

```python
import pandas as pd

# Load the transaction data
df = pd.read_csv('transactions.csv')

# Clean the Amount column by removing dollar signs and converting to numeric
df['Amount'] = pd.to_numeric(df['Amount'].str.replace('$', ''), errors='coerce')

# Remove any rows where Amount couldn't be converted (invalid data)
df = df.dropna(subset=['Amount'])

# Calculate total spending by category
category_spending = df.groupby('Category')['Amount'].sum()

# Sort from highest to lowest spending
category_spending = category_spending.sort_values(ascending=False)

# Display results in a business-friendly format
print("💰 Spending Summary by Category")
print("=" * 35)
for category, amount in category_spending.items():
    print(f"{category:<15}: ${amount:>8.2f}")

print(f"\n📊 Total Spending: ${category_spending.sum():.2f}")
```

**Benefits of AI collaboration:**
- ✅ Clear, commented code
- ✅ Handles data cleaning automatically
- ✅ Business-friendly output formatting
- ✅ Easy to understand and modify
- ✅ Professional code structure

## 🤔 Your Assignment Isn't Scary!

**Here's the secret**: Your Smart Finance Assistant assignment is just an extension of what we're doing today. It uses:

- **CSV data processing** (what we're learning now)
- **AI assistance** (what we're practicing now)
- **Gradio interface** (simple web UI - we'll cover this later)
- **Business insights** (your analysis skills)

**By starting today, you're already 10% done with your final project!**

---

# Part 2: Guided Practice with Sample Data (30 minutes)

Let's practice AI collaboration with real business data before moving to your assignment.

## 📁 Loading Sample Data

First, let's load our sample transaction data:

In [None]:
import pandas as pd

# Load the sample transactions
df = pd.read_csv('../data/transactions.csv')

# Display the first few rows to understand our data
print("📋 Sample Transaction Data:")
print("=" * 40)
print(df.head())

print(f"\n📊 Dataset Info:")
print(f"Total transactions: {len(df)}")
print(f"Date range: {df['Date'].min()} to {df['Date'].max()}")
print(f"Categories: {df['Category'].unique()}")

FileNotFoundError: [Errno 2] No such file or directory: '../data/transactions.csv'

## 🤖 AI Collaboration Exercise 1: Basic Analysis

**Your Turn**: Practice prompting AI for basic data analysis.

::: {.callout-tip}
## Practice Prompt Template

Try asking AI:
*"Using this transaction DataFrame, I want to understand my spending patterns. Please create code that shows:
1. Total spending by category
2. Average transaction amount by category  
3. Number of transactions per category
Format the output to be easy to read for a business presentation."*
:::

**Expected AI Response** (try this yourself first!):

In [None]:
# AI-generated solution for spending analysis

# Calculate comprehensive spending summary
spending_analysis = df.groupby('Category')['Amount'].agg({
    'Total_Spent': lambda x: pd.to_numeric(x.str.replace('$', ''), errors='coerce').sum(),
    'Avg_Amount': lambda x: pd.to_numeric(x.str.replace('$', ''), errors='coerce').mean(),
    'Transaction_Count': 'count'
}).round(2)

# Sort by total spending
spending_analysis = spending_analysis.sort_values('Total_Spent', ascending=False)

print("💼 Business Spending Analysis")
print("=" * 50)
print(f"{'Category':<15} {'Total':<10} {'Average':<10} {'Count':<8}")
print("-" * 50)

for category, row in spending_analysis.iterrows():
    total = row['Total_Spent']
    avg = row['Avg_Amount']
    count = row['Transaction_Count']
    print(f"{category:<15} ${total:<9.2f} ${avg:<9.2f} {count:<8.0f}")

print(f"\nGrand Total: ${spending_analysis['Total_Spent'].sum():.2f}")

## 🔍 Practice Your Critique Skills

Look at the AI-generated code above. Apply your critique toolkit:

1. **Do you understand each line?**
2. **Is it handling the dollar signs correctly?**
3. **Is the output business-appropriate?**
4. **What would you ask AI to improve?**

::: {.callout-warning}
## Critique Practice

The code above has an issue with handling the dollar signs in a grouped operation. How would you prompt AI to fix this?

**Sample critique prompt:**
*"The code is trying to clean dollar signs inside the groupby operation, which is inefficient. Can you first clean the Amount column once, then do the groupby analysis? Also add error handling for invalid amounts."*
:::

## 🤖 AI Collaboration Exercise 2: Business Insights

**Challenge**: Ask AI to help you identify spending patterns that would be useful for financial planning.

**Sample prompt to try:**
*"Based on this spending data, help me identify potential areas where I could reduce expenses. Look for categories with high spending or frequent transactions that might represent opportunities for savings."*

In [None]:
# Your AI collaboration code goes here
# Try the prompt above and see what AI suggests!

---

# Part 3: Assignment Setup (20 minutes)

**🎉 Now for the exciting part - starting your actual assignment!**

## 📋 Assignment Template Overview

Your Smart Finance Assistant assignment has this structure:

```
smart-finance-assistant/
├── smart_finance_assistant.ipynb    # Main notebook
├── README.md                        # Project description
├── diary.md                         # Developer's diary
├── data/
│   └── sample_transactions.csv      # Your data files
└── requirements.txt                 # Python dependencies
```

## 🔨 Setting Up Your Assignment Repository

### Step 1: Fork the Template Repository

1. Go to: https://github.com/michael-borck/smart-finance-assistant
2. Click "Fork" to create your own copy
3. Clone your fork to your computer or open in Google Colab

### Step 2: Understand the Project Structure

Let's explore what you're building:

**Your Smart Finance Assistant will include:**
- 📊 **CSV Data Processing** (what we're learning today!)
- 💬 **Chat Interface** (AI personality for financial advice)
- 🔍 **RAG System** (retrieval from your transaction data)
- 🛠️ **Custom Tool** (like a savings calculator)
- 🌐 **Gradio UI** (simple web interface)

**Don't worry** - you'll build this incrementally over the next 5 weeks!

### Step 3: Locate Your Developer's Diary

Open the `diary.md` file in your repository. This is where you'll document your AI collaborations each week.

::: {.callout-important}
## Week 8 Diary Template

Your diary should include weekly "AI Evidence Packages" like this:

```markdown
## Week 8: Initial CSV Processing

### AI Collaboration #1
**Context**: Setting up basic CSV loading and analysis for transaction data
**Prompt**: "Help me load and analyze transaction CSV data for my finance assistant"
**Result**: Working pandas code with data cleaning and summary statistics
**Reflection**: AI provided good starting code, but I had to ask for better error handling

### AI Collaboration #2
**Context**: Creating spending summary by category
**Prompt**: "Calculate total spending by category and format for business presentation"
**Result**: Professional-looking output with proper formatting
**Reflection**: Learned to be specific about output formatting requirements
```
:::

---

# Part 4: Start Your Assignment! (40 minutes)

**This is it!** Time to make real progress on your Smart Finance Assistant.

## 🎯 Your Week 8 Assignment Goals

By the end of today, you should have:
1. ✅ Forked the assignment repository
2. ✅ Loaded transaction data successfully
3. ✅ Created basic spending summaries using AI assistance
4. ✅ Documented your AI interactions in the diary
5. ✅ Made your first meaningful commit

## 📊 Step 1: Set Up Your Data

**Open your assignment notebook** (`smart_finance_assistant.ipynb`) and start with this section:

### Data Loading and Exploration

::: {.callout-tip}
## Prompt AI for this section:

*"I'm building a Smart Finance Assistant for my programming assignment. I need to load a CSV file with transaction data (columns: Date, Amount, Category, Description). The Amount column has dollar signs that need cleaning. Please write professional Python code with detailed comments that:
1. Loads the CSV data
2. Cleans and validates the data
3. Shows basic information about the dataset
4. Handles any common data issues"*
:::

In [16]:
# AI-assisted data loading section
# (Copy the code AI provides here)
import pandas as pd
import numpy as np

def load_and_clean_transactions(file_path: str) -> pd.DataFrame:
    """
    Load, clean, and validate a CSV file containing transaction data.

    Parameters
    ----------
    file_path : str
        The path to the CSV file to load (e.g., 'sample_transactions.csv').

    Returns
    -------
    pd.DataFrame
        A cleaned and validated pandas DataFrame.
    """

    # ======================
    # 1. LOAD CSV DATA
    # ======================
    try:
        df = pd.read_csv(file_path)
        print(f"✅ Successfully loaded '{file_path}'")
    except FileNotFoundError:
        raise FileNotFoundError(f"❌ File not found: {file_path}")
    except pd.errors.EmptyDataError:
        raise ValueError("❌ The CSV file is empty.")
    except pd.errors.ParserError as e:
        raise ValueError(f"❌ Error parsing CSV: {e}")

    # ======================
    # 2. VALIDATE REQUIRED COLUMNS
    # ======================
    required_columns = {"Date", "Amount", "Category", "Description"}
    missing_columns = required_columns - set(df.columns)
    if missing_columns:
        raise ValueError(f"❌ Missing required columns: {missing_columns}")

    # ======================
    # 3. CLEAN DATA
    # ======================
    # --- Clean Amount column: remove $ and commas, convert to float
    df["Amount"] = (
        df["Amount"]
        .astype(str)
        .str.replace(r"[\$,]", "", regex=True)  # remove $ and ,
        .replace("", np.nan)
        .astype(float)
    )

    # --- Convert Date to datetime format
    df["Date"] = pd.to_datetime(df["Date"], errors="coerce")

    # ======================
    # 4. HANDLE COMMON DATA ISSUES
    # ======================
    # Drop rows missing critical data
    before_drop = len(df)
    df = df.dropna(subset=["Date", "Amount"])
    after_drop = len(df)
    if before_drop != after_drop:
        print(f"⚠️ Dropped {before_drop - after_drop} rows with missing Date or Amount.")

    # Fill missing optional fields with placeholders
    df["Category"] = df["Category"].fillna("Uncategorized")
    df["Description"] = df["Description"].fillna("No description provided")

    # Remove duplicates (if any)
    df = df.drop_duplicates()

    # ======================
    # 5. SHOW BASIC INFO
    # ======================
    print("\n📊 === Dataset Overview ===")
    print(df.info())
    print("\n🔎 === First 5 Transactions ===")
    print(df.head())
    print("\n💰 === Amount Summary Statistics ===")
    print(df["Amount"].describe())

    return df


# ======================
# MAIN EXECUTION BLOCK
# ======================
if __name__ == "__main__":
    try:
        transactions_df = load_and_clean_transactions("sample_transactions.csv")
    except Exception as e:
        print(e)
# Add your own comments about what you learned

❌ File not found: sample_transactions.csv


## 📈 Step 2: Create Core Analysis Functions

**Prompt AI** to help you create the foundation of your finance assistant:

::: {.callout-tip}
## Analysis Functions Prompt:

*"For my Smart Finance Assistant, I need functions to analyze spending data. Please create well-documented functions that:
1. Calculate total spending by category
2. Find the top 5 spending categories
3. Calculate average spending per transaction
4. Identify any unusual transactions (very high or negative amounts)
Each function should have clear business explanations and error handling."*
:::

In [None]:
# AI-generated analysis functions
# (Your AI collaboration code goes here)

## 💰 Step 3: Generate Business Insights

**Challenge AI** to help you create business-relevant insights:

::: {.callout-tip}
## Business Insights Prompt:

*"Using the transaction analysis functions, help me create a comprehensive spending report that would be useful for personal financial planning. Include recommendations for potential savings and spending pattern observations. Format it like a professional financial summary."*
:::

In [None]:
# AI-generated business insights
# (Your AI collaboration code goes here)

## 📝 Step 4: Document Your AI Interactions

**Important**: As you work, document each AI collaboration in your `diary.md` file.

**Template for each interaction:**

```markdown
### AI Collaboration #[number]
**Context**: [One sentence about what you were trying to achieve]
**Prompt**: "[Your exact prompt to AI]"
**Result**: [Brief description of what AI provided]
**Reflection**: [What worked well? What didn't? What you learned?]
```

---

# Part 5: Commit and Reflect (10 minutes)

## 💾 Making Your First Commit

You've made real progress! Time to commit your work:

```bash
# In your terminal/command line:
git add .
git commit -m "Week 8: Initial CSV processing and analysis implemented

- Set up data loading with error handling
- Created spending analysis functions  
- Generated business insights report
- Documented AI collaboration in diary

🤖 Generated with AI assistance - see diary.md for details"

git push origin main
```

## 🎯 Exit Ticket Success Criteria

**Congratulations! If you've completed the following, you've successfully completed your Week 8 exit ticket:**

- ☐ **Repository Setup**: Forked and cloned assignment template
- ☐ **Data Processing**: Successfully loaded and analyzed transaction CSV data
- ☐ **AI Collaboration**: Used AI to generate spending summary code
- ☐ **Documentation**: Recorded at least 3 AI interactions in Developer's Diary
- ☐ **Git Commit**: Made meaningful commit with progress description
- ☐ **Business Focus**: Created analysis relevant to personal finance

## 🏆 What You've Accomplished Today

**Technical Skills:**
- Transitioned to AI-assisted programming workflow
- Mastered basic pandas operations for business data
- Learned effective AI prompting techniques
- Practiced code critique and improvement skills

**Assignment Progress:**
- Your Smart Finance Assistant is already 10% complete!
- You have working CSV processing - the foundation of your project
- Your repository structure is set up professionally
- You've established a good AI collaboration workflow

**Professional Development:**
- Experienced modern programming practices
- Built confidence with real business data
- Learned to direct AI like a junior developer
- Created documentation habits that employers value

---

# 🚀 Next Steps and Week 9 Preview

**For Week 9**, you'll build on today's foundation by:
- Adding data visualization to your finance assistant
- Creating interactive filtering and analysis features
- Expanding your business insights with trend analysis
- Continuing to document your AI collaboration journey

**Assignment Momentum**: By completing Week 8's exit ticket today, you've given yourself a significant head start on your final project. Keep this momentum going!

**Remember**: You only need 6 out of 8 exit tickets to pass. By starting here, you're setting yourself up for success.

---

## 🤝 Need Help?

- **Stuck with AI prompts?** Review Module 3 for better prompt templates
- **Code not working?** Use Module 4's critique toolkit to debug with AI
- **Repository issues?** Ask your facilitator for Git help
- **Behind on progress?** Focus on getting one function working well rather than everything perfect

**Most importantly**: You're now a code director, not a code writer. Embrace the collaboration with AI and focus on understanding the business logic rather than memorizing syntax.

**Great work today! You've officially started your journey toward building a professional Smart Finance Assistant! 🎉**