<a href="https://colab.research.google.com/github/jiuwong/sfu_AppliedAI_DataAnalytics/blob/main/2_1_asking_questions.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

<img src="https://sfudial.ca/wp-content/uploads/SFU-DIAL-Logo.png" width=40%>&nbsp;&nbsp;&nbsp;&nbsp;<img src="https://www.sfu.ca/content/dam/sfu/images/brand_extension/SFU-Big-Data_Logo.png" width=40%>

# Lab 2.1: Asking Questions with AI

Learn how to effectively prompt AI assistants to get better answers for data analysis questions. Practice crafting specific, contextual prompts that help you understand your data and make better decisions.

**Use the TODOs and prompt your AI like a teammate. Think critically, experiment often, and document your process.**

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/gist/git-steb/5e2f439060ed346cb6390bc6f63d2f8d/2_1_Asking_Questions.ipynb)

## Lab Outline

- **Part 1:** Learn effective prompting strategies
- **Part 2:** Practice with real data questions
- **Part 3:** Compare different prompt approaches
- **Deliverable:** Reflection on effective AI prompting

## Getting Help from Your AI Assistant

**Why AI assistance matters:** AI tools can help you formulate better questions, understand data patterns, and guide your analysis. They're particularly valuable for learning how to ask the right questions in data analysis.

**Good prompts:**
- "Help me understand what this data is telling me"
- "What questions should I ask about this dataset?"
- "How can I frame this analysis question better?"
- "What would an expert look for in this type of data?"
- "Help me interpret these results in business terms"
- "What follow-up questions should I ask?"

**Avoid vague prompts like "analyze this data"**

**Pro Tip:** Ask "what would an expert ask" and "how should I frame this question" to get more targeted assistance.

## Learning Objectives
- [ ] Objective 1: Learn effective prompting strategies
- [ ] Objective 2: Practice asking better data questions
- [ ] Objective 3: Compare different prompt approaches
- [ ] Objective 4: Reflect on effective AI communication

## Part 1: Understanding Effective Prompting

### Step 1: The Art of Asking Questions

Effective AI prompting is about being specific, contextual, and clear about what you want to achieve.

### Types of Effective Prompts

**Key prompt types to master:**
1. **Specific**: "Help me understand the relationship between X and Y"
2. **Contextual**: "Given this business scenario, what should I look for?"
3. **Actionable**: "What steps should I take to analyze this data?"
4. **Comparative**: "How does this approach compare to alternative methods?"

In [None]:
# TODO: Practice different types of prompts

### Step 2: Prompting Strategies

In [None]:
# TODO: Learn different prompting strategies
strategies = {
    "Chain of Thought": "Break down complex questions into steps",
    "Context Setting": "Provide background information and constraints",
    "Role Playing": "Ask AI to think like a specific expert",
    "Iterative Refinement": "Start broad, then narrow down based on responses"
}

for strategy, description in strategies.items():
    print(f"{strategy}: {description}")

## Part 2: Practice with Real Data Questions

### Step 1A: Sample Dataset Analysis (Built-in Sample Data)

In [None]:
import pandas as pd
import numpy as np

# Create sample dataset for practice
np.random.seed(42)
data = {
    'customer_id': range(1, 101),
    'age': np.random.normal(35, 10, 100),
    'income': np.random.normal(50000, 15000, 100),
    'satisfaction': np.random.randint(1, 11, 100),
    'purchase_amount': np.random.exponential(100, 100)
}

df = pd.DataFrame(data)
print("Sample dataset created for practice")
print(f"Shape: {df.shape}")
df.head()

### Step 1B (Optional): Alternative with Vancouver Bike Volumes Data

Instead of using the built-in sample dataset above, you can try asking questions
about a **real open data dataset** from the City of Vancouver:

- **Overview page (bike volumes explanation):**
  - https://vancouver.ca/streets-transportation/how-we-collect-bike-volumes.aspx
- **2021â€“2024 daily bike volume data (Excel file):**
  - https://vancouver.ca/files/cov/bike-volume-2021-2024.xlsx

This dataset contains daily two-way bike counts on key routes in Vancouver.

> If you switch to this dataset (or bring your **own** dataset),
> the *follow-up questions you ask your AI assistant should also change*
> to match the new context (bike volumes over time instead of customer data).

In [None]:
# OPTIONAL: Use real Vancouver Bike Volumes data instead of the sample dataset.
#
# Uncomment this block if you want to work with the City of Vancouver
# bike volume data for this lab.
#
# import pandas as pd
#
# bike_url = "https://vancouver.ca/files/cov/bike-volume-2021-2024.xlsx"
# bike_df = pd.read_excel(bike_url)
#
# print("Vancouver bike volumes data loaded")
# print(f"Shape: {bike_df.shape}")
# bike_df.head()
#
# # When you switch to this dataset, update your AI prompts and follow-up questions.
# # For example:
# # - "What patterns do you see in bike volumes across different routes?"
# # - "How do bike volumes change by month or season?"
# # - "What follow-up questions should I ask about factors that might affect bike traffic?"
# # - "If I bring my own dataset, what kinds of questions should I ask about *that* data?"

### Step 2: Practice Different Question Types


> If you switch to a different dataset (for example, the Vancouver Bike Volumes
> data in Step 1B or your own dataset), adapt the questions you ask so they
> match that context (e.g., bike traffic over time, routes, seasons, etc.).


In [None]:
# TODO: Practice asking different types of questions about this data
#
# Example questions to practice:
questions = [
    "What patterns do you see in customer satisfaction?",
    "How does income relate to purchase behavior?",
    "What would a marketing expert focus on in this data?",
    "What follow-up questions should I ask about customer segments?",
    "How can I validate these insights with additional data?"
]

print("Practice asking these types of questions:")
for i, question in enumerate(questions, 1):
    print(f"{i}. {question}")

## Part 3: Comparing Prompt Approaches

### Step 1: Vague vs Specific Prompts

### Vague vs Specific Prompts

**Vague prompts (avoid these):**
- "Analyze this data"
- "What do you think?"
- "Help me understand"

**Specific prompts (use these):**
- "What customer segments can I identify based on income and satisfaction?"
- "How can I improve customer retention using this data?"
- "What statistical tests should I run to validate this hypothesis?"

In [None]:
# TODO: Compare different prompt styles

### Step 2: Context Matters

### Good Context Includes

**Essential context elements:**
- Business objective
- Data constraints
- Target audience
- Success criteria
- Available resources

In [None]:
# TODO: Practice adding context to your prompts

## Part 4: Reflection and Best Practices

### Step 1: Document Your Learning

### Reflection Questions

**Consider these questions about your prompting experience:**
1. Which prompts gave you the most useful responses?
2. How did context affect the quality of AI responses?
3. What patterns did you notice in effective prompts?
4. How will you apply this to future data analysis?

In [None]:
# TODO: Reflect on your prompting experience

### Step 2: Create Your Prompting Toolkit

In [None]:
# TODO: Build your personal prompting toolkit
toolkit = {
    "Data Exploration": "What patterns should I look for in this dataset?",
    "Statistical Analysis": "What tests would be appropriate for this hypothesis?",
    "Business Insights": "How would a business analyst interpret these results?",
    "Technical Implementation": "What's the best way to implement this analysis?",
    "Validation": "How can I verify these findings?"
}

print("Your AI Prompting Toolkit:")
for category, example in toolkit.items():
    print(f"{category}: {example}")

## Metacognitive Learning Prompts

### Reflection Questions
- **What did you learn about your own learning process?**
- **How would you apply this to a different domain?**
- **What connections do you see to other concepts?**
- **What questions do you still have?**

### Transfer Applications
- **How would this work in healthcare?**
- **What would change if you had 10 times more data?**
- **How would you explain this to business executives?**

### Expert Thinking
- **What would an expert do differently?**
- **What assumptions are you making?**
- **How would you validate your approach?**