# IS4010: AI-Enhanced Application Development
## Week 4: Python Data Structures - Interactive Companion Notebook

**Instructor:** Brandon M. Greenwell  
**Course:** IS4010: AI-Enhanced Application Development  
**Week:** 4 - Python Data Structures  

---

### 🎯 Learning Objectives

By the end of this interactive session, you will be able to:

- **Master all four core Python data structures**: lists, tuples, dictionaries, and sets
- **Make performance-driven decisions** about data structure choice with measurable results
- **Collaborate effectively with AI tools** for code optimization and design
- **Apply list comprehensions** for concise and readable data processing
- **Design optimal data structures** for real-world scenarios
- **Critically evaluate AI recommendations** and document your reasoning

### 🛠️ Prerequisites

- Python 3.10+ installed
- Access to a GenAI tool (Claude, ChatGPT, Gemini, etc.)
- Basic understanding of Python variables, loops, and functions
- Completed Labs 1-3

### 🔗 Resources

- [Python Data Structures Documentation](https://docs.python.org/3/tutorial/datastructures.html)
- [List Comprehensions Guide](https://docs.python.org/3/tutorial/datastructures.html#list-comprehensions)
- [Lab 04 Instructions](https://github.com/bgreenwell/is4010-course-template/blob/main/labs/lab04/README.md)

---

# 📦 Welcome to Python Data Structures

## The Foundation of Organized Programming

**Data structures** are the backbone of efficient programming. They determine how we organize, store, and access information. Every app you use relies on sophisticated data organization:

- **Instagram feeds**: Lists of posts ordered by time
- **User profiles**: Dictionaries mapping usernames to account information  
- **Unique hashtags**: Sets ensuring no duplicates
- **GPS coordinates**: Tuples of immutable latitude/longitude pairs

Understanding data structures is essential for:
- ✅ **Technical interviews** - Common coding questions
- ✅ **Performance optimization** - Right structure = faster apps
- ✅ **Career relevance** - Industry standard knowledge
- ✅ **AI collaboration** - Better prompts, better results

### 🤔 Reflection Question

Before we dive in, think about an app you use daily. What types of data does it organize? How might different data structures be used?

*Write your thoughts in the cell below:*

**Your reflection here:**

_Double-click this cell to edit and write your thoughts about data organization in apps you use._

---

# Session 1: Sequences & Collections

## What is a Data Structure? 🤔

A **data structure** is a way to organize and store multiple pieces of data in a single variable. Think of them as **specialized containers** for your data.

**Python's built-in arsenal:**
- **[Lists](https://docs.python.org/3/tutorial/introduction.html#lists)** - Ordered, changeable sequences
- **[Tuples](https://docs.python.org/3/tutorial/datastructures.html#tuples-and-sequences)** - Ordered, unchangeable sequences  
- **[Dictionaries](https://docs.python.org/3/tutorial/datastructures.html#dictionaries)** - Key-value pair mappings
- **[Sets](https://docs.python.org/3/tutorial/datastructures.html#sets)** - Unique item collections

**Why choosing wisely matters:**
- The right data structure makes your code faster, cleaner, and more maintainable
- Different structures excel at different operations
- Performance can vary from milliseconds to nanoseconds!

## The List: A Mutable Sequence 📝

A **[list](https://docs.python.org/3/tutorial/introduction.html#lists)** is an ordered, changeable collection of items - the workhorse of Python programming.

**Key characteristics:**
- **Ordered**: Items maintain their position
- **Mutable**: Add, remove, or modify items after creation
- **Dynamic size**: Grow or shrink as needed
- **Zero-based indexing**: First item is at index 0

In [None]:
# Creating and working with lists - Computer Science Pioneers
pioneers = ["Grace Hopper", "Ada Lovelace", "Katherine Johnson"]

print(f"Original list: {pioneers}")
print(f"Number of pioneers: {len(pioneers)}")
print(f"First pioneer: {pioneers[0]}")
print(f"Last pioneer: {pioneers[-1]}")

In [None]:
# Modifying lists - Adding new pioneers
pioneers.append("Dorothy Vaughan")  # Add to end
pioneers.insert(1, "Hedy Lamarr")   # Insert at specific position

# Update an item
pioneers[0] = "Rear Admiral Grace Hopper"

print(f"Updated list: {pioneers}")
print(f"Now we have {len(pioneers)} pioneers!")

In [None]:
# Useful list methods
print(f"Index of Ada Lovelace: {pioneers.index('Ada Lovelace')}")
print(f"Count of 'Ada Lovelace': {pioneers.count('Ada Lovelace')}")

# Remove items
removed = pioneers.pop()  # Remove and return last item
print(f"Removed: {removed}")
print(f"Final list: {pioneers}")

### 🛠️ Your Turn: Practice with Lists

Create your own list and practice the operations:

In [None]:
# TODO: Create a list of your favorite programming languages
my_languages = []

# TODO: Add at least 3 languages to your list

# TODO: Print the list and its length

# TODO: Access the first and last items

# TODO: Insert a new language at position 1

# TODO: Remove the last language and print what was removed

## The Tuple: An Immutable Sequence 🔒

A **[tuple](https://docs.python.org/3/tutorial/datastructures.html#tuples-and-sequences)** is an ordered, unchangeable collection of items.

**Key characteristics:**
- **Ordered**: Like lists, items maintain their position
- **Immutable**: Cannot add, remove, or change items after creation
- **Performance advantage**: Slightly faster than lists for accessing elements
- **Memory efficient**: Less overhead due to immutability

**Perfect for:**
- Coordinates (latitude, longitude)
- RGB color values
- Database records
- Function return values

In [None]:
# Real-world tuple examples

# Cincinnati coordinates (latitude, longitude)
cincinnati_location = (39.1031, -84.5120)
denver_location = (39.7391, -104.9847)

print(f"Cincinnati: {cincinnati_location}")
print(f"Denver: {denver_location}")

# UC Brand Colors (RGB values)
uc_blue = (0, 123, 191)
uc_red = (224, 18, 34)
black = (0, 0, 0)

brand_colors = (uc_blue, uc_red, black)
print(f"UC Brand Colors: {brand_colors}")

In [None]:
# Tuple unpacking - A very common Python pattern!
lat, lon = cincinnati_location
print(f"Cincinnati is at latitude {lat}, longitude {lon}")

# Multiple return values from functions
def get_student_info():
    return "Alice Johnson", 95.5, "IS4010"  # Returns a tuple!

name, grade, course = get_student_info()  # Tuple unpacking
print(f"Student: {name}, Grade: {grade}, Course: {course}")

### 🛠️ Your Turn: Practice with Tuples

In [None]:
# TODO: Create a tuple representing your favorite restaurant's location (name, latitude, longitude)

# TODO: Unpack the tuple into separate variables

# TODO: Print a formatted message about the restaurant

# TODO: Create a function that returns multiple pieces of information about yourself
# (name, major, year, favorite_color) and unpack the result

# TODO: Call the function and unpack the results

## Quick Introduction: List Comprehensions 📝

Python has a concise way to create lists called **[list comprehensions](https://docs.python.org/3/tutorial/datastructures.html#list-comprehensions)** - perfect for generating test data!

**Pattern:** `[expression for item in iterable]`

**Why learn this now?**
- Perfect for generating test data (which we'll need for performance testing!)
- More readable and Pythonic than traditional loops
- We'll use this pattern extensively in data processing

In [None]:
# Traditional approach with loops
numbers = []
for i in range(5):
    numbers.append(i * 2)
print(f"Traditional loop: {numbers}")

# List comprehension - more concise!
numbers_comp = [i * 2 for i in range(5)]
print(f"List comprehension: {numbers_comp}")

# Pattern: [expression for item in iterable]
emails = [f"user_{i}@email.com" for i in range(3)]
print(f"Generated emails: {emails}")

### 🛠️ Your Turn: Practice List Comprehensions

In [None]:
# TODO: Create a list of squares for numbers 1-10 using a list comprehension

# TODO: Create a list of even numbers from 0-20 using a list comprehension

# TODO: Create a list of customer emails for customer IDs 100-105

---

# 🔬 Interactive Exercise: AI-Assisted Performance Detective

**Challenge:** Help optimize this slow code using GenAI tools!

## Step 1: The Slow Code

First, let's create some test data and implement a slow customer VIP lookup system.

In [None]:
import time

# Generate large customer database using list comprehensions!
customers = [f"customer_{i}@email.com" for i in range(50000)]
vip_customers = [f"customer_{i}@email.com" for i in range(0, 50000, 100)]  # Every 100th customer

print(f"Total customers: {len(customers):,}")
print(f"VIP customers: {len(vip_customers):,}")
print(f"First few VIPs: {vip_customers[:5]}")

In [None]:
# Slow approach - checking VIP status using list membership
def is_vip_slow(email, vip_list):
    """Check if customer is VIP - using list membership"""
    return email in vip_list

# Let's time this approach
start_time = time.time()
vip_count = 0

for customer in customers[:1000]:  # Check first 1000 customers
    if is_vip_slow(customer, vip_customers):
        vip_count += 1

slow_time = time.time() - start_time
print(f"Slow approach: {slow_time:.4f} seconds, found {vip_count} VIPs")

## Step 2: Consult Your AI Partner 🤖

**Your Mission:**
1. Open your preferred GenAI tool (Claude, ChatGPT, Gemini, etc.)
2. Use this prompt (copy and paste):

```
Analyze this Python code for performance bottlenecks and suggest optimizations using different data structures:

[Copy the slow code from above]

Focus on:
- What data structure would be better for VIP lookups?
- Why is the current approach slow?
- What's the time complexity difference?
```

**Document your AI's response here:**

**AI Response Documentation:**

_Double-click this cell and paste your AI's recommendations here. Include the key insights about data structure choice and performance._

**Key insights from AI:**
- 
- 
- 

**Recommended data structure:**

**Reasoning:**

## Step 3: Implement the AI's Recommendation

In [None]:
# Optimized approach - After consulting with GenAI
# Convert VIP list to set for O(1) lookups (AI recommendation!)
vip_customers_set = set(vip_customers)

def is_vip_fast(email, vip_set):
    """Check if customer is VIP - using set membership"""
    return email in vip_set

# Time the optimized approach
start_time = time.time()
vip_count_fast = 0

for customer in customers[:1000]:
    if is_vip_fast(customer, vip_customers_set):
        vip_count_fast += 1

fast_time = time.time() - start_time

print(f"Fast approach: {fast_time:.4f} seconds, found {vip_count_fast} VIPs")
print(f"Speedup: {slow_time/fast_time:.1f}x faster!")
print(f"Time saved: {(slow_time - fast_time)*1000:.1f} milliseconds")

## Step 4: Reflection and Learning

**Discussion Questions:**
1. How accurate was your AI's recommendation?
2. What surprised you about the performance difference?
3. When might you still prefer a list over a set?
4. What other optimizations did your AI suggest?

**Your Reflection:**

_Double-click to edit and write your thoughts about the AI collaboration and performance results._

### 🎯 Key Takeaways

- **List membership (`in`)**: O(n) - slow for large datasets
- **Set membership (`in`)**: O(1) - extremely fast
- **AI tools excel** at identifying common optimization patterns
- **Real impact**: This optimization matters in production systems
- **Learning loop**: AI suggestions → implementation → measurement → iteration

---

# Session 2: Mappings & Collections

## The Dictionary: Key-Value Pairs 🗝️

A **[dictionary](https://docs.python.org/3/tutorial/datastructures.html#dictionaries)** is a collection of `key: value` pairs optimized for lightning-fast lookups.

**Key characteristics:**
- **Fast lookups**: O(1) average time to find values by key
- **Unique keys**: Each key appears only once
- **Mutable**: Add, remove, or change items after creation
- **Ordered** (Python 3.7+): Maintains insertion order

In [1]:
# Real-world dictionary example - User profile system
user_profile = {
    "username": "ada_lovelace",
    "first_name": "Ada",
    "last_name": "Lovelace",
    "birth_year": 1815,
    "occupation": "Mathematician",
    "famous_for": "First computer programmer",
    "followers": 125000,
    "verified": True,
    "programming_languages": ["Analytical Engine", "Mathematics"]
}

# Accessing values (multiple ways)
print(f"User: {user_profile['username']}")
print(f"Born: {user_profile.get('birth_year', 'Unknown')}")  # Safe access
print(f"Followers: {user_profile['followers']:,}")

# Adding and updating
user_profile["location"] = "London, England"
user_profile["followers"] += 1000  # New followers!

print(f"Updated: {user_profile['username']} now has {user_profile['followers']:,} followers")

User: ada_lovelace
Born: 1815
Followers: 125,000
Updated: ada_lovelace now has 126,000 followers


### 🔍 Dictionary Deep Dive: Indexing, Methods & Best Practices

Now let's explore the full power of Python dictionaries with advanced indexing techniques and essential methods.

In [4]:
# Dictionary indexing and safe access methods
print("=== Dictionary Indexing Examples ===")

# Direct indexing - fast but can raise KeyError
print(f"Username: {user_profile['username']}")

# Safe access with .get() - returns None or default if key missing
print(f"Middle name: {user_profile.get('middle_name', 'Not provided')}")
print(f"Age: {user_profile.get('age', 'Unknown')}")

# Check if key exists before accessing
if 'birth_year' in user_profile:
    current_year = 2024
    age = current_year - user_profile['birth_year']
    print(f"Calculated age: {age} years old")

# Multiple ways to handle missing keys
try:
    print(user_profile['nonexistent_key'])
except KeyError:
    print("KeyError handled gracefully")

print(f"Followers with default: {user_profile.get('followers', 0):,}")

=== Dictionary Indexing Examples ===
Username: ada_lovelace
Middle name: Not provided
Age: Unknown
Calculated age: 209 years old
KeyError handled gracefully
Followers with default: 126,000


In [None]:
# Essential dictionary methods - the complete toolkit
print("=== Dictionary Methods in Action ===")

# View all available methods
print("Dictionary methods:", [method for method in dir(user_profile) if not method.startswith('_')][:10])

# Keys, values, and items
print(f"All keys: {list(user_profile.keys())}")
print(f"All values: {list(user_profile.values())[:3]}...")  # First 3 values
print(f"All items: {list(user_profile.items())[:2]}...")    # First 2 key-value pairs

# Update with another dictionary
social_media = {
    "twitter": "@ada_lovelace_codes",
    "github": "ada-codes",
    "linkedin": "ada-lovelace-dev"
}

user_profile.update(social_media)
print(f"After update: {list(user_profile.keys())[-3:]}")  # Last 3 keys added

In [None]:
[type(value) for value in user_profile.values()]  # IGNORE

# Write the above list comprehension in a for loop
types = []
for value in user_profile.values():
    types.append(type(value))
print(types)  # IGNORE

[str, str, str, int, str, str, int, bool, list, str]

In [13]:
# Advanced dictionary operations and patterns
print("=== Advanced Dictionary Patterns ===")

# Dictionary comprehensions - like list comprehensions but for dicts!
numbers = [1, 2, 3, 4, 5]
squared_dict = {num: num**2 for num in numbers}
print(f"Squared dictionary: {squared_dict}")

# Filtering with dictionary comprehensions
active_users = {
    "alice": {"last_login": "2024-01-15", "status": "active"},
    "bob": {"last_login": "2023-12-20", "status": "inactive"},
    "charlie": {"last_login": "2024-01-16", "status": "active"}
}

active_only = {name: data for name, data in active_users.items() if data["status"] == "active"}
print(f"Active users only: {active_only}")

# Nested dictionary access
print(f"Alice's last login: {active_users['alice']['last_login']}")

# Safe nested access with get()
print(f"Bob's email: {active_users.get('bob', {}).get('email', 'Not provided')}")

# setdefault - add key only if it doesn't exist
user_profile.setdefault('achievements', [])
user_profile['achievements'].append('First programmer')
print(f"Achievements: {user_profile['achievements']}")

=== Advanced Dictionary Patterns ===
Squared dictionary: {1: 1, 2: 4, 3: 9, 4: 16, 5: 25}
Active users only: {'alice': {'last_login': '2024-01-15', 'status': 'active'}, 'charlie': {'last_login': '2024-01-16', 'status': 'active'}}
Alice's last login: 2024-01-15
Bob's email: Not provided
Achievements: ['First programmer']


In [10]:
numbers = [1, 2, 3, 4, 5]
squared_dict = {str(num): num**2 for num in numbers}
print(squared_dict)

{'1': 1, '2': 4, '3': 9, '4': 16, '5': 25}


In [12]:
squared_dict['2']

4

### 🛠️ Your Turn: Dictionary Mastery Practice

Practice the dictionary concepts with these hands-on exercises:

In [21]:
# TODO: Create a product inventory dictionary
# Include: product_id, name, price, stock_quantity, category
inventory = {
    "LAPTOP001": {"name": "Gaming Laptop", "price": 1299.99, "stock": 15, "category": "Electronics"},
    "PHONE001": {"name": "Smartphone", "price": 699.99, "stock": 0, "category": "Electronics"},
    "BOOK001": {"name": "Python Programming", "price": 39.99, "stock": 25, "category": "Books"}
}

# TODO: Practice safe access - get the price of "LAPTOP001"
try:
    inventory["LAPTOP001"]["type"]
except KeyError:
    print("KeyError handled gracefully")

# TODO: Check if "PHONE001" is in stock (stock_quantity > 0)
is_in_stock = inventory.get("PHONE001", {}).get("stock", 0) > 0
print(f"Is PHONE001 in stock? {'Yes' if is_in_stock else 'No'}")


# TODO: Add a new product using setdefault
inventory.setdefault("TABLET001", {"name": "Tablet", "price": 499.99, "stock": 30, "category": "Electronics"})

# Add to dict using []
inventory["HEADPHONES001"] = {"name": "Wireless Headphones", "price": 199.99, "stock": 50, "category": "Electronics"}

# TODO: Create a dictionary comprehension to get all product names
product_names = {pid: details["name"] for pid, details in inventory.items()}
print(f"Product names: {product_names}")

# DO the same using a list comprehension
product_names_list = [details["name"] for details in inventory.values()]

# Now use a for loop to print each product name
for name in inventory.values():
    print(name["name"])

# TODO: Filter to get only products with stock > 10

# TODO: Use .pop() to remove PHONE001 and print what was removed
inventory.pop("PHONE001", None)  # Remove PHONE001 if it exists

KeyError handled gracefully
Is PHONE001 in stock? No
Product names: {'LAPTOP001': 'Gaming Laptop', 'PHONE001': 'Smartphone', 'BOOK001': 'Python Programming', 'TABLET001': 'Tablet', 'HEADPHONES001': 'Wireless Headphones'}
Gaming Laptop
Smartphone
Python Programming
Tablet
Wireless Headphones


{'name': 'Smartphone', 'price': 699.99, 'stock': 0, 'category': 'Electronics'}

In [22]:
inventory

{'LAPTOP001': {'name': 'Gaming Laptop',
  'price': 1299.99,
  'stock': 15,
  'category': 'Electronics'},
 'BOOK001': {'name': 'Python Programming',
  'price': 39.99,
  'stock': 25,
  'category': 'Books'},
 'TABLET001': {'name': 'Tablet',
  'price': 499.99,
  'stock': 30,
  'category': 'Electronics'},
 'HEADPHONES001': {'name': 'Wireless Headphones',
  'price': 199.99,
  'stock': 50,
  'category': 'Electronics'}}

## The Set: Unique Items Only 🎯

A **[set](https://docs.python.org/3/tutorial/datastructures.html#sets)** is an unordered collection of **unique** items.

**Key characteristics:**
- **No duplicates**: Automatically removes duplicate entries
- **Fast membership testing**: `item in set` is extremely efficient
- **Mathematical operations**: Union, intersection, difference
- **Mutable**: Add and remove items after creation

In [23]:
# Website analytics: tracking unique visitors
daily_visitors = ["alice", "bob", "charlie", "alice", "david", "alice", "eve"]
monthly_visitors = ["alice", "frank", "george", "bob", "helen"]

# Remove duplicates and get unique visitors
unique_daily = set(daily_visitors)
unique_monthly = set(monthly_visitors)

print(f"Daily visitors (with duplicates): {daily_visitors}")
print(f"Unique daily visitors: {unique_daily}")
print(f"Unique daily count: {len(unique_daily)}")

# Set operations (mathematical set theory)
all_visitors = unique_daily | unique_monthly        # Union
returning_visitors = unique_daily & unique_monthly  # Intersection
new_visitors = unique_monthly - unique_daily        # Difference

print(f"All unique visitors: {all_visitors}")
print(f"Returning visitors: {returning_visitors}")
print(f"New visitors this month: {new_visitors}")

Daily visitors (with duplicates): ['alice', 'bob', 'charlie', 'alice', 'david', 'alice', 'eve']
Unique daily visitors: {'david', 'bob', 'charlie', 'eve', 'alice'}
Unique daily count: 5
All unique visitors: {'david', 'bob', 'helen', 'charlie', 'george', 'frank', 'eve', 'alice'}
Returning visitors: {'bob', 'alice'}
New visitors this month: {'frank', 'george', 'helen'}


### 🎯 Set Deep Dive: Advanced Operations & Mathematical Power

Sets are much more powerful than just removing duplicates. Let's explore their full mathematical capabilities and practical applications.

In [None]:
# Advanced set operations - the mathematical toolkit
print("=== Set Operations in Business Analytics ===")

# Customer segments from different marketing campaigns
email_campaign = {"alice", "bob", "charlie", "diana", "eve"}
social_media = {"bob", "diana", "frank", "grace", "henry"}
referral_program = {"alice", "charlie", "frank", "ivan", "jane"}

print(f"Email campaign: {email_campaign}")
print(f"Social media: {social_media}")
print(f"Referral program: {referral_program}")

# Union (|) - All customers reached by ANY campaign
all_customers = email_campaign | social_media | referral_program
print(f"Total unique customers reached: {len(all_customers)} -> {all_customers}")

# Intersection (&) - Customers reached by ALL campaigns
super_engaged = email_campaign & social_media & referral_program
print(f"Super engaged (all 3 campaigns): {super_engaged}")

# Difference (-) - Customers only in email, not in social media
email_only = email_campaign - social_media
print(f"Email-only customers: {email_only}")

# Symmetric difference (^) - Customers in email OR social media, but not both
exclusive_reach = email_campaign ^ social_media
print(f"Exclusive reach (email XOR social): {exclusive_reach}")

In [None]:
# Set methods and membership testing
print("=== Set Methods and Testing ===")

# Available set methods
sample_set = {1, 2, 3}
print("Set methods:", [method for method in dir(sample_set) if not method.startswith('_')][:10])

# Create a skills tracking system
required_skills = {"python", "sql", "git", "statistics"}
applicant_a = {"python", "java", "git", "docker"}
applicant_b = {"python", "sql", "git", "statistics", "machine-learning"}

print(f"Required skills: {required_skills}")
print(f"Applicant A skills: {applicant_a}")
print(f"Applicant B skills: {applicant_b}")

# Check qualifications using set methods
a_qualified = required_skills.issubset(applicant_a)
b_qualified = required_skills.issubset(applicant_b)

print(f"Applicant A qualified: {a_qualified}")
print(f"Applicant B qualified: {b_qualified}")

# Find missing and extra skills
a_missing = required_skills - applicant_a
a_extra = applicant_a - required_skills
b_extra = applicant_b - required_skills

print(f"Applicant A missing: {a_missing}")
print(f"Applicant A extra skills: {a_extra}")
print(f"Applicant B extra skills: {b_extra}")

# Fast membership testing
print(f"Does applicant A know Python? {'python' in applicant_a}")
print(f"Does applicant A know Rust? {'rust' in applicant_a}")

In [24]:
# Set comprehensions and advanced patterns
print("=== Set Comprehensions and Performance ===")

# Set comprehension - like list comprehensions but creates sets
even_squares = {x**2 for x in range(10) if x % 2 == 0}
print(f"Even squares: {even_squares}")

# Performance comparison: list vs set for membership testing
import time

# Large datasets
large_list = list(range(100000))
large_set = set(range(100000))

# Test membership in list
start = time.time()
result1 = 99999 in large_list
list_time = time.time() - start

# Test membership in set
start = time.time()
result2 = 99999 in large_set
set_time = time.time() - start

print(f"List membership test: {list_time:.6f} seconds")
print(f"Set membership test: {set_time:.6f} seconds")
print(f"Set is {list_time/set_time:.1f}x faster for membership testing!")

# Remove duplicates from large list
data_with_duplicates = [1, 2, 2, 3, 3, 3, 4, 4, 4, 4] * 1000
unique_data = list(set(data_with_duplicates))
print(f"Original list size: {len(data_with_duplicates)}")
print(f"After removing duplicates: {len(unique_data)}")

# Working with immutable sets
frozen_permissions = frozenset(["read", "write", "execute"])
print(f"Frozen set (immutable): {frozen_permissions}")
# frozen_permissions.add("admin")  # This would raise an AttributeError!

=== Set Comprehensions and Performance ===
Even squares: {0, 64, 4, 36, 16}
List membership test: 0.001083 seconds
Set membership test: 0.000030 seconds
Set is 36.0x faster for membership testing!
Original list size: 10000
After removing duplicates: 4
Frozen set (immutable): frozenset({'read', 'write', 'execute'})


### 🛠️ Your Turn: Set Mastery Practice

Apply set operations to solve real-world problems:

In [None]:
# TODO: Social media platform analysis
instagram_followers = {"alice", "bob", "charlie", "diana", "eve", "frank"}
twitter_followers = {"bob", "diana", "grace", "henry", "ivan", "frank"}
tiktok_followers = {"alice", "charlie", "grace", "jane", "diana"}

# TODO: Find followers who follow you on ALL platforms
all_platforms = instagram_followers & twitter_followers & tiktok_followers

# TODO: Find your total unique audience across all platforms
total_audience = instagram_followers | twitter_followers | tiktok_followers
# Do the same using set.union()
total_audience_union = set().union(instagram_followers, twitter_followers, tiktok_followers)


# TODO: Find followers who are ONLY on Instagram
instagram_only = instagram_followers - (twitter_followers | tiktok_followers)

# TODO: Find followers who are on Instagram OR Twitter, but not both
instagram_twitter_xor = instagram_followers ^ twitter_followers

# TODO: Use set comprehension to create tags from a blog post
blog_post = "Python programming data science machine learning AI artificial intelligence"
tags = {word.lower() for word in blog_post.split()}

# TODO: Check if required skills are a subset of candidate skills
required_job_skills = {"python", "sql", "excel"}
candidate_skills = {"python", "sql", "excel", "tableau", "r"}

# TODO: Find what additional skills the candidate brings

### 🔄 Combining Dictionaries and Sets: Real-World Applications

The most powerful solutions often combine multiple data structures. Let's see dictionaries and sets working together:

In [None]:
# Real-world example: E-commerce recommendation system
print("=== E-commerce Recommendation System ===")

# User purchase history (dictionary with sets as values)
user_purchases = {
    "alice": {"laptop", "mouse", "keyboard", "monitor"},
    "bob": {"phone", "case", "charger"},
    "charlie": {"laptop", "mouse", "headphones"},
    "diana": {"tablet", "stylus", "case", "headphones"}
}

# Product categories (dictionary mapping products to their categories)
product_categories = {
    "laptop": "computers", "mouse": "accessories", "keyboard": "accessories",
    "monitor": "displays", "phone": "mobile", "case": "accessories",
    "charger": "accessories", "headphones": "audio", "tablet": "mobile",
    "stylus": "accessories"
}

print("User purchase history:")
for user, products in user_purchases.items():
    print(f"  {user}: {products}")

# Find users who bought similar products
alice_products = user_purchases["alice"]
charlie_products = user_purchases["charlie"]

common_products = alice_products & charlie_products
print(f"Alice and Charlie both bought: {common_products}")

# Recommendation: products Alice bought that Charlie didn't
recommendations_for_charlie = alice_products - charlie_products
print(f"Recommend to Charlie (based on Alice): {recommendations_for_charlie}")

# Find all users who bought accessories
accessory_buyers = set()
for user, products in user_purchases.items():
    user_categories = {product_categories[product] for product in products}
    if "accessories" in user_categories:
        accessory_buyers.add(user)

print(f"Users who bought accessories: {accessory_buyers}")

# Category analysis using dictionary and set comprehensions
category_buyers = {}
for category in set(product_categories.values()):
    buyers = {user for user, products in user_purchases.items() 
              if any(product_categories[product] == category for product in products)}
    category_buyers[category] = buyers

print("Buyers by category:")
for category, buyers in category_buyers.items():
    print(f"  {category}: {buyers}")

---

# 🎯 Putting It All Together: Lab 04 Integration

You've experienced the power of AI-assisted data structure optimization. Now apply those skills in [Lab 04](https://github.com/bgreenwell/is4010-course-template/blob/main/labs/lab04/README.md)!

## Your AI Pair Programming Toolkit 🤖

Based on today's exercises, here are proven prompt templates for Lab 04 success:

### **For Performance Analysis:**
```
Analyze this Python code for performance bottlenecks and suggest optimizations using different data structures:

[Your code here]

Focus on:
- What data structure would be better for [specific operation]?
- Why is the current approach slow?
- What's the time complexity difference?
```

### **For Data Structure Selection:**
```
I need to design data structures for this scenario:

[Describe your problem]

Requirements:
- Fast lookup of [items] by [key]
- Quick filtering by [criteria]  
- Efficient [specific operation]

What data structures should I use and why?
```

### **For Code Review:**
```
Review this Python implementation and suggest improvements for:
- Data structure choice
- Code readability  
- Performance optimization

[Your code here]

What improvements can you suggest?
```

## Success Strategy for Lab 04

1. 🤖 **Collaborate with AI** for data structure recommendations
2. 💡 **Test and validate** AI suggestions with real code  
3. 📝 **Document your reasoning** in `lab04_prompts.md`
4. ✅ **Verify with automation** using GitHub Actions

## Key Concepts to Apply

- **List vs Set performance**: When uniqueness and fast membership matter
- **Dictionary optimization**: Fast key-based lookups for complex data
- **List comprehensions**: Concise data generation and processing
- **Critical evaluation**: Balance AI suggestions with your understanding

**Remember:** The goal isn't just to get the right answer, but to understand *why* it's right and *when* to use each approach.