# Chapter 4: Data Structures
**From: Zero to AI Agent**

## Overview
In this chapter, you'll learn about:
- Lists: creation, indexing, and slicing
- List methods and operations
- Tuples and their immutability
- Dictionaries: key-value pairs
- Sets and their operations
- Choosing the right data structure
- List comprehensions (gentle introduction)


---
## Section 4.1: Lists: creation, indexing, and slicing

In [None]:
# From: first_list.py

# From: Zero to AI Agent, Chapter 4, Section 4.1
# first_list.py - Creating your first lists

# Let's organize our data!
# Navigate to your folder first
# ~/Desktop/ai_agents_complete/part_1_python/chapter_04_data_structures/

# Creating your first list
favorite_numbers = [7, 42, 13, 99, 3.14]
print("My favorite numbers:", favorite_numbers)

# Lists can hold different types of data
mixed_bag = [42, "hello", 3.14, True, "Python"]
print("A list with different types:", mixed_bag)

# Even an empty list (like an empty toy box, ready to be filled!)
empty_list = []
print("An empty list:", empty_list)


In [None]:
# From: creating_lists.py

# From: Zero to AI Agent, Chapter 4, Section 4.1
# creating_lists.py - All the ways to create lists

# Method 1: Direct creation (what we just did)
colors = ["red", "blue", "green", "yellow"]
print("Method 1 - Direct creation:", colors)

# Method 2: Creating from a string
sentence = "Python is amazing"
words = sentence.split()  # Splits the string into a list of words
print("Method 2 - From string:", words)

# Method 3: Using the list() function
numbers_string = "12345"
digits = list(numbers_string)  # Converts each character to a list item
print("Method 3 - Using list():", digits)

# Method 4: Creating with range() - remember this from loops?
counting = list(range(1, 11))  # Numbers 1 through 10
print("Method 4 - Using range():", counting)

# Method 5: Repeating elements
lots_of_zeros = [0] * 5  # Creates [0, 0, 0, 0, 0]
print("Method 5 - Repetition:", lots_of_zeros)


In [None]:
# From: list_indexing.py

# From: Zero to AI Agent, Chapter 4, Section 4.1
# list_indexing.py - Accessing list elements with indexing

# Let's create a list of AI terms we'll be using later
ai_terms = ["neural", "network", "training", "model", "agent", "prompt"]

# Accessing items by their index (position)
first_term = ai_terms[0]   # Gets "neural" (index 0)
third_term = ai_terms[2]   # Gets "training" (index 2)

print(f"First term: {first_term}")
print(f"Third term: {third_term}")

# You can also modify items using their index
ai_terms[1] = "NETWORK"    # Changes "network" to "NETWORK"
print("After modification:", ai_terms)

# What happens if we try to access an index that doesn't exist?
# Uncomment the next line to see the error:
# bad_access = ai_terms[10]  # IndexError: list index out of range


In [None]:
# From: negative_indexing.py

# From: Zero to AI Agent, Chapter 4, Section 4.1
# negative_indexing.py - Counting from the end with negative indices

# Using the same AI terms list
ai_terms = ["neural", "network", "training", "model", "agent", "prompt"]

# Negative indexing starts from the end
last_term = ai_terms[-1]       # Gets "prompt"
second_to_last = ai_terms[-2]  # Gets "agent"
third_from_end = ai_terms[-3]  # Gets "model"

print(f"Last term: {last_term}")
print(f"Second to last: {second_to_last}")
print(f"Third from end: {third_from_end}")

# This is incredibly useful! Imagine you're building a chatbot
chat_history = ["Hello", "How are you?", "I'm fine", "What's the weather?", "It's sunny"]
last_message = chat_history[-1]  # Always gets the most recent message
print(f"Most recent message: {last_message}")


In [None]:
# From: list_slicing.py

# From: Zero to AI Agent, Chapter 4, Section 4.1
# list_slicing.py - Extracting portions of lists with slicing

# Let's work with a dataset of temperatures (in Celsius)
temperatures = [20, 22, 25, 23, 26, 28, 30, 29, 27, 24, 21, 19]
print("All temperatures:", temperatures)

# Basic slicing
first_three = temperatures[0:3]   # Items at index 0, 1, 2 (not 3!)
print("First three temps:", first_three)

# If you omit the start, it defaults to 0
first_four = temperatures[:4]     # Same as [0:4]
print("First four temps:", first_four)

# If you omit the end, it goes to the end of the list
from_index_6 = temperatures[6:]   # From index 6 to the end
print("From index 6 onward:", from_index_6)

# Get everything (make a copy)
all_temps = temperatures[:]       # Copies the entire list
print("Copy of all temps:", all_temps)

# Using step to skip items
every_other = temperatures[::2]   # Every 2nd item
print("Every other temp:", every_other)

# Reverse the list using step -1
reversed_temps = temperatures[::-1]
print("Reversed:", reversed_temps)

# Combine start, end, and step
morning_temps = temperatures[1:7:2]  # Index 1 to 6, every 2nd item
print("Select morning temps:", morning_temps)


In [None]:
# From: advanced_slicing.py

# From: Zero to AI Agent, Chapter 4, Section 4.1
# advanced_slicing.py - Advanced slicing patterns for AI/ML

# Simulating a dataset for machine learning
dataset = list(range(100))  # 0 to 99, imagine these are data samples

# Common AI/ML slicing patterns

# 1. Getting batches of data
batch_size = 10
first_batch = dataset[:batch_size]
second_batch = dataset[batch_size:batch_size*2]
print(f"First batch: {first_batch}")
print(f"Second batch: {second_batch}")

# 2. Train/test split (very common in ML!)
split_point = int(len(dataset) * 0.8)  # 80% for training
training_data = dataset[:split_point]   # First 80%
test_data = dataset[split_point:]       # Last 20%
print(f"Training samples: {len(training_data)}")
print(f"Test samples: {len(test_data)}")

# 3. Getting the last n items (like recent chat history)
recent_history = dataset[-5:]  # Last 5 items
print(f"Recent items: {recent_history}")

# 4. Skipping header/footer (common with data files)
data_with_header = ["HEADER", 10, 20, 30, 40, "FOOTER"]
clean_data = data_with_header[1:-1]  # Skip first and last
print(f"Clean data: {clean_data}")

# 5. Reverse order (useful for backpropagation in neural networks!)
forwards = [1, 2, 3, 4, 5]
backwards = forwards[::-1]
print(f"Forward: {forwards}")
print(f"Backward: {backwards}")


In [None]:
# From: lists_and_loops.py

# From: Zero to AI Agent, Chapter 4, Section 4.1
# lists_and_loops.py - Combining lists with loops from Chapter 3

# Using loops with lists (building on Chapter 3!)
scores = [85, 92, 78, 95, 88, 73, 91]

# For loop through a list (remember this pattern?)
print("All scores:")
for score in scores:
    print(f"  Score: {score}")

# Using enumerate to get both index and value
print("\nScores with position:")
for position, score in enumerate(scores):
    print(f"  Position {position}: {score}")

# Using conditions with lists (Chapter 3 skills!)
print("\nHigh scores (90+):")
for score in scores:
    if score >= 90:  # Our if statement from Chapter 3!
        print(f"  Excellent: {score}")

# Accessing by index with a loop
print("\nFirst half of scores:")
for i in range(len(scores) // 2):  # Using range from Chapter 3
    print(f"  scores[{i}] = {scores[i]}")


In [None]:
# From: ai_conversation.py

# From: Zero to AI Agent, Chapter 4, Section 4.1
# ai_conversation.py - Simulating chatbot conversation history

# Simulating a simple chatbot conversation history
conversation = []  # Start with empty list

# Adding user messages (we'll learn append in the next section)
conversation = conversation + ["User: Hello!"]
conversation = conversation + ["Bot: Hi there! How can I help?"]
conversation = conversation + ["User: What's the weather?"]
conversation = conversation + ["Bot: I'll check that for you."]

# Get the last exchange
last_exchange = conversation[-2:]  # Last user message and bot response
print("Last exchange:")
for message in last_exchange:
    print(f"  {message}")

# Prepare context for AI (like preparing a prompt)
context_window = 3  # How many previous messages to include
context = conversation[-context_window:] if len(conversation) >= context_window else conversation
print(f"\nContext for next response ({len(context)} messages):")
for msg in context:
    print(f"  {msg}")

# This is exactly how AI agents maintain conversation context!


---
### Section 4.1 Exercises

### Exercise 4.1.1: Shopping Cart Manager

Create a shopping cart system that:
1. Starts with an empty cart (list)
2. Adds these items: "apples", "bread", "milk", "eggs", "cheese"
3. Displays the first and last items using indexing
4. Removes the third item
5. Checks if "milk" is in the cart
6. Displays the total number of items

In [None]:
# Your code here


### Exercise 4.1.2: Grade Analyzer

Build a grade tracking system that:
1. Creates a list of 10 test scores: [85, 92, 78, 95, 88, 73, 91, 82, 79, 96]
2. Finds and displays the highest score (hint: use max())
3. Finds and displays the lowest score (hint: use min())
4. Calculates the average score
5. Extracts the top 3 scores using slicing after sorting
6. Counts how many scores are above 85

In [None]:
# Your code here


### Exercise 4.1.3: Matrix Operations

Work with a 3x3 matrix (nested list):
1. Create a 3x3 matrix: [[1,2,3], [4,5,6], [7,8,9]]
2. Access and print the center element (5)
3. Extract and print the first row
4. Extract and print the last column [3, 6, 9]
5. Calculate the sum of diagonal elements [1, 5, 9]
6. Create a flattened list containing all elements in order

In [None]:
# Your code here


---
## Section 4.2: List methods and operations

In [None]:
# From: list_methods_adding.py

# From: Zero to AI Agent, Chapter 4, Section 4.2
# list_methods_adding.py - Methods for adding items to lists

# Starting with a simple list
topics = ["Python", "Lists", "Loops"]
print("Starting topics:", topics)

# append() - Adds ONE item to the end
topics.append("Functions")
print("After append('Functions'):", topics)

# Be careful - append adds the whole item as ONE element
topics.append(["AI", "ML"])  # This adds the entire list as one item!
print("After appending a list:", topics)
# Notice the nested list: ['Python', 'Lists', 'Loops', 'Functions', ['AI', 'ML']]

# Remove the nested list to clean up
topics.pop()  # Remove last item

# extend() - Adds EACH item from another list
more_topics = ["Machine Learning", "Neural Networks"]
topics.extend(more_topics)
print("After extend:", topics)

# The difference is clear
list_a = [1, 2, 3]
list_b = [4, 5]

# Using append
test_append = list_a.copy()
test_append.append(list_b)
print("\nAppend result:", test_append)  # [1, 2, 3, [4, 5]]

# Using extend
test_extend = list_a.copy()
test_extend.extend(list_b)
print("Extend result:", test_extend)  # [1, 2, 3, 4, 5]

# insert() - Adds an item at a SPECIFIC position
topics = ["Python", "Lists", "Functions", "Classes"]
topics.insert(0, "Introduction")  # Insert at the beginning
print("\nAfter insert at position 0:", topics)

topics.insert(3, "Control Flow")  # Insert at position 3
print("After insert at position 3:", topics)

# Real-world example: Managing a conversation history
chat_history = []
chat_history.append("User: Hello!")
chat_history.append("Bot: Hi there!")
chat_history.insert(0, "System: Conversation started")  # Add system message at beginning
print("\nChat history:")
for message in chat_history:
    print(f"  {message}")


In [None]:
# From: list_methods_removing.py

# From: Zero to AI Agent, Chapter 4, Section 4.2
# list_methods_removing.py - Methods for removing items from lists

# Let's work with a list of tasks
tasks = ["email client", "fix bug", "write tests", "fix bug", "deploy", "fix bug"]
print("Original tasks:", tasks)

# Method 1: remove() - Removes the FIRST occurrence of a value
tasks.remove("fix bug")  # Only removes the first "fix bug"
print("After remove('fix bug'):", tasks)

# Method 2: pop() - Removes and RETURNS an item at a specific index
completed_task = tasks.pop()  # No index = removes last item
print(f"Completed: {completed_task}")
print("Tasks after pop():", tasks)

first_task = tasks.pop(0)  # Remove and return first item
print(f"Did first: {first_task}")
print("Tasks after pop(0):", tasks)

# Method 3: clear() - Removes ALL items
old_tasks = ["outdated task 1", "outdated task 2"]
print(f"Old tasks before clear: {old_tasks}")
old_tasks.clear()
print(f"Old tasks after clear: {old_tasks}")

# Method 4: del - Not a method, but a statement (be careful with this one!)
numbers = [1, 2, 3, 4, 5]
del numbers[2]  # Removes the item at index 2
print("After del numbers[2]:", numbers)

# You can also delete slices!
del numbers[1:3]  # Removes items at index 1 and 2
print("After del numbers[1:3]:", numbers)


In [None]:
# From: list_finding_counting.py

# From: Zero to AI Agent, Chapter 4, Section 4.2
# list_finding_counting.py - Finding and counting items in lists

# Sample data: user feedback scores
scores = [8, 9, 7, 9, 10, 8, 9, 7, 8, 9, 10, 9]
print("Feedback scores:", scores)

# count() - How many times does a value appear?
nines = scores.count(9)
tens = scores.count(10)
print(f"Number of 9s: {nines}")
print(f"Number of 10s: {tens}")

# index() - Where is a value located?
first_ten_position = scores.index(10)
print(f"First 10 is at position: {first_ten_position}")

# Be careful - index() raises an error if item doesn't exist!
# Safe way to use index():
search_value = 6
if search_value in scores:
    position = scores.index(search_value)
    print(f"Found {search_value} at position {position}")
else:
    print(f"{search_value} not found in scores")

# in operator - Check if item exists (returns True/False)
has_perfect_score = 10 in scores
has_failing_score = 5 in scores
print(f"Has perfect score (10)? {has_perfect_score}")
print(f"Has failing score (5)? {has_failing_score}")

# Real-world AI example: Checking for keywords in user input
user_message = "I want to cancel my subscription"
keywords = user_message.lower().split()
if "cancel" in keywords or "unsubscribe" in keywords:
    print("User wants to cancel - routing to retention team")


In [None]:
# From: list_sorting.py

# From: Zero to AI Agent, Chapter 4, Section 4.2
# list_sorting.py - Sorting and reversing lists

# Sorting numbers
prices = [29.99, 14.50, 39.99, 24.99, 19.99]
print("Original prices:", prices)

# sort() - Sorts the list IN PLACE (modifies the original)
prices.sort()
print("After sort() - ascending:", prices)

prices.sort(reverse=True)  # Sort in descending order
print("After sort(reverse=True):", prices)

# sorted() - Returns a NEW sorted list (doesn't modify original)
original = [5, 2, 8, 1, 9]
sorted_copy = sorted(original)
print(f"Original: {original}")  # Unchanged!
print(f"Sorted copy: {sorted_copy}")

# Sorting strings
words = ["python", "agent", "neural", "bot", "ai"]
words.sort()
print("Alphabetically sorted:", words)

# reverse() - Reverses the list IN PLACE
countdown = [1, 2, 3, 4, 5]
countdown.reverse()
print("Reversed countdown:", countdown)

# Advanced: Sorting with a key function
# Sort by length of string
names = ["Jo", "Alexander", "Bob", "Christina"]
names.sort(key=len)  # Sort by length
print("Sorted by length:", names)

# Sort ignoring case
mixed_case = ["apple", "Banana", "cherry", "Date"]
mixed_case.sort()  # Capital letters come first!
print("Default sort:", mixed_case)

mixed_case = ["apple", "Banana", "cherry", "Date"]
mixed_case.sort(key=str.lower)  # Ignore case
print("Case-insensitive sort:", mixed_case)


In [None]:
# From: list_copying.py

# From: Zero to AI Agent, Chapter 4, Section 4.2
# list_copying.py - The critical importance of proper list copying

# The WRONG way (creates a reference, not a copy)
original_list = [1, 2, 3, 4, 5]
not_a_copy = original_list  # This is NOT a copy!

not_a_copy.append(6)
print("Original list:", original_list)  # [1, 2, 3, 4, 5, 6] - CHANGED!
print("Not a copy:", not_a_copy)        # [1, 2, 3, 4, 5, 6]
# They're the same list!

# The RIGHT ways to copy a list

# Method 1: Using copy()
list1 = [1, 2, 3, 4, 5]
list2 = list1.copy()  # Creates an actual copy
list2.append(6)
print("\nUsing copy():")
print("list1:", list1)  # [1, 2, 3, 4, 5] - unchanged!
print("list2:", list2)  # [1, 2, 3, 4, 5, 6]

# Method 2: Using slicing
list3 = [7, 8, 9]
list4 = list3[:]  # The [:] creates a copy
list4.append(10)
print("\nUsing slicing [:]:")
print("list3:", list3)  # [7, 8, 9] - unchanged!
print("list4:", list4)  # [7, 8, 9, 10]

# Method 3: Using list()
list5 = [11, 12, 13]
list6 = list(list5)  # Creates a new list
list6.append(14)
print("\nUsing list():")
print("list5:", list5)  # [11, 12, 13] - unchanged!
print("list6:", list6)  # [11, 12, 13, 14]


In [None]:
# From: list_operations.py

# From: Zero to AI Agent, Chapter 4, Section 4.2
# list_operations.py - Mathematical operations with lists

# Concatenation with +
list_a = [1, 2, 3]
list_b = [4, 5, 6]
combined = list_a + list_b
print(f"{list_a} + {list_b} = {combined}")

# Repetition with *
pattern = [0, 1]
repeated = pattern * 3
print(f"{pattern} * 3 = {repeated}")

# This is great for initialization!
# Creating a game board
row = [0] * 5  # Five zeros
board = []
for i in range(5):
    board.append(row.copy())  # Important: copy each row!
print("Empty board:")
for row in board:
    print(row)

# Membership testing (we saw this earlier)
inventory = ["sword", "shield", "potion", "map"]
has_potion = "potion" in inventory
has_armor = "armor" in inventory
print(f"Has potion? {has_potion}")
print(f"Has armor? {has_armor}")

# Length, min, max, sum (for numeric lists)
numbers = [10, 5, 8, 3, 15, 12]
print(f"Length: {len(numbers)}")
print(f"Minimum: {min(numbers)}")
print(f"Maximum: {max(numbers)}")
print(f"Sum: {sum(numbers)}")
print(f"Average: {sum(numbers) / len(numbers)}")


In [None]:
# From: memory_system.py

# From: Zero to AI Agent, Chapter 4, Section 4.2
# memory_system.py - Simple conversation memory system for AI agents

# Simple conversation memory system using dictionaries and lists
# No classes or functions - just direct manipulation

# Initialize our memory system
memory = {
    "conversations": [],
    "max_size": 5,
    "important_messages": [],
    "message_count": 0
}

print("Memory system initialized")
print(f"Max conversation size: {memory['max_size']}")

# Simulate adding messages
messages_to_add = [
    "User: Hello AI!",
    "AI: Hello! How can I help you?",
    "User: Tell me about Python lists",
    "AI: Lists are ordered collections in Python",
    "User: How do I add items?",
    "AI: Use append() to add items to a list",
    "User: Thanks, that's helpful!"
]

print("\nAdding messages to memory:")
for msg in messages_to_add:
    # Add message to conversations
    memory["conversations"].append(msg)
    memory["message_count"] += 1
    
    # Check if we exceeded max size (sliding window)
    if len(memory["conversations"]) > memory["max_size"]:
        removed = memory["conversations"].pop(0)  # Remove oldest
        print(f"  Memory full, removed: {removed}")
    
    print(f"  Added: {msg}")

print(f"\nCurrent memory state:")
print(f"  Total messages processed: {memory['message_count']}")
print(f"  Messages in memory: {len(memory['conversations'])}")

# Get recent context (last 3 messages)
context_size = 3
if len(memory["conversations"]) >= context_size:
    recent_context = memory["conversations"][-context_size:]
else:
    recent_context = memory["conversations"].copy()

print(f"\nRecent context ({len(recent_context)} messages):")
for msg in recent_context:
    print(f"  {msg}")

# Mark important messages
important_keywords = ["thanks", "helpful", "great"]
for msg in memory["conversations"]:
    msg_lower = msg.lower()
    for keyword in important_keywords:
        if keyword in msg_lower and msg not in memory["important_messages"]:
            memory["important_messages"].append(msg)
            print(f"\nMarked as important: {msg}")
            break

# Search for specific keywords
search_term = "list"
print(f"\nSearching for messages containing '{search_term}':")
found_messages = []
for msg in memory["conversations"]:
    if search_term.lower() in msg.lower():
        found_messages.append(msg)
        print(f"  Found: {msg}")

print(f"\nTotal matches: {len(found_messages)}")

# Summary statistics
print("\n=== Memory Summary ===")
print(f"Messages in memory: {len(memory['conversations'])}")
print(f"Important messages: {len(memory['important_messages'])}")
print(f"Total processed: {memory['message_count']}")
if memory["conversations"]:
    print(f"Oldest message: {memory['conversations'][0]}")
    print(f"Latest message: {memory['conversations'][-1]}")


In [None]:
# From: common_patterns.py

# From: Zero to AI Agent, Chapter 4, Section 4.2
# common_patterns.py - Common list patterns for AI applications

# Pattern 1: Building a list conditionally
responses = ["good", "bad", "excellent", "poor", "great", "terrible", "okay"]
positive = []
for response in responses:
    if response in ["good", "excellent", "great"]:
        positive.append(response)
print(f"Positive responses: {positive}")

# Pattern 2: Removing duplicates while preserving order
messages = ["hello", "world", "hello", "python", "world", "ai"]
seen = []
unique_messages = []
for msg in messages:
    if msg not in seen:
        seen.append(msg)
        unique_messages.append(msg)
print(f"Unique messages: {unique_messages}")

# Pattern 3: Batch processing
data = list(range(1, 16))  # 1 to 15
batch_size = 5
print("Processing in batches:")
for i in range(0, len(data), batch_size):
    batch = data[i:i+batch_size]
    print(f"  Processing batch: {batch}")

# Pattern 4: Maintaining a fixed-size history (sliding window)
history = []
max_history = 3

new_items = ["event1", "event2", "event3", "event4", "event5"]
print("Building history with sliding window:")
for item in new_items:
    history.append(item)
    if len(history) > max_history:
        history.pop(0)  # Remove oldest
    print(f"  Current history: {history}")

# Pattern 5: Working with nested lists
# Student scores: [name, [quiz1, quiz2, quiz3]]
students = [
    ["Alice", [85, 90, 92]],
    ["Bob", [78, 82, 88]],
    ["Charlie", [92, 95, 89]]
]

print("\nStudent Score Analysis:")
for student in students:
    name = student[0]
    scores = student[1]
    average = sum(scores) / len(scores)
    highest = max(scores)
    print(f"  {name}: Average = {average:.1f}, Highest = {highest}")


In [None]:
# From: recommendation_tracker.py

# From: Zero to AI Agent, Chapter 4, Section 4.2
# recommendation_tracker.py - Simple recommendation system

# Recommendation system using dictionaries and lists
# Track what items users have viewed and liked

# Our data storage
users_data = {
    "alice": {
        "viewed": ["item1", "item2", "item3"],
        "liked": ["item1", "item3"],
        "recommendations": []
    },
    "bob": {
        "viewed": ["item2", "item4"],
        "liked": ["item4"],
        "recommendations": []
    }
}

# All available items in our system
all_items = ["item1", "item2", "item3", "item4", "item5", "item6"]

# Generate recommendations for each user
for username in users_data:
    user = users_data[username]
    
    # Find items they haven't viewed yet
    not_viewed = []
    for item in all_items:
        if item not in user["viewed"]:
            not_viewed.append(item)
    
    # Simple recommendation: items not viewed yet
    user["recommendations"] = not_viewed[:3]  # Top 3 recommendations
    
    print(f"\n{username}'s profile:")
    print(f"  Viewed: {user['viewed']}")
    print(f"  Liked: {user['liked']}")
    print(f"  Recommendations: {user['recommendations']}")

# Find popular items (liked by multiple users)
all_liked_items = []
for username in users_data:
    all_liked_items.extend(users_data[username]["liked"])

print("\n=== Popular Items ===")
for item in all_items:
    like_count = all_liked_items.count(item)
    if like_count > 0:
        print(f"  {item}: {like_count} likes")

# Find users with similar interests
print("\n=== Similar Users ===")
user_list = list(users_data.keys())
for i in range(len(user_list)):
    for j in range(i + 1, len(user_list)):
        user1 = user_list[i]
        user2 = user_list[j]
        
        # Find common liked items
        liked1 = users_data[user1]["liked"]
        liked2 = users_data[user2]["liked"]
        
        common_likes = []
        for item in liked1:
            if item in liked2:
                common_likes.append(item)
        
        if common_likes:
            print(f"  {user1} and {user2} both like: {common_likes}")


---
### Section 4.2 Exercises

### Exercise 4.2.1: Shopping Cart Manager

Create a shopping cart system that:
1. Start with an empty cart
2. Add items: "apple", "banana", "apple", "orange", "banana", "grape"
3. Count how many apples and bananas are in the cart
4. Remove one banana using remove()
5. Sort the cart alphabetically
6. Display the final cart and total number of items

In [None]:
# Your code here


### Exercise 4.2.2: Score Tracker

Build a score tracking system that:
1. Start with scores: [75, 82, 90, 68, 95, 78]
2. Add three more scores: 88, 92, 79 using extend()
3. Find the highest and lowest scores
4. Calculate the average score
5. Remove the lowest score using remove()
6. Sort scores from highest to lowest
7. Display the top 3 scores

In [None]:
# Your code here


### Exercise 4.2.3: Message Queue

Build a message queue that:
1. Maintains a maximum of 5 messages
2. Process these messages in order: "msg1", "msg2", "msg3", "msg4", "msg5", "msg6", "msg7"
3. When full, remove the oldest message using pop(0) before adding new ones
4. After processing all messages, show the final queue
5. Search for any message containing "5" using index()
6. Count how many times "msg" appears in any message

In [None]:
# Your code here


---
## Section 4.3: Tuples and their immutability

In [None]:
# From: tuple_creation.py

# From: Zero to AI Agent, Chapter 4, Section 4.3
# tuple_creation.py - Creating tuples in Python

# Creating tuples - notice the parentheses!
coordinates = (10, 20)
print(f"Coordinates: {coordinates}")
print(f"Type: {type(coordinates)}")

# Tuple with different data types
person = ("Alice", 25, "Engineer", True)
print(f"Person data: {person}")

# Empty tuple
empty = ()
print(f"Empty tuple: {empty}")

# Here's where it gets interesting - parentheses are often optional!
colors = "red", "green", "blue"  # This is a tuple!
print(f"Colors: {colors}")
print(f"Type: {type(colors)}")

# But sometimes parentheses are required for clarity
# Without parentheses, this would be confusing:
result = (1 + 2, 3 + 4)  # Tuple of (3, 7)
print(f"Result: {result}")

# Single element tuple - this is tricky!
not_a_tuple = (42)  # This is just the number 42 with parentheses
actual_tuple = (42,)  # The comma makes it a tuple!
print(f"not_a_tuple: {not_a_tuple}, type: {type(not_a_tuple)}")
print(f"actual_tuple: {actual_tuple}, type: {type(actual_tuple)}")

# Converting a list to a tuple
my_list = [1, 2, 3, 4, 5]
my_tuple = tuple(my_list)
print(f"List converted to tuple: {my_tuple}")


In [None]:
# From: tuple_accessing.py

# From: Zero to AI Agent, Chapter 4, Section 4.3
# tuple_accessing.py - Accessing and unpacking tuple elements

# AI model configuration tuple
model_config = ("gpt-3", 175, "billion", 96, 12288, 2048)
# (name, size, unit, layers, hidden_size, context_length)

# Indexing works exactly like lists
model_name = model_config[0]
num_layers = model_config[3]
context = model_config[-1]  # Negative indexing works too!

print(f"Model: {model_name}")
print(f"Layers: {num_layers}")
print(f"Context length: {context}")

# Slicing works the same way
size_info = model_config[1:3]  # Get size and unit
print(f"Size info: {size_info}")

# You can loop through tuples
print("Configuration details:")
for item in model_config:
    print(f"  - {item}")

# Unpacking - this is SUPER useful with tuples!
point = (3, 7)
x, y = point  # Unpacks the tuple into separate variables
print(f"x = {x}, y = {y}")

# Unpacking with AI example
response = ("success", "Hello! How can I help?", 0.92)
status, message, confidence = response
print(f"Status: {status}")
print(f"Message: {message}")
print(f"Confidence: {confidence}")

# You can even use * to grab multiple elements
numbers = (1, 2, 3, 4, 5, 6, 7)
first, *middle, last = numbers
print(f"First: {first}")
print(f"Middle: {middle}")  # This becomes a list!
print(f"Last: {last}")


In [None]:
# From: tuple_immutability.py

# From: Zero to AI Agent, Chapter 4, Section 4.3
# tuple_immutability.py - Understanding tuple immutability

# Lists are mutable (changeable)
list_scores = [85, 90, 78]
list_scores[1] = 95  # This works fine
print(f"Modified list: {list_scores}")

# Tuples are immutable (unchangeable)
tuple_scores = (85, 90, 78)
# tuple_scores[1] = 95  # This would cause an error!
# Uncomment the line above to see: TypeError: 'tuple' object does not support item assignment

# But wait - you CAN "modify" a tuple by creating a new one
original = (1, 2, 3)
# To "add" an element, create a new tuple
modified = original + (4,)  # Note the comma for single element!
print(f"Original: {original}")  # Still (1, 2, 3)
print(f"Modified: {modified}")  # New tuple (1, 2, 3, 4)

# To "change" an element, convert to list, modify, convert back
config = ("model_v1", 100, "active")
print(f"Original config: {config}")

# Need to update? Create a new tuple
temp_list = list(config)
temp_list[0] = "model_v2"
new_config = tuple(temp_list)
print(f"New config: {new_config}")
print(f"Original still unchanged: {config}")


In [None]:
# From: tuple_superpowers.py

# From: Zero to AI Agent, Chapter 4, Section 4.3
# tuple_superpowers.py - Why immutability is actually useful

# 1. SAFETY - Protecting important data
# Imagine these are critical system settings
SYSTEM_SETTINGS = ("production", "api.company.com", 443, True)
# No one can accidentally modify these!
# If someone tries: SYSTEM_SETTINGS[0] = "development"  # ERROR!

print(f"System settings are protected: {SYSTEM_SETTINGS}")

# 2. DICTIONARY KEYS - Only immutable objects can be dictionary keys
# This is useful for coordinate systems, caching, etc.
location_names = {
    (40.7128, -74.0060): "New York City",
    (51.5074, -0.1278): "London",
    (35.6762, 139.6503): "Tokyo"
}

coordinates = (40.7128, -74.0060)
print(f"Location at {coordinates}: {location_names[coordinates]}")

# You CAN'T use lists as dictionary keys
# city_data = {[40.7, -74.0]: "NYC"}  # This would cause an error!

# 3. MULTIPLE RETURN VALUES - Clean way to return multiple values
# Calculate rectangle properties
width = 10
height = 5
area = width * height
perimeter = 2 * (width + height)

# Store multiple results in a tuple
rectangle_info = (area, perimeter)
print(f"Rectangle info (area, perimeter): {rectangle_info}")

# Unpack when using
calc_area, calc_perimeter = rectangle_info
print(f"Area: {calc_area}, Perimeter: {calc_perimeter}")

# 4. MEMORY EFFICIENCY - Tuples use less memory than lists
import sys

my_list = [1, 2, 3, 4, 5]
my_tuple = (1, 2, 3, 4, 5)

print(f"List size: {sys.getsizeof(my_list)} bytes")
print(f"Tuple size: {sys.getsizeof(my_tuple)} bytes")
# Tuples are smaller!


In [None]:
# From: named_tuples.py

# From: Zero to AI Agent, Chapter 4, Section 4.3
# named_tuples.py - Named tuples for clearer code

from collections import namedtuple

# Create a named tuple class for AI model info
ModelInfo = namedtuple('ModelInfo', ['name', 'parameters', 'accuracy', 'trained_on'])

# Create instances
gpt3 = ModelInfo("GPT-3", 175_000_000_000, 0.92, "CommonCrawl")
bert = ModelInfo("BERT", 340_000_000, 0.89, "Wikipedia")

# Access by name (much clearer than index!)
print(f"{gpt3.name} has {gpt3.parameters:,} parameters")
print(f"Accuracy: {gpt3.accuracy}")

# You can still use indexing
print(f"First field: {gpt3[0]}")

# Named tuples are still immutable
# gpt3.accuracy = 0.95  # This would cause an error!

# Create another named tuple for evaluation results
Result = namedtuple('Result', ['model', 'precision', 'recall', 'f1_score'])

# Simulated evaluation
eval_result = Result("MyAgent", 0.89, 0.91, 0.90)
print(f"\nEvaluation Results:")
print(f"Model: {eval_result.model}")
print(f"Precision: {eval_result.precision}")
print(f"Recall: {eval_result.recall}")
print(f"F1 Score: {eval_result.f1_score}")


In [None]:
# From: tuples_vs_lists.py

# From: Zero to AI Agent, Chapter 4, Section 4.3
# tuples_vs_lists.py - When to use tuples vs lists

# Use TUPLES when:
# 1. Data shouldn't change (coordinates, configuration, constants)
rgb_red = (255, 0, 0)  # Color values should be fixed
db_config = ("localhost", 5432, "mydb", "readonly")  # Database settings

# 2. You need dictionary keys
cache = {
    ("user", 123): "cached_data_1",
    ("post", 456): "cached_data_2"
}

# 3. Representing a single record/entity
person = ("Bob", 30, "Engineer")  # One person's data
point = (10, 20)  # One point in space

# Use LISTS when:
# 1. Data needs to change
shopping_cart = ["apples", "bread"]  # Will add/remove items
shopping_cart.append("milk")

# 2. You have a collection of similar items
temperatures = [22, 24, 23, 25, 21]  # Collection of readings
messages = []  # Will accumulate messages

# 3. You need list methods (sort, append, etc.)
scores = [85, 92, 78, 95]
scores.sort()  # Need to sort

# 4. Size will change
active_users = []  # Will grow and shrink


In [None]:
# From: ai_agent_tuples.py

# From: Zero to AI Agent, Chapter 4, Section 4.3
# ai_agent_tuples.py - Using tuples in AI agent development

# AI Agent configuration and state management using tuples and dictionaries

# Agent configuration (immutable - use tuple)
agent_config = ("Assistant", "gpt-3.5", 2048, 0.7)  # (name, model, max_tokens, temperature)
print(f"Agent Configuration: {agent_config}")
print(f"Agent name: {agent_config[0]}")
print(f"Model: {agent_config[1]}")

# Conversation state (mutable - use dictionary with lists)
conversation_state = {
    "config": agent_config,  # Store the immutable config
    "history": [],  # Mutable conversation history
    "message_count": 0,
    "context_window": 5
}

# Simulate conversation
messages = [
    ("user", "Hello!", "2024-01-15 10:30:00"),
    ("assistant", "Hi there! How can I help?", "2024-01-15 10:30:01"),
    ("user", "What's the weather?", "2024-01-15 10:30:15"),
    ("assistant", "I'll check that for you.", "2024-01-15 10:30:16")
]

print("\nProcessing messages:")
for role, content, timestamp in messages:  # Unpacking tuple
    # Each message is stored as a tuple (immutable record)
    message_record = (role, content, timestamp)
    conversation_state["history"].append(message_record)
    conversation_state["message_count"] += 1
    print(f"  Added: {role} - {content}")

# Get recent context
context_size = conversation_state["context_window"]
recent = conversation_state["history"][-context_size:]

print(f"\nRecent context ({len(recent)} messages):")
for role, content, timestamp in recent:
    print(f"  [{timestamp}] {role}: {content}")

# Statistics as a tuple (immutable snapshot)
stats = (
    conversation_state["message_count"],
    len(conversation_state["history"]),
    conversation_state["config"][0],  # Agent name
    conversation_state["config"][1]   # Model
)

messages_processed, history_length, agent_name, model_used = stats
print(f"\nStatistics Snapshot:")
print(f"  Agent: {agent_name}")
print(f"  Model: {model_used}")
print(f"  Messages processed: {messages_processed}")
print(f"  History length: {history_length}")


In [None]:
# From: tuple_patterns.py

# From: Zero to AI Agent, Chapter 4, Section 4.3
# tuple_patterns.py - Common tuple patterns in AI development

# Pattern 1: Batch processing with coordinates
training_data = [
    ((0, 0), "origin"),
    ((1, 0), "right"),
    ((0, 1), "up"),
    ((-1, 0), "left")
]

print("Training data:")
for coordinates, label in training_data:
    x, y = coordinates  # Unpack the tuple
    print(f"  Point at ({x}, {y}) is labeled '{label}'")

# Pattern 2: Multiple return values for model evaluation
# Simulate model evaluation
accuracy = 0.92
loss = 0.08
epochs_completed = 100
training_time = 45.3

# Return as tuple
evaluation_results = (accuracy, loss, epochs_completed, training_time)

# Clean unpacking
acc, loss_val, epochs, time_taken = evaluation_results
print(f"\nTraining complete: {acc:.2%} accuracy in {time_taken:.1f} seconds")

# Pattern 3: Configuration management
MODEL_CONFIGS = {
    "small": (32, 4, 512, 0.1),    # (batch_size, layers, hidden_dim, dropout)
    "medium": (64, 8, 1024, 0.2),
    "large": (128, 12, 2048, 0.3)
}

selected_config = MODEL_CONFIGS["medium"]
batch, layers, hidden, dropout = selected_config
print(f"\nMedium model: {layers} layers, {hidden} hidden dimensions")

# Pattern 4: Storing immutable state snapshots
# Training history - each snapshot is immutable
training_history = []

# Simulate training epochs
epoch_data = [
    (1, 0.5, 0.75),  # (epoch, loss, accuracy)
    (2, 0.3, 0.85),
    (3, 0.2, 0.90)
]

for epoch, loss, acc in epoch_data:
    # Each snapshot is an immutable tuple
    snapshot = (epoch, loss, acc)
    training_history.append(snapshot)

print("\nTraining History:")
for epoch, loss, acc in training_history:
    print(f"  Epoch {epoch}: loss={loss:.2f}, accuracy={acc:.2%}")

# Find best epoch (highest accuracy)
if training_history:
    best_epoch = max(training_history, key=lambda x: x[2])  # x[2] is accuracy
    epoch, loss, acc = best_epoch
    print(f"\nBest epoch: {epoch} with {acc:.2%} accuracy")


In [None]:
# From: game_state_manager.py

# From: Zero to AI Agent, Chapter 4, Section 4.3
# game_state_manager.py - Using tuples for immutable game snapshots

# Game State Management System
# Using tuples for immutable snapshots and lists for history

# Initialize game
game_data = {
    "player_name": "Hero",
    "current_position": (0, 0),  # Starting position as tuple
    "current_stats": {
        "health": 100,
        "score": 0,
        "level": 1
    },
    "snapshots": [],  # Will store immutable snapshots
    "move_history": []  # Will store move records
}

print("Game initialized!")
print(f"Player: {game_data['player_name']}")
print(f"Starting position: {game_data['current_position']}")

# Simulate game moves
moves = [
    ("right", 1, 0, 10),   # (direction, dx, dy, points)
    ("up", 0, 1, 15),
    ("right", 1, 0, 20),
    ("down", 0, -1, -5),   # Lost points!
    ("left", -1, 0, 25)
]

print("\nPlaying game:")
for move_num, move_data in enumerate(moves, 1):
    direction, dx, dy, points = move_data
    
    # Update position (create new tuple)
    old_x, old_y = game_data["current_position"]
    new_position = (old_x + dx, old_y + dy)
    old_position = game_data["current_position"]
    game_data["current_position"] = new_position
    
    # Update score
    game_data["current_stats"]["score"] += points
    
    # Create immutable snapshot of this moment
    snapshot = (
        move_num,
        new_position,
        game_data["current_stats"]["score"],
        game_data["current_stats"]["health"],
        direction
    )
    game_data["snapshots"].append(snapshot)
    
    # Record the move
    move_record = (direction, old_position, new_position, points)
    game_data["move_history"].append(move_record)
    
    print(f"  Move {move_num}: {direction} to {new_position}, Score: {game_data['current_stats']['score']}")

# Analyze game history
print("\n=== Game Analysis ===")
print(f"Total moves: {len(game_data['snapshots'])}")
print(f"Final position: {game_data['current_position']}")
print(f"Final score: {game_data['current_stats']['score']}")

# Find best scoring move
if game_data["move_history"]:
    best_move = max(game_data["move_history"], key=lambda x: x[3])  # x[3] is points
    direction, from_pos, to_pos, points = best_move
    print(f"Best move: {direction} from {from_pos} to {to_pos} (+{points} points)")

# Show all snapshots
print("\n=== Game Snapshots ===")
for snapshot in game_data["snapshots"]:
    move, pos, score, health, direction = snapshot
    print(f"  After move {move}: Position {pos}, Score {score}, Health {health}")


---
### Section 4.3 Exercises

### Exercise 4.3.1: Color Palette Manager

Create a color palette system that:
1. Store RGB colors as tuples: red=(255,0,0), green=(0,255,0), blue=(0,0,255)
2. Create a mixed color by averaging two color tuples
3. Store colors in a dictionary with tuple keys for coordinates
4. Return color information as a named tuple with fields: name, rgb, hex

In [None]:
# Your code here


### Exercise 4.3.2: Game State Snapshots

Build a simple game state system that:
1. Store player position as a tuple (x, y)
2. Save game snapshots as tuples: (turn_number, position, score, health)
3. Maintain a list of these immutable snapshots
4. Find the snapshot with the highest score
5. Return game statistics as a tuple

In [None]:
# Your code here


### Exercise 4.3.3: Model Configuration Validator

Create a configuration system that:
1. Define valid model configs as tuples: (name, layers, parameters, learning_rate)
2. Store multiple configurations
3. Validate that configurations haven't been modified
4. Return the smallest and largest models based on parameters
5. Unpack and display configuration details

In [None]:
# Your code here


---
## Section 4.4: Dictionaries: key-value pairs

In [None]:
# From: dict_creation.py

# From: Zero to AI Agent, Chapter 4, Section 4.4
# dict_creation.py - Creating dictionaries in Python

# Your first dictionary - a user profile
user = {
    "name": "Alice",
    "age": 28,
    "email": "alice@example.com",
    "is_premium": True
}

print("User dictionary:", user)
print(f"Type: {type(user)}")

# Keys can be strings, numbers, or any immutable type (remember tuples?)
mixed_keys = {
    "string_key": "I'm a string value",
    42: "I'm accessed with the number 42",
    (1, 2): "I'm accessed with the tuple (1, 2)",
    3.14: "I'm accessed with 3.14"
}

print("\nMixed keys dictionary:")
for key, value in mixed_keys.items():
    print(f"  {key} -> {value}")

# Empty dictionary (ready to fill!)
empty_dict = {}
also_empty = dict()  # Alternative way
print(f"\nEmpty dict: {empty_dict}")
print(f"Also empty: {also_empty}")

# Creating from pairs
pairs = [("red", "#FF0000"), ("green", "#00FF00"), ("blue", "#0000FF")]
color_codes = dict(pairs)
print(f"\nColor codes from pairs: {color_codes}")

# Using dict() with keyword arguments
person = dict(name="Bob", age=30, city="New York")
print(f"\nPerson created with dict(): {person}")


In [None]:
# From: dict_accessing.py

# From: Zero to AI Agent, Chapter 4, Section 4.4
# dict_accessing.py - Accessing dictionary values safely

# AI model configuration
model_config = {
    "model_name": "GPT-3",
    "temperature": 0.7,
    "max_tokens": 2048,
    "top_p": 0.95,
    "frequency_penalty": 0.5,
    "presence_penalty": 0.0
}

# Access values using keys
model = model_config["model_name"]
temp = model_config["temperature"]
print(f"Model: {model} with temperature: {temp}")

# Safer access with get() method
tokens = model_config.get("max_tokens")
print(f"Max tokens: {tokens}")

# get() with default value if key doesn't exist
stream = model_config.get("stream", False)  # Default to False if not found
print(f"Stream enabled: {stream}")

# What happens when key doesn't exist?
# bad_access = model_config["non_existent"]  # KeyError!

# Safe pattern for checking if key exists
if "top_p" in model_config:
    print(f"Top-p sampling: {model_config['top_p']}")

# Check if key doesn't exist
if "api_key" not in model_config:
    print("No API key in config (good for security!)")


In [None]:
# From: dict_modifying.py

# From: Zero to AI Agent, Chapter 4, Section 4.4
# dict_modifying.py - Adding, updating, and removing dictionary items

# Starting with a basic AI agent state
agent_state = {
    "status": "idle",
    "messages_processed": 0,
    "last_active": None
}

print("Initial state:", agent_state)

# Adding new key-value pairs
agent_state["model"] = "gpt-3.5-turbo"
agent_state["context"] = []
print("\nAfter adding keys:", agent_state)

# Updating existing values
agent_state["status"] = "active"
agent_state["messages_processed"] += 1
agent_state["last_active"] = "2024-01-15 10:30:00"
print("\nAfter updates:", agent_state)

# Update multiple values at once
updates = {
    "status": "processing",
    "messages_processed": 5,
    "error_count": 0  # This adds a new key too!
}
agent_state.update(updates)
print("\nAfter batch update:", agent_state)

# Removing items
del agent_state["error_count"]  # Remove using del
removed = agent_state.pop("last_active", None)  # Remove and return value
print(f"\nRemoved last_active: {removed}")
print("State after removals:", agent_state)

# Clear everything
# agent_state.clear()  # Empties the dictionary


In [None]:
# From: dict_methods.py

# From: Zero to AI Agent, Chapter 4, Section 4.4
# dict_methods.py - Essential dictionary methods

# Sample data: User preferences for an AI assistant
preferences = {
    "language": "en",
    "voice": "neutral",
    "speed": "normal",
    "personality": "helpful",
    "memory": True
}

# keys() - Get all keys
all_keys = preferences.keys()
print("All preference keys:", list(all_keys))

# values() - Get all values
all_values = preferences.values()
print("All preference values:", list(all_values))

# items() - Get key-value pairs
print("\nAll preferences:")
for key, value in preferences.items():
    print(f"  {key}: {value}")

# pop() with default
removed = preferences.pop("non_existent", "default_value")
print(f"\nPopped non-existent key: {removed}")

# setdefault() - Get value or set it if missing
theme = preferences.setdefault("theme", "dark")
print(f"Theme (set to default): {theme}")
print(f"Preferences now include theme: {preferences}")

# Copy dictionary (remember the list copying lesson?)
backup = preferences.copy()
backup["language"] = "es"
print(f"\nOriginal language: {preferences['language']}")
print(f"Backup language: {backup['language']}")


In [None]:
# From: nested_dictionaries.py

# From: Zero to AI Agent, Chapter 4, Section 4.4
# nested_dictionaries.py - Working with complex nested data structures

# Complex AI conversation data
conversation = {
    "id": "conv_123",
    "user": {
        "name": "Alice",
        "id": "user_456",
        "preferences": {
            "language": "en",
            "style": "concise"
        }
    },
    "messages": [
        {"role": "user", "content": "Hello!"},
        {"role": "assistant", "content": "Hi there!"},
        {"role": "user", "content": "What's the weather?"}
    ],
    "metadata": {
        "created": "2024-01-15",
        "model": "gpt-3.5-turbo",
        "token_count": 45
    }
}

# Accessing nested data
user_name = conversation["user"]["name"]
language = conversation["user"]["preferences"]["language"]
first_message = conversation["messages"][0]["content"]
token_count = conversation["metadata"]["token_count"]

print(f"User: {user_name} (language: {language})")
print(f"First message: {first_message}")
print(f"Tokens used: {token_count}")

# Safely navigating nested structures
# Use get() chains for safety
style = conversation.get("user", {}).get("preferences", {}).get("style", "default")
print(f"Style preference: {style}")

# Modifying nested data
conversation["metadata"]["token_count"] += 10
conversation["user"]["preferences"]["style"] = "detailed"
print(f"\nUpdated token count: {conversation['metadata']['token_count']}")
print(f"Updated style: {conversation['user']['preferences']['style']}")


In [None]:
# From: json_and_apis.py

# From: Zero to AI Agent, Chapter 4, Section 4.4
# json_and_apis.py - Working with JSON and API responses

# Simulating an API response (this is what you'll get from OpenAI, etc.)
api_response = {
    "id": "chatcmpl-123",
    "object": "chat.completion",
    "created": 1677652288,
    "model": "gpt-3.5-turbo",
    "choices": [
        {
            "index": 0,
            "message": {
                "role": "assistant",
                "content": "The weather today is sunny with a high of 72°F."
            },
            "finish_reason": "stop"
        }
    ],
    "usage": {
        "prompt_tokens": 12,
        "completion_tokens": 15,
        "total_tokens": 27
    }
}

# Extracting the actual response
assistant_message = api_response["choices"][0]["message"]["content"]
total_tokens = api_response["usage"]["total_tokens"]
model_used = api_response["model"]

print(f"Model: {model_used}")
print(f"Response: {assistant_message}")
print(f"Tokens used: {total_tokens}")

# Converting to/from JSON (you'll use this constantly!)
import json

# Dictionary to JSON string
json_string = json.dumps({"name": "Alice", "age": 30}, indent=2)
print(f"\nJSON string:\n{json_string}")

# JSON string back to dictionary
parsed = json.loads(json_string)
print(f"\nParsed back to dict: {parsed}")


In [None]:
# From: dict_patterns.py

# From: Zero to AI Agent, Chapter 4, Section 4.4
# dict_patterns.py - Essential dictionary patterns for AI development

# Pattern 1: Configuration Management
agent_config = {
    "model": "gpt-3.5-turbo",
    "temperature": 0.7,
    "max_history": 10,
    "system_prompt": "You are a helpful assistant.",
    "features": {
        "memory": True,
        "web_search": False,
        "code_execution": False
    }
}

print("Agent Configuration:")
print(f"  Model: {agent_config['model']}")
print(f"  Temperature: {agent_config['temperature']}")

# Update configuration
agent_config["temperature"] = 0.9
agent_config["max_history"] = 20
print(f"\nUpdated config:")
print(f"  Temperature: {agent_config['temperature']}")
print(f"  Max history: {agent_config['max_history']}")

# Check if feature is enabled
feature_to_check = "memory"
if feature_to_check in agent_config["features"]:
    is_enabled = agent_config["features"][feature_to_check]
    print(f"  {feature_to_check} enabled: {is_enabled}")

# Pattern 2: Response Cache
response_cache = {}

# Create cache key (using tuple as key!)
prompt1 = "What is Python?"
model1 = "gpt-3.5"
cache_key1 = (prompt1.lower().strip(), model1)

# Store response in cache
response_cache[cache_key1] = {
    "response": "Python is a programming language...",
    "timestamp": "2024-01-15 10:30:00",
    "hits": 0
}

# Check cache
prompt2 = "what is python?"  # Different case
model2 = "gpt-3.5"
cache_key2 = (prompt2.lower().strip(), model2)

if cache_key2 in response_cache:
    cached_data = response_cache[cache_key2]
    cached_data["hits"] += 1
    print(f"\nCache hit! Response: {cached_data['response'][:30]}...")
    print(f"Cache hits: {cached_data['hits']}")

# Pattern 3: Entity Tracking
entities = {}

# Track entities mentioned in conversation
entity_mentions = [
    ("Alice", "person", {"age": 30, "role": "developer"}),
    ("Python", "technology", {"version": "3.11"}),
    ("Alice", "person", {"city": "New York"}),  # Update Alice
    ("Bob", "person", {"age": 25})
]

for name, entity_type, attributes in entity_mentions:
    if name not in entities:
        entities[name] = {
            "type": entity_type,
            "mentions": 0,
            "attributes": {}
        }
    
    entities[name]["mentions"] += 1
    entities[name]["attributes"].update(attributes)

print("\nEntity Tracking:")
for name, data in entities.items():
    print(f"  {name} ({data['type']}): {data['mentions']} mentions")
    print(f"    Attributes: {data['attributes']}")


In [None]:
# From: conversation_memory.py

# From: Zero to AI Agent, Chapter 4, Section 4.4
# conversation_memory.py - Building a memory system with dictionaries

# Conversation memory system using dictionaries
memory_system = {
    "conversations": {},  # Will store conversations by user_id
    "user_profiles": {},  # Will store user information
    "context_cache": {},  # Recent context by user
    "max_size": 10       # Maximum messages per conversation
}

# Initialize users
user_ids = ["user_123", "user_456"]
for user_id in user_ids:
    memory_system["conversations"][user_id] = []
    memory_system["user_profiles"][user_id] = {
        "name": f"User_{user_id[-3:]}",
        "first_seen": "2024-01-15",
        "message_count": 0,
        "topics": []  # Topics discussed
    }

print("Memory system initialized for users:", user_ids)

# Add messages for user_123
messages = [
    ("user", "Hello AI!", "2024-01-15 10:30:00"),
    ("assistant", "Hello! How can I help you?", "2024-01-15 10:30:01"),
    ("user", "Tell me about Python lists", "2024-01-15 10:30:15"),
    ("assistant", "Lists are ordered collections in Python", "2024-01-15 10:30:16"),
    ("user", "How do I add items?", "2024-01-15 10:30:30"),
    ("assistant", "Use append() to add items to a list", "2024-01-15 10:30:31"),
]

current_user = "user_123"
print(f"\nAdding messages for {current_user}:")

for role, content, timestamp in messages:
    # Create message record
    message = {
        "role": role,
        "content": content,
        "timestamp": timestamp,
        "tokens": len(content.split()) * 2  # Rough estimate
    }
    
    # Add to conversation
    memory_system["conversations"][current_user].append(message)
    
    # Update user profile
    if role == "user":
        memory_system["user_profiles"][current_user]["message_count"] += 1
        
        # Simple topic extraction
        if "python" in content.lower():
            if "programming" not in memory_system["user_profiles"][current_user]["topics"]:
                memory_system["user_profiles"][current_user]["topics"].append("programming")
        if "list" in content.lower():
            if "data structures" not in memory_system["user_profiles"][current_user]["topics"]:
                memory_system["user_profiles"][current_user]["topics"].append("data structures")
    
    print(f"  Added: {role} - {content[:30]}...")
    
    # Check if exceeded max size
    if len(memory_system["conversations"][current_user]) > memory_system["max_size"]:
        removed = memory_system["conversations"][current_user].pop(0)
        print(f"  Memory full! Removed oldest message")

# Get context for user
context_size = 3
user_conversation = memory_system["conversations"][current_user]
if len(user_conversation) >= context_size:
    recent_context = user_conversation[-context_size:]
else:
    recent_context = user_conversation.copy()

# Cache the context
memory_system["context_cache"][current_user] = {
    "messages": recent_context,
    "summary": f"{len(recent_context)} recent messages",
    "total_tokens": sum(m["tokens"] for m in recent_context)
}

print(f"\nContext for {current_user}:")
for msg in recent_context:
    print(f"  [{msg['timestamp']}] {msg['role']}: {msg['content'][:40]}...")

# Display user profile
profile = memory_system["user_profiles"][current_user]
print(f"\nUser Profile for {current_user}:")
print(f"  Name: {profile['name']}")
print(f"  Messages sent: {profile['message_count']}")
print(f"  Topics: {profile['topics']}")
print(f"  First seen: {profile['first_seen']}")

# Search for keywords
search_term = "list"
print(f"\nSearching for '{search_term}' in conversations:")
found_messages = []
for msg in memory_system["conversations"][current_user]:
    if search_term.lower() in msg["content"].lower():
        found_messages.append(msg)
        print(f"  Found in {msg['role']} message: {msg['content'][:50]}...")

print(f"Total matches: {len(found_messages)}")


In [None]:
# From: advanced_techniques.py

# From: Zero to AI Agent, Chapter 4, Section 4.4
# advanced_techniques.py - Advanced dictionary techniques

# Merging dictionaries
defaults = {"color": "blue", "size": "medium", "quantity": 1}
user_choices = {"color": "red", "quantity": 3}

# Merge (user choices override defaults)
final = {**defaults, **user_choices}
print(f"Final options: {final}")

# Dictionary comprehensions
# Square numbers
squares = {n: n**2 for n in range(1, 6)}
print(f"\nSquares: {squares}")

# Filter a dictionary
scores = {"Alice": 85, "Bob": 92, "Charlie": 78, "Diana": 95}
high_scores = {name: score for name, score in scores.items() if score >= 90}
print(f"High scores: {high_scores}")

# Invert a dictionary (swap keys and values)
color_codes = {"red": "#FF0000", "green": "#00FF00", "blue": "#0000FF"}
code_to_color = {code: color for color, code in color_codes.items()}
print(f"Inverted: {code_to_color}")

# Grouping data
students = [
    {"name": "Alice", "grade": "A"},
    {"name": "Bob", "grade": "B"},
    {"name": "Charlie", "grade": "A"},
    {"name": "Diana", "grade": "B"},
    {"name": "Eve", "grade": "A"}
]

# Group by grade
by_grade = {}
for student in students:
    grade = student["grade"]
    if grade not in by_grade:
        by_grade[grade] = []
    by_grade[grade].append(student["name"])

print("\nStudents by grade:")
for grade, names in by_grade.items():
    print(f"  Grade {grade}: {names}")

# Word frequency counter
text = "the quick brown fox jumps over the lazy dog the fox"
word_counter = {}

for word in text.split():
    if word in word_counter:
        word_counter[word] += 1
    else:
        word_counter[word] = 1

print("\nWord frequencies:")
for word, count in word_counter.items():
    print(f"  {word}: {count}")


---
### Section 4.4 Exercises

### Exercise 4.4.1: Product Inventory System

Create an inventory system that:
1. Store products with name, price, and quantity
2. Add new products
3. Update quantities
4. Calculate total inventory value
5. Find products below a certain stock level
6. Generate a restock report

In [None]:
# Your code here


### Exercise 4.4.2: Student Grade Manager

Build a grade management system that:
1. Store students with their grades in multiple subjects
2. Add new students and grades
3. Calculate average grade per student
4. Find the top performer
5. Generate a report card for a specific student
6. List all students failing any subject (grade \< 60)

In [None]:
# Your code here


### Exercise 4.4.3: API Response Handler

Create a system that:
1. Process mock API responses (nested dictionaries)
2. Extract specific fields safely
3. Handle missing keys gracefully
4. Count total API calls by endpoint
5. Cache responses to avoid duplicate calls
6. Generate usage statistics

In [None]:
# Your code here


---
## Section 4.5: Sets and their operations

In [None]:
# From: set_creation.py

# From: Zero to AI Agent, Chapter 4, Section 4.5
# set_creation.py - Creating sets in Python

# Creating sets - notice just values, no key:value pairs
fruits = {"apple", "banana", "orange", "grape"}
print(f"Fruits set: {fruits}")
print(f"Type: {type(fruits)}")

# Sets automatically remove duplicates!
numbers = {1, 2, 3, 2, 1, 4, 3, 5}  # Duplicates: 1, 2, 3
print(f"Numbers set: {numbers}")  # Only unique values remain

# Creating from a list (removes duplicates)
temperatures = [22, 24, 22, 23, 24, 25, 23, 22]
unique_temps = set(temperatures)
print(f"Original list: {temperatures}")
print(f"Unique temperatures: {unique_temps}")

# Empty set - CAREFUL with this one!
# wrong_way = {}  # This creates an empty DICTIONARY, not a set!
right_way = set()  # This creates an empty set
also_right = {1, 2, 3}
also_right.clear()  # Now it's empty
print(f"Empty set: {right_way}")
print(f"Type of {{}}: {type({})}")  # It's a dict!
print(f"Type of set(): {type(set())}")  # It's a set!

# Creating from a string (gets unique characters)
word = "programming"
unique_letters = set(word)
print(f"Unique letters in '{word}': {unique_letters}")

# Creating from range
evens = set(range(0, 10, 2))
print(f"Even numbers: {evens}")


In [None]:
# From: set_operations.py

# From: Zero to AI Agent, Chapter 4, Section 4.5
# set_operations.py - Basic set operations: add, remove, check

# Working with a set of skills for an AI developer
skills = {"Python", "Machine Learning", "Data Analysis"}
print(f"Initial skills: {skills}")

# Adding elements
skills.add("Deep Learning")
skills.add("Python")  # Try to add duplicate - nothing happens!
print(f"After adding: {skills}")

# Adding multiple elements
new_skills = ["Statistics", "SQL", "Cloud Computing", "SQL"]  # Note: SQL appears twice
skills.update(new_skills)  # Adds all unique elements
print(f"After update: {skills}")

# Removing elements - different methods
skills.remove("SQL")  # Removes SQL (raises error if not found)
print(f"After remove: {skills}")

# Safe removal with discard (no error if not found)
skills.discard("JavaScript")  # Not in set, but no error
skills.discard("Statistics")  # Removes if present
print(f"After discard: {skills}")

# Pop removes and returns an arbitrary element
if skills:  # Check if not empty
    popped = skills.pop()
    print(f"Popped: {popped}")
    print(f"Remaining: {skills}")

# Checking membership (SUPER FAST!)
ai_skills = {"Python", "TensorFlow", "PyTorch", "Scikit-learn", "Pandas"}

# This is incredibly fast even with huge sets
if "Python" in ai_skills:
    print("Python is in the skill set")

if "Java" not in ai_skills:
    print("Java is not in the skill set")

# Length and clearing
print(f"Number of skills: {len(ai_skills)}")
# ai_skills.clear()  # Removes all elements


In [None]:
# From: set_mathematics.py

# From: Zero to AI Agent, Chapter 4, Section 4.5
# set_mathematics.py - Mathematical set operations

# Two teams and their programming languages
team_a = {"Python", "JavaScript", "Go", "Rust"}
team_b = {"Python", "Java", "JavaScript", "C++"}

print(f"Team A knows: {team_a}")
print(f"Team B knows: {team_b}")

# UNION - All languages known by either team (OR)
all_languages = team_a | team_b  # Using | operator
# OR
all_languages = team_a.union(team_b)  # Using method
print(f"\nAll languages (union): {all_languages}")

# INTERSECTION - Languages known by both teams (AND)
common_languages = team_a & team_b  # Using & operator
# OR
common_languages = team_a.intersection(team_b)  # Using method
print(f"Common languages (intersection): {common_languages}")

# DIFFERENCE - Languages only Team A knows
team_a_exclusive = team_a - team_b  # Using - operator
# OR
team_a_exclusive = team_a.difference(team_b)  # Using method
print(f"Only Team A knows (difference): {team_a_exclusive}")

# SYMMETRIC DIFFERENCE - Languages known by one team but not both (XOR)
unique_to_one_team = team_a ^ team_b  # Using ^ operator
# OR
unique_to_one_team = team_a.symmetric_difference(team_b)  # Using method
print(f"Known by only one team (symmetric difference): {unique_to_one_team}")

# Real-world example: Finding common interests
alice_interests = {"AI", "Python", "Reading", "Hiking", "Photography"}
bob_interests = {"Python", "Gaming", "AI", "Cooking", "Photography"}

common = alice_interests & bob_interests
print(f"\nAlice and Bob both like: {common}")

alice_unique = alice_interests - bob_interests
print(f"Only Alice likes: {alice_unique}")

all_interests = alice_interests | bob_interests
print(f"All interests combined: {all_interests}")


In [None]:
# From: set_comparisons.py

# From: Zero to AI Agent, Chapter 4, Section 4.5
# set_comparisons.py - Subset and superset operations

# AI technology hierarchy
ml_basics = {"Python", "Statistics", "Linear Algebra"}
ml_advanced = {"Python", "Statistics", "Linear Algebra", "Deep Learning", "NLP"}
data_science = {"Python", "Statistics", "SQL", "Visualization"}

# Is ml_basics a subset of ml_advanced?
print(f"ML basics ⊆ ML advanced? {ml_basics.issubset(ml_advanced)}")
print(f"ML basics ⊆ ML advanced? {ml_basics <= ml_advanced}")  # Alternative

# Is ml_advanced a superset of ml_basics?
print(f"ML advanced ⊇ ML basics? {ml_advanced.issuperset(ml_basics)}")
print(f"ML advanced ⊇ ML basics? {ml_advanced >= ml_basics}")  # Alternative

# Are ml_basics and data_science disjoint (no common elements)?
print(f"ML basics ∩ Data Science = ∅? {ml_basics.isdisjoint(data_science)}")
# False, because they share Python and Statistics

# Proper subset (subset but not equal)
print(f"ML basics ⊂ ML advanced? {ml_basics < ml_advanced}")
print(f"ML basics = ML basics? {ml_basics == ml_basics}")

# Practical example: Permission checking
user_permissions = {"read", "write", "execute"}
required_permissions = {"read", "write"}
admin_permissions = {"read", "write", "execute", "delete", "admin"}

# Check if user has all required permissions
has_access = required_permissions.issubset(user_permissions)
print(f"\nUser has required permissions? {has_access}")

# Check if user is admin (has all admin permissions)
is_admin = user_permissions == admin_permissions
print(f"User is admin? {is_admin}")

# Check if user has ANY admin permission
has_some_admin = not user_permissions.isdisjoint(admin_permissions)
print(f"User has some admin permissions? {has_some_admin}")


In [None]:
# From: frozen_sets.py

# From: Zero to AI Agent, Chapter 4, Section 4.5
# frozen_sets.py - Working with immutable sets

# Creating frozen sets
constants = frozenset([3.14159, 2.71828, 1.41421])
print(f"Mathematical constants: {constants}")

# Frozen sets can be dictionary keys (regular sets cannot!)
user_groups = {
    frozenset(["admin", "user"]): "Full Access",
    frozenset(["user"]): "Limited Access",
    frozenset(["guest"]): "Read Only"
}

current_user_groups = frozenset(["user"])
access_level = user_groups.get(current_user_groups, "No Access")
print(f"User access level: {access_level}")

# Frozen sets support all non-mutating operations
set1 = frozenset([1, 2, 3])
set2 = frozenset([2, 3, 4])

print(f"Union: {set1 | set2}")
print(f"Intersection: {set1 & set2}")
print(f"Difference: {set1 - set2}")

# But you can't modify them
# set1.add(4)  # This would raise an AttributeError


In [None]:
# From: text_processing.py

# From: Zero to AI Agent, Chapter 4, Section 4.5
# text_processing.py - Using sets for natural language processing

# Text analysis using sets
text1 = "The quick brown fox jumps over the lazy dog and the dog runs away with the fox"
text2 = "Machine learning is a subset of artificial intelligence and artificial intelligence is the future"

# Process text 1
words1 = text1.lower().split()
vocabulary1 = set(words1)  # Unique words

# Process text 2
words2 = text2.lower().split()
vocabulary2 = set(words2)

# Common English stop words
stop_words = {"the", "is", "at", "which", "on", "a", "an", "and", "or", "but", "in", "with", "to", "for", "of"}

# Content words (excluding stop words)
content_words1 = vocabulary1 - stop_words
content_words2 = vocabulary2 - stop_words

# Analysis
print("Text 1 Analysis:")
print(f"  Total words: {len(words1)}")
print(f"  Unique words: {len(vocabulary1)}")
print(f"  Content words: {content_words1}")
print(f"  Lexical diversity: {len(vocabulary1) / len(words1):.2f}")

print("\nText 2 Analysis:")
print(f"  Total words: {len(words2)}")
print(f"  Unique words: {len(vocabulary2)}")
print(f"  Content words: {content_words2}")
print(f"  Lexical diversity: {len(vocabulary2) / len(words2):.2f}")

# Compare vocabularies
print(f"\nCommon content words: {content_words1 & content_words2}")
print(f"Words only in text 1: {content_words1 - content_words2}")
print(f"Words only in text 2: {content_words2 - content_words1}")


In [None]:
# From: set_patterns.py

# From: Zero to AI Agent, Chapter 4, Section 4.5
# set_patterns.py - Essential patterns for AI development

# Pattern 1: Duplicate Detection in Datasets
dataset = {
    "seen_ids": set(),
    "duplicates": [],
    "unique_data": []
}

# Sample data with duplicates
samples = [
    ("id_001", "data1"),
    ("id_002", "data2"),
    ("id_001", "data1_duplicate"),  # Duplicate!
    ("id_003", "data3"),
    ("id_002", "data2_duplicate"),  # Duplicate!
]

print("Processing dataset:")
for sample_id, data in samples:
    if sample_id in dataset["seen_ids"]:
        dataset["duplicates"].append((sample_id, data))
        print(f"  Duplicate detected: {sample_id}")
    else:
        dataset["seen_ids"].add(sample_id)
        dataset["unique_data"].append((sample_id, data))
        print(f"  Added unique sample: {sample_id}")

print(f"\nStatistics:")
print(f"  Total unique: {len(dataset['seen_ids'])}")
print(f"  Duplicates found: {len(dataset['duplicates'])}")

# Pattern 2: Feature Selection for ML
all_features = {"age", "income", "education", "location", "gender", "occupation"}
selected_features = set()
excluded_features = set()

# Select features
features_to_add = ["age", "income", "education", "invalid_feature"]
valid_features = set(features_to_add) & all_features
invalid_features = set(features_to_add) - all_features

if invalid_features:
    print(f"\nWarning: Invalid features ignored: {invalid_features}")

selected_features.update(valid_features)
print(f"Selected features: {selected_features}")

# Exclude features
excluded_features.add("gender")  # Remove for privacy
selected_features -= excluded_features

# Get unused features
unused_features = all_features - selected_features - excluded_features
print(f"Final features: {selected_features}")
print(f"Unused features: {unused_features}")

# Pattern 3: User Session Tracking
session_tracker = {
    "active_sessions": set(),
    "completed_sessions": set(),
    "all_users": set()
}

# Simulate user activity
actions = [
    ("start", "user1"),
    ("start", "user2"),
    ("start", "user3"),
    ("end", "user1"),
    ("start", "user1"),  # Returning user
    ("start", "user4"),
    ("end", "user2"),
]

print("\nSession tracking:")
for action, user_id in actions:
    if action == "start":
        if user_id in session_tracker["active_sessions"]:
            print(f"  {user_id} already has active session")
        else:
            session_tracker["active_sessions"].add(user_id)
            session_tracker["all_users"].add(user_id)
            print(f"  Session started for {user_id}")
    else:  # end
        if user_id not in session_tracker["active_sessions"]:
            print(f"  No active session for {user_id}")
        else:
            session_tracker["active_sessions"].remove(user_id)
            session_tracker["completed_sessions"].add(user_id)
            print(f"  Session ended for {user_id}")

# Calculate metrics
active_count = len(session_tracker["active_sessions"])
total_users = len(session_tracker["all_users"])
returning_users = len(session_tracker["completed_sessions"])
new_users = len(session_tracker["all_users"] - session_tracker["completed_sessions"])

print(f"\nSession Metrics:")
print(f"  Active now: {active_count}")
print(f"  Total users: {total_users}")
print(f"  Returning users: {returning_users}")
print(f"  New users: {new_users}")
print(f"  Active user IDs: {session_tracker['active_sessions']}")


In [None]:
# From: set_comprehensions.py

# From: Zero to AI Agent, Chapter 4, Section 4.5
# set_comprehensions.py - Creating sets elegantly

# Basic set comprehension
squares = {x**2 for x in range(10)}
print(f"Squares: {squares}")

# With condition
even_squares = {x**2 for x in range(10) if x % 2 == 0}
print(f"Even squares: {even_squares}")

# From string - unique words longer than 3 characters
text = "the quick brown fox jumps over the lazy fox"
long_words = {word for word in text.split() if len(word) > 3}
print(f"Long words: {long_words}")

# Normalizing data
emails = ["Alice@EXAMPLE.com", "bob@example.com", "ALICE@Example.com", "charlie@test.com"]
unique_emails = {email.lower() for email in emails}
print(f"Unique emails (normalized): {unique_emails}")

# Extracting from nested data
users = [
    {"name": "Alice", "tags": ["python", "ai", "ml"]},
    {"name": "Bob", "tags": ["python", "web", "api"]},
    {"name": "Charlie", "tags": ["ai", "data", "python"]}
]

all_tags = {tag for user in users for tag in user["tags"]}
print(f"All unique tags: {all_tags}")


In [None]:
# From: set_performance.py

# From: Zero to AI Agent, Chapter 4, Section 4.5
# set_performance.py - Why sets are incredibly fast

import time

# Create large collections
large_list = list(range(1000000))
large_set = set(range(1000000))

# Test membership with list (slow for large lists)
start = time.time()
result = 999999 in large_list  # Has to check each element until found
list_time = time.time() - start

# Test membership with set (always fast)
start = time.time()
result = 999999 in large_set  # Direct lookup
set_time = time.time() - start

print(f"List lookup time: {list_time:.6f} seconds")
print(f"Set lookup time: {set_time:.6f} seconds")
print(f"Set is {list_time/set_time:.0f}x faster!")

# Practical example: Checking many items
words_to_check = ["python", "programming", "ai", "machine", "learning"] * 1000
valid_words = {"python", "programming", "ai", "computer", "science", "data"}

# Using set for validation (fast)
start = time.time()
valid_count = sum(1 for word in words_to_check if word in valid_words)
print(f"\nValidation with set: {time.time() - start:.4f} seconds")
print(f"Found {valid_count} valid words")


In [None]:
# From: recommendation_system.py

# From: Zero to AI Agent, Chapter 4, Section 4.5
# recommendation_system.py - Building a recommendation engine with sets

# Simple recommendation system using set operations

# User interests database
user_interests = {
    "alice": {"python", "ai", "machine-learning", "data-science"},
    "bob": {"javascript", "web", "react", "node"},
    "charlie": {"python", "ai", "deep-learning", "nlp"},
    "diana": {"python", "web", "django", "api"}
}

# Content tags database
content_items = {
    "article_1": {"python", "tutorial", "beginners"},
    "article_2": {"ai", "machine-learning", "python"},
    "article_3": {"javascript", "react", "tutorial"},
    "article_4": {"python", "django", "web"},
    "article_5": {"deep-learning", "nlp", "ai"},
    "video_1": {"python", "data-science", "pandas"},
    "video_2": {"node", "javascript", "api"}
}

# User viewing history
user_history = {
    "alice": {"article_1", "article_2"},
    "bob": {"article_3"},
    "charlie": set(),
    "diana": {"article_4"}
}

# Generate recommendations for each user
print("Generating recommendations:\n")
for username in user_interests:
    user_int = user_interests[username]
    viewed = user_history.get(username, set())
    
    # Find unviewed content
    all_content = set(content_items.keys())
    unviewed = all_content - viewed
    
    # Score each unviewed content
    recommendations = []
    for content_id in unviewed:
        content_tags = content_items[content_id]
        
        # Calculate relevance (number of matching tags)
        relevance = len(user_int & content_tags)
        
        if relevance > 0:
            recommendations.append((content_id, relevance))
    
    # Sort by relevance
    recommendations.sort(key=lambda x: x[1], reverse=True)
    
    # Display top 3 recommendations
    print(f"{username}'s recommendations:")
    top_recs = recommendations[:3]
    if top_recs:
        for content_id, score in top_recs:
            print(f"  {content_id}: relevance score {score}")
    else:
        print(f"  No recommendations found")
    print()

# Find similar users (Jaccard similarity)
print("Finding similar users:\n")
user_list = list(user_interests.keys())
for i in range(len(user_list)):
    for j in range(i + 1, len(user_list)):
        user1 = user_list[i]
        user2 = user_list[j]
        
        interests1 = user_interests[user1]
        interests2 = user_interests[user2]
        
        # Calculate Jaccard similarity
        intersection = len(interests1 & interests2)
        union = len(interests1 | interests2)
        
        if union > 0:
            similarity = intersection / union
            if similarity > 0.3:  # Threshold
                print(f"{user1} and {user2}: {similarity:.2f} similarity")
                print(f"  Common interests: {interests1 & interests2}")

# Popular content analysis
print("\nPopular content (viewed by multiple users):")
all_viewed = set()
for viewed_set in user_history.values():
    all_viewed |= viewed_set

view_counts = {}
for content_id in all_viewed:
    count = sum(1 for viewed in user_history.values() if content_id in viewed)
    if count > 1:
        view_counts[content_id] = count

for content_id, count in view_counts.items():
    print(f"  {content_id}: {count} views")


---
### Section 4.5 Exercises

### Exercise 4.5.1: Email List Manager

Create a system that:
1. Manage email lists for different campaigns
2. Add emails to lists (no duplicates allowed)
3. Find common subscribers between campaigns
4. Identify exclusive subscribers for each campaign
5. Merge campaign lists without duplicates
6. Generate statistics about overlap between campaigns

In [None]:
# Your code here


### Exercise 4.5.2: Skill Matcher

Build a job matching system that:
1. Store job requirements as sets of skills
2. Store candidate skills as sets
3. Find perfect matches (candidate has all required skills)
4. Find partial matches with match percentage
5. Identify missing skills for each candidate
6. Recommend training based on skill gaps

In [None]:
# Your code here


### Exercise 4.5.3: Document Similarity Analyzer

Create a text analysis system that:
1. Extract unique words from documents
2. Calculate vocabulary overlap between documents
3. Find common themes (shared important words)
4. Identify unique vocabulary per document
5. Calculate similarity scores
6. Group similar documents together

In [None]:
# Your code here


---
## Section 4.6: Choosing the right data structure

In [None]:
# From: decision_guide.py

# From: Zero to AI Agent, Chapter 4, Section 4.6
# decision_guide.py - Quick decision guide for choosing data structures

# THE GOLDEN QUESTIONS TO ASK YOURSELF:

# 1. Do I need to store key-value pairs or look things up by name?
#    → Use a DICTIONARY
user = {"name": "Alice", "age": 30, "email": "alice@example.com"}

# 2. Do I need to ensure uniqueness or perform set operations?
#    → Use a SET
unique_visitors = {"user123", "user456", "user789"}

# 3. Will this data never change once created?
#    → Use a TUPLE
coordinates = (40.7128, -74.0060)  # NYC coordinates won't change

# 4. Do I need an ordered, changeable collection?
#    → Use a LIST
shopping_cart = ["apples", "bread", "milk"]


In [None]:
# From: when_to_use_lists.py

# From: Zero to AI Agent, Chapter 4, Section 4.6
# when_to_use_lists.py - When to choose lists for your data

# 1. ORDER MATTERS and items might change
timeline_events = [
    "User logged in",
    "Viewed product",
    "Added to cart",
    "Completed purchase"
]
# The sequence of events is crucial!

# 2. You need to access items by position
temperatures = [22, 24, 23, 25, 21, 26, 22]
monday_temp = temperatures[0]  # First day
friday_temp = temperatures[4]  # Fifth day

# 3. You'll be adding/removing items frequently
task_queue = []
task_queue.append("Process payment")
task_queue.append("Send email")
current_task = task_queue.pop(0)  # Process first task

# 4. Duplicates are allowed and meaningful
dice_rolls = [6, 3, 6, 2, 6, 1, 4, 6]  # Multiple 6s are valid

# 5. You need to sort or reverse
scores = [85, 92, 78, 95, 88]
scores.sort()  # In-place sorting

# Real-world LIST examples in AI:

# Conversation history (order matters, can grow)
chat_messages = [
    "User: Hello",
    "Bot: Hi there!",
    "User: How are you?"
]
chat_messages.append("Bot: I'm doing great!")

# Training data batches (need specific order)
training_batches = [
    [1, 2, 3, 4],  # Batch 1
    [5, 6, 7, 8],  # Batch 2
    [9, 10, 11, 12]  # Batch 3
]

# Sequential predictions
predictions = []
for i in range(5):
    predictions.append(f"Prediction {i}")

print("List examples:")
print(f"Timeline has {len(timeline_events)} events")
print(f"Task queue: {task_queue}")
print(f"Chat messages: {len(chat_messages)} messages")


In [None]:
# From: when_to_use_tuples.py

# From: Zero to AI Agent, Chapter 4, Section 4.6
# when_to_use_tuples.py - When to choose tuples for your data

# 1. Data represents a single, unchangeable entity
rgb_red = (255, 0, 0)  # Color definition
database_config = ("localhost", 5432, "mydb", "readonly")

# 2. You need to use it as a dictionary key
location_data = {
    (40.7128, -74.0060): "New York",
    (51.5074, -0.1278): "London"
}
# Lists can't be dictionary keys, but tuples can!

# 3. Returning multiple values (conceptually)
# Calculate both area and perimeter
width, height = 10, 5
result = (width * height, 2 * (width + height))  # (area, perimeter)
area, perimeter = result

# 4. Protecting data from accidental changes
SYSTEM_CONSTANTS = (3.14159, 2.71828, 1.41421)
# No one can accidentally modify these

# 5. Representing fixed records
user_record = ("Alice", 30, "alice@example.com", True)  # Fixed structure

# Real-world TUPLE examples in AI:

# Model configuration (shouldn't change during runtime)
MODEL_CONFIG = ("gpt-3.5", 2048, 0.7, "production")
model, max_tokens, temperature, environment = MODEL_CONFIG

# Training metrics snapshot (immutable record)
epoch_results = (1, 0.85, 0.15, 42.3)  # (epoch, accuracy, loss, time)

# Fixed coordinate pairs for computer vision
bounding_box = ((100, 100), (200, 200))  # ((x1, y1), (x2, y2))

print("Tuple examples:")
print(f"RGB Red: {rgb_red}")
print(f"Database config: {database_config}")
print(f"Model: {model}, Max tokens: {max_tokens}")
print(f"Area: {area}, Perimeter: {perimeter}")


In [None]:
# From: when_to_use_dictionaries.py

# From: Zero to AI Agent, Chapter 4, Section 4.6
# when_to_use_dictionaries.py - When to choose dictionaries for your data

# 1. You need to look up values by a meaningful key
phone_book = {
    "Alice": "555-1234",
    "Bob": "555-5678",
    "Charlie": "555-9012"
}
alice_phone = phone_book["Alice"]

# 2. You're mapping relationships
word_counts = {
    "python": 15,
    "code": 8,
    "function": 12
}
python_count = word_counts["python"]

# 3. Storing structured data (like JSON)
user_profile = {
    "username": "alice123",
    "settings": {
        "theme": "dark",
        "notifications": True
    }
}

# 4. Building caches or lookup tables
fibonacci_cache = {
    0: 0,
    1: 1,
    2: 1,
    3: 2,
    4: 3,
    5: 5
}

# 5. Grouping related data
product = {
    "id": "PROD-123",
    "name": "Laptop",
    "price": 999.99,
    "in_stock": True
}

# Real-world DICTIONARY examples in AI:

# API responses (always come as dictionaries/JSON)
ai_response = {
    "text": "Hello! How can I help?",
    "confidence": 0.95,
    "tokens_used": 15,
    "model": "gpt-3.5-turbo"
}

# Feature engineering (mapping features to values)
user_features = {
    "age": 25,
    "purchase_count": 10,
    "is_premium": True,
    "last_activity": "2024-01-15"
}

# Entity tracking in NLP
entities = {
    "persons": ["Alice", "Bob"],
    "locations": ["New York", "Boston"],
    "organizations": ["OpenAI", "Google"]
}

print("Dictionary examples:")
print(f"Alice's phone: {alice_phone}")
print(f"Python word count: {python_count}")
print(f"AI response confidence: {ai_response['confidence']}")
print(f"Entities found: {len(entities)} types")


In [None]:
# From: when_to_use_sets.py

# From: Zero to AI Agent, Chapter 4, Section 4.6
# when_to_use_sets.py - When to choose sets for your data

# 1. You need only unique items
unique_visitors = set()
unique_visitors.add("user123")
unique_visitors.add("user456")
unique_visitors.add("user123")  # Won't be added again
print(unique_visitors)  # {'user123', 'user456'}

# 2. You need FAST membership testing
valid_commands = {"start", "stop", "pause", "resume"}
user_input = "start"
if user_input in valid_commands:  # Super fast!
    print("Valid command")

# 3. You need set operations (union, intersection, difference)
skills_required = {"Python", "SQL", "ML"}
skills_candidate = {"Python", "SQL", "Java", "ML"}
has_all_required = skills_required.issubset(skills_candidate)  # True

# 4. Removing duplicates from data
numbers = [1, 2, 2, 3, 3, 3, 4, 4, 4, 4]
unique_numbers = list(set(numbers))  # [1, 2, 3, 4]

# 5. Finding commonalities or differences
team_a = {"Python", "JavaScript", "Go"}
team_b = {"Python", "Java", "C++"}
common_languages = team_a & team_b  # {'Python'}

# Real-world SET examples in AI:

# Tracking unique entities in conversations
mentioned_topics = set()
mentioned_topics.add("weather")
mentioned_topics.add("sports")
mentioned_topics.add("weather")  # Duplicate ignored

# Vocabulary in NLP
text = "the cat sat on the mat"
vocabulary = set(text.split())  # Unique words only

# Feature selection
all_features = {"f1", "f2", "f3", "f4", "f5"}
selected_features = {"f1", "f3", "f5"}
excluded_features = all_features - selected_features

print("Set examples:")
print(f"Unique visitors: {unique_visitors}")
print(f"Has all required skills: {has_all_required}")
print(f"Common languages: {common_languages}")
print(f"Vocabulary size: {len(vocabulary)}")
print(f"Excluded features: {excluded_features}")


In [None]:
# From: combining_structures.py

# From: Zero to AI Agent, Chapter 4, Section 4.6
# combining_structures.py - Combining data structures for maximum power

# 1. LIST OF DICTIONARIES - Perfect for records
users = [
    {"id": 1, "name": "Alice", "age": 30},
    {"id": 2, "name": "Bob", "age": 25},
    {"id": 3, "name": "Charlie", "age": 35}
]

# Easy to iterate
for user in users:
    print(f"{user['name']} is {user['age']} years old")

# 2. DICTIONARY OF LISTS - Perfect for grouping
students_by_grade = {
    "A": ["Alice", "Amy", "Anna"],
    "B": ["Bob", "Bill", "Betty"],
    "C": ["Charlie", "Carl"]
}

# Easy to access groups
a_students = students_by_grade["A"]

# 3. DICTIONARY OF SETS - Perfect for unique groupings
user_permissions = {
    "admin": {"read", "write", "delete", "modify"},
    "editor": {"read", "write", "modify"},
    "viewer": {"read"}
}

# Check permissions
if "delete" in user_permissions.get("editor", set()):
    print("Editor can delete")
else:
    print("Editor cannot delete")

# 4. LIST OF TUPLES - Perfect for paired data
coordinates = [
    (10, 20),
    (30, 40),
    (50, 60)
]
for x, y in coordinates:
    print(f"Point at ({x}, {y})")

# 5. DICTIONARY OF DICTIONARIES - Perfect for nested data
company_data = {
    "employees": {
        "alice": {"role": "developer", "salary": 70000},
        "bob": {"role": "designer", "salary": 65000}
    },
    "departments": {
        "engineering": {"head": "alice", "budget": 500000},
        "design": {"head": "bob", "budget": 200000}
    }
}

# 6. SET OF TUPLES - Perfect for unique pairs
edges = {
    ("A", "B"),
    ("B", "C"),
    ("A", "C")
}
print(f"Graph has {len(edges)} edges")

print("\nCombining structures examples:")
print(f"Number of users: {len(users)}")
print(f"A students: {a_students}")
print(f"Admin permissions: {user_permissions['admin']}")
print(f"Alice's role: {company_data['employees']['alice']['role']}")


In [None]:
# From: performance_comparison.py

# From: Zero to AI Agent, Chapter 4, Section 4.6
# performance_comparison.py - Comparing performance of different data structures

import time

# Setup
n = 100000
test_list = list(range(n))
test_set = set(range(n))
test_dict = {}
for i in range(n):
    test_dict[i] = i

# MEMBERSHIP TESTING
search_value = n - 1

# List (slow for large collections)
start = time.time()
result = search_value in test_list
list_time = time.time() - start

# Set (always fast)
start = time.time()
result = search_value in test_set
set_time = time.time() - start

# Dictionary (always fast)
start = time.time()
result = search_value in test_dict
dict_time = time.time() - start

print(f"Membership testing for {n} items:")
print(f"  List: {list_time:.6f} seconds")
print(f"  Set: {set_time:.6f} seconds")
print(f"  Dict: {dict_time:.6f} seconds")

# ADDING ITEMS
# List append (fast)
start = time.time()
test_list.append(n)
print(f"\nAdding one item:")
print(f"  List append: {time.time() - start:.6f} seconds")

# Set add (fast)
start = time.time()
test_set.add(n)
print(f"  Set add: {time.time() - start:.6f} seconds")

# Dictionary assignment (fast)
start = time.time()
test_dict[n] = n
print(f"  Dict assign: {time.time() - start:.6f} seconds")



In [None]:
# From: decision_flowchart.py

# From: Zero to AI Agent, Chapter 4, Section 4.6
# decision_flowchart.py - Step-by-step decision process for choosing data structures
# Demonstrates the decision process without functions

print("=" * 40)
print("DATA STRUCTURE DECISION FLOWCHART")
print("=" * 40)

# The decision process:
# 1. Do you need key-value pairs? -> DICTIONARY
# 2. Do you need uniqueness? -> SET
# 3. Will the data change? -> LIST (if yes) or TUPLE (if no)

# Scenario 1: User profiles with lookup by ID
print("\nScenario 1: User profiles with lookup by ID")
print("  Question: Do you need key-value pairs? YES")
print("  Answer: Use DICTIONARY")
example_1 = {"user_123": {"name": "Alice", "age": 25}}
print(f"  Example: {example_1}")

# Scenario 2: Unique visitor tracking
print("\nScenario 2: Unique visitor tracking")
print("  Question: Do you need key-value pairs? NO")
print("  Question: Do you need uniqueness? YES")
print("  Answer: Use SET")
example_2 = {"visitor_1", "visitor_2", "visitor_3"}
print(f"  Example: {example_2}")

# Scenario 3: Shopping cart that changes
print("\nScenario 3: Shopping cart that changes")
print("  Question: Do you need key-value pairs? NO")
print("  Question: Do you need uniqueness? NO")
print("  Question: Will the data change? YES")
print("  Answer: Use LIST")
example_3 = ["apple", "banana", "milk"]
print(f"  Example: {example_3}")

# Scenario 4: GPS coordinates that never change
print("\nScenario 4: GPS coordinates that never change")
print("  Question: Do you need key-value pairs? NO")
print("  Question: Do you need uniqueness? NO")
print("  Question: Will the data change? NO")
print("  Answer: Use TUPLE")
example_4 = (37.7749, -122.4194)
print(f"  Example: {example_4}")

# Interactive decision helper
print("\n" + "=" * 40)
print("INTERACTIVE DECISION HELPER")
print("=" * 40)

needs_key_value = input("Do you need key-value pairs? (yes/no): ").lower().strip() == "yes"

if needs_key_value:
    print("\nRecommendation: Use DICTIONARY")
    print("Example: my_dict = {'key': 'value'}")
else:
    needs_unique = input("Do you need uniqueness (no duplicates)? (yes/no): ").lower().strip() == "yes"

    if needs_unique:
        print("\nRecommendation: Use SET")
        print("Example: my_set = {1, 2, 3}")
    else:
        will_change = input("Will the data change after creation? (yes/no): ").lower().strip() == "yes"

        if will_change:
            print("\nRecommendation: Use LIST")
            print("Example: my_list = [1, 2, 3]")
        else:
            print("\nRecommendation: Use TUPLE")
            print("Example: my_tuple = (1, 2, 3)")

print("=" * 40)


In [None]:
# From: common_mistakes.py

# From: Zero to AI Agent, Chapter 4, Section 4.6
# common_mistakes.py - Common mistakes and how to avoid them

# MISTAKE 1: Using lists for lookups (slow!)
# Bad:
users_list = ["alice", "bob", "charlie"]
if "charlie" in users_list:  # Has to check each item
    print("Found (slow way)")

# Good:
users_set = {"alice", "bob", "charlie"}
if "charlie" in users_set:  # Instant lookup
    print("Found (fast way)")

# MISTAKE 2: Using dictionaries when order matters
# Bad (in older Python):
steps_dict = {
    1: "Wash vegetables",
    2: "Cut vegetables",
    3: "Cook vegetables"
}
# Dictionary order wasn't guaranteed in older Python!

# Good:
steps_list = [
    "Wash vegetables",
    "Cut vegetables",
    "Cook vegetables"
]
print(f"Step 1: {steps_list[0]}")

# MISTAKE 3: Modifying a list while iterating
# Bad:
numbers = [1, 2, 3, 4, 5]
# This would cause problems:
# for n in numbers:
#     if n % 2 == 0:
#         numbers.remove(n)  # Dangerous!

# Good:
numbers = [1, 2, 3, 4, 5]
odd_numbers = []
for n in numbers:
    if n % 2 != 0:
        odd_numbers.append(n)
print(f"Odd numbers: {odd_numbers}")

# MISTAKE 4: Not using sets for uniqueness
# Bad:
seen = []
data = [1, 2, 2, 3, 3, 3, 4, 4, 4, 4]
for item in data:
    if item not in seen:  # Slow for large lists
        seen.append(item)

# Good:
seen_set = set()
for item in data:
    seen_set.add(item)  # Automatically handles uniqueness
print(f"Unique items: {seen_set}")

# MISTAKE 5: Forgetting tuples can't be modified
# Bad:
config = ("localhost", 5432)
# config[0] = "remotehost"  # This would cause an error!

# Good:
config_list = ["localhost", 5432]  # If you need to modify
config_list[0] = "remotehost"
print(f"Modified config: {config_list}")

print("\nRemember: Choose the right tool for the job!")


In [None]:
# From: recommendation_system_design.py

# From: Zero to AI Agent, Chapter 4, Section 4.6
# recommendation_system_design.py - Choosing the right data structures for a recommendation system

# CHOOSING DATA STRUCTURES FOR A RECOMMENDATION SYSTEM

# 1. User profiles - Need key-value lookup by user_id
#    → DICTIONARY
user_profiles = {
    "user123": {
        "name": "Alice",
        "joined": "2024-01-01"
    },
    "user456": {
        "name": "Bob",
        "joined": "2024-01-15"
    }
}

# 2. User interests - Need unique items per user
#    → DICTIONARY OF SETS
user_interests = {
    "user123": {"python", "ai", "web"},
    "user456": {"javascript", "web", "mobile"}
}

# 3. Content items - Need to iterate in order
#    → LIST OF DICTIONARIES
content = [
    {"id": "c1", "title": "Learn Python", "tags": {"python", "programming"}},
    {"id": "c2", "title": "Web Development", "tags": {"web", "javascript"}},
    {"id": "c3", "title": "AI Basics", "tags": {"ai", "python"}}
]

# 4. View history - Order matters for recency
#    → DICTIONARY OF LISTS
view_history = {
    "user123": ["c1", "c3"],  # Ordered by time
    "user456": ["c2"]
}

# 5. Cached recommendations - Fast lookup, immutable
#    → DICTIONARY WITH TUPLE VALUES
cached_recommendations = {
    "user123": ("c2",),  # Tuple of recommendations
    "user456": ("c1", "c3")
}

# Using our chosen structures effectively
print("Generating recommendations for user123:")

# Get user interests (SET operations)
interests = user_interests["user123"]

# Find unviewed content (SET operations)
viewed = set(view_history["user123"])

# Score each content item
recommendations = []
for item in content:
    content_id = item["id"]
    if content_id not in viewed:  # Fast SET lookup
        # Calculate relevance (SET intersection)
        relevance = len(interests & item["tags"])
        if relevance > 0:
            recommendations.append((content_id, relevance))

# Sort and display
recommendations.sort(key=lambda x: x[1], reverse=True)
print(f"Recommendations: {recommendations}")

print("\nData structure choices:")
print("- User profiles: Dictionary (fast lookup by ID)")
print("- User interests: Dict of sets (unique interests per user)")
print("- Content: List of dicts (ordered, detailed items)")
print("- View history: Dict of lists (ordered history per user)")
print("- Cached: Dict of tuples (immutable recommendations)")


---
### Section 4.6 Exercises

### Exercise 4.6.1: Task Management System

Design data structures for a task management system that needs to:
1. Store tasks with title, priority, and status
2. Track unique tags across all tasks
3. Maintain task order by creation time
4. Quickly look up tasks by ID
5. Group tasks by status

In [None]:
# Your code here


### Exercise 4.6.2: Quiz Application

Choose data structures for a quiz app that needs to:
1. Store questions with multiple choice answers
2. Track which questions a user has answered
3. Maintain question order
4. Store correct answers securely
5. Calculate scores quickly

In [None]:
# Your code here


### Exercise 4.6.3: Social Network Features

Pick data structures for social features that need to:
1. Store user friendships (bidirectional)
2. Track unique hashtags in posts
3. Maintain a feed in chronological order
4. Store user metadata
5. Find mutual friends efficiently

In [None]:
# Your code here


---
## Section 4.7: List comprehensions (gentle introduction)

In [None]:
# From: first_comprehension.py

# From: Zero to AI Agent, Chapter 4, Section 4.7
# first_comprehension.py - Your first list comprehension

# The way you know - using a loop
squares_loop = []
for x in range(5):
    squares_loop.append(x ** 2)
print("Using loop:", squares_loop)

# The SAME thing using a list comprehension
squares_comp = [x ** 2 for x in range(5)]
print("Using comprehension:", squares_comp)

# They produce the exact same result!
# [0, 1, 4, 9, 16]

# Anatomy of a list comprehension:
# [expression for item in iterable]
#  ↑          ↑        ↑
#  │          │        └── Where the items come from (range, list, etc.)
#  │          └──────────── Variable name for each item
#  └──────────────────────── What to do with each item

# More examples to build intuition
numbers = [1, 2, 3, 4, 5]

# Double each number
doubled = [n * 2 for n in numbers]
print("Doubled:", doubled)  # [2, 4, 6, 8, 10]

# Convert to strings
strings = [str(n) for n in numbers]
print("As strings:", strings)  # ['1', '2', '3', '4', '5']

# Create a list of lengths
words = ["hi", "hello", "python", "ai"]
lengths = [len(word) for word in words]
print("Word lengths:", lengths)  # [2, 5, 6, 2]


In [None]:
# From: comprehension_with_conditions.py

# From: Zero to AI Agent, Chapter 4, Section 4.7
# comprehension_with_conditions.py - Adding conditions to filter lists

# The loop way you know
evens_loop = []
for x in range(10):
    if x % 2 == 0:
        evens_loop.append(x)
print("Evens (loop):", evens_loop)

# The comprehension way with a condition
evens_comp = [x for x in range(10) if x % 2 == 0]
print("Evens (comprehension):", evens_comp)

# Both give: [0, 2, 4, 6, 8]

# The pattern with conditions:
# [expression for item in iterable if condition]
#  ↑          ↑        ↑           ↑
#  │          │        │           └── Only include if this is True
#  │          │        └──────────────── Where items come from
#  │          └────────────────────────── Variable for each item
#  └────────────────────────────────────── What to do with item

# More examples with conditions
numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

# Only positive numbers
positives = [n for n in numbers if n > 0]
print("Positives:", positives)

# Only numbers greater than 5
big_numbers = [n for n in numbers if n > 5]
print("Greater than 5:", big_numbers)  # [6, 7, 8, 9, 10]

# Squares of only even numbers
even_squares = [n ** 2 for n in numbers if n % 2 == 0]
print("Squares of evens:", even_squares)  # [4, 16, 36, 64, 100]

# Words that are longer than 3 characters
words = ["hi", "hello", "python", "ai", "code"]
long_words = [word for word in words if len(word) > 3]
print("Long words:", long_words)  # ['hello', 'python', 'code']


In [None]:
# From: transforming_data.py

# From: Zero to AI Agent, Chapter 4, Section 4.7
# transforming_data.py - Real-world examples of data transformation

# Example 1: Processing user input
user_inputs = ["  Hello  ", "WORLD ", " python", "  AI  "]

# Clean and lowercase all inputs
cleaned = [text.strip().lower() for text in user_inputs]
print("Cleaned inputs:", cleaned)  # ['hello', 'world', 'python', 'ai']

# Example 2: Extracting data
users = [
    {"name": "Alice", "age": 30},
    {"name": "Bob", "age": 25},
    {"name": "Charlie", "age": 35}
]

# Get all names
names = [user["name"] for user in users]
print("Names:", names)  # ['Alice', 'Bob', 'Charlie']

# Get names of users over 30
adults = [user["name"] for user in users if user["age"] > 30]
print("Over 30:", adults)  # ['Charlie']

# Example 3: File extensions
files = ["photo.jpg", "document.pdf", "image.png", "script.py", "data.csv"]

# Get only image files
images = [f for f in files if f.endswith((".jpg", ".png"))]
print("Images:", images)  # ['photo.jpg', 'image.png']

# Example 4: Converting temperatures
celsius = [0, 20, 30, 100]
fahrenheit = [c * 9/5 + 32 for c in celsius]
print("Fahrenheit:", fahrenheit)  # [32.0, 68.0, 86.0, 212.0]


In [None]:
# From: nested_comprehensions.py

# From: Zero to AI Agent, Chapter 4, Section 4.7
# nested_comprehensions.py - Nested loops in comprehensions

# The loop way
pairs_loop = []
for x in [1, 2, 3]:
    for y in ['a', 'b']:
        pairs_loop.append((x, y))
print("Pairs (loop):", pairs_loop)

# The comprehension way
pairs_comp = [(x, y) for x in [1, 2, 3] for y in ['a', 'b']]
print("Pairs (comprehension):", pairs_comp)

# Both give: [(1, 'a'), (1, 'b'), (2, 'a'), (2, 'b'), (3, 'a'), (3, 'b')]

# Practical example: Multiplication table
table = [i * j for i in range(1, 4) for j in range(1, 4)]
print("Multiplication:", table)  # [1, 2, 3, 2, 4, 6, 3, 6, 9]

# With better formatting
table_formatted = [(i, j, i*j) for i in range(1, 4) for j in range(1, 4)]
print("\nMultiplication table:")
for i, j, result in table_formatted:
    print(f"{i} × {j} = {result}")


In [None]:
# From: when_not_to_use.py

# From: Zero to AI Agent, Chapter 4, Section 4.7
# when_not_to_use.py - When NOT to use list comprehensions

# TOO COMPLEX - Hard to read!
# Don't do this:
# result = [x if x > 0 else -x if x < 0 else 0 for x in numbers if x != 5]

# Better as a regular loop:
numbers = [-3, -1, 0, 1, 5, 7]
result = []
for x in numbers:
    if x != 5:
        if x > 0:
            result.append(x)
        elif x < 0:
            result.append(-x)
        else:
            result.append(0)
print("Result:", result)

# SIDE EFFECTS - Don't use comprehensions just for side effects
# Bad - using comprehension just to print (creates useless list):
# [print(x) for x in range(5)]  # Don't do this!

# Good - use a regular loop for side effects:
print("Printing with loop (correct way):")
for x in range(5):
    print(x, end=" ")
print()

# TOO LONG - If it doesn't fit on one line, use a loop
# Bad - too long to read easily:
# data = [complicated_expression(x) * another_function(x) / some_calculation(x) for x in very_long_iterable_name if complex_condition(x) and another_condition(x)]

# Good - break it up:
items = [1, 2, 3, 4, 5]
data = []
for x in items:
    if x > 2 and x < 5:  # condition1 and condition2
        value = x * 2  # expression
        data.append(value)
print("Clean loop result:", data)

print("\nRemember: Readability counts! If it's hard to read, use a regular loop.")


In [None]:
# From: ai_data_science.py

# From: Zero to AI Agent, Chapter 4, Section 4.7
# ai_data_science.py - List comprehensions for AI and data science

# Preparing text data for NLP
raw_texts = ["Hello World!", "  Python AI  ", "Machine Learning"]

# Clean and tokenize
processed = [text.strip().lower().split() for text in raw_texts]
print("Tokenized:", processed)
# [['hello', 'world!'], ['python', 'ai'], ['machine', 'learning']]

# Filter out short words
filtered = [[word for word in text if len(word) > 2] for text in processed]
print("Filtered:", filtered)

# Creating feature vectors
words = ["python", "ai", "machine", "learning"]
vocabulary = ["python", "java", "ai", "machine", "learning", "code"]

# Binary features (1 if word in vocabulary, 0 if not)
features = [1 if word in words else 0 for word in vocabulary]
print("Features:", features)  # [1, 0, 1, 1, 1, 0]

# Batch processing
data = list(range(20))
batch_size = 5

# Create batches
batches = [data[i:i+batch_size] for i in range(0, len(data), batch_size)]
print("Batches:", batches)
# [[0, 1, 2, 3, 4], [5, 6, 7, 8, 9], [10, 11, 12, 13, 14], [15, 16, 17, 18, 19]]

# Normalizing scores
scores = [85, 92, 78, 95, 88]
max_score = max(scores)
normalized = [score / max_score for score in scores]
print("Normalized:", normalized)


In [None]:
# From: other_comprehensions.py

# From: Zero to AI Agent, Chapter 4, Section 4.7
# other_comprehensions.py - Dictionary and set comprehensions

# Dictionary comprehension
numbers = [1, 2, 3, 4, 5]

# Create a dictionary of squares
squares_dict = {n: n**2 for n in numbers}
print("Squares dict:", squares_dict)
# {1: 1, 2: 4, 3: 9, 4: 16, 5: 25}

# Word lengths dictionary
words = ["hello", "world", "python"]
word_lengths = {word: len(word) for word in words}
print("Word lengths:", word_lengths)
# {'hello': 5, 'world': 5, 'python': 6}

# Filtering in dictionary comprehension
scores = {"Alice": 85, "Bob": 92, "Charlie": 78, "Diana": 95}
high_scores = {name: score for name, score in scores.items() if score > 90}
print("High scores:", high_scores)
# {'Bob': 92, 'Diana': 95}

# Set comprehension
numbers = [1, 2, 2, 3, 3, 3, 4, 4, 4, 4]

# Create a set of unique squares
unique_squares = {n**2 for n in numbers}
print("Unique squares:", unique_squares)
# {1, 4, 9, 16}

# Words containing 'a'
words = ["apple", "banana", "cherry", "date"]
words_with_a = {word for word in words if 'a' in word}
print("Words with 'a':", words_with_a)
# {'apple', 'banana', 'date'}


In [None]:
# From: performance_test.py

# From: Zero to AI Agent, Chapter 4, Section 4.7
# performance_test.py - Testing if comprehensions are faster than loops

import time

# Large dataset
n = 1000000
data = list(range(n))

# Using a loop
start = time.time()
squares_loop = []
for x in data:
    squares_loop.append(x ** 2)
loop_time = time.time() - start

# Using comprehension
start = time.time()
squares_comp = [x ** 2 for x in data]
comp_time = time.time() - start

print(f"Loop time: {loop_time:.3f} seconds")
print(f"Comprehension time: {comp_time:.3f} seconds")
print(f"Comprehension is {loop_time/comp_time:.1f}x faster!")

# Comprehensions are usually faster because:
# 1. They're optimized at the C level
# 2. Less Python bytecode to interpret
# 3. No repeated append() method calls


In [None]:
# From: common_patterns_lc.py

# From: Zero to AI Agent, Chapter 4, Section 4.7
# common_patterns.py - Common patterns and recipes

# Pattern 1: Flatten a list of lists
nested = [[1, 2], [3, 4], [5, 6]]
flat = [item for sublist in nested for item in sublist]
print("Flattened:", flat)  # [1, 2, 3, 4, 5, 6]

# Pattern 2: Remove None values
data = [1, None, 2, None, 3, 4, None, 5]
clean = [x for x in data if x is not None]
print("Without None:", clean)  # [1, 2, 3, 4, 5]

# Pattern 3: Extract from nested structures
data = [
    {"name": "Alice", "score": 85},
    {"name": "Bob", "score": 92},
    {"name": "Charlie", "score": 78}
]
passing = [d["name"] for d in data if d["score"] >= 80]
print("Passing students:", passing)  # ['Alice', 'Bob']

# Pattern 4: Create enumerated pairs
items = ["a", "b", "c"]
enumerated = [(i, item) for i, item in enumerate(items)]
print("Enumerated:", enumerated)  # [(0, 'a'), (1, 'b'), (2, 'c')]

# Pattern 5: Zip multiple lists
names = ["Alice", "Bob", "Charlie"]
ages = [30, 25, 35]
combined = [(name, age) for name, age in zip(names, ages)]
print("Combined:", combined)  # [('Alice', 30), ('Bob', 25), ('Charlie', 35)]

# Pattern 6: String operations
sentences = ["Hello world", "Python programming", "AI is cool"]
word_counts = [len(s.split()) for s in sentences]
print("Word counts:", word_counts)  # [2, 2, 3]

# Pattern 7: Working with ranges
# Even numbers from 0 to 20
evens = [x for x in range(21) if x % 2 == 0]
print("Evens:", evens)

# Squares of numbers from 1 to 10
squares = [x**2 for x in range(1, 11)]
print("Squares:", squares)


In [None]:
# From: writing_tips.py

# From: Zero to AI Agent, Chapter 4, Section 4.7
# writing_tips.py - Tips for writing good comprehensions

# Good examples:
numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
words = ["hello", "world", "python", "programming", "ai"]
people = [
    {"name": "Alice", "age": 25},
    {"name": "Bob", "age": 17},
    {"name": "Charlie", "age": 30}
]

# 1. Keep them simple
squares = [x**2 for x in range(10)]
print("Simple squares:", squares[:5])

# 2. Clear variable names
uppercase = [word.upper() for word in words]
print("Uppercase words:", uppercase[:3])

# 3. Readable conditions
adults = [person for person in people if person["age"] >= 18]
print("Adults:", [p["name"] for p in adults])

# Bad examples (too complex):
# Don't do this - too hard to read:
# result = [[y for y in row if y > 0] for row in matrix if sum(row) > 10]

# When in doubt, write the loop first, then convert:
print("\nLoop to comprehension conversion:")

# Step 1: Write the loop
data = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
result_loop = []
for x in data:
    if x % 2 == 0:
        result_loop.append(x * 2)
print("Loop result:", result_loop)

# Step 2: Convert to comprehension
result_comp = [x * 2 for x in data if x % 2 == 0]
print("Comprehension result:", result_comp)

print("\nRemember:")
print("1. Keep them simple - If you need to read it twice, it's too complex")
print("2. One line only - If it wraps to multiple lines, use a regular loop")
print("3. Clear variable names - Use 'word' not 'w', 'number' not 'n'")
print("4. Don't sacrifice readability - Shorter isn't always better")
print("5. Use them for building, not printing - They create lists, not side effects")


In [None]:
# From: text_analysis.py

# From: Zero to AI Agent, Chapter 4, Section 4.7
# text_analysis.py - Real-world text analysis with comprehensions

# Sample text data
documents = [
    "Python is great for AI and machine learning",
    "Data science requires Python skills",
    "Machine learning is part of artificial intelligence",
    "Python makes data analysis easy"
]

# Tokenize all documents (split into words)
tokenized = [doc.lower().split() for doc in documents]
print("Tokenized documents:")
for i, tokens in enumerate(tokenized):
    print(f"  Doc {i}: {tokens[:5]}...")  # Show first 5 words

# Extract all unique words (vocabulary)
all_words = [word for doc in tokenized for word in doc]
vocabulary = list(set(all_words))
print(f"\nVocabulary size: {len(vocabulary)}")

# Count word frequencies
word_freq = {}
for word in all_words:
    if word not in word_freq:
        word_freq[word] = 0
    word_freq[word] += 1

# Find common words (appear more than once)
common_words = [word for word, freq in word_freq.items() if freq > 1]
print(f"Common words: {common_words}")

# Create document vectors (1 if word appears, 0 if not)
target_words = ["python", "ai", "machine", "learning", "data"]

doc_vectors = []
for doc in tokenized:
    vector = [1 if word in doc else 0 for word in target_words]
    doc_vectors.append(vector)

print("\nDocument vectors:")
print("Words:", target_words)
for i, vector in enumerate(doc_vectors):
    print(f"Doc {i}: {vector}")

# Find documents containing "python"
python_docs = [i for i, doc in enumerate(tokenized) if "python" in doc]
print(f"\nDocuments containing 'python': {python_docs}")


In [None]:
# From: datamaster.py

# From: Zero to AI Agent, Chapter 4 Challenge Project
# datamaster.py - Your Personal Data Analysis System
# A data analysis challenge using all Python data structures (without classes or functions)

print("=" * 50)
print("DATAMASTER - Personal Data Analysis System")
print("=" * 50)
print("")
print("This is a CHALLENGE PROJECT!")
print("Try to implement these features using what you've learned.")
print("")

# Initialize your data structures
datasets = {}  # Dictionary to store named datasets
unique_values = set()  # Track all unique values seen
history = []  # List of operations performed
metadata = {}  # Dictionary for dataset information
cache = {}  # Dictionary for cached results

# Sample data to work with
print("Loading sample sales data...")
sales_data = [
    {"id": 1, "product": "Widget A", "price": 29.99, "quantity": 10, "region": "North"},
    {"id": 2, "product": "Widget B", "price": 49.99, "quantity": 5, "region": "South"},
    {"id": 3, "product": "Widget A", "price": 29.99, "quantity": 8, "region": "East"},
    {"id": 4, "product": "Widget C", "price": 19.99, "quantity": 20, "region": "North"},
    {"id": 5, "product": "Widget B", "price": 49.99, "quantity": 3, "region": "West"},
]

datasets["sales"] = sales_data
metadata["sales"] = {"rows": len(sales_data), "source": "sample"}
history.append(("import", "sales", "5 records"))
print(f"Loaded {len(sales_data)} sales records")
print("")

# CHALLENGE 1: Get unique products
print("CHALLENGE 1: Find unique products")
print("-" * 40)
# Use a set to find unique products
unique_products = {record["product"] for record in sales_data}
print(f"Unique products: {unique_products}")
print("")

# CHALLENGE 2: Calculate total revenue by product
print("CHALLENGE 2: Total revenue by product")
print("-" * 40)
# Use a dictionary to accumulate totals
revenue_by_product = {}
for record in sales_data:
    product = record["product"]
    revenue = record["price"] * record["quantity"]
    revenue_by_product[product] = revenue_by_product.get(product, 0) + revenue

for product, revenue in revenue_by_product.items():
    print(f"  {product}: ${revenue:.2f}")
print("")

# CHALLENGE 3: Group sales by region
print("CHALLENGE 3: Sales by region")
print("-" * 40)
# Use dictionary with lists as values
sales_by_region = {}
for record in sales_data:
    region = record["region"]
    if region not in sales_by_region:
        sales_by_region[region] = []
    sales_by_region[region].append(record)

for region, records in sales_by_region.items():
    print(f"  {region}: {len(records)} sales")
print("")

# CHALLENGE 4: Find records with high quantities using list comprehension
print("CHALLENGE 4: High quantity sales (quantity > 5)")
print("-" * 40)
high_quantity_sales = [record for record in sales_data if record["quantity"] > 5]
for sale in high_quantity_sales:
    print(f"  {sale['product']}: {sale['quantity']} units")
print("")

# CHALLENGE 5: Calculate statistics
print("CHALLENGE 5: Price statistics")
print("-" * 40)
prices = [record["price"] for record in sales_data]
prices_sorted = sorted(prices)

total_price = 0
for p in prices:
    total_price = total_price + p

min_price = prices_sorted[0]
max_price = prices_sorted[-1]
avg_price = total_price / len(prices)

print(f"  Minimum price: ${min_price:.2f}")
print(f"  Maximum price: ${max_price:.2f}")
print(f"  Average price: ${avg_price:.2f}")
print("")

# Show operation history
print("=" * 50)
print("OPERATION HISTORY")
print("=" * 50)
for operation, dataset, details in history:
    print(f"  [{operation}] {dataset}: {details}")

print("")
print("=" * 50)
print("YOUR CHALLENGES:")
print("=" * 50)
print("1. Add more data to the datasets dictionary")
print("2. Create a query system using list comprehensions")
print("3. Implement a caching system for expensive operations")
print("4. Add customer data and find correlations with sales")
print("5. Generate a comprehensive report using all data structures")
print("=" * 50)


---
### Section 4.7 Exercises

### Exercise 4.7.1: Data Cleaning

You have messy data that needs cleaning:
- A list of strings with extra spaces: ["  hello  ", " WORLD ", "  Python  "]
- Create a cleaned list with stripped, lowercase strings
- Filter out any strings shorter than 4 characters

In [None]:
# Your code here


### Exercise 4.7.2: Grade Processing

Given a list of student records:
```python
students = [
    {"name": "Alice", "grade": 85},
    {"name": "Bob", "grade": 92},
    {"name": "Charlie", "grade": 78},
    {"name": "Diana", "grade": 95},
    {"name": "Eve", "grade": 88}
]
```
Use comprehensions to:
- Extract all names
- Get names of students with grades \>= 90
- Create a dictionary of name: grade pairs
- Calculate letter grades (A if \>= 90, B if \>= 80, C otherwise)

In [None]:
# Your code here


### Exercise 4.7.3: Number Processing

Given numbers = range(1, 21):
- Create a list of all even numbers
- Create a list of squares of odd numbers
- Create a dictionary where keys are numbers and values are "even" or "odd"
- Create a list of tuples (number, square, cube) for numbers 1-10

In [None]:
# Your code here


---
## Next Steps

- Check your answers in **chapter_04_data_structures_solutions.ipynb**
- Proceed to **Chapter 5**