# üß™ Week 3 In-Class Python Lab
## Data Structures + `for` Loop Patterns

### Lab Goals
In this lab you will practice:
- Core Python data structures: **lists, tuples, sets, dictionaries**
- Unique traits & use cases (including **mutability vs. immutability**)
- `for` loop patterns: **Count, Sum, Accumulate, Map**
- Advanced patterns: **Filter, Find, Take, Min/Max**

> ‚úÖ Tip: Work top-to-bottom. Keep a "scratch" cell open for experimenting.


## Setup: Sample Data
Run this cell first. We'll reuse these values throughout the lab.

In [1]:
# Sample data
transactions = [
    {"id": 101, 
     "user": "Ava",
    "amount": 19.99,
    "category": "books",
    "city": "Boston"},
    
    {"id": 102, "user": "Liam",  "amount": 5.49,  "category": "coffee",  "city": "New York"},
    {"id": 103, "user": "Noah",  "amount": 120.00,"category": "tech",    "city": "Boston"},
    {"id": 104, "user": "Emma",  "amount": 5.49,  "category": "coffee",  "city": "New York"},
    {"id": 105, "user": "Olivia","amount": 45.00, "category": "books",   "city": "Chicago"},
    {"id": 106, "user": "Ava",   "amount": 9.99,  "category": "music",   "city": "Boston"},
    {"id": 107, "user": "Sophia","amount": 250.00,"category": "tech",    "city": "Seattle"},
    {"id": 108, "user": "Liam",  "amount": 15.00, "category": "books",   "city": "New York"},
]

raw_tags = ["  DATA ", "python", "SQL", "python", " Stats ", "DATA", "viz ", "sql", "PYTHON "]

product_skus = ("B-100", "C-200", "T-300", "B-101")  # tuple (immutable sequence)

print("transactions:", len(transactions))
print("raw_tags:", raw_tags)
print("product_skus:", product_skus)

transactions: 8
raw_tags: ['  DATA ', 'python', 'SQL', 'python', ' Stats ', 'DATA', 'viz ', 'sql', 'PYTHON ']
product_skus: ('B-100', 'C-200', 'T-300', 'B-101')


---
## 1) Data Structures: Traits & Use Cases

### Quick Reference
- **List**: ordered, allows duplicates, **mutable** (can change)
- **Tuple**: ordered, allows duplicates, **immutable** (cannot change)
- **Set**: unordered, **no duplicates**, mutable container (items must be hashable)
- **Dict**: key ‚Üí value mapping, keys unique, **mutable**

You will see these properties show up in the problems below.

### Problem 1A ‚Äî Lists (Mutable, Ordered)

**Task:**
1. Create a list called `amounts` containing all transaction amounts.
2. Append a new amount `3.50`.
3. Replace the first value in the list with `0.0`.

This problem should demonstrate **mutability**.

In [18]:
transactions[0]

{'id': 101,
 'user': 'Ava',
 'amount': 19.99,
 'category': 'books',
 'city': 'Boston'}

In [22]:
transactions[1].items()

dict_items([('id', 102), ('user', 'Liam'), ('amount', 5.49), ('category', 'coffee'), ('city', 'New York')])

In [21]:
transactions[t]['amount']

t = transactions[0]['amounts']

5.49

In [24]:
# TODO: Build amounts from transactions
amounts = []

for t in transactions:
    amounts.append(t['amount'])

# TODO: Append 3.50


# TODO: Replace the first value with 0.0

print(amounts)

[19.99, 5.49, 120.0, 5.49, 45.0, 9.99, 250.0, 15.0]


In [26]:
amounts.append(3.50)

In [27]:
amounts

[19.99, 5.49, 120.0, 5.49, 45.0, 9.99, 250.0, 15.0, 3.5]

In [29]:
amounts[0] = 0.0

In [30]:
amounts

[0.0, 5.49, 120.0, 5.49, 45.0, 9.99, 250.0, 15.0, 3.5]

In [31]:
# Check (Problem 1A)
try:
    assert isinstance(amounts, list)
    assert amounts[-1] == 3.50
    assert amounts[0] == 0.0
    print("‚úÖ Problem 1A looks good.")
except NameError:
    print("‚ùå amounts is not defined yet.")
except AssertionError:
    print("‚ùå Check failed. Make sure you appended 3.50 and replaced the first element with 0.0.")

‚úÖ Problem 1A looks good.


### Problem 1B ‚Äî Tuples (Immutable, Ordered)

**Task:**
1. Print the first SKU in `product_skus`.
2. Try (on purpose) to change the first SKU.
3. Then do it the "right way": create a **new** tuple with an updated first SKU (e.g., change to `"B-999"`).

This should demonstrate **immutability**.

In [None]:
# TODO 1: Print the first SKU

# TODO 2: Try to modify the tuple (expect an error). Uncomment to test.
# product_skus[0] = "B-999"

# TODO 3: Create a new tuple with the first SKU replaced
# updated_skus = ...

# print("original:", product_skus)
# print("updated :", updated_skus)

In [None]:
# Check (Problem 1B)
try:
    assert isinstance(updated_skus, tuple)
    assert updated_skus[0] == "B-999"
    assert product_skus[0] == "B-100"  # original unchanged
    print("‚úÖ Problem 1B looks good.")
except NameError:
    print("‚ùå updated_skus is not defined yet.")
except AssertionError:
    print("‚ùå Check failed. Make sure you created updated_skus as a new tuple and kept product_skus unchanged.")

### Problem 1C ‚Äî Sets (Unique Items)

**Task:** Clean and deduplicate the `raw_tags` list.

Requirements:
- Strip whitespace (`.strip()`)
- Standardize to lowercase (`.lower()`)
- Produce a **set** called `unique_tags`

Use case: sets are great for **membership tests** and **deduplication**.

In [None]:
# TODO: Build unique_tags as a set of cleaned tags
# unique_tags = ...

# print(unique_tags)
# print("contains 'python'?", 'python' in unique_tags)

In [None]:
# Check (Problem 1C)
try:
    assert isinstance(unique_tags, set)
    expected = {"data", "python", "sql", "stats", "viz"}
    assert unique_tags == expected
    print("‚úÖ Problem 1C looks good:", unique_tags)
except NameError:
    print("‚ùå unique_tags is not defined yet.")
except AssertionError:
    print("‚ùå Check failed. Ensure you stripped + lowercased and used a set.")

### Problem 1D ‚Äî Dictionaries (Key ‚Üí Value)

**Task:** Build a dictionary `spend_by_user` that maps each user to their **total spend**.

Example shape:
```python
{
  'Ava': 29.98,
  'Liam': 20.49,
  ...
}
```

Use case: dictionaries are ideal for **grouping** / **counting** / **totals by category**.

In [None]:
# TODO: Build spend_by_user
# spend_by_user = {}
# for t in transactions:
#     ...

# print(spend_by_user)

In [None]:
# Check (Problem 1D)
try:
    assert isinstance(spend_by_user, dict)
    # expected totals
    expected = {
        "Ava": 19.99 + 9.99,
        "Liam": 5.49 + 15.00,
        "Noah": 120.00,
        "Emma": 5.49,
        "Olivia": 45.00,
        "Sophia": 250.00
    }
    # float-safe comparison
    for k, v in expected.items():
        assert abs(spend_by_user.get(k, 0) - v) < 1e-9
    print("‚úÖ Problem 1D looks good.")
except NameError:
    print("‚ùå spend_by_user is not defined yet.")
except AssertionError:
    print("‚ùå Check failed. Ensure you're summing amounts per user correctly.")

---
## 2) `for` Loop Patterns (Core)

We'll practice classic patterns you will use constantly in data analytics.


### Problem 2A ‚Äî Count Pattern

**Task:** Count how many transactions happened in **Boston**.

Create an integer variable `boston_count`.

In [None]:
# TODO: Count Boston transactions
# boston_count = 0
# for t in transactions:
#     ...

# print(boston_count)

In [None]:
# Check (Problem 2A)
try:
    assert boston_count == 3
    print("‚úÖ Problem 2A correct. boston_count =", boston_count)
except NameError:
    print("‚ùå boston_count is not defined yet.")
except AssertionError:
    print("‚ùå Check failed. Recount Boston transactions.")

### Problem 2B ‚Äî Sum Pattern

**Task:** Compute the total spend across **all** transactions.

Create a float variable `total_spend`.

In [None]:
# TODO: Sum all amounts
# total_spend = 0.0
# for t in transactions:
#     ...

# print(total_spend)

In [None]:
# ‚úÖ Check (Problem 2B)
try:
    expected_total = sum(t["amount"] for t in transactions)
    assert abs(total_spend - expected_total) < 1e-9
    print("‚úÖ Problem 2B correct. total_spend =", total_spend)
except NameError:
    print("‚ùå total_spend is not defined yet.")
except AssertionError:
    print("‚ùå Check failed. Ensure you're adding each transaction amount.")

### Problem 2C ‚Äî Accumulate Pattern (Running Total)

**Task:** Build a list `running_totals` where each element is the running total of amounts.

Example (if amounts were `[2, 5, 1]`):
`running_totals` would be `[2, 7, 8]`

In [None]:
# TODO: Running totals
# running_totals = []
# current = 0.0
# for t in transactions:
#     ...

# print(running_totals)

In [None]:
# Check (Problem 2C)
try:
    assert isinstance(running_totals, list)
    expected = []
    cur = 0.0
    for t in transactions:
        cur += t["amount"]
        expected.append(cur)
    assert len(running_totals) == len(expected)
    for a, b in zip(running_totals, expected):
        assert abs(a - b) < 1e-9
    print("‚úÖ Problem 2C correct.")
except NameError:
    print("‚ùå running_totals is not defined yet.")
except AssertionError:
    print("‚ùå Check failed. Ensure you update a running variable and append each step.")

### Problem 2D ‚Äî Map Pattern (Transform)

**Task:** Create a new list `amounts_cents` that converts each transaction amount to **integer cents**.

Example: `19.99 ‚Üí 1999`

‚úÖ Tip: `int(round(amount * 100))` helps avoid float artifacts.

In [None]:
# TODO: Map amounts to cents
# amounts_cents = []
# for t in transactions:
#     ...

# print(amounts_cents[:5])

In [None]:
# ‚úÖ Check (Problem 2D)
try:
    expected = [int(round(t["amount"] * 100)) for t in transactions]
    assert amounts_cents == expected
    print("‚úÖ Problem 2D correct.")
except NameError:
    print("‚ùå amounts_cents is not defined yet.")
except AssertionError:
    print("‚ùå Check failed. Ensure you're converting each amount to integer cents correctly.")

---
## 3) `for` Loop Patterns (Advanced)

These patterns are extremely common when scanning data.


###  Problem 3A ‚Äî Filter Pattern

**Task:** Build a list `large_txns` containing only transactions with amount **>= 50**.

Then print their IDs.

In [None]:
# TODO: Filter large transactions
# large_txns = []
# for t in transactions:
#     ...

# print([t['id'] for t in large_txns])

In [None]:
# ‚úÖ Check (Problem 3A)
try:
    ids = [t["id"] for t in large_txns]
    assert ids == [103, 107]
    print("‚úÖ Problem 3A correct. large_txns IDs =", ids)
except NameError:
    print("‚ùå large_txns is not defined yet.")
except AssertionError:
    print("‚ùå Check failed. Filter should keep only amounts >= 50.")

###  Problem 3B ‚Äî Find Pattern (First Match)

**Task:** Find the **first** transaction in New York with category `"books"`.

Store the result in `first_ny_books` (or `None` if not found).

‚úÖ Hint: use a loop + `break` once found.

In [None]:
# TODO: Find first NY books transaction
# first_ny_books = None
# for t in transactions:
#     ...

# print(first_ny_books)

In [None]:
# ‚úÖ Check (Problem 3B)
try:
    assert first_ny_books is not None
    assert first_ny_books["id"] == 108
    print("‚úÖ Problem 3B correct. Found ID:", first_ny_books["id"])
except NameError:
    print("‚ùå first_ny_books is not defined yet.")
except AssertionError:
    print("‚ùå Check failed. The first NY books transaction should be ID 108.")

###  Problem 3C ‚Äî Take Pattern (First N Matches)

**Task:** Collect the **first 2** transactions in category `"coffee"`.

Store them in a list called `first_two_coffee`.

‚úÖ Hint: append matches and stop when length reaches 2.

In [None]:
# TODO: Take first 2 coffee transactions
# first_two_coffee = []
# for t in transactions:
#     ...

# print([t['id'] for t in first_two_coffee])

In [None]:
# ‚úÖ Check (Problem 3C)
try:
    ids = [t["id"] for t in first_two_coffee]
    assert ids == [102, 104]
    print("‚úÖ Problem 3C correct. IDs:", ids)
except NameError:
    print("‚ùå first_two_coffee is not defined yet.")
except AssertionError:
    print("‚ùå Check failed. The first two coffee IDs should be 102 and 104.")

###  Problem 3D ‚Äî Min/Max Pattern

**Task:** Find:
- `min_txn`: the transaction with the smallest amount
- `max_txn`: the transaction with the largest amount

Do this with a loop (no `min()` / `max()` yet).

In [None]:
# TODO: Min/Max transaction by amount
# min_txn = None
# max_txn = None
# for t in transactions:
#     ...

# print("min:", min_txn)
# print("max:", max_txn)

In [None]:
# Check (Problem 3D)
try:
    assert min_txn["id"] == 102
    assert max_txn["id"] == 107
    print("‚úÖ Problem 3D correct. min_id=", min_txn["id"], "max_id=", max_txn["id"])
except NameError:
    print("‚ùå min_txn / max_txn not defined yet.")
except Exception:
    print("‚ùå Check failed. Ensure min_txn and max_txn are transactions (dicts) with correct IDs.")

---
## 4) Data Structures + Loop Patterns Together

Now we combine structures and loop patterns in realistic analytics tasks.


###  Problem 4A ‚Äî Count by Category (Dictionary)

**Task:** Build a dictionary `count_by_category` mapping each category to the number of transactions.

Expected keys include: `books`, `coffee`, `tech`, `music`.

In [None]:
# TODO: Count by category
# count_by_category = {}
# for t in transactions:
#     ...

# print(count_by_category)

In [None]:
# Check (Problem 4A)
try:
    expected = {"books": 3, "coffee": 2, "tech": 2, "music": 1}
    assert count_by_category == expected
    print("‚úÖ Problem 4A correct.")
except NameError:
    print("‚ùå count_by_category is not defined yet.")
except AssertionError:
    print("‚ùå Check failed. Ensure you're counting each category correctly.")

###  Problem 4B ‚Äî Unique Users per City (Dict of Sets)

**Task:** Build a dictionary `users_by_city` where:
- key = city
- value = set of unique users in that city

This showcases sets (uniqueness) + dictionaries (grouping).

In [None]:
# TODO: users_by_city as dict[str, set[str]]
# users_by_city = {}
# for t in transactions:
#     ...

# print(users_by_city)

In [None]:
# Check (Problem 4B)
try:
    assert isinstance(users_by_city, dict)
    assert users_by_city["Boston"] == {"Ava", "Noah"}
    assert users_by_city["New York"] == {"Liam", "Emma"}
    assert users_by_city["Chicago"] == {"Olivia"}
    assert users_by_city["Seattle"] == {"Sophia"}
    print("‚úÖ Problem 4B correct.")
except NameError:
    print("‚ùå users_by_city is not defined yet.")
except Exception:
    print("‚ùå Check failed. Ensure values are sets and users are unique per city.")

###  Problem 4C ‚Äî Map + Filter Combined

**Task:** Create a list `boston_amounts_cents` containing amounts in **cents** for transactions in **Boston** only.

This is a common real-world pattern: filter then transform.

In [None]:
# TODO: Filter to Boston, then map to cents
# boston_amounts_cents = []
# for t in transactions:
#     ...

# print(boston_amounts_cents)

In [None]:
# Check (Problem 4C)
try:
    expected = [int(round(19.99*100)), int(round(120.00*100)), int(round(9.99*100))]
    assert boston_amounts_cents == expected
    print("‚úÖ Problem 4C correct.")
except NameError:
    print("‚ùå boston_amounts_cents is not defined yet.")
except AssertionError:
    print("‚ùå Check failed. Ensure you're filtering Boston and converting to cents.")

---
## 5) Mutability vs. Immutability (Mini Investigations)

These quick experiments reinforce what can/can't be changed in place.


### üß™ Investigation 5A ‚Äî List aliasing (mutable gotcha)

**Task:**
1. Create `a = [1, 2, 3]`
2. Set `b = a`
3. Modify `b[0] = 999`
4. Print `a` and `b`

**Question:** Why did both change?

In [None]:
# TODO: Run the aliasing experiment
# a = ...
# b = ...
# ...
# print("a:", a)
# print("b:", b)

# Write a 1-line explanation as a comment:
# Because ...

### üß™ Investigation 5B ‚Äî Safe copy (avoid aliasing)

**Task:**
1. Create `a = [1, 2, 3]`
2. Set `b = a.copy()` (or `b = a[:]`)
3. Modify `b[0] = 999`
4. Print `a` and `b`

**Question:** Why is `a` unchanged now?

In [None]:
# TODO: Run the safe copy experiment
# a = ...
# b = ...
# ...
# print("a:", a)
# print("b:", b)

# Write a 1-line explanation as a comment:
# Because ...

---
## 6) Challenge (Optional)

### Challenge ‚Äî "Top Category by City"

**Task:** Build a dictionary `top_category_by_city` where each city maps to the category with the highest total spend.

Example output shape:
```python
{
  'Boston': 'tech',
  'New York': 'books',
  ...
}
```

Suggested approach:
1. Build `spend_by_city_category` as a nested dict: `{city: {category: total_spend}}`
2. For each city, scan the inner dict to find the max category.

This uses dicts + loops + min/max patterns.


In [None]:
# OPTIONAL CHALLENGE
# top_category_by_city = {}
# spend_by_city_category = {}
# for t in transactions:
#     ...

# for city, cat_totals in spend_by_city_category.items():
#     ...

# print(top_category_by_city)

---
## Wrap-Up Reflection
Answer briefly (1‚Äì2 sentences each):

1. When would you choose a **set** over a **list**?
2. When is a **tuple** a better choice than a **list**?
3. Which loop pattern (Count / Sum / Filter / Find / Take / Min/Max) felt most useful today, and why?

### Takeaway
Most data work is: **scan ‚Üí filter ‚Üí transform ‚Üí summarize**. Today you practiced the exact building blocks.


In [None]:
# Reflection (optional)
# 1)
# 2)
# 3)


Uh oh! I needed to add this line of markdown to the lab! It's super important!