# Advanced Python concepts (Demonstration)

_This notebook provides a very basic introduction to a selection of advanced Python concepts._

Note: This Jupyter Notebook was originally compiled by Alex Reppel (AR) based on conversations with [ClaudeAI](https://claude.ai/) *(version 3.5 Sonnet)*. For this year's materials, further revisions were made using [Claude Code](https://www.anthropic.com/claude-code) *(Opus 4.1)*, including updated documentation and git commit messages.

---

## 🎯 CORE CONTENT (Essential for Exercises)

**Estimated time**: 60-75 minutes

All sections in this notebook are essential for completing the Week 02 exercises. The content introduces advanced Python techniques that make your code more concise and powerful:
- List comprehensions (concise list creation)
- Lambda functions (anonymous functions)
- Map, filter, reduce (functional programming)
- Error handling (try/except blocks)
- Advanced file I/O (CSV and JSON)

Work through each section carefully. These techniques will be used extensively in later weeks.

---

## List comprehensions

**From Week 01**, you learned to create lists using loops and `.append()`:

In [None]:
squares = []
for x in range(5):
    squares.append(x**2)
# Result: [0, 1, 4, 9, 16]

**New technique**: List comprehensions provide a concise, one-line alternative:


In [None]:
squares = [x**2 for x in range(5)]
# Same result: [0, 1, 4, 9, 16]

They're more readable and often more efficient than traditional loops. Let's explore their power:

## List comprehensions

List comprehensions are a concise way to create lists in Python.

They provide a more readable and often more efficient alternative to using loops and `append()` method calls.

In [None]:
# List comprehension with condition
even_squares = [x**2 for x in range(10) if x % 2 == 0]
print("Even squares:", even_squares)

## Lambda functions

Lambda functions, also known as anonymous functions, are small, one-time-use ('throw away') functions that can be defined without using the `def` keyword.

**Building on Week 01 functions**: In Week 01, you learned to define functions with `def`:

In [None]:
def square(x):
    return x ** 2

# Print result
square(5)

**New shorthand**: Lambda lets you write tiny, one-line functions:

In [None]:
square = lambda x: x ** 2

# Print result
square(5)

**When to use lambda:**
- For very simple operations (one expression)
- As a quick function argument (you'll see this next)

**When NOT to use lambda:**
- For anything complex (use regular `def` functions)
- When you need multiple lines

**Bottom line**: Regular `def` functions are clearer and preferred. Lambda is just handy occasionally.

### Simple lambda examples

In [None]:
square = lambda x: x ** 2
print("Square of 5:", square(5))

add = lambda x, y: x + y
print("3 + 7 =", add(3, 7))

### Lambda is useful when you need a quick function for sorting

In [None]:
students = [{"name": "Alice", "grade": 85}, {"name": "Bob", "grade": 92}, {"name": "Carol", "grade": 78}]
sorted_students = sorted(students, key=lambda s: s["grade"], reverse=True)
print("Sorted by grade:", [s["name"] for s in sorted_students])

## Map and filter for built-in list processing

**Building on Week 01 loops**: In Week 01, you learned to process lists with for loops:

In [None]:
# Square all numbers
numbers = [1, 2, 3, 4, 5]
squared = []
for x in numbers:
    squared.append(x ** 2)

# Print result
squared

**Alternative approaches**: Python provides built-in functions for common list operations:
- **`map()`** - apply a function to every item
- **`filter()`** - keep only items that pass a test

**Important**: List comprehensions (which you just learned) are usually clearer and preferred in Python. These functions are shown for completeness, but you'll mostly use list comprehensions.

Let's see quick examples:

In [None]:
numbers = [1, 2, 3, 4, 5]

# Map
squared = list(map(lambda x: x**2, numbers))
print("Squared numbers:", squared)

# Filters
even = list(filter(lambda x: x % 2 == 0, numbers))
print("Even numbers:", even)

# Combining map and filter
odd_squares = list(map(lambda x: x**2, filter(lambda x: x % 2 != 0, numbers)))
print("Squares of odd numbers:", odd_squares)

### Explanation:
- `map()` applies a function to every item (like a for loop with append)
- `filter()` creates a list of items that pass a test (like a for loop with if condition)
- **But**: List comprehensions do the same thing more clearly
- **Best practice**: Use list comprehensions in your code

This comparison shows you that map/filter exist, but you'll write clearer code with list comprehensions.

## Error handling

Error handling in Python is done using try-except blocks, which allow you to gracefully handle exceptions that might occur during program execution.

In [None]:
# Validating customer data
def validate_customer(customer_data):
    """
    Validate customer data and return list of errors
    """
    errors = []
    
    # Check required fields
    if "name" not in customer_data or not customer_data["name"]:
        errors.append("Name is required")
    
    # Check age is a number and in valid range
    age = customer_data.get("age")
    if age is None:
        errors.append("Age is required")
    elif not isinstance(age, (int, float)):
        errors.append(f"Age must be a number, got {type(age).__name__}")
    elif age < 0 or age > 120:
        errors.append(f"Age {age} is out of valid range (0-120)")
    
    # Check price is positive
    if "total_spent" in customer_data:
        spent = customer_data["total_spent"]
        if not isinstance(spent, (int, float)):
            errors.append(f"Total spent must be a number")
        elif spent < 0:
            errors.append(f"Total spent cannot be negative: £{spent}")
    
    return errors

# Test with valid data
valid_customer = {"name": "Alice Johnson", "age": 30, "total_spent": 150}
errors = validate_customer(valid_customer)
print(f"Valid customer errors: {errors if errors else 'None - data is valid!'}")

# Test with invalid data
invalid_customer = {"name": "", "age": 150, "total_spent": -50}
errors = validate_customer(invalid_customer)
print(f"\nInvalid customer errors:")
for error in errors:
    print(f"  - {error}")

## Data validation

When processing data, you need to check if it's valid before using it. This prevents errors and catches data quality issues early.

Common validation checks:
- **Type checking**: Is this a number or text?
- **Range validation**: Is age between 0 and 120?
- **Required fields**: Is this value missing?
- **Format validation**: Does this look like an email?

Let's learn defensive programming techniques:

In [None]:
# Parsing email addresses
email = "alice.johnson@example.co.uk"

# Extract username and domain
username, domain = email.split("@")
print(f"Username: {username}")
print(f"Domain: {domain}")

# Extract first and last name from username
first, last = username.split(".")
print(f"First name: {first.title()}")
print(f"Last name: {last.title()}")

# Parsing phone numbers
phone = "  (020) 7946-0958  "
# Remove spaces, parentheses, and hyphens
cleaned_phone = phone.strip().replace(" ", "").replace("(", "").replace(")", "").replace("-", "")
print(f"\nCleaned phone: {cleaned_phone}")

In [None]:
# Cleaning messy customer names
messy_names = [
    "  alice  johnson  ",
    "BOB SMITH",
    "  Carol   Lee  ",
    "david NGUYEN"
]

cleaned_names = []
for name in messy_names:
    # 1. Remove extra whitespace
    name = name.strip()
    # 2. Replace multiple spaces with single space
    name = " ".join(name.split())
    # 3. Title case (capitalize first letter of each word)
    name = name.title()
    cleaned_names.append(name)

print("Cleaned names:")
for name in cleaned_names:
    print(f"  '{name}'")

## String processing for data cleaning

**Building on Week 01**: In Week 01, you learned basic string methods like `.lower()`, `.upper()`, `.strip()`, and `.split()`.

Real-world data is messy! You'll constantly need to clean:
- Extra whitespace
- Inconsistent capitalization
- Parsing names, addresses, phone numbers
- Extracting information from text

Let's learn practical string cleaning techniques:

In [None]:
# Practical pattern: Grouping data by category
sales = [
    {"product": "Laptop", "category": "Electronics", "amount": 999},
    {"product": "Mouse", "category": "Electronics", "amount": 25},
    {"product": "Desk", "category": "Furniture", "amount": 250},
    {"product": "Chair", "category": "Furniture", "amount": 150},
    {"product": "Keyboard", "category": "Electronics", "amount": 75}
]

# Group by category
by_category = {}
for sale in sales:
    category = sale["category"]
    if category not in by_category:
        by_category[category] = []
    by_category[category].append(sale)

print("Sales by category:")
for category, items in by_category.items():
    total = sum(item["amount"] for item in items)
    print(f"{category}: {len(items)} items, £{total} total")

In [None]:
# Dictionary comprehensions - like list comprehensions, but for dictionaries
products = ["Laptop", "Mouse", "Keyboard", "Monitor"]
prices = [999, 25, 75, 250]

# Create a price lookup dictionary
price_lookup = {product: price for product, price in zip(products, prices)}
print("Price lookup:", price_lookup)
print(f"Mouse costs: £{price_lookup['Mouse']}")

# Filter while creating dictionary - only expensive items
expensive_items = {product: price for product, price in zip(products, prices) if price > 100}
print("\nExpensive items:", expensive_items)

In [None]:
# Nested dictionaries - customer database
customers = {
    "C001": {
        "name": "Alice Johnson",
        "email": "alice@example.com",
        "orders": [{"item": "Laptop", "price": 999}, {"item": "Mouse", "price": 25}],
        "total_spent": 1024
    },
    "C002": {
        "name": "Bob Smith",
        "email": "bob@example.com",
        "orders": [{"item": "Keyboard", "price": 75}],
        "total_spent": 75
    }
}

# Accessing nested data
print(f"Customer C001: {customers['C001']['name']}")
print(f"First order item: {customers['C001']['orders'][0]['item']}")
print(f"C001 total spent: £{customers['C001']['total_spent']}")

## Working with dictionaries - advanced patterns

**Building on Week 01**: In Week 01, you learned dictionary basics:
```python
person = {"name": "Alice", "age": 30}
print(person["name"])
```

This week, we'll explore more sophisticated dictionary patterns you'll use constantly in data analysis:
- Nested dictionaries (complex data structures)
- Dictionary comprehensions (creating dictionaries efficiently)
- Practical patterns (grouping data, lookup tables)
- Working with JSON-like structures

In [None]:
def divide(a, b):
    try:
        result = a / b
    except ZeroDivisionError:
        print("Error: Cannot divide by zero!")
        result = None
    except TypeError:
        print("Error: Invalid input types!")
        result = None
    else:
        print("Division successful!")
    finally:
        print("Division operation completed.")
    return result

print(divide(10, 2))
print(divide(10, 0))
print(divide("10", 2))

### Explanation:
- The `try` block contains code that might raise an exception.
- `except` blocks handle specific exceptions (`ZeroDivisionError`, `TypeError`).
- The `else` block executes if no exception occurs.
- The `finally` block always executes, regardless of whether an exception occurred.

## File I/O

File Input/Output operations in Python allow you to read from and write to files on your computer.

Note: We'll use the pre-existing example files in the `assets/data/` directory.

In [None]:
# First, let's see what example files we have
import os
print("Files in assets/data/:")
for file in os.listdir("assets/data/"):
    print(f"  - {file}")

In [None]:
# Reading from an existing file
print("Reading entire file:")
with open("assets/data/example.txt", "r") as f:
    content = f.read()
    print(content)

print("\nReading line by line:")
with open("assets/data/example.txt", "r") as f:
    for line in f:
        print(f"Line: {line.strip()}")

# Writing to a new file in the output directory
os.makedirs("output", exist_ok=True)
with open("output/my_output.txt", "w") as f:
    f.write("This is my analysis output\n")
    f.write("Processing complete!\n")
    print("Created output/my_output.txt")

# Appending to our output file
with open("output/my_output.txt", "a") as f:
    f.write("Additional results added.\n")
    print("Appended to output/my_output.txt")

print("\nFinal file contents:")
with open("output/my_output.txt", "r") as f:
    print(f.read())

### Explanation:
- The `with` statement ensures the file is properly closed after operations.
- "w" mode opens the file for writing, overwriting existing content.
- "r" mode opens the file for reading.
- "a" mode opens the file for appending, adding new content to the end.