# File Location: docs/notebooks/02_data_structures.ipynb

# Python Data Structures - Interactive Learning Notebook

Welcome to Python Data Structures! This notebook explores Python's built-in data structures and their practical applications.

## Learning Objectives

After completing this notebook, you will understand:

- Lists: creation, manipulation, and methods
- Tuples: immutable sequences and their use cases
- Dictionaries: key-value pairs and efficient lookups
- Sets: unique collections and set operations
- Strings: advanced string manipulation techniques
- When to use each data structure type
- Performance characteristics of different structures

## Table of Contents

1. [Lists - Dynamic Arrays](#lists)
2. [Tuples - Immutable Sequences](#tuples)
3. [Dictionaries - Key-Value Pairs](#dictionaries)
4. [Sets - Unique Collections](#sets)
5. [Strings - Text Processing](#strings)
6. [Data Structure Comparison](#comparison)
7. [Nested Data Structures](#nested-structures)
8. [Performance and Best Practices](#performance)
9. [Real-World Applications](#applications)

---

## 1. Lists - Dynamic Arrays

### List Basics

```python
print("Lists in Python:")
print("=" * 16)

# Creating lists
print("1. Creating Lists:")
empty_list = []
numbers = [1, 2, 3, 4, 5]
mixed_list = [1, "hello", 3.14, True, [1, 2, 3]]
fruits = ["apple", "banana", "cherry", "date"]

print(f"   Empty list: {empty_list}")
print(f"   Numbers: {numbers}")
print(f"   Mixed types: {mixed_list}")
print(f"   Fruits: {fruits}")

# List properties
print(f"\n2. List Properties:")
print(f"   Length of fruits: {len(fruits)}")
print(f"   Type: {type(fruits)}")
print(f"   First fruit: {fruits[0]}")
print(f"   Last fruit: {fruits[-1]}")

# List indexing and slicing
print(f"\n3. Indexing and Slicing:")
sample_list = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
print(f"   List: {sample_list}")
print(f"   Element at index 3: {sample_list[3]}")
print(f"   Elements 2-5: {sample_list[2:6]}")
print(f"   First 4 elements: {sample_list[:4]}")
print(f"   Last 3 elements: {sample_list[-3:]}")
print(f"   Every 2nd element: {sample_list[::2]}")
print(f"   Reversed list: {sample_list[::-1]}")
```

### List Methods and Operations

```python
print("\nList Methods and Operations:")
print("=" * 28)

# Starting with a sample list
shopping_cart = ["apples", "bread", "milk"]
print(f"Initial cart: {shopping_cart}")

# Adding elements
print("\n1. Adding Elements:")
shopping_cart.append("eggs")  # Add to end
print(f"   After append('eggs'): {shopping_cart}")

shopping_cart.insert(1, "bananas")  # Insert at specific position
print(f"   After insert(1, 'bananas'): {shopping_cart}")

shopping_cart.extend(["cheese", "yogurt"])  # Add multiple items
print(f"   After extend(['cheese', 'yogurt']): {shopping_cart}")

# Removing elements
print("\n2. Removing Elements:")
removed_item = shopping_cart.pop()  # Remove and return last item
print(f"   Popped item: {removed_item}")
print(f"   Cart after pop(): {shopping_cart}")

shopping_cart.remove("bread")  # Remove specific item
print(f"   After remove('bread'): {shopping_cart}")

# Finding elements
print("\n3. Finding Elements:")
if "milk" in shopping_cart:
    milk_index = shopping_cart.index("milk")
    print(f"   'milk' found at index: {milk_index}")

print(f"   Count of 'apples': {shopping_cart.count('apples')}")

# List modification
print("\n4. List Modification:")
numbers = [3, 1, 4, 1, 5, 9, 2, 6]
print(f"   Original: {numbers}")

numbers.sort()  # Sort in place
print(f"   After sort(): {numbers}")

numbers.reverse()  # Reverse in place
print(f"   After reverse(): {numbers}")

# List copying
print("\n5. List Copying:")
original = [1, 2, 3]
shallow_copy = original.copy()  # or original[:]
deep_reference = original

original.append(4)
print(f"   Original: {original}")
print(f"   Shallow copy: {shallow_copy}")
print(f"   Reference: {deep_reference}")
```

### List Comprehensions

```python
print("\nList Comprehensions:")
print("=" * 19)

# Basic list comprehension
print("1. Basic Syntax:")
squares = [x**2 for x in range(1, 6)]
print(f"   Squares: {squares}")

# Traditional approach vs list comprehension
print("\n2. Traditional vs Comprehension:")
# Traditional way
traditional_evens = []
for i in range(10):
    if i % 2 == 0:
        traditional_evens.append(i)

# List comprehension way
comprehension_evens = [i for i in range(10) if i % 2 == 0]

print(f"   Traditional: {traditional_evens}")
print(f"   Comprehension: {comprehension_evens}")

# More complex examples
print("\n3. Complex Examples:")
words = ["hello", "world", "python", "programming"]

# Convert to uppercase
uppercase_words = [word.upper() for word in words]
print(f"   Uppercase: {uppercase_words}")

# Get word lengths
word_lengths = [len(word) for word in words]
print(f"   Lengths: {word_lengths}")

# Filter words with more than 5 characters
long_words = [word for word in words if len(word) > 5]
print(f"   Long words: {long_words}")

# Nested list comprehension
matrix = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
flattened = [num for row in matrix for num in row]
print(f"   Matrix: {matrix}")
print(f"   Flattened: {flattened}")

# Conditional expression in comprehension
numbers = range(-5, 6)
abs_values = [x if x >= 0 else -x for x in numbers]
print(f"   Numbers: {list(numbers)}")
print(f"   Absolute values: {abs_values}")
```

---

## 2. Tuples - Immutable Sequences

### Tuple Basics

```python
print("Tuples in Python:")
print("=" * 17)

# Creating tuples
print("1. Creating Tuples:")
empty_tuple = ()
single_item = (42,)  # Note the comma for single item
coordinates = (3, 4)
person_info = ("Alice", 25, "Engineer", True)
mixed_tuple = (1, "hello", [1, 2, 3], {"key": "value"})

print(f"   Empty tuple: {empty_tuple}")
print(f"   Single item: {single_item}")
print(f"   Coordinates: {coordinates}")
print(f"   Person info: {person_info}")
print(f"   Mixed tuple: {mixed_tuple}")

# Tuple properties
print(f"\n2. Tuple Properties:")
print(f"   Length: {len(person_info)}")
print(f"   Type: {type(person_info)}")
print(f"   Immutable: Cannot modify after creation")

# Tuple indexing and slicing
print(f"\n3. Indexing and Slicing:")
sample_tuple = (0, 1, 2, 3, 4, 5, 6, 7, 8, 9)
print(f"   Tuple: {sample_tuple}")
print(f"   First element: {sample_tuple[0]}")
print(f"   Last element: {sample_tuple[-1]}")
print(f"   Slice [2:5]: {sample_tuple[2:5]}")
print(f"   Every 2nd element: {sample_tuple[::2]}")
```

### Tuple Operations and Methods

```python
print("\nTuple Operations:")
print("=" * 17)

# Tuple unpacking
print("1. Tuple Unpacking:")
point = (10, 20)
x, y = point
print(f"   Point: {point}")
print(f"   x = {x}, y = {y}")

# Multiple assignment
a, b, c = 1, 2, 3  # Actually creates a tuple (1, 2, 3)
print(f"   Multiple assignment: a={a}, b={b}, c={c}")

# Swapping variables using tuples
print("\n2. Variable Swapping:")
before_a, before_b = 100, 200
print(f"   Before swap: a={before_a}, b={before_b}")
before_a, before_b = before_b, before_a  # Tuple packing/unpacking
print(f"   After swap: a={before_a}, b={before_b}")

# Tuple methods
print("\n3. Tuple Methods:")
sample_tuple = (1, 2, 3, 2, 4, 2, 5)
print(f"   Tuple: {sample_tuple}")
print(f"   Count of 2: {sample_tuple.count(2)}")
print(f"   Index of first 3: {sample_tuple.index(3)}")

# Tuple concatenation and repetition
print("\n4. Concatenation and Repetition:")
tuple1 = (1, 2, 3)
tuple2 = (4, 5, 6)
combined = tuple1 + tuple2
repeated = tuple1 * 3

print(f"   Tuple1: {tuple1}")
print(f"   Tuple2: {tuple2}")
print(f"   Combined: {combined}")
print(f"   Repeated: {repeated}")

# Named tuples (conceptual example)
print("\n5. Named Tuples (concept):")
# In practice, you'd use collections.namedtuple
# from collections import namedtuple
# Person = namedtuple('Person', ['name', 'age', 'job'])
# person = Person('Alice', 30, 'Engineer')

# Simulating with regular tuple
Person = ("Alice", 30, "Engineer")
name, age, job = Person
print(f"   Person tuple: {Person}")
print(f"   Name: {name}, Age: {age}, Job: {job}")
```

### When to Use Tuples

```python
print("\nWhen to Use Tuples:")
print("=" * 19)

use_cases = {
    "Coordinates/Points": ("(x, y, z) coordinates", (10, 20, 30)),
    "RGB Colors": ("Color values", (255, 128, 0)),
    "Database Records": ("Fixed structure data", ("John", 25, "Engineer")),
    "Function Return Values": ("Multiple return values", (True, "Success", 42)),
    "Dictionary Keys": ("Immutable keys", {(0, 0): "origin", (1, 1): "point"}),
    "Configuration": ("Read-only settings", ("localhost", 8080, False))
}

for use_case, (description, example) in use_cases.items():
    print(f"   {use_case}:")
    print(f"     {description}")
    print(f"     Example: {example}")
    print()
```

---

## 3. Dictionaries - Key-Value Pairs

### Dictionary Basics

```python
print("Dictionaries in Python:")
print("=" * 23)

# Creating dictionaries
print("1. Creating Dictionaries:")
empty_dict = {}
student_grades = {"Alice": 85, "Bob": 92, "Charlie": 78}
person = {
    "name": "John Doe",
    "age": 30,
    "city": "New York",
    "is_employed": True
}
mixed_keys = {1: "one", "two": 2, (3, 4): "tuple_key"}

print(f"   Empty dict: {empty_dict}")
print(f"   Student grades: {student_grades}")
print(f"   Person info: {person}")
print(f"   Mixed keys: {mixed_keys}")

# Dictionary properties
print(f"\n2. Dictionary Properties:")
print(f"   Length: {len(person)}")
print(f"   Type: {type(person)}")
print(f"   Keys are unique and immutable")
print(f"   Values can be any type")

# Accessing values
print(f"\n3. Accessing Values:")
print(f"   person['name']: {person['name']}")
print(f"   person.get('age'): {person.get('age')}")
print(f"   person.get('salary', 'Not found'): {person.get('salary', 'Not found')}")

# Dictionary keys, values, and items
print(f"\n4. Keys, Values, and Items:")
print(f"   Keys: {list(person.keys())}")
print(f"   Values: {list(person.values())}")
print(f"   Items: {list(person.items())}")
```

### Dictionary Operations

```python
print("\nDictionary Operations:")
print("=" * 21)

# Starting with a sample dictionary
inventory = {"apples": 50, "bananas": 30, "oranges": 25}
print(f"Initial inventory: {inventory}")

# Adding and updating
print("\n1. Adding and Updating:")
inventory["grapes"] = 40  # Add new item
print(f"   After adding grapes: {inventory}")

inventory["apples"] = 60  # Update existing item
print(f"   After updating apples: {inventory}")

inventory.update({"mangoes": 15, "bananas": 35})  # Update multiple
print(f"   After update(): {inventory}")

# Removing items
print("\n2. Removing Items:")
removed_value = inventory.pop("oranges")  # Remove and return value
print(f"   Popped oranges: {removed_value}")
print(f"   After pop: {inventory}")

# Get and remove arbitrary item
if inventory:
    key, value = inventory.popitem()
    print(f"   Popped item: {key} = {value}")
    print(f"   After popitem: {inventory}")

# Dictionary comprehension
print("\n3. Dictionary Comprehension:")
numbers = [1, 2, 3, 4, 5]
squares_dict = {num: num**2 for num in numbers}
print(f"   Numbers: {numbers}")
print(f"   Squares dict: {squares_dict}")

# Filtering with comprehension
even_squares = {num: num**2 for num in numbers if num % 2 == 0}
print(f"   Even squares: {even_squares}")

# Merging dictionaries
print("\n4. Merging Dictionaries:")
dict1 = {"a": 1, "b": 2}
dict2 = {"c": 3, "d": 4}
dict3 = {"b": 20, "e": 5}  # Note: 'b' will be overwritten

# Python 3.9+ syntax
merged = dict1 | dict2 | dict3
print(f"   Dict1: {dict1}")
print(f"   Dict2: {dict2}")
print(f"   Dict3: {dict3}")
print(f"   Merged: {merged}")

# Alternative merging method
merged_alt = {**dict1, **dict2, **dict3}
print(f"   Alternative merge: {merged_alt}")
```

### Advanced Dictionary Usage

```python
print("\nAdvanced Dictionary Usage:")
print("=" * 27)

# Nested dictionaries
print("1. Nested Dictionaries:")
company = {
    "employees": {
        "engineering": ["Alice", "Bob", "Charlie"],
        "marketing": ["David", "Eve"],
        "sales": ["Frank", "Grace", "Henry"]
    },
    "locations": {
        "headquarters": "New York",
        "branch_offices": ["London", "Tokyo", "Sydney"]
    },
    "founded": 2010
}

print(f"   Company structure:")
for dept, employees in company["employees"].items():
    print(f"     {dept.title()}: {employees}")

# Default dictionaries (simulated)
print("\n2. Counting with Dictionaries:")
text = "hello world hello python world"
word_count = {}

for word in text.split():
    word_count[word] = word_count.get(word, 0) + 1

print(f"   Text: '{text}'")
print(f"   Word count: {word_count}")

# Dictionary as a switch/case alternative
print("\n3. Dictionary as Switch Statement:")
def get_day_info(day_number):
    day_info = {
        1: ("Monday", "Start of work week"),
        2: ("Tuesday", "Getting into rhythm"),
        3: ("Wednesday", "Hump day"),
        4: ("Thursday", "Almost there"),
        5: ("Friday", "TGIF!"),
        6: ("Saturday", "Weekend begins"),
        7: ("Sunday", "Rest day")
    }
    return day_info.get(day_number, ("Unknown", "Invalid day"))

for day_num in [1, 5, 7, 8]:
    day_name, description = get_day_info(day_num)
    print(f"   Day {day_num}: {day_name} - {description}")

# Dictionary methods summary
print("\n4. Key Dictionary Methods:")
sample_dict = {"a": 1, "b": 2, "c": 3}
methods_demo = {
    "keys()": list(sample_dict.keys()),
    "values()": list(sample_dict.values()),
    "items()": list(sample_dict.items()),
    "get('a', 0)": sample_dict.get('a', 0),
    "pop('b')": "Removes and returns value",
    "update()": "Merges another dict",
    "clear()": "Removes all items"
}

print(f"   Sample dict: {sample_dict}")
for method, result in methods_demo.items():
    print(f"   {method}: {result}")
```

---

## 4. Sets - Unique Collections

### Set Basics

```python
print("Sets in Python:")
print("=" * 15)

# Creating sets
print("1. Creating Sets:")
empty_set = set()  # Note: {} creates empty dict, not set
numbers_set = {1, 2, 3, 4, 5}
mixed_set = {1, "hello", 3.14, True}
from_list = set([1, 2, 2, 3, 3, 4])  # Duplicates removed

print(f"   Empty set: {empty_set}")
print(f"   Numbers set: {numbers_set}")
print(f"   Mixed set: {mixed_set}")
print(f"   From list [1,2,2,3,3,4]: {from_list}")

# Set properties
print(f"\n2. Set Properties:")
print(f"   Length: {len(numbers_set)}")
print(f"   Type: {type(numbers_set)}")
print(f"   Unordered collection")
print(f"   No duplicate elements")
print(f"   Elements must be immutable")

# Set membership
print(f"\n3. Membership Testing:")
fruits_set = {"apple", "banana", "cherry"}
print(f"   Fruits: {fruits_set}")
print(f"   'apple' in fruits: {'apple' in fruits_set}")
print(f"   'grape' in fruits: {'grape' in fruits_set}")
```

### Set Operations

```python
print("\nSet Operations:")
print("=" * 15)

# Basic set operations
print("1. Adding and Removing:")
colors = {"red", "green", "blue"}
print(f"   Initial colors: {colors}")

colors.add("yellow")
print(f"   After add('yellow'): {colors}")

colors.update(["purple", "orange"])
print(f"   After update(['purple', 'orange']): {colors}")

removed_color = colors.pop()  # Removes arbitrary element
print(f"   Popped color: {removed_color}")
print(f"   After pop(): {colors}")

colors.discard("green")  # Safe removal (no error if not present)
print(f"   After discard('green'): {colors}")

# Set mathematical operations
print("\n2. Mathematical Operations:")
set_a = {1, 2, 3, 4, 5}
set_b = {4, 5, 6, 7, 8}

print(f"   Set A: {set_a}")
print(f"   Set B: {set_b}")

# Union (all elements from both sets)
union_result = set_a | set_b  # or set_a.union(set_b)
print(f"   Union (A | B): {union_result}")

# Intersection (common elements)
intersection_result = set_a & set_b  # or set_a.intersection(set_b)
print(f"   Intersection (A & B): {intersection_result}")

# Difference (elements in A but not in B)
difference_result = set_a - set_b  # or set_a.difference(set_b)
print(f"   Difference (A - B): {difference_result}")

# Symmetric difference (elements in either A or B, but not both)
sym_diff_result = set_a ^ set_b  # or set_a.symmetric_difference(set_b)
print(f"   Symmetric Difference (A ^ B): {sym_diff_result}")

# Set relationships
print("\n3. Set Relationships:")
small_set = {2, 3}
large_set = {1, 2, 3, 4, 5}

print(f"   Small set: {small_set}")
print(f"   Large set: {large_set}")
print(f"   Is subset: {small_set.issubset(large_set)}")
print(f"   Is superset: {large_set.issuperset(small_set)}")
print(f"   Are disjoint: {small_set.isdisjoint({6, 7, 8})}")
```

### Set Comprehensions and Practical Uses

```python
print("\nSet Comprehensions and Uses:")
print("=" * 28)

# Set comprehensions
print("1. Set Comprehensions:")
numbers = [1, 2, 3, 4, 5, 1, 2, 3]  # List with duplicates
unique_squares = {x**2 for x in numbers}
print(f"   Numbers: {numbers}")
print(f"   Unique squares: {unique_squares}")

# Even numbers from 1 to 20
even_set = {x for x in range(1, 21) if x % 2 == 0}
print(f"   Even numbers 1-20: {even_set}")

# Practical applications
print("\n2. Practical Applications:")

# Remove duplicates from list
original_list = [1, 2, 2, 3, 4, 4, 5]
unique_list = list(set(original_list))
print(f"   Original: {original_list}")
print(f"   Unique: {unique_list}")

# Find common elements between lists
list1 = [1, 2, 3, 4, 5]
list2 = [4, 5, 6, 7, 8]
common_elements = list(set(list1) & set(list2))
print(f"   List 1: {list1}")
print(f"   List 2: {list2}")
print(f"   Common: {common_elements}")

# Check for any common elements
has_common = bool(set(list1) & set(list2))
print(f"   Has common elements: {has_common}")

# Membership testing performance
print("\n3. Membership Testing Performance:")
large_list = list(range(10000))
large_set = set(large_list)

print(f"   Testing membership in list vs set:")
print(f"   List size: {len(large_list)}")
print(f"   Set size: {len(large_set)}")
print(f"   Set membership testing is O(1) vs list's O(n)")

# Vowel detection example
print("\n4. Vowel Detection Example:")
def count_vowels(text):
    vowels = {'a', 'e', 'i', 'o', 'u'}
    text_lower = text.lower()
    vowel_count = sum(1 for char in text_lower if char in vowels)
    unique_vowels = {char for char in text_lower if char in vowels}
    return vowel_count, unique_vowels

sample_text = "Hello World Python Programming"
vowel_count, unique_vowels = count_vowels(sample_text)
print(f"   Text: '{sample_text}'")
print(f"   Vowel count: {vowel_count}")
print(f"   Unique vowels: {unique_vowels}")
```

---

## 5. Strings - Text Processing

### Advanced String Operations

```python
print("Advanced String Operations:")
print("=" * 28)

# String creation and basic properties
print("1. String Basics:")
text = "Hello, World! Welcome to Python."
print(f"   Text: '{text}'")
print(f"   Length: {len(text)}")
print(f"   Type: {type(text)}")

# String indexing and slicing
print(f"\n2. Indexing and Slicing:")
print(f"   First character: '{text[0]}'")
print(f"   Last character: '{text[-1]}'")
print(f"   First 5 characters: '{text[:5]}'")
print(f"   Characters 7-12: '{text[7:12]}'")
print(f"   Every 2nd character: '{text[::2]}'")
print(f"   Reversed: '{text[::-1]}'")

# String methods - case manipulation
print(f"\n3. Case Manipulation:")
sample = "Hello World Python"
print(f"   Original: '{sample}'")
print(f"   lower(): '{sample.lower()}'")
print(f"   upper(): '{sample.upper()}'")
print(f"   title(): '{sample.title()}'")
print(f"   capitalize(): '{sample.capitalize()}'")
print(f"   swapcase(): '{sample.swapcase()}'")

# String methods - testing
print(f"\n4. String Testing Methods:")
test_strings = ["hello", "WORLD", "Python123", "123", "   ", "Hello World"]
methods = ['isalpha', 'isdigit', 'isalnum', 'isspace', 'istitle', 'islower', 'isupper']

for test_str in test_strings[:3]:  # Test first 3 strings
    print(f"   '{test_str}':")
    for method in methods:
        result = getattr(test_str, method)()
        print(f"     {method}(): {result}")
    print()
```

### String Formatting and Manipulation

```python
print("String Formatting and Manipulation:")
print("=" * 35)

# String formatting methods
print("1. String Formatting Methods:")
name = "Alice"
age = 30
salary = 75000.50

# f-strings (Python 3.6+)
print(f"   f-string: Hello {name}, you are {age} years old.")
print(f"   f-string with formatting: Salary: ${salary:,.2f}")

# .format() method
print("   .format(): Hello {}, you are {} years old.".format(name, age))
print("   .format() with names: Hello {name}, age {age}".format(name=name, age=age))

# % formatting (old style)
print("   %% formatting: Hello %s, you are %d years old." % (name, age))

# String manipulation
print("\n2. String Manipulation:")
sentence = "  Hello, World! Welcome to Python programming.  "
print(f"   Original: '{sentence}'")
print(f"   strip(): '{sentence.strip()}'")
print(f"   replace(): '{sentence.replace('Python', 'Java')}'")
print(f"   split(): {sentence.split()}")

# String joining
print("\n3. String Joining:")
words = ["Python", "is", "awesome", "for", "programming"]
joined = " ".join(words)
joined_with_dash = "-".join(words)
print(f"   Words: {words}")
print(f"   Joined with space: '{joined}'")
print(f"   Joined with dash: '{joined_with_dash}'")

# String alignment and padding
print("\n4. Alignment and Padding:")
text = "Python"
print(f"   Original: '{text}'")
print(f"   ljust(10): '{text.ljust(10)}'")
print(f"   rjust(10): '{text.rjust(10)}'")
print(f"   center(10): '{text.center(10)}'")
print(f"   zfill(8): '{text.zfill(8)}'")

# Advanced string operations
print("\n5. Advanced Operations:")
data = "apple,banana,cherry;date|elderberry"
print(f"   Data: '{data}'")

# Multiple character split (simulation)
import re
fruits = re.split('[,;|]', data)
print(f"   Split on multiple delimiters: {fruits}")

# String translation
translation_table = str.maketrans('aeiou', '12345')
translated = "hello world".translate(translation_table)
print(f"   Translation (vowels to numbers): '{translated}'")
```

### Regular Expressions (Basic)

```python
print("\nBasic Pattern Matching:")
print("=" * 23)

# Simple pattern matching without regex
print("1. Simple Pattern Matching:")
email = "user@example.com"
print(f"   Email: '{email}'")
print(f"   Contains '@': {'@' in email}")
print(f"   Ends with '.com': {email.endswith('.com')}")
print(f"   Starts with 'user': {email.startswith('user')}")

# Find and count patterns
text = "The quick brown fox jumps over the lazy dog"
print(f"\n2. Pattern Counting:")
print(f"   Text: '{text}'")
print(f"   Count 'the': {text.lower().count('the')}")
print(f"   Find 'fox': {text.find('fox')}")
print(f"   Find 'cat': {text.find('cat')}")  # Returns -1 if not found

# Basic validation functions
print("\n3. Basic Validation:")

def is_valid_email(email):
    """Basic email validation."""
    return '@' in email and '.' in email.split('@')[1]

def is_valid_phone(phone):
    """Basic phone validation (digits and dashes)."""
    cleaned = phone.replace('-', '').replace(' ', '')
    return cleaned.isdigit() and len(cleaned) >= 10

def extract_numbers(text):
    """Extract numbers from text."""
    numbers = []
    current_number = ""
    
    for char in text:
        if char.isdigit():
            current_number += char
        else:
            if current_number:
                numbers.append(int(current_number))
                current_number = ""
    
    if current_number:  # Don't forget the last number
        numbers.append(int(current_number))
    
    return numbers

# Test validation functions
test_emails = ["user@example.com", "invalid.email", "test@domain.org"]
test_phones = ["123-456-7890", "555 123 4567", "abc-def-ghij"]
test_text = "I have 5 apples and 10 oranges, cost $25 total."

print(f"   Email validation:")
for email in test_emails:
    print(f"     '{email}': {is_valid_email(email)}")

print(f"   Phone validation:")
for phone in test_phones:
    print(f"     '{phone}': {is_valid_phone(phone)}")

print(f"   Number extraction:")
print(f"     Text: '{test_text}'")
print(f"     Numbers: {extract_numbers(test_text)}")
```

---

## 6. Data Structure Comparison

### Performance Characteristics

```python
print("Data Structure Performance:")
print("=" * 28)

# Time complexity comparison
print("1. Time Complexity Comparison:")
complexity_table = {
    "Operation": ["Access", "Search", "Insert", "Delete"],
    "List": ["O(1)", "O(n)", "O(n)*", "O(n)*"],
    "Tuple": ["O(1)", "O(n)", "N/A", "N/A"],
    "Dict": ["O(1)", "O(1)", "O(1)", "O(1)"],
    "Set": ["N/A", "O(1)", "O(1)", "O(1)"]
}

print("   " + " | ".join(f"{key:>8}" for key in complexity_table.keys()))
print("   " + "-" * 45)
for i in range(4):
    row = []
    for key in complexity_table.keys():
        row.append(f"{complexity_table[key][i]:>8}")
    print("   " + " | ".join(row))

print("\n   * Insert/Delete at end is O(1), but O(n) for arbitrary position")

# Memory usage comparison
print("\n2. Memory Usage:")
import sys

sample_data = list(range(1000))
list_size = sys.getsizeof(sample_data)
tuple_size = sys.getsizeof(tuple(sample_data))
set_size = sys.getsizeof(set(sample_data))
dict_size = sys.getsizeof({i: i for i in sample_data})

print(f"   1000 integers:")
print(f"     List:  {list_size:,} bytes")
print(f"     Tuple: {tuple_size:,} bytes")
print(f"     Set:   {set_size:,} bytes")
print(f"     Dict:  {dict_size:,} bytes")

# When to use each data structure
print("\n3. When to Use Each:")
use_cases = {
    "List": [
        "Ordered collection with duplicates",
        "Need to modify elements frequently",
        "Index-based access required",
        "Stack or queue operations"
    ],
    "Tuple": [
        "Immutable ordered collection",
        "Fixed data that won't change",
        "Dictionary keys",
        "Function return values"
    ],
    "Dict": [
        "Key-value associations",
        "Fast lookups by key",
        "Mapping relationships",
        "Caching/memoization"
    ],
    "Set": [
        "Unique elements only",
        "Mathematical set operations",
        "Fast membership testing",
        "Remove duplicates"
    ]
}

for data_type, cases in use_cases.items():
    print(f"\n   {data_type}:")
    for case in cases:
        print(f"     • {case}")
```

### Conversion Between Data Structures

```python
print("\nData Structure Conversions:")
print("=" * 29)

# Sample data for conversion
original_list = [1, 2, 3, 2, 4, 3, 5]
print(f"Original list: {original_list}")

# Convert to different types
converted_tuple = tuple(original_list)
converted_set = set(original_list)  # Removes duplicates
converted_dict_enumerate = dict(enumerate(original_list))
converted_dict_zip = dict(zip(original_list, original_list))

print(f"\n1. Conversions from List:")
print(f"   To tuple: {converted_tuple}")
print(f"   To set: {converted_set}")
print(f"   To dict (enumerate): {converted_dict_enumerate}")
print(f"   To dict (zip with self): {converted_dict_zip}")

# Convert back to list
print(f"\n2. Converting Back to List:")
print(f"   From tuple: {list(converted_tuple)}")
print(f"   From set: {list(converted_set)}")
print(f"   From dict keys: {list(converted_dict_enumerate.keys())}")
print(f"   From dict values: {list(converted_dict_enumerate.values())}")

# Practical conversion examples
print(f"\n3. Practical Examples:")

# Remove duplicates while preserving order
def remove_duplicates_ordered(lst):
    seen = set()
    result = []
    for item in lst:
        if item not in seen:
            seen.add(item)
            result.append(item)
    return result

# Count items using dict
def count_items(lst):
    counts = {}
    for item in lst:
        counts[item] = counts.get(item, 0) + 1
    return counts

# Find unique items
def find_unique_items(lst):
    return list(set(lst))

test_list = ['a', 'b', 'c', 'a', 'b', 'd']
print(f"   Test list: {test_list}")
print(f"   Remove duplicates (ordered): {remove_duplicates_ordered(test_list)}")
print(f"   Count items: {count_items(test_list)}")
print(f"   Unique items: {find_unique_items(test_list)}")
```

---

## 7. Nested Data Structures

### Complex Data Structures

```python
print("Nested Data Structures:")
print("=" * 23)

# Complex nested structure - Student Management System
print("1. Student Management System:")
school_data = {
    "school_name": "Python Academy",
    "students": [
        {
            "id": 1,
            "name": "Alice Johnson",
            "grades": {"math": 95, "science": 87, "english": 92},
            "attendance": [True, True, False, True, True],
            "extracurricular": ["debate", "chess"]
        },
        {
            "id": 2,
            "name": "Bob Smith",
            "grades": {"math": 78, "science": 92, "english": 85},
            "attendance": [True, True, True, False, True],
            "extracurricular": ["basketball", "drama"]
        },
        {
            "id": 3,
            "name": "Charlie Brown",
            "grades": {"math": 88, "science": 79, "english": 91},
            "attendance": [False, True, True, True, True],
            "extracurricular": ["music", "debate", "chess"]
        }
    ],
    "teachers": {
        "math": "Dr. Adams",
        "science": "Prof. Baker",
        "english": "Ms. Clark"
    }
}

print(f"   School: {school_data['school_name']}")
print(f"   Number of students: {len(school_data['students'])}")

# Accessing nested data
print(f"\n2. Accessing Nested Data:")
first_student = school_data["students"][0]
print(f"   First student: {first_student['name']}")
print(f"   Math grade: {first_student['grades']['math']}")
print(f"   Extracurricular count: {len(first_student['extracurricular'])}")

# Processing nested data
print(f"\n3. Processing Nested Data:")

def calculate_average_grade(student):
    """Calculate average grade for a student."""
    grades = student["grades"]
    return sum(grades.values()) / len(grades)

def calculate_attendance_rate(student):
    """Calculate attendance rate for a student."""
    attendance = student["attendance"]
    return sum(attendance) / len(attendance) * 100

# Process all students
print(f"   Student Performance Report:")
for student in school_data["students"]:
    avg_grade = calculate_average_grade(student)
    attendance_rate = calculate_attendance_rate(student)
    print(f"     {student['name']}:")
    print(f"       Average Grade: {avg_grade:.1f}")
    print(f"       Attendance: {attendance_rate:.1f}%")
    print(f"       Activities: {', '.join(student['extracurricular'])}")

# Advanced data manipulation
print(f"\n4. Advanced Data Analysis:")

# Find students by subject performance
def find_top_performers(data, subject, threshold=90):
    """Find students performing above threshold in a subject."""
    top_performers = []
    for student in data["students"]:
        if student["grades"].get(subject, 0) >= threshold:
            top_performers.append(student["name"])
    return top_performers

# Get activity participation
def get_activity_participation(data):
    """Count participation in each activity."""
    activity_count = {}
    for student in data["students"]:
        for activity in student["extracurricular"]:
            activity_count[activity] = activity_count.get(activity, 0) + 1
    return activity_count

math_top_performers = find_top_performers(school_data, "math", 90)
activity_stats = get_activity_participation(school_data)

print(f"   Top Math Performers (90+): {math_top_performers}")
print(f"   Activity Participation: {activity_stats}")

# Class statistics
total_students = len(school_data["students"])
all_math_grades = [s["grades"]["math"] for s in school_data["students"]]
class_math_average = sum(all_math_grades) / len(all_math_grades)

print(f"   Class Statistics:")
print(f"     Total Students: {total_students}")
print(f"     Math Class Average: {class_math_average:.1f}")
print(f"     Math Grade Range: {min(all_math_grades)} - {max(all_math_grades)}")
```

### Working with JSON-like Structures

```python
print("\nJSON-like Data Structures:")
print("=" * 27)

# API response simulation
print("1. API Response Structure:")
api_response = {
    "status": "success",
    "data": {
        "users": [
            {
                "id": 101,
                "profile": {
                    "username": "alice123",
                    "email": "alice@example.com",
                    "preferences": {
                        "theme": "dark",
                        "notifications": True,
                        "language": "en"
                    }
                },
                "posts": [
                    {"id": 1, "title": "Hello World", "likes": 15},
                    {"id": 2, "title": "Python Tips", "likes": 42}
                ]
            },
            {
                "id": 102,
                "profile": {
                    "username": "bob456",
                    "email": "bob@example.com",
                    "preferences": {
                        "theme": "light",
                        "notifications": False,
                        "language": "es"
                    }
                },
                "posts": [
                    {"id": 3, "title": "Data Science", "likes": 28}
                ]
            }
        ]
    },
    "pagination": {
        "page": 1,
        "per_page": 10,
        "total": 2
    }
}

print(f"   Status: {api_response['status']}")
print(f"   Total users: {api_response['pagination']['total']}")

# Safe data access with nested dictionaries
print(f"\n2. Safe Data Access:")

def safe_get(data, *keys, default=None):
    """Safely get nested dictionary values."""
    for key in keys:
        if isinstance(data, dict) and key in data:
            data = data[key]
        else:
            return default
    return data

# Examples of safe access
user_theme = safe_get(api_response, "data", "users", 0, "profile", "preferences", "theme")
invalid_path = safe_get(api_response, "data", "users", 0, "invalid", "path", default="Not found")

print(f"   First user theme: {user_theme}")
print(f"   Invalid path result: {invalid_path}")

# Data transformation
print(f"\n3. Data Transformation:")

def extract_user_summaries(response):
    """Extract user summaries from API response."""
    summaries = []
    users = safe_get(response, "data", "users", default=[])
    
    for user in users:
        summary = {
            "username": safe_get(user, "profile", "username"),
            "email": safe_get(user, "profile", "email"),
            "post_count": len(safe_get(user, "posts", default=[])),
            "total_likes": sum(post.get("likes", 0) for post in safe_get(user, "posts", default=[])),
            "theme": safe_get(user, "profile", "preferences", "theme")
        }
        summaries.append(summary)
    
    return summaries

user_summaries = extract_user_summaries(api_response)
print(f"   User Summaries:")
for summary in user_summaries:
    print(f"     {summary['username']}: {summary['post_count']} posts, {summary['total_likes']} likes")

# Flattening nested structures
print(f"\n4. Flattening Nested Data:")

def flatten_dict(d, parent_key='', sep='_'):
    """Flatten a nested dictionary."""
    items = []
    for k, v in d.items():
        new_key = f"{parent_key}{sep}{k}" if parent_key else k
        if isinstance(v, dict):
            items.extend(flatten_dict(v, new_key, sep=sep).items())
        else:
            items.append((new_key, v))
    return dict(items)

# Flatten first user's profile
first_user_profile = api_response["data"]["users"][0]["profile"]
flattened_profile = flatten_dict(first_user_profile)

print(f"   Original profile structure:")
print(f"     {first_user_profile}")
print(f"   Flattened profile:")
for key, value in flattened_profile.items():
    print(f"     {key}: {value}")
```

---

## 8. Performance and Best Practices

### Performance Tips

```python
print("Performance Tips and Best Practices:")
print("=" * 37)

import time

# List vs Set membership testing
print("1. Membership Testing Performance:")

def time_membership_test(container, search_items):
    """Time membership testing for a container."""
    start_time = time.time()
    for item in search_items:
        _ = item in container
    end_time = time.time()
    return end_time - start_time

# Create test data
large_data = list(range(10000))
search_items = [100, 5000, 9999, 15000]  # Some exist, some don't

list_time = time_membership_test(large_data, search_items)
set_time = time_membership_test(set(large_data), search_items)

print(f"   Testing membership for {len(search_items)} items in {len(large_data)} elements:")
print(f"   List: {list_time:.6f} seconds")
print(f"   Set:  {set_time:.6f} seconds")
print(f"   Set is ~{list_time/set_time:.1f}x faster" if set_time > 0 else "   Set is much faster")

# Dictionary lookup vs list search
print(f"\n2. Lookup Performance:")

# Create lookup structures
student_list = [("Alice", 85), ("Bob", 92), ("Charlie", 78)] * 1000
student_dict = {name: grade for name, grade in student_list}

def find_in_list(lst, target_name):
    for name, grade in lst:
        if name == target_name:
            return grade
    return None

def find_in_dict(dct, target_name):
    return dct.get(target_name)

print(f"   Dictionary lookup is much faster than list search")
print(f"   Use dictionaries for frequent key-based lookups")

# Memory-efficient practices
print(f"\n3. Memory-Efficient Practices:")

# Generator vs list
def squares_list(n):
    return [x**2 for x in range(n)]

def squares_generator(n):
    return (x**2 for x in range(n))

print(f"   List comprehension creates all items in memory")
print(f"   Generator expression creates items on demand")
print(f"   Use generators for large datasets or when you don't need all items at once")

# String concatenation
print(f"\n4. String Concatenation Best Practices:")

# Inefficient way
def inefficient_concat(words):
    result = ""
    for word in words:
        result += word + " "
    return result.strip()

# Efficient way
def efficient_concat(words):
    return " ".join(words)

test_words = ["Python", "is", "awesome", "for", "data", "processing"] * 100

print(f"   Use join() instead of += for multiple string concatenations")
print(f"   join() is much more efficient for combining many strings")

# Best practices summary
print(f"\n5. General Best Practices:")
best_practices = [
    "Use sets for membership testing and removing duplicates",
    "Use dictionaries for key-value mappings and fast lookups",
    "Use list comprehensions for simple transformations",
    "Use generators for large datasets or streaming data",
    "Choose tuples for immutable sequences",
    "Use join() for string concatenation",
    "Consider memory usage for large datasets",
    "Profile your code to identify bottlenecks"
]

for i, practice in enumerate(best_practices, 1):
    print(f"   {i}. {practice}")
```

---

## 9. Real-World Applications

### Data Processing Examples

```python
print("Real-World Data Processing:")
print("=" * 28)

# Example 1: Sales Data Analysis
print("1. Sales Data Analysis:")
sales_data = [
    {"date": "2023-01-01", "product": "laptop", "quantity": 2, "price": 1200.00, "region": "north"},
    {"date": "2023-01-01", "product": "mouse", "quantity": 5, "price": 25.00, "region": "south"},
    {"date": "2023-01-02", "product": "laptop", "quantity": 1, "price": 1200.00, "region": "north"},
    {"date": "2023-01-02", "product": "keyboard", "quantity": 3, "price": 75.00, "region": "east"},
    {"date": "2023-01-03", "product": "mouse", "quantity": 8, "price": 25.00, "region": "west"},
    {"date": "2023-01-03", "product": "laptop", "quantity": 1, "price": 1200.00, "region": "south"}
]

# Calculate total revenue by product
product_revenue = {}
for sale in sales_data:
    product = sale["product"]
    revenue = sale["quantity"] * sale["price"]
    product_revenue[product] = product_revenue.get(product, 0) + revenue

print(f"   Product Revenue:")
for product, revenue in sorted(product_revenue.items(), key=lambda x: x[1], reverse=True):
    print(f"     {product.title()}: ${revenue:,.2f}")

# Sales by region
region_sales = {}
for sale in sales_data:
    region = sale["region"]
    revenue = sale["quantity"] * sale["price"]
    if region not in region_sales:
        region_sales[region] = {"revenue": 0, "units": 0}
    region_sales[region]["revenue"] += revenue
    region_sales[region]["units"] += sale["quantity"]

print(f"   Sales by Region:")
for region, data in region_sales.items():
    print(f"     {region.title()}: ${data['revenue']:,.2f} ({data['units']} units)")

# Example 2: Text Analysis
print(f"\n2. Text Analysis:")
text_corpus = [
    "Python is a powerful programming language",
    "Data science with Python is amazing",
    "Machine learning and Python go hand in hand",
    "Python programming is fun and productive",
    "Web development with Python is efficient"
]

# Word frequency analysis
word_freq = {}
all_words = []

for sentence in text_corpus:
    words = sentence.lower().split()
    for word in words:
        # Remove basic punctuation
        clean_word = word.strip('.,!?')
        if clean_word:
            all_words.append(clean_word)
            word_freq[clean_word] = word_freq.get(clean_word, 0) + 1

# Most common words
common_words = sorted(word_freq.items(), key=lambda x: x[1], reverse=True)
print(f"   Most common words:")
for word, count in common_words[:5]:
    print(f"     '{word}': {count} times")

# Unique words per sentence
unique_words_per_sentence = []
for sentence in text_corpus:
    words = set(word.lower().strip('.,!?') for word in sentence.split())
    unique_words_per_sentence.append(len(words))

print(f"   Unique words per sentence: {unique_words_per_sentence}")
print(f"   Average unique words: {sum(unique_words_per_sentence) / len(unique_words_per_sentence):.1f}")

# Example 3: Inventory Management
print(f"\n3. Inventory Management:")
inventory = {
    "electronics": {
        "laptop": {"stock": 15, "price": 1200, "reorder_level": 5},
        "mouse": {"stock": 50, "price": 25, "reorder_level": 10},
        "keyboard": {"stock": 3, "price": 75, "reorder_level": 8}
    },
    "books": {
        "python_guide": {"stock": 25, "price": 45, "reorder_level": 10},
        "data_science": {"stock": 8, "price": 60, "reorder_level": 5}
    }
}

# Find items that need reordering
def find_reorder_items(inventory):
    reorder_needed = []
    for category, items in inventory.items():
        for item, details in items.items():
            if details["stock"] <= details["reorder_level"]:
                reorder_needed.append({
                    "category": category,
                    "item": item,
                    "current_stock": details["stock"],
                    "reorder_level": details["reorder_level"]
                })
    return reorder_needed

# Calculate total inventory value
def calculate_inventory_value(inventory):
    total_value = 0
    category_values = {}
    
    for category, items in inventory.items():
        category_value = 0
        for item, details in items.items():
            item_value = details["stock"] * details["price"]
            category_value += item_value
            total_value += item_value
        category_values[category] = category_value
    
    return total_value, category_values

reorder_items = find_reorder_items(inventory)
total_value, category_values = calculate_inventory_value(inventory)

print(f"   Items needing reorder:")
for item in reorder_items:
    print(f"     {item['category']}/{item['item']}: {item['current_stock']} (reorder at {item['reorder_level']})")

print(f"   Inventory Value:")
print(f"     Total: ${total_value:,}")
for category, value in category_values.items():
    print(f"     {category.title()}: ${value:,}")

print(f"\n" + "="*60)
print("Data Structures Course Complete!")
print("="*60)

print("\nKey Skills Acquired:")
skills = [
    "Working with Lists, Tuples, Dictionaries, and Sets",
    "Understanding when to use each data structure",
    "Performance optimization techniques",
    "Nested data structure manipulation",
    "Real-world data processing applications",
    "Memory management best practices"
]

for i, skill in enumerate(skills, 1):
    print(f"{i}. {skill}")

print(f"\nNext Steps:")
next_steps = [
    "Practice with larger datasets",
    "Learn about advanced data structures (heaps, trees)",
    "Explore libraries like pandas for data analysis",
    "Study algorithms and their complexity",
    "Build real projects using these data structures"
]

for i, step in enumerate(next_steps, 1):
    print(f"{i}. {step}")

print(f"\nRecommended Practice:")
print("• Build a contact management system using dictionaries")
print("• Create a text analyzer using string methods and sets")
print("• Implement a simple inventory system")
print("• Process CSV-like data using lists and dictionaries")
print("• Create a simple grade calculator")

print("\nCongratulations on mastering Python Data Structures!")
```

---

## Summary

This comprehensive notebook covered Python's core data structures:

### **Lists**
- Dynamic, ordered collections
- Support indexing, slicing, and modification
- Rich set of methods (append, extend, remove, etc.)
- List comprehensions for efficient creation

### **Tuples**
- Immutable, ordered collections
- Perfect for fixed data and function returns
- Support unpacking and multiple assignment
- Can be used as dictionary keys

### **Dictionaries**
- Key-value pairs with fast lookups
- Highly efficient for data mapping
- Support comprehensions and various methods
- Essential for JSON-like data structures

### **Sets**
- Unordered collections of unique elements
- Excellent for mathematical operations
- Fast membership testing
- Perfect for removing duplicates

### **Strings**
- Immutable text sequences
- Rich methods for manipulation
- Support various formatting techniques
- Essential for text processing

### **Key Takeaways:**
1. Choose the right data structure for your use case
2. Consider performance implications
3. Use comprehensions for cleaner code
4. Understand memory usage patterns
5. Practice with real-world examples

Continue practicing these concepts and explore more advanced topics like object-oriented programming!