# Episode 4: Lists and Data Structures

Lists are one of Python's most versatile data structures. In this notebook, we'll learn how to store, access, and manipulate collections of inflammation data using lists and other data structures.

## Learning Objectives
- Create and manipulate Python lists
- Understand list indexing and slicing
- Work with nested lists for multi-dimensional data
- Use list methods and operations
- Explore tuples, sets, and dictionaries

## Introduction

While NumPy arrays are excellent for numerical computation, Python's built-in data structures are essential for general data manipulation and when working with mixed data types.

## 1. Creating Lists

Lists can store multiple items of different types:

In [None]:
# Simple lists
patient_ids = ['P001', 'P002', 'P003', 'P004', 'P005']
ages = [25, 34, 45, 56, 23]
inflammation_day1 = [0.0, 1.2, 2.1, 0.5, 1.8]

print("Patient IDs:", patient_ids)
print("Ages:", ages)
print("Day 1 inflammation:", inflammation_day1)
print("\nList types:")
print("patient_ids type:", type(patient_ids))
print("Number of patients:", len(patient_ids))

In [None]:
# Mixed data types in lists
patient_info = ['P001', 25, 70.5, True, 'Alice Johnson']
print("Mixed patient info:", patient_info)
print("ID:", patient_info[0])
print("Age:", patient_info[1])
print("Weight:", patient_info[2])
print("Has inflammation:", patient_info[3])
print("Name:", patient_info[4])

## 2. List Indexing and Slicing

Access individual elements or ranges of elements:

In [None]:
# Indexing (remember: 0-based!)
inflammation_readings = [0.0, 1.5, 3.2, 4.1, 2.8, 1.9, 0.8, 0.0]

print("All readings:", inflammation_readings)
print("First reading (day 0):", inflammation_readings[0])
print("Fourth reading (day 3):", inflammation_readings[3])
print("Last reading:", inflammation_readings[-1])
print("Second to last:", inflammation_readings[-2])

In [None]:
# Slicing
print("First 3 readings:", inflammation_readings[:3])
print("Last 3 readings:", inflammation_readings[-3:])
print("Middle readings (days 2-5):", inflammation_readings[2:6])
print("Every other reading:", inflammation_readings[::2])
print("Readings in reverse:", inflammation_readings[::-1])

### Exercise 4.1
Given the patient ages list, extract:
1. Ages of the first three patients
2. Ages of patients older than 30
3. The youngest and oldest patient ages

In [None]:
# Exercise 4.1 - Your code here
ages = [25, 34, 45, 56, 23]
# 1. First three patients

# 2. Ages > 30 (hint: use a list comprehension or loop)

# 3. Youngest and oldest

## 3. Modifying Lists

Lists are mutable - you can change their contents:

In [None]:
# Modifying individual elements
inflammation_copy = inflammation_readings.copy()  # Make a copy to preserve original
print("Original:", inflammation_copy)

inflammation_copy[0] = 0.1  # Correct a measurement error
inflammation_copy[-1] = 0.05  # Update final reading
print("After corrections:", inflammation_copy)

In [None]:
# Adding elements
new_readings = [0.0, 1.2, 2.5]
print("Starting with:", new_readings)

new_readings.append(3.1)  # Add one element
print("After append:", new_readings)

new_readings.extend([2.8, 1.5, 0.5])  # Add multiple elements
print("After extend:", new_readings)

new_readings.insert(2, 1.8)  # Insert at specific position
print("After insert at position 2:", new_readings)

In [None]:
# Removing elements
readings_to_modify = [0.0, 1.2, 999.0, 2.5, 3.1, 2.8]  # 999.0 is an error
print("With error:", readings_to_modify)

readings_to_modify.remove(999.0)  # Remove specific value
print("Error removed:", readings_to_modify)

last_reading = readings_to_modify.pop()  # Remove and return last element
print("Last reading removed:", last_reading)
print("Remaining:", readings_to_modify)

del readings_to_modify[1]  # Delete by index
print("After deleting index 1:", readings_to_modify)

## 4. List Methods and Operations

Useful methods for working with lists:

In [None]:
# Statistical operations
daily_readings = [2.1, 3.5, 1.8, 4.2, 5.1, 2.8, 1.9]

print("Daily readings:", daily_readings)
print("Length:", len(daily_readings))
print("Sum:", sum(daily_readings))
print("Average:", sum(daily_readings) / len(daily_readings))
print("Minimum:", min(daily_readings))
print("Maximum:", max(daily_readings))
print("Sorted:", sorted(daily_readings))
print("Original (unchanged):", daily_readings)

In [None]:
# List methods
patient_list = ['Alice', 'Bob', 'Carol', 'David', 'Alice']

print("Patient list:", patient_list)
print("Count of 'Alice':", patient_list.count('Alice'))
print("Index of 'Carol':", patient_list.index('Carol'))

# Reverse in place
patient_list.reverse()
print("Reversed:", patient_list)

# Sort in place
patient_list.sort()
print("Sorted:", patient_list)

## 5. List Comprehensions

A powerful way to create and transform lists:

In [None]:
# Basic list comprehensions
celsius_temps = [36.5, 37.2, 38.1, 36.8, 37.5]

# Convert to Fahrenheit
fahrenheit_temps = [(temp * 9/5) + 32 for temp in celsius_temps]
print("Celsius:", celsius_temps)
print("Fahrenheit:", fahrenheit_temps)

# Square all inflammation readings
squared_readings = [reading**2 for reading in daily_readings]
print("Original readings:", daily_readings)
print("Squared readings:", squared_readings)

In [None]:
# Conditional list comprehensions
all_ages = [25, 34, 45, 56, 23, 67, 42, 38, 51]

# Filter adults only
adults = [age for age in all_ages if age >= 18]
print("All ages:", all_ages)
print("Adults (≥18):", adults)

# Filter high inflammation readings
all_inflammation = [0.5, 3.2, 1.8, 7.5, 2.1, 8.8, 1.2, 9.3, 4.1]
high_inflammation = [reading for reading in all_inflammation if reading > 5.0]
print("All inflammation:", all_inflammation)
print("High inflammation (>5.0):", high_inflammation)

# Transform and filter
severe_cases = [reading * 2 for reading in all_inflammation if reading > 7.0]
print("Severe cases (>7.0) doubled:", severe_cases)

### Exercise 4.2
Create list comprehensions for:
1. Patient IDs that start with 'P' (from a mixed list)
2. BMI values from weights and heights
3. Categorize inflammation as 'Low', 'Medium', or 'High'

In [None]:
# Exercise 4.2 - Your code here
mixed_ids = ['P001', 'X002', 'P003', 'Y004', 'P005']
weights = [70.5, 65.2, 80.1, 55.8, 75.3]  # kg
heights = [1.75, 1.62, 1.80, 1.55, 1.78]  # meters
inflammation_levels = [1.2, 4.5, 8.2, 2.1, 6.7]

# 1. Patient IDs starting with 'P'

# 2. BMI values (weight / height²)

# 3. Categorize inflammation: <3=Low, 3-6=Medium, >6=High

## 6. Nested Lists

Lists can contain other lists, useful for multi-dimensional data:

In [None]:
# Patient data: each inner list is [id, age, weight, height]
patients_data = [
    ['P001', 25, 70.5, 1.75],
    ['P002', 34, 65.2, 1.62],
    ['P003', 45, 80.1, 1.80],
    ['P004', 56, 55.8, 1.55],
    ['P005', 23, 75.3, 1.78]
]

print("All patient data:")
for patient in patients_data:
    print(f"  {patient}")

print(f"\nFirst patient: {patients_data[0]}")
print(f"First patient ID: {patients_data[0][0]}")
print(f"Third patient age: {patients_data[2][1]}")

In [None]:
# Inflammation data matrix: patients × days
inflammation_matrix = [
    [0.0, 1.5, 3.2, 4.1, 2.8, 1.9, 0.8],  # Patient 1
    [0.5, 2.1, 3.8, 5.2, 3.1, 2.0, 0.5],  # Patient 2
    [0.0, 1.0, 2.5, 3.5, 2.2, 1.5, 0.3]   # Patient 3
]

print("Inflammation matrix:")
for i, patient_data in enumerate(inflammation_matrix):
    print(f"  Patient {i+1}: {patient_data}")

# Calculate statistics
print(f"\nPatient 1 average: {sum(inflammation_matrix[0]) / len(inflammation_matrix[0]):.2f}")

# Daily averages across all patients
num_days = len(inflammation_matrix[0])
daily_averages = []
for day in range(num_days):
    day_sum = sum(patient_data[day] for patient_data in inflammation_matrix)
    daily_averages.append(day_sum / len(inflammation_matrix))

print(f"Daily averages: {[f'{avg:.2f}' for avg in daily_averages]}")

## 7. Other Data Structures

Python has other useful built-in data structures:

In [None]:
# Tuples - immutable sequences
patient_record = ('P001', 'Alice Johnson', 25, 70.5)  # Can't be changed
coordinates = (12.5, 34.8)  # Often used for coordinates

print("Patient record (tuple):", patient_record)
print("Patient ID:", patient_record[0])
print("Coordinates:", coordinates)

# Tuple unpacking
patient_id, name, age, weight = patient_record
print(f"Unpacked: {name} (ID: {patient_id}) is {age} years old")

# Try to modify tuple (this will fail)
try:
    patient_record[1] = 'Bob Smith'
except TypeError as e:
    print(f"Cannot modify tuple: {e}")

In [None]:
# Sets - collections of unique elements
all_patient_ids = ['P001', 'P002', 'P001', 'P003', 'P002', 'P004', 'P001']
unique_patients = set(all_patient_ids)

print("All patient IDs:", all_patient_ids)
print("Unique patients:", unique_patients)
print("Number of unique patients:", len(unique_patients))

# Set operations
group_a = {'P001', 'P002', 'P003'}
group_b = {'P003', 'P004', 'P005'}

print(f"Group A: {group_a}")
print(f"Group B: {group_b}")
print(f"Intersection (both groups): {group_a & group_b}")
print(f"Union (either group): {group_a | group_b}")
print(f"Difference (A but not B): {group_a - group_b}")

In [None]:
# Dictionaries - key-value pairs
patient_data = {
    'id': 'P001',
    'name': 'Alice Johnson',
    'age': 25,
    'weight': 70.5,
    'height': 1.75,
    'inflammation_readings': [0.0, 1.5, 3.2, 4.1, 2.8]
}

print("Patient data dictionary:")
for key, value in patient_data.items():
    print(f"  {key}: {value}")

print(f"\nPatient name: {patient_data['name']}")
print(f"Average inflammation: {sum(patient_data['inflammation_readings']) / len(patient_data['inflammation_readings']):.2f}")

# Calculate BMI
bmi = patient_data['weight'] / (patient_data['height'] ** 2)
patient_data['bmi'] = round(bmi, 1)  # Add new key-value pair
print(f"BMI added: {patient_data['bmi']}")

In [None]:
# Multiple patients as dictionary
study_data = {
    'P001': {'name': 'Alice', 'age': 25, 'readings': [1.5, 2.0, 1.8]},
    'P002': {'name': 'Bob', 'age': 34, 'readings': [2.1, 3.5, 2.8]},
    'P003': {'name': 'Carol', 'age': 45, 'readings': [1.0, 1.5, 1.2]}
}

print("Study data:")
for patient_id, data in study_data.items():
    avg_inflammation = sum(data['readings']) / len(data['readings'])
    print(f"{patient_id}: {data['name']} (age {data['age']}) - avg inflammation: {avg_inflammation:.2f}")

# Access specific patient
print(f"\nPatient P002 readings: {study_data['P002']['readings']}")

### Exercise 4.3
Create a comprehensive data structure for a small study:
1. Use a dictionary to store patient information
2. Include lists for daily measurements
3. Calculate summary statistics for each patient

In [None]:
# Exercise 4.3 - Your code here
# Create a study with 3 patients, each with:
# - Patient info (name, age, group)
# - 7 days of inflammation readings
# - Calculate: average, min, max, trend (increasing/decreasing)

## 8. When to Use Which Data Structure

Guidelines for choosing the right data structure:

In [None]:
# Comparison of data structures
print("Data Structure Usage Guidelines:")
print("")
print("📝 LISTS - Use when:")
print("  • You need ordered, mutable sequences")
print("  • You need to modify elements frequently")
print("  • You need to append/remove elements")
print("  • Example: daily measurements, patient records")
print("")
print("🔒 TUPLES - Use when:")
print("  • You need immutable sequences")
print("  • You want to prevent accidental modification")
print("  • You need hashable objects (for dict keys)")
print("  • Example: coordinates, patient ID + name")
print("")
print("🎯 SETS - Use when:")
print("  • You need unique elements only")
print("  • You need fast membership testing")
print("  • You need set operations (union, intersection)")
print("  • Example: unique patient IDs, drug allergies")
print("")
print("🗂️ DICTIONARIES - Use when:")
print("  • You need key-value associations")
print("  • You need fast lookups by key")
print("  • You need structured data")
print("  • Example: patient records, configuration settings")

## Summary

In this episode, we learned:
- **Lists**: Mutable, ordered sequences for collections of data
- **Indexing and slicing**: Accessing specific elements and ranges
- **List methods**: append, extend, remove, sort, etc.
- **List comprehensions**: Elegant way to create and transform lists
- **Nested lists**: Multi-dimensional data structures
- **Other structures**: Tuples, sets, and dictionaries
- **When to use what**: Guidelines for choosing data structures

Understanding these data structures is fundamental to effective Python programming!