# Session 1.2: Modules and Packages

## **Supports Modularity, Used in PySpark Imports**

### **Learning Objectives**
By the end of this session, you will:
- Import and use Python modules
- Understand package structure
- Work with healthcare data libraries
- Prepare for PySpark module imports

---

### **Relevance to PySpark**
Understanding modules and packages is essential for PySpark development, as you'll need to import various PySpark modules and work with external libraries.

---

## 1. Standard Library Imports

In [None]:
# Essential imports for healthcare data processing
import datetime
import math
import random
from datetime import date, timedelta

# Current date for patient records
today = date.today()
print(f"Today's date: {today}")

# Calculate patient age from birth date
birth_date = date(1978, 5, 15)
age = today.year - birth_date.year
print(f"Patient age: {age} years")

## 2. Mathematical Operations for Clinical Calculations

In [None]:
# Using math module for clinical calculations
import math

# Dosage calculations
patient_weight = 70.5  # kg
dosage_per_kg = 2.5    # mg per kg

total_dosage = patient_weight * dosage_per_kg
rounded_dosage = math.ceil(total_dosage)  # Round up for safety

print(f"Calculated dosage: {total_dosage:.2f} mg")
print(f"Rounded dosage: {rounded_dosage} mg")

# Statistical calculations
heart_rates = [72, 75, 78, 73, 76]
mean_hr = sum(heart_rates) / len(heart_rates)
variance = sum((hr - mean_hr) ** 2 for hr in heart_rates) / len(heart_rates)
std_dev = math.sqrt(variance)

print(f"Mean heart rate: {mean_hr:.1f} bpm")
print(f"Standard deviation: {std_dev:.2f} bpm")

## 3. Creating Custom Modules

Let's create a simple healthcare utilities module.

In [None]:
# Create a healthcare utilities module (would be in separate file)
def calculate_bmi(weight_kg, height_m):
    """Calculate BMI from weight and height."""
    return weight_kg / (height_m ** 2)

def bmi_category(bmi):
    """Determine BMI category."""
    if bmi < 18.5:
        return "Underweight"
    elif bmi < 25:
        return "Normal"
    elif bmi < 30:
        return "Overweight"
    else:
        return "Obese"

def generate_patient_id():
    """Generate a unique patient ID."""
    return f"PT{random.randint(10000, 99999)}"

# Using our custom functions
patient_bmi = calculate_bmi(75.0, 1.80)
category = bmi_category(patient_bmi)
new_id = generate_patient_id()

print(f"BMI: {patient_bmi:.2f} - Category: {category}")
print(f"Generated Patient ID: {new_id}")

## 4. Package Structure Understanding

Understanding how to organize code for PySpark projects.

In [None]:
# Simulating PySpark-style imports (structure example)
# In real PySpark, you would have:
# from pyspark.sql import SparkSession
# from pyspark.sql.functions import col, sum, avg
# from pyspark.sql.types import StructType, StructField, StringType

# For now, let's simulate with standard library
from collections import namedtuple
from typing import List, Dict

# Define a Patient structure (like a DataFrame schema)
Patient = namedtuple('Patient', ['id', 'name', 'age', 'diagnosis'])

# Create sample patients
patients = [
    Patient('PT001', 'John Doe', 45, 'Hypertension'),
    Patient('PT002', 'Jane Smith', 32, 'Diabetes'),
    Patient('PT003', 'Bob Johnson', 58, 'Heart Disease')
]

# Process data (similar to PySpark operations)
for patient in patients:
    print(f"ID: {patient.id}, Name: {patient.name}, Age: {patient.age}")

## 5. Import Best Practices for PySpark Preparation

In [None]:
# Different import styles and when to use them

# 1. Full module import
import datetime
current_time = datetime.datetime.now()

# 2. Specific function import
from datetime import datetime, timedelta
tomorrow = datetime.now() + timedelta(days=1)

# 3. Alias for shorter names
import datetime as dt
next_week = dt.datetime.now() + dt.timedelta(weeks=1)

# 4. Multiple imports (PySpark style)
from math import sqrt, ceil, floor

dosage = 125.7
print(f"Rounded up: {ceil(dosage)} mg")
print(f"Rounded down: {floor(dosage)} mg")

print(f"Current time: {current_time}")
print(f"Tomorrow: {tomorrow}")
print(f"Next week: {next_week}")

## 6. Practice Exercise

Create a patient data processing module.

In [None]:
# Exercise: Create functions for patient data processing
# TODO: Import necessary modules and create functions for:
# 1. Age calculation from birth date
# 2. Days since last visit calculation
# 3. Generate random patient temperatures (normal range)
# 4. Format patient information display

# Your code here

---

## Summary

In this session, you learned:
- ✅ Importing standard library modules
- ✅ Creating and using custom functions
- ✅ Understanding package structure
- ✅ Import best practices for PySpark preparation
- ✅ Healthcare data processing examples

**Next:** Session 1.3 - Data Structures