<a href="https://colab.research.google.com/github/maradeben/oop-tutorial-mit/blob/main/intro_to_python_course_mit.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

![welcome](https://imgs.search.brave.com/WsgHGrKZNjfyegblUrCaB4gxfnv_lDWaCwlR0PzYh2Y/rs:fit:500:0:1:0/g:ce/aHR0cHM6Ly9pLnBp/bmltZy5jb20vb3Jp/Z2luYWxzLzM4LzRi/LzRmLzM4NGI0Zjk2/YzFhMjExMDMxNzM3/MWEzNTI0NTgxMGNl/LmpwZw)


# 🚀 Welcome to Python for Healthcare Professionals!

![welcome](https://imgs.search.brave.com/6AUCT3KxHZPH8bXlACEFDmV5kIHmrWvE5uBt4H4GtSA/rs:fit:500:0:1:0/g:ce/aHR0cHM6Ly90NC5m/dGNkbi5uZXQvanBn/LzEyLzQ3LzU1LzU5/LzM2MF9GXzEyNDc1/NTU5MDNfMTN6dW4y/Nmt5d0dsR1Q2SmFY/UVUwTHB1NFJGRDhu/YksuanBn)


Hello and welcome to the **Healthcare AI & Data Literacy Challenge**!

You are already an expert in your medical or healthcare field. This challenge isn't about replacing that expertise; it's about adding a powerful new tool to your skillset: **data literacy**.

## Contents
1. [**Introduction and How to Use**](#introduction)
2. [**Part 1: The Building Blocks**](#part-1)
3. [**Part 2: Making Your Code Smarter & More Powerful**](#part-2)
4. [**Part 3: From One Patient to a Thousand with NumPy**](#part-3)
5. [**Optional Module: Python in Your Specialty**](#optional)
6. [**You've Completed the Foundation! What's Next?**](#congrats)


<a id='introduction'></a>
### What is this Notebook?

This Colab notebook is your interactive companion to the DataCamp "Introduction to Python" course. While DataCamp teaches the fundamental concepts, this guide will **translate every lesson into a practical, real-world healthcare scenario**. Our goal is to bridge the gap between abstract code and the clinical data you work with every day.

**How to use it:**
1.  Watch the video lessons on DataCamp.
2.  Follow along with the corresponding sections in this notebook.
3.  Click the "play" button on the code cells to run the examples yourself and see the concepts in action!

Let's begin your journey into the future of data-driven medicine. 🩺

-----



# 🐍 Python for Healthcare: From Clinical Data to Code
**A Practical Companion for the Healthcare AI & Data Literacy Challenge**


## You Already "Speak" Data. Let's Learn Its Language.

As a healthcare professional, you work with data every single day:
* Patient Vitals (Blood Pressure, Heart Rate, Temp)
* Lab Results (Blood Glucose, CBC, Lipid Panels)
* Medication Dosages & Schedules
* Patient Demographics & Histories

**The Problem:** This data often lives in static charts, disconnected files, or complex EMRs.

**The Solution:** Python is a tool that lets you interact with, analyze, and find patterns in this data at scale. This notebook connects every Python concept you learn on DataCamp to a real-world healthcare scenario. 🩺

-----

<a id='part-1'></a>

## Part 1: The Building Blocks
### `Variables`: Storing a Single Piece of Patient Information

Think of a variable as a single field on a patient's chart. Each variable holds one piece of information and has a specific data type.

* `patient_id = "HN-00451"` is a **string** (text).
* `patient_age = 42` is an **integer** (a whole number).
* `patient_temp_celsius = 37.5` is a **float** (a number with a decimal).
* `is_on_medication = True` is a **boolean** (True or False).

Let's see this in action.


-----

In [None]:
# A single patient's data
patient_id = "HN-00451"
patient_age = 42
patient_temp_celsius = 37.5
is_on_medication = True

# We can print the values
print("Patient ID:", patient_id)
print("Patient Age:", patient_age)

# And we can check their types
print("Type of patient_temp_celsius:", type(patient_temp_celsius))
print("Type of is_on_medication:", type(is_on_medication))

Patient ID: HN-00451
Patient Age: 42
Type of patient_temp_celsius: <class 'float'>
Type of is_on_medication: <class 'bool'>


-----
### `Lists`: A Collection of Clinical Readings

A single variable isn't enough for continuous monitoring. A **list** is perfect for storing a sequence of measurements for a single patient, like hourly vitals or a list of their current medications.

-----

In [None]:
# A patient's heart rate (bpm) recorded over 5 minutes
heart_rates = [78, 81, 80, 77, 84]

# A patient's current medications
medications = ["Lisinopril", "Metformin", "Atorvastatin"]

print("Hourly Heart Rates:", heart_rates)
print("Current Medications:", medications)

Hourly Heart Rates: [78, 81, 80, 77, 84]
Current Medications: ['Lisinopril', 'Metformin', 'Atorvastatin']


-----
### Subsetting & Changing Lists: Accessing and Updating Clinical Data

We can easily interact with our lists to get specific information or update them as a patient's condition changes.

* **Accessing an item:** "What was the patient's first recorded heart rate?" (Remember: Python starts counting from 0!)
* **Getting a range (slicing):** "Show me the heart rates from the 2nd to the 4th minute."
* **Changing an item:** "The 2nd reading was entered incorrectly. Let's update it."
* **Adding an item:** "The doctor just prescribed a new medication."
-----

In [None]:
# Original heart rates list
heart_rates = [78, 81, 80, 77, 84]
medications = ["Lisinopril", "Metformin", "Atorvastatin"]

# Get the first heart rate reading (at index 0)
first_reading = heart_rates[0]
print("First Reading:", first_reading)

# Get readings from the 2nd to the 4th minute (index 1 up to 4)
middle_readings = heart_rates[1:4]
print("Middle Readings:", middle_readings)

# The second reading (at index 1) was wrong, update it to 82
heart_rates[1] = 82
print("Corrected Heart Rates:", heart_rates)

# Add a new medication to the end of the list
medications.append("Aspirin")
print("Updated Medications:", medications)

First Reading: 78
Middle Readings: [81, 80, 77]
Corrected Heart Rates: [78, 82, 80, 77, 84]
Updated Medications: ['Lisinopril', 'Metformin', 'Atorvastatin', 'Aspirin']



---
<a id='part-2'></a>
## Part 2: Making Your Code Smarter & More Powerful

### `Functions`: Turning a Clinical Protocol into Reusable Code

You follow protocols every day. A function is just a clinical protocol written in code. For example, instead of manually calculating a patient's BMI every time, you can write a `calculate_bmi` function.

* **Protocol:** Calculating Body Mass Index (BMI).
* **Inputs:** Weight (kg), Height (m).
* **Process:** weight / (height²)
* **Output:** BMI value.

-----

### **Cell 10: Code**

In [None]:
# Define the BMI "protocol" as a function
def calculate_bmi(weight_kg, height_m):
  """Calculates BMI from weight in kg and height in meters."""
  bmi = weight_kg / (height_m ** 2)
  return round(bmi, 1) # Round to one decimal place

# Use the "protocol" for two different patients
patient1_bmi = calculate_bmi(75, 1.8)
patient2_bmi = calculate_bmi(62, 1.65)

print("Patient 1 BMI:", patient1_bmi)
print("Patient 2 BMI:", patient2_bmi)

Patient 1 BMI: 23.1
Patient 2 BMI: 22.8


-----
<a id='part-3'></a>
##Part 3: From One Patient to a Thousand with NumPy

### `NumPy`: Your New Superpower for Research 📈

Python lists are great for one patient. But what about a clinical
trial with 500 patients? For that, we need **NumPy**.

NumPy gives us a new data structure called an **array**.
It's like a list on steroids: it's faster and allows you to perform
 mathematical operations on the entire collection of values at once.

This is the big shift from individual patient care to **population-level data analysis**.

-----

In [None]:
import numpy as np

# Patient data in regular Python lists
heights_list = [1.75, 1.80, 1.62, 1.91]
weights_list = [78.5, 85.1, 66.7, 92.2]

# Now, convert them to NumPy arrays
np_heights = np.array(heights_list)
np_weights = np.array(weights_list)

# Calculate BMI for ALL patients at once! This is not possible with lists.
bmi_array = np_weights / (np_heights ** 2)

print("NumPy Heights Array:", np_heights)
print("NumPy Weights Array:", np_weights)
print("Resulting BMI Array:", np.round(bmi_array, 1)) # Rounding for readability

NumPy Heights Array: [1.75 1.8  1.62 1.91]
NumPy Weights Array: [78.5 85.1 66.7 92.2]
Resulting BMI Array: [25.6 26.3 25.4 25.3]


-----
### 2D NumPy Arrays: The Digital Patient Cohort

A 2D NumPy array is exactly like a spreadsheet or a table of clinical trial data.

* **Rows** = Individual patients
* **Columns** = Different measurements (e.g., Age, Systolic BP, Glucose Level)
-----

In [None]:
# Data for 4 patients: [Age, Systolic BP, Blood Glucose]
clinical_data = np.array([
    [45, 130, 5.6],
    [52, 142, 6.1],
    [38, 125, 5.1],
    [45, 155, 7.3]
])

print("Clinical Dataset (2D Array):")
print(clinical_data)

# We can check the dimensions (rows, columns) with .shape
print("\nShape of the data (patients, measurements):", clinical_data.shape)

Clinical Dataset (2D Array):
[[ 45.  130.    5.6]
 [ 52.  142.    6.1]
 [ 38.  125.    5.1]
 [ 45.  155.    7.3]]

Shape of the data (patients, measurements): (4, 3)


-----
### NumPy Statistics: From Data to Clinical Insights

Now you can answer critical research and public health questions from your data! Using NumPy's statistical functions, you can analyze an entire column (a specific measurement for all patients) with one line of code.

In [None]:
# Our data again for reference
# Columns are: [Age, Systolic BP, Blood Glucose]
clinical_data = np.array([
    [45, 130, 5.6],
    [52, 142, 6.1],
    [38, 125, 5.1],
    [45, 155, 7.3]
])

# "What is the average blood pressure in our patient group?"
# We select all rows (:) and the second column (1)
avg_bp = np.mean(clinical_data[:, 1])
print("Average Systolic BP:", round(avg_bp, 1))

# "What is the median age of our participants?"
# We select all rows (:) and the first column (0)
median_age = np.median(clinical_data[:, 0])
print("Median Age:", median_age)

# "Find all patients with blood pressure over 140."
# This creates a True/False condition to filter the data
high_bp_mask = clinical_data[:, 1] > 140
high_bp_patients = clinical_data[high_bp_mask]
print("\nPatients with BP > 140:")
print(high_bp_patients)

Average Systolic BP: 138.0
Median Age: 45.0

Patients with BP > 140:
[[ 52.  142.    6.1]
 [ 45.  155.    7.3]]


## ✅ Congratulations on Finishing the Core Concepts!

You've successfully translated the fundamental Python concepts into the language of healthcare!

* **Variables** -> A single data point on a chart.
* **Lists** -> Vitals for one patient.
* **NumPy Arrays** -> An entire patient cohort.
* **Functions & Stats** -> Asking clinical questions and getting answers.

The next section is an **optional module** with more examples tailored to a wide range of healthcare specialties.

<a id='optional'></a>
# 🔬 Optional Module: Python in Your Specialty

The concepts we've covered are the universal building blocks of data analysis in any field. This optional section provides specific examples of how these tools apply to your unique area of healthcare. Find your specialty below to see how you can start thinking about your daily data in a new way.

-----
## Variables in Your Specialty

A **variable** holds a single piece of data. Think of it as one entry in a logbook, one measurement, or one specific label.

* **Nursing:** `patient_fall_risk_score = 12`
* **General Medicine:** `chief_complaint = "Persistent Cough"`
* **Surgery:** `estimated_blood_loss_ml = 250`
* **Obstetrics and Gynecology:** `gestational_age_weeks = 38.5`
* **Pediatrics:** `vaccine_name = "MMR"`
* **Anesthesiology:** `sevoflurane_concentration_percent = 2.0`
* **Pharmacy:** `drug_half_life_hours = 8.5`
* **Medical Laboratory Science:** `hemoglobin_g_dL = 14.2`
* **Dentistry:** `caries_risk_level = "High"`
* **Radiology/Medical Imaging:** `ct_slice_thickness_mm = 2.5`
* **Oncology:** `tumor_size_cm = 4.1`
* **Cardiology:** `ejection_fraction_percent = 55`
* **Public Health:** `case_fatality_rate = 0.02`
* **Community Health:** `household_food_security_status = "Secure"`
* **Mental Health:** `gad7_anxiety_score = 15`
* **Biomedical Engineering:** `pacemaker_battery_voltage = 2.8`
* **Physiotherapy:** `knee_flexion_rom_degrees = 110`
* **Nutrition and Dietetics:** `daily_calorie_target = 1800`
* **Radiography:** `xray_exposure_mAs = 15`
* **Physiology:** `maximal_oxygen_uptake_vo2max = 45.5`
* **Optometry:** `intraocular_pressure_mmHg = 18`
* **Environmental Health Science:** `lead_in_water_ppb = 3.2`
* **Health Information Management:** `patient_record_retention_years = 7`
* **Occupational Therapy:** `grip_strength_kg = 25`
* **Medical Physics:** `linear_accelerator_dose_rate_cGy_min = 600`
-----


In [None]:
# Let's declare and print a few specialty-specific variables

# Physiotherapy Example
knee_flexion_rom_degrees = 110
print("Physiotherapy - Knee Range of Motion:", knee_flexion_rom_degrees, "degrees")

# Pharmacy Example
drug_half_life_hours = 8.5
print("Pharmacy - Drug Half-Life:", drug_half_life_hours, "hours")

# Medical Laboratory Science Example
hemoglobin_g_dL = 14.2
print("Lab Science - Hemoglobin Level:", hemoglobin_g_dL, "g/dL")

Physiotherapy - Knee Range of Motion: 110 degrees
Pharmacy - Drug Half-Life: 8.5 hours
Lab Science - Hemoglobin Level: 14.2 g/dL


-----

## Lists in Your Specialty

A **list** stores an ordered sequence of items. It's perfect for tracking things over time, listing steps in a protocol, or grouping related items.

* **Nursing:** `wound_care_steps = ['cleanse', 'apply_ointment', 'dress_wound']`
* **General Medicine:** `differential_diagnoses = ["Bronchitis", "Pneumonia", "Asthma"]`
* **Surgery:** `surgical_instrument_count = [25, 25]` (pre and post-op)
* **Obstetrics and Gynecology:** `apgar_scores = [8, 9]` (at 1 and 5 minutes)
* **Pediatrics:** `monthly_weight_kg = [3.5, 4.2, 5.1, 5.8]`
* **Anesthesiology:** `pre_op_checklist = ["Consent Signed", "NPO Status Confirmed", "Airway Assessed"]`
* **Pharmacy:** `patient_allergies = ["Penicillin", "Sulfa"]`
* **Medical Laboratory Science:** `quality_control_results = [98.5, 101.2, 99.8]`
* **Dentistry:** `perio_chart_pockets_mm = [3, 2, 3, 4, 3, 2]`
* **Radiology/Medical Imaging:** `imaging_series_list = ["Axial T1", "Sagittal T2", "Coronal FLAIR"]`
* **Oncology:** `chemo_cycle_dates = ["2025-09-20", "2025-10-11", "2025-11-01"]`
* **Cardiology:** `ecg_lead_placements = ["V1", "V2", "V3", "V4", "V5", "V6"]`
* **Public Health:** `outbreak_contact_trace_ids = ["P001", "P007", "P012"]`
* **Mental Health:** `patient_reported_moods = ["Anxious", "Calm", "Anxious", "Tired"]`
* **Biomedical Engineering:** `device_calibration_timestamps = [1663333200, 1663419600]`
* **Physiotherapy:** `rehab_exercise_plan = ["Quad Sets", "Heel Slides", "Leg Raises"]`
* **Nutrition and Dietetics:** `sample_meal_plan_foods = ["Oatmeal", "Chicken Breast", "Broccoli", "Quinoa"]`
* **Health Information Management:** `required_document_scans = ["Consent Form", "Insurance Card", "Referral Letter"]`
* **Occupational Therapy:** `activities_of_daily_living = ["Dressing", "Bathing", "Eating"]`

-----

In [None]:
# Example of creating and modifying lists for different specialties

# Pediatrics: Tracking a baby's weight over 4 months
monthly_weight_kg = [3.5, 4.2, 5.1, 5.8]
print("Pediatrics - Initial weights:", monthly_weight_kg)

# A new measurement is taken for the 5th month
monthly_weight_kg.append(6.4)
print("Pediatrics - Updated weights:", monthly_weight_kg)


# Oncology: A list of chemotherapy drugs for a patient's regimen
chemo_regimen = ["Cisplatin", "Etoposide"]
print("\nOncology - Original regimen:", chemo_regimen)

# The first drug (at index 0) is switched to Carboplatin
chemo_regimen[0] = "Carboplatin"
print("Oncology - Revised regimen:", chemo_regimen)

Pediatrics - Initial weights: [3.5, 4.2, 5.1, 5.8]
Pediatrics - Updated weights: [3.5, 4.2, 5.1, 5.8, 6.4]

Oncology - Original regimen: ['Cisplatin', 'Etoposide']
Oncology - Revised regimen: ['Carboplatin', 'Etoposide']



---
## Functions in Your Specialty

A **function** is a reusable block of code that performs a specific task—just like a clinical protocol or calculation you do frequently.

* **Nursing:** A function to calculate a patient's Glasgow Coma Scale (GCS) score based on eye, verbal, and motor responses.
* **General Medicine:** A function to calculate a CHA₂DS₂-VASc score for stroke risk.
* **Anesthesiology:** A function to determine the correct dosage of Propofol based on patient weight and age.
* **Pharmacy:** A function to check for potential drug-drug interactions from a list of medications.
* **Medical Laboratory Science:** A function to flag lab results that are outside the normal range.
* **Radiology/Medical Imaging:** A function to calculate the Standardized Uptake Value (SUV) in PET scans.
* **Oncology:** A function to calculate Body Surface Area (BSA) for chemotherapy dosing.
* **Cardiology:** A function to calculate a patient's 10-year cardiovascular disease risk score.
* **Nutrition and Dietetics:** A function to calculate Basal Metabolic Rate (BMR) using the Mifflin-St Jeor equation.
* **Medical Physics:** A function to calculate the radioactive decay of a specific isotope over time.


In [None]:
# Example: A function for Nutrition and Dietetics to calculate BMR

def calculate_bmr_mifflin(weight_kg, height_cm, age_years, is_male):
  """
  Calculates Basal Metabolic Rate using the Mifflin-St Jeor equation.
  """
  if is_male:
    bmr = (10 * weight_kg) + (6.25 * height_cm) - (5 * age_years) + 5
  else:
    bmr = (10 * weight_kg) + (6.25 * height_cm) - (5 * age_years) - 161

  return int(bmr) # Return as a whole number

# Use the function for a 35-year-old male, 80kg, 180cm
male_patient_bmr = calculate_bmr_mifflin(80, 180, 35, True)
print("Nutrition - BMR for male patient:", male_patient_bmr, "kcal/day")

# Use the function for a 42-year-old female, 65kg, 165cm
female_patient_bmr = calculate_bmr_mifflin(65, 165, 42, False)
print("Nutrition - BMR for female patient:", female_patient_bmr, "kcal/day")

Nutrition - BMR for male patient: 1755 kcal/day
Nutrition - BMR for female patient: 1310 kcal/day


-----

## NumPy Arrays in Your Specialty

A **NumPy array** is how you handle data for a whole group of patients, an entire clinic's records, or a research cohort. It's the digital equivalent of a spreadsheet or a research dataset.

* **Public Health:** A 2D array where rows are individuals and columns are `[Age, Vaccination_Status (0 or 1), Zip_Code]`.
* **Cardiology:** A 2D array of clinical trial data: `[Patient_ID, LDL_Cholesterol_Before, LDL_Cholesterol_After]`.
* **Health Information Management:** A 2D array of hospital data: `[Department_ID, Average_Patient_Wait_Time_Mins, Patient_Satisfaction_Score]`.
* **Oncology:** A 3D array representing tumor response, where dimensions are `[Patient, Timepoint, Tumor_Volume]`.
* **Radiology/Medical Imaging:** A 2D array of image quality scores from different MRI machines.
* **Physiotherapy:** A 2D array tracking patient progress: `[Patient_ID, Session_Number, Pain_Score_1_to_10]`.
* **Environmental Health Science:** A 2D array from a community health study: `[Household_ID, Air_Quality_Index, Asthma_Incidence]`.
* **Biomedical Engineering:** An array of sensor readings from a wearable device over time.
----

In [None]:
import numpy as np

# Example: Data for a Public Health vaccination study
# Each row is a person: [Age, Zip Code, Vaccinated (1=yes, 0=no)]
community_data = np.array([
    [68, 90210, 1],
    [25, 90211, 0],
    [45, 90210, 1],
    [33, 90212, 1],
    [76, 90211, 0],
    [22, 90210, 1]
])

# Let's ask some public health questions!

# What is the average age of participants in our study?
average_age = np.mean(community_data[:, 0])
print("Public Health - Average participant age:", round(average_age, 1), "years")

# How many people in this sample have been vaccinated?
# The sum of the 'Vaccinated' column (index 2) gives us the total
total_vaccinated = np.sum(community_data[:, 2])
print("Public Health - Total vaccinated in sample:", total_vaccinated)

# What is the average age of the UNVACCINATED participants?
unvaccinated_mask = community_data[:, 2] == 0  # Find where vaccinated column is 0
unvaccinated_group = community_data[unvaccinated_mask]
avg_age_unvaccinated = np.mean(unvaccinated_group[:, 0])
print("Public Health - Average age of unvaccinated:", round(avg_age_unvaccinated, 1), "years")

Public Health - Average participant age: 44.8 years
Public Health - Total vaccinated in sample: 4
Public Health - Average age of unvaccinated: 50.5 years


-----
<a id='congrats'></a>
# 🎉 You've Completed the Foundation! What's Next?

Congratulations on working through this entire notebook! You have successfully taken the first, most important step: learning the fundamental language of data.

You've seen how to represent everything from a single patient's temperature to an entire public health dataset using Python's core tools.

### Your Journey Continues

This is just the beginning. To succeed in the **Healthcare AI & Data Literacy Challenge**, here are your next steps:

* **Complete All Courses:** Make sure you finish the other required courses on DataCamp to build on this foundation.
* **Start Thinking About Your Project:** Begin brainstorming for your "Learning Showcase." Think about a simple healthcare problem and how the skills you're learning could help you ask a question about it.
* **Engage with the Community:** The best way to learn is together. Ask questions, share your progress, and help others when you can.

We are incredibly excited to see what you will learn and build. Stay curious, keep learning, and good luck with the rest of the challenge! 🎓