<a href="https://colab.research.google.com/github/richardogoma/health-clinic-data-summary-richardogoma/blob/main/Health_Data_Summary.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Mini Project: Health Clinic Data Summary & Outlier Analysis**

LifeCare Clinics, a rapidly growing healthcare provider collects patient vital signs (heart rate, blood pressure, etc.) daily. However, the data is currently stored in raw CSV files, and the doctors are struggling to identify patients who are at “high risk” (outliers) manually.

**The goal of this project is to build a data analysis pipeline** to ingest the raw data, clean it, calculate statistical summaries for the doctors, and automatically flag patients with irregular vitals (outliers) for immediate medical review.

## Module 1: Data Ingestion & Cleaning

In [8]:
import csv

def load_data(filepath: str) -> list[dict]:
    """
    Loads patient vital signs data from a CSV file.

    Args:
        filepath (str): The path to the CSV file.

    Returns:
        list[dict]: A list of dictionaries, where each dictionary represents one patient.
                    Returns an empty list if an error occurs.
    """
    patient_data = []
    try:
        with open(filepath, mode='r', newline='', encoding='utf-8') as file:
            reader = csv.DictReader(file)
            for row in reader:
                processed_row = {}
                for key, value in row.items():
                    try:
                        if key in ['patient_id', 'age', 'heart_rate', 'systolic_bp', 'oxygen_saturation']:
                            processed_row[key] = int(value)
                        elif key in ['temperature']:
                            processed_row[key] = float(value)
                    except ValueError:
                        print(f"Warning: Could not convert '{value}' for column '{key}' to a numeric type in row: {row}")
                        processed_row[key] = None
                patient_data.append(processed_row)
        print(f"Successfully loaded data from {filepath}.")
    except FileNotFoundError:
        print(f"Error: The file '{filepath}' was not found.")

    return patient_data
