### Mid-term for HDS5210

Your supervisor is concerned about 4-year survival risks for COPD. She has asked for you to do some analysis using a new metric, BODE. BODE is an improvement on a previous metric and promises to provide insight on survival risks.

BODE is defined here. https://www.mdcalc.com/calc/3916/bode-index-copd-survival#evidence

Your assignment is to create a BODE calculation, use it to calculate BODE scores and BODE survival rates for a group of patients. Then we want to evaluate the average BODE scores and BODE survival rates for each area hospital.

Your patient input file will have the following columns:
NAME,SSN,LANGUAGE,JOB,HEIGHT_M,WEIGHT_KG,fev_pct,dyspnea_description,distance_in_meters,hospital

BODE calculations require a BMI value, so you will have to create a function for it.

Your output should be in the form of two CSV files, patient_output.csv and hospital_output.csv.

Patient_output will have the following columns:
NAME,BODE_SCORE,BODE_RISK,HOSPITAL

Hospital output will have the following columns:
HOSPITAL_NAME, COPD_COUNT, PCT_OF_COPD_CASES_OVER_BEDS, AVG_SCORE, AVG_RISK

Each function you create should have documentation and a suitable number of test cases. If the input data could be wrong, make sure to raise a Value Error.

For this assignment, use the doctest, json, and csv libraries. Pandas is not allowed for this assignment.

In [11]:
import doctest
import json
import csv

### Step 1: Calculate BMI

In [24]:
def calculate_bmi(weight_kg, height_m):
    """
    Calculate the Body Mass Index (BMI).

    Parameters:
    weight_kg (float): The weight of the individual in kilograms.
    height_m (float): The height of the individual in meters.

    Returns:
    float: The calculated BMI.

    Raises:
    ValueError: If weight or height is not positive.

    >>> calculate_bmi(70, 1.75)
    22.857142857142858
    >>> calculate_bmi(0, 1.75)
    Traceback (most recent call last):
        ...
    ValueError: Weight and height must be positive values.
    >>> calculate_bmi(70, 0)
    Traceback (most recent call last):
        ...
    ValueError: Weight and height must be positive values.
    """
    if weight_kg <= 0 or height_m <= 0:
        raise ValueError("Weight and height must be positive values.")

    bmi = weight_kg / (height_m ** 2)
    return bmi


### Step 2: Calculate BODE Score

In [13]:
def calculate_bode_score(fev_pct, dyspnea_description, distance_meters, weight_kg, height_m):
    """
    Calculate the BODE score based on FEV1 percentage, dyspnea level, distance walked, and BMI.

    Parameters:
    fev_pct (float): FEV1 percentage.
    dyspnea_description (str): Description of dyspnea (None, Mild, Moderate, Severe).
    distance_meters (float): Distance walked in 6 minutes.
    weight_kg (float): The weight of the individual in kilograms.
    height_m (float): The height of the individual in meters.

    Returns:
    tuple: BODE score and risk category.

    Raises:
    ValueError: If inputs are not valid.

    >>> calculate_bode_score(50, 'Moderate', 300, 70, 1.75)
    (4, 'High')
    >>> calculate_bode_score(30, 'Severe', 200, 80, 1.8)
    (6, 'Very High')
    >>> calculate_bode_score(80, 'None', 600, 60, 1.6)
    (1, 'Low')
    """
    # Calculate BMI
    bmi = calculate_bmi(weight_kg, height_m)

    # BODE components
    # B: BMI points
    if bmi < 21:
        b_points = 1
    elif 21 <= bmi <= 24.9:
        b_points = 0
    else:
        b_points = 1

    # O: FEV1 points
    if fev_pct < 50:
        o_points = 2
    elif 50 <= fev_pct < 65:
        o_points = 1
    else:
        o_points = 0

    # D: Dyspnea points
    if dyspnea_description == 'None':
        d_points = 0
    elif dyspnea_description == 'Mild':
        d_points = 1
    elif dyspnea_description == 'Moderate':
        d_points = 2
    else:  # Severe
        d_points = 3

    # E: Distance points
    if distance_meters < 150:
        e_points = 3
    elif 150 <= distance_meters < 250:
        e_points = 2
    elif 250 <= distance_meters < 350:
        e_points = 1
    else:
        e_points = 0

    # Total BODE score
    total_bode_score = b_points + o_points + d_points + e_points

    # Risk categories based on total BODE score
    if total_bode_score <= 2:
        risk_category = 'Low'
    elif total_bode_score <= 4:
        risk_category = 'Moderate'
    elif total_bode_score <= 6:
        risk_category = 'High'
    else:
        risk_category = 'Very High'

    return total_bode_score, risk_category


### Step 3: Calculate BODE Risk

In [15]:
def determine_bode_risk(bode_score):
    """
    Determine BODE risk category based on the BODE score.

    Parameters:
    bode_score (int): The total BODE score.

    Returns:
    str: Risk category.

    Raises:
    ValueError: If BODE score is negative.

    >>> determine_bode_risk(1)
    'Low'
    >>> determine_bode_risk(4)
    'Moderate'
    >>> determine_bode_risk(6)
    'High'
    >>> determine_bode_risk(7)
    'Very High'
    >>> determine_bode_risk(-1)
    Traceback (most recent call last):
        ...
    ValueError: BODE score cannot be negative.
    """
    if bode_score < 0:
        raise ValueError("BODE score cannot be negative.")

    if bode_score <= 2:
        return 'Low'
    elif 3 <= bode_score <= 4:
        return 'Moderate'
    elif 5 <= bode_score <= 6:
        return 'High'
    else:
        return 'Very High'


### Step 4: Load Hospital Data

In [17]:
import csv

def load_patient_data(input_file):
    """
    Load patient data from a CSV file and calculate BODE scores and risks.

    Parameters:
    input_file (str): The path to the input CSV file.

    Returns:
    list: A list of dictionaries containing patient results.
    dict: A dictionary containing hospital summaries.

    Raises:
    ValueError: If the input data is invalid.

    >>> load_patient_data('patients.csv')  # Example usage.
    """
    patient_results = []
    hospital_summary = {}

    with open(input_file, mode='r', newline='') as file:
        reader = csv.DictReader(file)

        for row in reader:
            try:
                # Extract patient data
                name = row['NAME']
                height_m = float(row['HEIGHT_M'])
                weight_kg = float(row['WEIGHT_KG'])
                fev_pct = float(row['fev_pct'])
                dyspnea_description = row['dyspnea_description']
                distance_meters = float(row['distance_in_meters'])
                hospital = row['hospital']

                # Calculate BODE score and risk
                bode_score, risk_category = calculate_bode_score(
                    fev_pct, dyspnea_description, distance_meters, weight_kg, height_m
                )

                # Append patient result
                patient_results.append({
                    'NAME': name,
                    'BODE_SCORE': bode_score,
                    'BODE_RISK': risk_category,
                    'HOSPITAL': hospital
                })

                # Update hospital summary
                if hospital not in hospital_summary:
                    hospital_summary[hospital] = {
                        'COPD_COUNT': 0,
                        'TOTAL_SCORE': 0
                    }
                hospital_summary[hospital]['COPD_COUNT'] += 1
                hospital_summary[hospital]['TOTAL_SCORE'] += bode_score

            except (ValueError, KeyError) as e:
                raise ValueError(f"Invalid data for patient {name}: {e}")

    # Calculate average scores for each hospital
    for hospital, data in hospital_summary.items():
        data['AVG_SCORE'] = data['TOTAL_SCORE'] / data['COPD_COUNT']
        data['AVG_RISK'] = determine_bode_risk(data['AVG_SCORE'])

    return patient_results, hospital_summary


### Step 5: Main business logic

Call BODE Score, BODE Risk functions for each patient.

For each hospital, calculate Avg BODE score and Avg BODE risk and count the number of cases for each hospital.