### Mid-term for HDS5210

Your supervisor is concerned about 4-year survival risks for COPD. She has asked for you to do some analysis using a new metric, BODE. BODE is an improvement on a previous metric and promises to provide insight on survival risks.

BODE is defined here. https://www.mdcalc.com/calc/3916/bode-index-copd-survival#evidence

Your assignment is to create a BODE calculation, use it to calculate BODE scores and BODE survival rates for a group of patients. Then we want to evaluate the average BODE scores and BODE survival rates for each area hospital.

Your patient input file will have the following columns:
NAME,SSN,LANGUAGE,JOB,HEIGHT_M,WEIGHT_KG,fev_pct,dyspnea_description,distance_in_meters,hospital

BODE calculations require a BMI value, so you will have to create a function for it.

Your output should be in the form of two CSV files, patient_output.csv and hospital_output.csv.

Patient_output will have the following columns:
NAME,BODE_SCORE,BODE_RISK,HOSPITAL

Hospital output will have the following columns:
HOSPITAL_NAME, COPD_COUNT, PCT_OF_COPD_CASES_OVER_BEDS, AVG_SCORE, AVG_RISK

Each function you create should have documentation and a suitable number of test cases. If the input data could be wrong, make sure to raise a Value Error.

For this assignment, use the doctest, json, and csv libraries. Pandas is not allowed for this assignment.

In [43]:
import doctest
import json
import csv

### Step 1: Calculate BMI

In [44]:
def calc_bmi(weight, height):
    """
    This function calculate BMI for patient.

    Args:
    weight (float): This is weight of patient in kilogram.
    height (float): This is height of patient in meters.

    Returns:
    float: The BMI of patient, rounded to 2 decimal places.
    """

    # First we use the formula for BMI: weight / (height in meters squared) as per week 1 assignment
    bmi_value = weight / (height ** 2)

    # Now we need to round off the BMI to two decimal places to make it easy to read
    return round(bmi_value, 2)


### Step 2: Calculate BODE Score

In [45]:
def calc_bode(bmi, fev1, dyspnea, walk_distance):
    """
    This function calculate BODE score for COPD patient. BODE score helps to know how bad the patient condition is.

    Args:
    bmi (float): The BMI of patient.
    fev1 (float): The breathing test result as percentage.
    dyspnea (int): How much difficulty patient has in breathing, rated from 0 to 4.
    walk_distance (int): Distance patient walked in meters during test.

    Returns:
    int: Total BODE score for patient. (sum of bmi, fev1, syspnea, walk_distance)
    """

    # First we will start with a score of 0 and continue to add points based on the patient's condition.
    score = 0

    # If BMI is less than 21, patient gets 1 point because low weight is bad
    if bmi < 21:
        score += 1

    # Next, we check FEV1 percent (how good lungs work). The worse it is, the more points
    if fev1 < 36:
        score += 3
    elif fev1 < 50:
        score += 2
    elif fev1 < 65:
        score += 1

    # We add the dyspnea score directly, more difficulty breathing means more points
    score += dyspnea

    # Now check how far patient can walk. Less walking means more points because it shows weaker condition
    if walk_distance < 150:
        score += 3
    elif walk_distance < 250:
        score += 2
    elif walk_distance < 350:
        score += 1

    # Return the total score after adding all the points
    return score


### Step 3: Calculate BODE Risk

In [46]:
def calc_bode_risk(bode_score):
    """
    This function calculates the BODE risk category based on the BODE score.

    Args:
    bode_score (int): The total BODE score of the patient.

    Returns:
    str: The risk category ("Low", "Moderate", "High", "Very High").

    Example:
    >>> calc_bode_risk(3)
    'Moderate'
    """

    # In this we mainly need to decide the risk category based on the BODE score obtained from the previous code in step 2
    if bode_score <= 2:
        return "Low"
    elif bode_score <= 4:
        return "Moderate"
    elif bode_score <= 6:
        return "High"
    else:
        return "Very High"


### Step 4: Load Hospital Data

In [47]:
def load_hospital_data(hospital_file):
    """
    This function loads hospital data from a JSON file.

    Args:
    hospital_file (str): The path to the hospital JSON file.

    Returns:
    dict: The hospital data.

    Example:
    >>> load_hospital_data("hospitals.json")
    {'Hospital A': 100, 'Hospital B': 50}
    """

    # In this we will open the JSON file and load the data into a Python dictionary for further analysis and review of the patient condition
    with open(hospital_file, 'r') as file:
        data = json.load(file)

    return data


### Step 5: Main business logic

Call BODE Score, BODE Risk functions for each patient.

For each hospital, calculate Avg BODE score and Avg BODE risk and count the number of cases for each hospital.

In [48]:
def process_patients(patient_file, hospital_file, patient_output, hospital_output):
    """
    This function processes the patient data to calculate BODE scores and risk levels.
    It also generates hospital-level summaries.

    Args:
    patient_file (str): Path to the CSV file with patient data.
    hospital_file (str): Path to the JSON file with hospital data.
    patient_output (str): Path to save the patient output CSV file.
    hospital_output (str): Path to save the hospital output CSV file.

    Returns:
    None
    """

    # First we need to open the patient CSV file and read the data present to procede further
    with open(patient_file, 'r') as file:
        reader = csv.DictReader(file)

        # Next we will create a list to store patient results
        patient_results = []

        # Then, create a dictionary to store hospital summary data
        hospital_summary = {}

        for row in reader:
            # Extract patient details
            name = row['NAME']
            height = float(row['HEIGHT_M'])
            weight = float(row['WEIGHT_KG'])
            fev1 = float(row['fev_pct'])
            dyspnea = int(row['dyspnea_description'])
            walk_distance = int(row['distance_in_meters'])
            hospital = row['hospital']

            # Calculate BMI
            bmi = calc_bmi(weight, height)

            # Calculate BODE score
            bode_score = calc_bode(bmi, fev1, dyspnea, walk_distance)

            # Calculate BODE risk
            bode_risk = calc_bode_risk(bode_score)

            # Append the patient results to the list
            patient_results.append({
                'NAME': name,
                'BODE_SCORE': bode_score,
                'BODE_RISK': bode_risk,
                'HOSPITAL': hospital
            })

            # Update the hospital summary data
            if hospital not in hospital_summary:
                hospital_summary[hospital] = {
                    'COPD_COUNT': 0,
                    'TOTAL_SCORE': 0,
                    'TOTAL_RISK': 0
                }

            hospital_summary[hospital]['COPD_COUNT'] += 1
            hospital_summary[hospital]['TOTAL_SCORE'] += bode_score
            hospital_summary[hospital]['TOTAL_RISK'] += 1 if bode_risk == 'High' or bode_risk == 'Very High' else 0

        # Write patient output to CSV file
        with open(patient_output, 'w', newline='') as file:
            writer = csv.DictWriter(file, fieldnames=['NAME', 'BODE_SCORE', 'BODE_RISK', 'HOSPITAL'])
            writer.writeheader()
            writer.writerows(patient_results)

        # Write hospital output to CSV file
        with open(hospital_output, 'w', newline='') as file:
            writer = csv.DictWriter(file, fieldnames=['HOSPITAL_NAME', 'COPD_COUNT', 'AVG_SCORE', 'AVG_RISK'])
            writer.writeheader()
            for hospital, data in hospital_summary.items():
                writer.writerow({
                    'HOSPITAL_NAME': hospital,
                    'COPD_COUNT': data['COPD_COUNT'],
                    'AVG_SCORE': round(data['TOTAL_SCORE'] / data['COPD_COUNT'], 2),
                    'AVG_RISK': round(data['TOTAL_RISK'] / data['COPD_COUNT'] * 100, 2)
                })
