### Mid-term for HDS5210

Your supervisor is concerned about 4-year survival risks for COPD. She has asked for you to do some analysis using a new metric, BODE. BODE is an improvement on a previous metric and promises to provide insight on survival risks.

BODE is defined here. https://www.mdcalc.com/calc/3916/bode-index-copd-survival#evidence

Your assignment is to create a BODE calculation, use it to calculate BODE scores and BODE survival rates for a group of patients. Then we want to evaluate the average BODE scores and BODE survival rates for each area hospital.

Your patient input file will have the following columns:
NAME,SSN,LANGUAGE,JOB,HEIGHT_M,WEIGHT_KG,fev_pct,dyspnea_description,distance_in_meters,hospital

BODE calculations require a BMI value, so you will have to create a function for it.

Your output should be in the form of two CSV files, patient_output.csv and hospital_output.csv.

Patient_output will have the following columns:
NAME,BODE_SCORE,BODE_RISK,HOSPITAL

Hospital output will have the following columns:
HOSPITAL_NAME, COPD_COUNT, PCT_OF_COPD_CASES_OVER_BEDS, AVG_SCORE, AVG_RISK

Each function you create should have documentation and a suitable number of test cases. If the input data could be wrong, make sure to raise a Value Error.

For this assignment, use the doctest, json, and csv libraries. Pandas is not allowed for this assignment.

In [4]:
import doctest
import json
import csv

### Step 1: Calculate BMI

In [6]:
def calculate_bmi(height_m, weight_kg):
   """
    Calculate Body Mass Index (BMI) using the formula:

    BMI = weight_kg / (height_m^2)

    Args:
        weight_kg (float): The weight of the patient in kilograms.
        height_m (float): The height of the patient in meters.

    Returns:
        float: The calculated BMI.

    Raises:
        ValueError: If weight or height is non-positive.

    >>> calculate_bmi(90.28, 1.72)
    30.53
    >>> calculate_bmi(83.09, 1.64)
    30.88
    """
   if weight_kg <= 0 or height_m <= 0:
        raise ValueError("Weight and height must be positive values.")

   return weight_kg / (height_m ** 2)



### Step 2: Calculate BODE Score

In [28]:
def calculate_bode(fev_pct, distance_m, dyspnea, bmi):
    """Calculates BODE score."""
    # Placeholder for actual BODE score calculation logic
    # Replace with your actual calculation based on the provided parameters
    # This is just an example and may not be medically accurate
    bode_score = 0
    if fev_pct < 50:
        bode_score += 2
    if distance_m < 350:
        bode_score += 3
    if dyspnea == "severe":
        bode_score += 3
    elif dyspnea == "moderate":
        bode_score += 2
    if bmi < 21:
        bode_score += 1
    return bode_score

In [29]:
 def calculate_bode_score(bmi, fev_pct, dyspnea_description, distance_m):
    """
    Calculate the BODE score based on the following criteria:
    - BMI score
    - FEV% score
    - Dyspnea score
    - Distance walked score

    Args:
        bmi (float): Body Mass Index of the patient.
        fev_pct (float): Percentage of forced expiratory volume.
        dyspnea_description (str): Description of the patient's dyspnea.
        distance_m (float): The distance the patient can walk in 6 minutes (in meters).

    Returns:
        tuple: (BODE score, BODE risk category)

    >>> calculate_bode(57.73, 367.9, 'STOPS AFTER A FEW MINUTES', 30.53)
    2
    """

    # BMI scoring
    if bmi < 21:
        bmi_score = 0
    elif 21 <= bmi <= 25:
        bmi_score = 1
    elif 26 <= bmi <= 30:
        bmi_score = 2
    else:
        bmi_score = 3


    # FEV% scoring
    fev_score = 0
    if fev_pct >= 65:
        fev_score = 0
    elif 50 <= fev_pct < 65:
        fev_score = 1
    elif 36 <= fev_pct < 50:
        fev_score = 2
    else:
        fev_score = 3

    # Dyspnea scoring
    dyspnea_score = 0
    dyspnea_map = {
        "ONLY STRENUOUS EXERCISE": 0,
        "WHEN HURRYING": 0,
        "SLOWER THAN PEERS": 1,
        "STOPS AFTER 100 YARDS": 2,
        "STOPS AFTER A FEW MINUTES": 2,
        "BREATHLESS WHEN DRESSING": 3,
        "UNABLE TO LEAVE HOME": 3
    }
    dyspnea_score = dyspnea_map.get(dyspnea_desc, 0)

    # Distance walked scoring
    distance_score = 0
    if distance_m >= 350:
        distance_score = 0
    elif 250 <= distance_m < 350:
        distance_score = 1
    elif 150 <= distance_m < 250:
        distance_score = 2
    else:
        distance_score = 3

    # Calculate total BODE score
    total_bode_score = bmi_score + fev_score + dyspnea_score + distance_score



### Step 3: Calculate BODE Risk

In [30]:
def calculate_bode_risk(bode_score):
    """
    Calculate BODE survival risk based on BODE score.

    :param bode_score: Total BODE score
    :return: Survival percentage

    >>> calculate_bode_risk(2)
    80
    >>> calculate_bode_risk(5)
    57
    """
    if 0 <= bode_score <= 2:
        return 80
    elif 3 <= bode_score <= 4:
        return 67
    elif 5 <= bode_score <= 6:
        return 57
    elif 7 <= bode_score <= 10:
        return 18
    else:
        raise ValueError("Invalid BODE score.")

### Step 4: Load Hospital Data

In [31]:
import urllib.request as r
import json

# Fetch the JSON data
url = r.urlopen("https://hds5210-data.s3.amazonaws.com/hospitals.json")
content = url.read()

# Parse the JSON data directly
newDictionary = json.loads(content)

In [32]:
def load_hospital_data(file_name):
    """
    Load hospital data from a JSON file.

    :param file_name: JSON file name
    :return: Hospital data dictionary
    """
    with open(file_name, 'r') as file:
        return json.load(file)

hospital_data = load_hospital_data("hospitals.json")


### Step 5: Main business logic

Call BODE Score, BODE Risk functions for each patient.

For each hospital, calculate Avg BODE score and Avg BODE risk and count the number of cases for each hospital.

In [33]:
###
def main(patient_csv, hospital_json):
    patients = []
    hospitals = {}

    with open(patient_csv, newline='') as csvfile:
        reader = csv.DictReader(csvfile)
        for row in reader:
            try:
                name = row['NAME']
                height = float(row['HEIGHT_M'])
                weight = float(row['WEIGHT_KG'])
                fev_pct = float(row['fev_pct'])
                distance_m = float(row['distance_in_meters'])
                dyspnea = row['dyspnea_description']
                hospital = row['hospital']

                bmi = calculate_bmi(weight, height)
                bode_score = calculate_bode(fev_pct, distance_m, dyspnea, bmi)
                bode_risk = calculate_bode_risk(bode_score)

                patients.append([name, bode_score, bode_risk, hospital])

                if hospital not in hospitals:
                    hospitals[hospital] = {'count': 0, 'total_score': 0, 'total_risk': 0}

                hospitals[hospital]['count'] += 1
                hospitals[hospital]['total_score'] += bode_score
                hospitals[hospital]['total_risk'] += bode_risk
            except ValueError as e:
                print(f"Error processing patient {name}: {e}")

    # Write patient_output.csv
    with open("patient_output.csv", 'w', newline='') as csvfile:
        writer = csv.writer(csvfile)
        writer.writerow(['NAME', 'BODE_SCORE', 'BODE_RISK', 'HOSPITAL'])
        writer.writerows(patients)

    # Write hospital_output.csv
    with open("hospital_output.csv", 'w', newline='') as csvfile:
        writer = csv.writer(csvfile)
        writer.writerow(['HOSPITAL_NAME', 'COPD_COUNT', 'AVG_SCORE', 'AVG_RISK'])
        for hospital, data in hospitals.items():
            avg_score = data['total_score'] / data['count']
            avg_risk = data['total_risk'] / data['count']
            writer.writerow([hospital, data['count'], avg_score, avg_risk])

# Call the main function
main("patient.csv", "hospitals.json")