<a href="https://colab.research.google.com/github/sana-hds/HDS-assignmenet/blob/main/midterm/midterm.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

### Mid-term for HDS5210

Your supervisor is concerned about 4-year survival risks for COPD. She has asked for you to do some analysis using a new metric, BODE. BODE is an improvement on a previous metric and promises to provide insight on survival risks.

BODE is defined here. https://www.mdcalc.com/calc/3916/bode-index-copd-survival#evidence

Your assignment is to create a BODE calculation, use it to calculate BODE scores and BODE survival rates for a group of patients. Then we want to evaluate the average BODE scores and BODE survival rates for each area hospital.

Your patient input file will have the following columns:
NAME,SSN,LANGUAGE,JOB,HEIGHT_M,WEIGHT_KG,fev_pct,dyspnea_description,distance_in_meters,hospital

BODE calculations require a BMI value, so you will have to create a function for it.

Your output should be in the form of two CSV files, patient_output.csv and hospital_output.csv.

Patient_output will have the following columns:
NAME,BODE_SCORE,BODE_RISK,HOSPITAL

Hospital output will have the following columns:
HOSPITAL_NAME, COPD_COUNT, PCT_OF_COPD_CASES_OVER_BEDS, AVG_SCORE, AVG_RISK

Each function you create should have documentation and a suitable number of test cases. If the input data could be wrong, make sure to raise a Value Error.

For this assignment, use the doctest, json, and csv libraries. Pandas is not allowed for this assignment.

In [91]:
import doctest
import json
import csv

### Step 1: Calculate BMI

In [92]:
def calculate_bmi(weight_kg, height_m):
    """
    Calculate BMI (Body Mass Index).
    :param weight_kg: Weight in kilograms.
    :param height_m: Height in meters.
    :return: BMI value.
    >>> calculate_bmi(70, 1.75)
    22.86
    >>> calculate_bmi(85, 1.8)
    26.23
    >>> calculate_bmi(0, 1.8)
    Traceback (most recent call last):
        ...
    ValueError: Weight and height must be positive values.
    """
    if weight_kg <= 0 or height_m <= 0:
        raise ValueError("Weight and height must be positive values.")
    return round(weight_kg / (height_m ** 2), 2)
if __name__ == "__main__":
    doctest.testmod()

### Step 2: Calculate BODE Score

In [93]:
def calculate_bode(bmi, fev_pct, dyspnea_description, distance_in_meters):
    """
    Calculate BODE score based on BMI, FEV1 (% of predicted), dyspnea scale, and 6-minute walk distance.
    :param bmi: BMI value.
    :param fev_pct: FEV1 % of predicted.
    :param dyspnea_description: Dyspnea description.
    :param distance_in_meters: 6-minute walk distance in meters.
    :return: BODE score.
    >>> calculate_bode(22.86, 50, "SLOWER THAN PEERS", 300)
    3
    >>> calculate_bode(26.23, 80, "WHEN HURRYING", 400)
    0
    >>> calculate_bode(20.0, 30, "BREATHLESS WHEN DRESSING", 100)
    10
    """
    bode_score = 0
    # BMI
    if bmi <= 21:
        bode_score += 1
    # FEV1 %
    if fev_pct < 35:
        bode_score += 3
    elif fev_pct < 50:
        bode_score += 2
    elif fev_pct < 65:
        bode_score += 1
    # Dyspnea scale
    dyspnea_map = {
        "ONLY STRENUOUS EXERCISE": 0,
        "WHEN HURRYING": 0,
        "WALKING UPHILL": 0,
        "SLOWER THAN PEERS": 1,
        "STOPS WHEN WALKING AT PACE": 1,
        "STOPS AFTER A FEW MINUTES": 2,
        "STOPS AFTER 100 YARDS": 2,
        "UNABLE TO LEAVE HOME": 3,
        "BREATHLESS WHEN DRESSING": 3
    }
    bode_score += dyspnea_map.get(dyspnea_description.upper(), 0)
    # 6-minute walk distance
    if distance_in_meters < 149:
        bode_score += 3
    elif distance_in_meters < 250:
        bode_score += 2
    elif distance_in_meters < 350:
        bode_score += 1
    return bode_score
if __name__ == "__main__":
    doctest.testmod()

### Step 3: Calculate BODE Risk

In [94]:
def calculate_bode_risk(bode_score):
    """
    Calculate BODE risk based on BODE score.
    :param bode_score: BODE score.
    :return: BODE risk category.
    >>> calculate_bode_risk(2)
    '80% (Low)'
    >>> calculate_bode_risk(4)
    '67% (Moderate)'
    >>> calculate_bode_risk(6)
    '57% (Moderate)'
    >>> calculate_bode_risk(8)
    '18% (High)'
    """
    if 0 <= bode_score <= 2:
        return "80% (Low)"
    elif 3 <= bode_score <= 4:
        return "67% (Moderate)"
    elif 5 <= bode_score <= 6:
        return "57% (Moderate)"
    elif 7 <= bode_score <= 10:
        return "18% (High)"
    else:
        return "Invalid BODE score"
if __name__ == "__main__":
    doctest.testmod()

### Step 4: Load Hospital Data

In [95]:
def load_hospital_data(hospital_json):
    """
    Load hospital data from a JSON file.
    :param hospital_json: JSON file path for hospital data.
    :return: Dictionary containing hospital data.
    >>> load_hospital_data('hospitals.json')
    [{'system': 'BJC', 'hospitals': [{'name': 'BJC', 'beds': 2000}, {'name': 'BJC WEST COUNTY', 'beds': 1000}, {'name': 'MISSOURI BAPTIST', 'beds': 800}]}, {'system': 'SSM', 'hospitals': [{'name': 'SAINT LOUIS UNIVERSITY', 'beds': 1000}, {'name': "ST.MARY'S", 'beds': 500}]}, {'system': "ST.LUKE'S", 'hospitals': [{'name': "ST.LUKE'S", 'beds': 800}]}]
    """
    with open(hospital_json, 'r') as file:
        hospital_data = json.load(file)
    return hospital_data
if __name__ == "__main__":
    doctest.testmod()

### Step 5: Main business logic

Call BODE Score, BODE Risk functions for each patient.

For each hospital, calculate Avg BODE score and Avg BODE risk and count the number of cases for each hospital.

In [96]:
patient_csv = "patient.csv"
hospital_json = "hospitals.json"
patient_output_file = "patient_output.csv"
hospital_output_file = "hospital_output.csv"
def process_patient_data(patient_csv, hospital_json, patient_output_file, hospital_output_file):
    """
    Process patient data and generate required outputs.
    :param patient_csv: Input file path for patient data.
    :param hospital_json: JSON file path for hospital data.
    :param patient_output_file: Output file path for patient data.
    :param hospital_output_file: Output file path for hospital data.
    """
    hospital_data = load_hospital_data(hospital_json)
    hospitals = {hospital['name']: {'copd_count': 0, 'total_score': 0, 'total_survival': 0, 'beds': hospital['beds']}
                 for system in hospital_data for hospital in system['hospitals']}
    patient_results = []
    with open(patient_csv, 'r') as infile:
        reader = csv.DictReader(infile)
        for row in reader:
            name = row['NAME']
            weight_kg = float(row['WEIGHT_KG'])
            height_m = float(row['HEIGHT_M'])
            fev_pct = float(row['fev_pct'])
            dyspnea_description = row['dyspnea_description']
            distance_in_meters = float(row['distance_in_meters'])
            hospital = row['hospital']

            bmi = calculate_bmi(weight_kg, height_m)
            bode_score = calculate_bode(bmi, fev_pct, dyspnea_description, distance_in_meters)
            bode_risk = calculate_bode_risk(bode_score)

            patient_results.append([name, bode_score, bode_risk, hospital])

            if hospital in hospitals:
                hospitals[hospital]['copd_count'] += 1
                hospitals[hospital]['total_score'] += bode_score
                if bode_risk.startswith("80"):
                    hospitals[hospital]['total_survival'] += 80
                elif bode_risk.startswith("67"):
                    hospitals[hospital]['total_survival'] += 67
                elif bode_risk.startswith("57"):
                    hospitals[hospital]['total_survival'] += 57
                elif bode_risk.startswith("18"):
                    hospitals[hospital]['total_survival'] += 18

    hospital_output_list = []
    for hospital, data in hospitals.items():
        copd_count = data['copd_count']
        if copd_count > 0:
            avg_score = data['total_score'] / copd_count
            avg_survival = data['total_survival'] / copd_count
            pct_of_copd_cases_over_beds = (copd_count / data['beds']) * 100
        else:
            avg_score = 0
            avg_survival = 0
            pct_of_copd_cases_over_beds = 0
        hospital_output_list.append([hospital, copd_count, pct_of_copd_cases_over_beds, avg_score, avg_survival])

    # Write Patient_output.csv
    with open(patient_output_file, 'w', newline='') as csvfile:
        writer = csv.writer(csvfile)
        writer.writerow(['NAME', 'BODE_SCORE', 'BODE_RISK', 'HOSPITAL'])
        writer.writerows(patient_results)

    # Write Hospital_output.csv
    with open(hospital_output_file, 'w', newline='') as csvfile:
        writer = csv.writer(csvfile)
        writer.writerow(['HOSPITAL_NAME', 'COPD_COUNT', 'PCT_OF_COPD_CASES_OVER_BEDS', 'AVG_SCORE', 'AVG_SURVIVAL'])
        writer.writerows(hospital_output_list)

# Call the function to process patient data
if __name__ == "__main__":
    process_patient_data(patient_csv, hospital_json, patient_output_file, hospital_output_file)

if __name__ == "__main__":
    doctest.testmod()