<a href="https://colab.research.google.com/github/Vineelreddy67/vineel-reddy/blob/main/midterm/midterm.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

### Mid-term for HDS5210

Your supervisor is concerned about 4-year survival risks for COPD. She has asked for you to do some analysis using a new metric, BODE. BODE is an improvement on a previous metric and promises to provide insight on survival risks.

BODE is defined here. https://www.mdcalc.com/calc/3916/bode-index-copd-survival#evidence

Your assignment is to create a BODE calculation, use it to calculate BODE scores and BODE survival rates for a group of patients. Then we want to evaluate the average BODE scores and BODE survival rates for each area hospital.

Your patient input file will have the following columns:
NAME,SSN,LANGUAGE,JOB,HEIGHT_M,WEIGHT_KG,fev_pct,dyspnea_description,distance_in_meters,hospital

BODE calculations require a BMI value, so you will have to create a function for it.

Your output should be in the form of two CSV files, patient_output.csv and hospital_output.csv.

Patient_output will have the following columns:
NAME,BODE_SCORE,BODE_RISK,HOSPITAL

Hospital output will have the following columns:
HOSPITAL_NAME, COPD_COUNT, PCT_OF_COPD_CASES_OVER_BEDS, AVG_SCORE, AVG_RISK

Each function you create should have documentation and a suitable number of test cases. If the input data could be wrong, make sure to raise a Value Error.

For this assignment, use the doctest, json, and csv libraries. Pandas is not allowed for this assignment.

In [27]:
import doctest
import json
import csv

### Step 1: Calculate BMI

In [28]:
###Begin solution
def calculate_bmi(weight_kg, height_m):
    """
    Calculate Body Mass Index (BMI).

    Parameters:
    weight_kg (float): Weight in kilograms.
    height_m (float): Height in meters.

    Returns:
    float: The BMI value.
    """
    if weight_kg <= 0 or height_m <= 0:
        raise ValueError("Weight and height must be positive values.")
###End solution
    return weight_kg / (height_m ** 2)


### Step 2: Calculate BODE Score

In [29]:
###Begin solution
def dyspnea_score(description):
    """
    Assign a score based on dyspnea description.

    Parameters:
    description (str): Description of dyspnea.

    Returns:
    int: The corresponding dyspnea score.
    """
    description = description.lower()
    if "unbearable" in description or "unable to" in description:
        return 3
    elif "breathless" in description or "stops" in description:
        return 2
    elif "walking" in description or "hurrying" in description:
        return 1
    else:
        return 0
def calculate_bode_score(fev_pct, dyspnea_description, distance_in_meters, height_m, weight_kg):
    """
    Calculate the BODE score based on input parameters.

    Parameters:
    fev_pct (float): FEV1 percentage.
    dyspnea_description (str): Description of dyspnea.
    distance_in_meters (float): Distance walked in meters.
    height_m (float): Height in meters.
    weight_kg (float): Weight in kilograms.

    Returns:
    int: The total BODE score.

    Raises:
    ValueError: If any parameter is invalid.
    """
    if fev_pct < 0 or fev_pct > 100:
        raise ValueError("FEV1 percentage must be between 0 and 100.")
    if distance_in_meters < 0:
        raise ValueError("Distance must be non-negative.")

    bmi = calculate_bmi(weight_kg, height_m)
    bmi_score = 0
    dyspnea_score_value = dyspnea_score(dyspnea_description)
    distance_score = 0

    # Calculate BMI scoring
    if bmi < 21:
        bmi_score = 0
    elif 21 <= bmi <= 25:
        bmi_score = 1
    elif 26 <= bmi <= 30:
        bmi_score = 2
    else:
        bmi_score = 3

    # FEV1% scoring
    if fev_pct >= 50:
        fev_score = 0
    elif 35 <= fev_pct < 50:
        fev_score = 1
    elif 25 <= fev_pct < 35:
        fev_score = 2
    else:
        fev_score = 3

    # Distance scoring
    if distance_in_meters >= 300:
        distance_score = 0
    elif 200 <= distance_in_meters < 300:
        distance_score = 1
    elif 100 <= distance_in_meters < 200:
        distance_score = 2
    else:
        distance_score = 3
###End solution
    return bmi_score + fev_score + dyspnea_score_value + distance_score


### Step 3: Calculate BODE Risk

In [30]:
###Begin solution
def calculate_bode_risk(bode_score):
    """
    Determine the risk level based on the BODE score.

    Parameters:
    bode_score (int): The BODE score.

    Returns:
    str: The corresponding risk level.
    """
    if bode_score <= 2:
        return 'Low Risk'
    elif bode_score <= 5:
        return 'Moderate Risk'
    else:
        return 'High Risk'
###End solution

### Step 4: Load Hospital Data

In [31]:
###Begin solution
def load_hospital_data(file_path):
    """
    Load hospital data from a JSON file.

    Parameters:
    file_path (str): Path to the JSON file.

    Returns:
    dict: Parsed hospital data.
    """
    with open(file_path, 'r') as f:
###End solution
        return json.load(f)

### Step 5: Main business logic

Call BODE Score, BODE Risk functions for each patient.

For each hospital, calculate Avg BODE score and Avg BODE risk and count the number of cases for each hospital.

In [32]:
patient_csv = "patient.csv"
hospital_json = "hospitals.json"
patient_output_file = "patient_output.csv"
hospital_output_file = "hospital_output.csv"
###Begin solution
def process_patients(patient_csv, hospital_json, patient_output_file, hospital_output_file):
    """
    Process patient data to calculate BODE scores and generate outputs.

    Parameters:
    patient_csv (str): Path to the patient CSV file.
    hospital_json (str): Path to the hospital JSON file.
    patient_output_file (str): Output CSV file for patient results.
    hospital_output_file (str): Output CSV file for hospital statistics.
    """
    # Load hospital data and initialize hospital statistics
    hospitals_data = load_hospital_data(hospital_json)
    hospital_stats = {h['name']: {'count': 0, 'total_score': 0, 'beds': h['beds']}
                      for system in hospitals_data for h in system['hospitals']}

    patient_results = []
    hospital_output_list = []

    with open(patient_csv, 'r') as csvfile:
        reader = csv.DictReader(csvfile)
        for row in reader:
            try:
                height = float(row['HEIGHT_M'])
                weight = float(row['WEIGHT_KG'])
                fev_pct = float(row['fev_pct'])
                dyspnea_description = row['dyspnea_description'].strip()
                distance = float(row['distance_in_meters'])
                hospital_name = row['hospital'].strip()

                # Calculate BODE score and risk
                bode_score = calculate_bode_score(fev_pct, dyspnea_description, distance, height, weight)
                bode_risk = calculate_bode_risk(bode_score)

                patient_results.append([row['NAME'], bode_score, bode_risk, hospital_name])

                # Update hospital statistics
                if hospital_name in hospital_stats:
                    hospital_stats[hospital_name]['count'] += 1
                    hospital_stats[hospital_name]['total_score'] += bode_score

            except ValueError as e:
                print(f"Error processing {row['NAME']}: {e}")

    # Write patient output to CSV
    with open(patient_output_file, 'w', newline='') as csvfile:
        writer = csv.writer(csvfile)
        writer.writerow(['NAME', 'BODE_SCORE', 'BODE_RISK', 'HOSPITAL'])
        writer.writerows(patient_results)

    # Calculate hospital statistics
    for hospital_name, stats in hospital_stats.items():
        if stats['count'] > 0:
            avg_score = stats['total_score'] / stats['count']
            pct_of_copd_cases = (stats['count'] / stats['beds']) * 100
            hospital_output_list.append([hospital_name, stats['count'], pct_of_copd_cases, avg_score])

    # Write hospital output to CSV
    with open(hospital_output_file, 'w', newline='') as csvfile:
        writer = csv.writer(csvfile)
        writer.writerow(['HOSPITAL_NAME', 'COPD_COUNT', 'PCT_OF_COPD_CASES_OVER_BEDS', 'AVG_SCORE'])
        writer.writerows(hospital_output_list)

    # Print results to console
    print("\nPatient Results:")
    for result in patient_results:
        print(result)

    print("\nHospital Statistics:")
    for hospital in hospital_output_list:
        print(hospital)
###End solution
process_patients(patient_csv, hospital_json, patient_output_file, hospital_output_file)



Patient Results:
['Vanessa Roberts', 5, 'Moderate Risk', "ST.LUKE'S"]
['Christopher Fox', 6, 'High Risk', 'SAINT LOUIS UNIVERSITY']
['Benjamin Johnston', 6, 'High Risk', 'BJC']
['Christopher Hernandez', 4, 'Moderate Risk', 'MISSOURI BAPTIST']
['Valerie Burch', 3, 'Moderate Risk', 'BJC WEST COUNTY']
['Heather Hart', 4, 'Moderate Risk', 'SAINT LOUIS UNIVERSITY']
['Ronald Cobb', 7, 'High Risk', "ST.MARY'S"]
['Austin French', 8, 'High Risk', 'SAINT LOUIS UNIVERSITY']
['Mary Leonard', 7, 'High Risk', 'BJC']
['Mrs. Nicole Smith', 7, 'High Risk', "ST.MARY'S"]
['Ashley Warren', 8, 'High Risk', 'BJC']
['Jeffrey Jacobson', 6, 'High Risk', 'BJC WEST COUNTY']
['Angela Bauer', 7, 'High Risk', 'BJC WEST COUNTY']
['Jerry Rogers', 6, 'High Risk', 'BJC']
['Lisa Beck', 5, 'Moderate Risk', 'BJC']
['Bryan Pena', 7, 'High Risk', 'SAINT LOUIS UNIVERSITY']
['Jessica Henderson', 5, 'Moderate Risk', 'SAINT LOUIS UNIVERSITY']
['Daniel Mitchell', 5, 'Moderate Risk', 'MISSOURI BAPTIST']
['Melanie Graham', 6, 'Hi