<a href="https://colab.research.google.com/github/Vineelreddy67/vineel-reddy/blob/main/midterm/midterm.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

### Mid-term for HDS5210

Your supervisor is concerned about 4-year survival risks for COPD. She has asked for you to do some analysis using a new metric, BODE. BODE is an improvement on a previous metric and promises to provide insight on survival risks.

BODE is defined here. https://www.mdcalc.com/calc/3916/bode-index-copd-survival#evidence

Your assignment is to create a BODE calculation, use it to calculate BODE scores and BODE survival rates for a group of patients. Then we want to evaluate the average BODE scores and BODE survival rates for each area hospital.

Your patient input file will have the following columns:
NAME,SSN,LANGUAGE,JOB,HEIGHT_M,WEIGHT_KG,fev_pct,dyspnea_description,distance_in_meters,hospital

BODE calculations require a BMI value, so you will have to create a function for it.

Your output should be in the form of two CSV files, patient_output.csv and hospital_output.csv.

Patient_output will have the following columns:
NAME,BODE_SCORE,BODE_RISK,HOSPITAL

Hospital output will have the following columns:
HOSPITAL_NAME, COPD_COUNT, PCT_OF_COPD_CASES_OVER_BEDS, AVG_SCORE, AVG_RISK

Each function you create should have documentation and a suitable number of test cases. If the input data could be wrong, make sure to raise a Value Error.

For this assignment, use the doctest, json, and csv libraries. Pandas is not allowed for this assignment.

In [22]:
import doctest
import json
import csv

### Step 1: Calculate BMI

In [23]:
def calculate_bmi(weight_kg, height_m):
    """
    Calculate Body Mass Index (BMI).

    Parameters:
    weight_kg (float): Weight in kilograms.
    height_m (float): Height in meters.

    Returns:
    float: The BMI value.

    Raises:
    ValueError: If weight or height is not a positive number.

    Example:
    >>> calculate_bmi(70, 1.75)
    22.86

    >>> calculate_bmi(90, 1.80)
    27.78

    >>> calculate_bmi(0, 1.75)
    Traceback (most recent call last):
        ...
    ValueError: Weight and height must be positive values.

    >>> calculate_bmi(70, 0)
    Traceback (most recent call last):
        ...
    ValueError: Weight and height must be positive values.
    """
    if weight_kg <= 0 or height_m <= 0:
        raise ValueError("Weight and height must be positive values.")

    return round(weight_kg / (height_m ** 2), 2)

if __name__ == "__main__":
    import doctest
    doctest.testmod()


### Step 2: Calculate BODE Score

In [24]:
def dyspnea_score(dyspnea_level):
    """Calculate dyspnea score based on level of breathlessness."""
    if dyspnea_level == "no problem walking":
        return 0
    elif dyspnea_level == "breathless while walking":
        return 1
    elif dyspnea_level == "unable to walk":
        return 2
    else:
        return 3  # This is an example for more severe dyspnea

def calculate_bode_score(age, dyspnea_level, fev1, height, weight):
    """Calculate the BODE score."""
    # Calculate dyspnea score
    d_score = dyspnea_score(dyspnea_level)

    # Calculate age score (example thresholds)
    if age < 65:
        a_score = 0
    elif age <= 74:
        a_score = 1
    else:
        a_score = 2

    # Calculate FEV1 score (example thresholds)
    if fev1 > 65:
        f_score = 0
    elif fev1 > 50:
        f_score = 1
    else:
        f_score = 2

    # Calculate BMI (Body Mass Index) score (example thresholds)
    bmi = weight / (height ** 2)
    if bmi < 25:
        b_score = 0
    elif bmi <= 30:
        b_score = 1
    else:
        b_score = 2

    # Total BODE score
    total_score = a_score + d_score + f_score + b_score
    return total_score


### Step 3: Calculate BODE Risk

In [25]:
def calculate_bode_risk(bode_score):
    """
    Determine the risk level based on the BODE score.

    Parameters:
    bode_score (int): The BODE score.

    Returns:
    str: The corresponding risk level.

    Example:
    >>> calculate_bode_risk(2)
    'Low Risk'
    >>> calculate_bode_risk(4)
    'Moderate Risk'
    >>> calculate_bode_risk(8)
    'High Risk'
    """
    if bode_score <= 2:
        return 'Low Risk'
    elif bode_score <= 5:
        return 'Moderate Risk'
    else:
        return 'High Risk'


### Step 4: Load Hospital Data

In [26]:
def load_hospital_data(file_path):
    """
    Load hospital data from a JSON file.

    Parameters:
    file_path (str): Path to the JSON file.

    Returns:
    dict: Parsed hospital data.

    Example:
    # Assuming a file path of 'hospitals.json' with correct structure
    """
    with open(file_path, 'r') as f:
        return json.load(f)


### Step 5: Main business logic

Call BODE Score, BODE Risk functions for each patient.

For each hospital, calculate Avg BODE score and Avg BODE risk and count the number of cases for each hospital.

In [27]:
patient_csv = "patient.csv"
hospital_json = "hospitals.json"

patient_output_file = "patient_output.csv"
hospital_output_file = "hospital_output.csv"

###Begin Solution
def process_patients(patient_csv, hospital_json, patient_output_file, hospital_output_file):
    """
    Process patient data, calculate BODE score and risk, and update hospital statistics.

    Parameters:
    patient_csv (str): Path to the patient CSV file.
    hospital_json (str): Path to the hospital JSON file.
    patient_output_file (str): Output file for patient data.
    hospital_output_file (str): Output file for hospital data.

    Example:
    >>> process_patients('patient.csv', 'hospitals.json', 'patient_output.csv', 'hospital_output.csv')
    """

    # Load hospital data
    with open(hospital_json, 'r') as f:
        hospitals = json.load(f)

    # Initialize hospital stats
    hospital_stats = {}
    for system in hospitals:
        for hospital in system['hospitals']:
            hospital_stats[hospital['name']] = {'count': 0, 'total_score': 0, 'beds': hospital['beds']}

    patient_results = []

    # Load patient data
    with open(patient_csv, newline='') as csvfile:
        reader = csv.DictReader(csvfile)

        for row in reader:
            try:
                height = float(row['HEIGHT_M'])
                weight = float(row['WEIGHT_KG'])
                fev_pct = float(row['fev_pct'])
                dyspnea_description = row['dyspnea_description']
                distance = float(row['distance_in_meters'])
                hospital_name = row['hospital'].strip()

                # Calculate BODE score and risk
                bode_score = calculate_bode_score(fev_pct, dyspnea_description, distance, height, weight)
                bode_risk = calculate_bode_risk(bode_score)

                # Append results for each patient
                patient_results.append([row['NAME'], bode_score, bode_risk, hospital_name])

                # Update hospital statistics
                if hospital_name in hospital_stats:
                    hospital_stats[hospital_name]['count'] += 1
                    hospital_stats[hospital_name]['total_score'] += bode_score

            except ValueError as e:
                print(f"Error processing {row['NAME']}: {e}")

    # Write patient_output.csv
    with open(patient_output_file, 'w', newline='') as csvfile:
        writer = csv.writer(csvfile)
        writer.writerow(['NAME', 'BODE_SCORE', 'BODE_RISK', 'HOSPITAL'])
        writer.writerows(patient_results)

    # Calculate hospital statistics
    hospital_output_list = []
    for hospital_name, stats in hospital_stats.items():
        if stats['count'] > 0:
            avg_score = stats['total_score'] / stats['count']
            pct_of_copd_cases = (stats['count'] / stats['beds']) * 100
            hospital_output_list.append([hospital_name, stats['count'], pct_of_copd_cases, avg_score])

    # Write hospital_output.csv
    with open(hospital_output_file, 'w', newline='') as csvfile:
        writer = csv.writer(csvfile)
        writer.writerow(['HOSPITAL_NAME', 'COPD_COUNT', 'PCT_OF_COPD_CASES_OVER_BEDS', 'AVG_SCORE'])
        writer.writerows(hospital_output_list)

# Example usage with provided input file names
patient_csv = "patient.csv"
hospital_json = "hospitals.json"
patient_output_file = "patient_output.csv"
hospital_output_file = "hospital_output.csv"

process_patients(patient_csv, hospital_json, patient_output_file, hospital_output_file)

###End solution
patient_results = []
hospital_output_list = []
