<a href="https://colab.research.google.com/github/arun0601/Week2Assignment/blob/main/midterm.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

### Mid-term for HDS5210

Your supervisor is concerned about 4-year survival risks for COPD. She has asked for you to do some analysis using a new metric, BODE. BODE is an improvement on a previous metric and promises to provide insight on survival risks.

BODE is defined here. https://www.mdcalc.com/calc/3916/bode-index-copd-survival#evidence

Your assignment is to create a BODE calculation, use it to calculate BODE scores and BODE survival rates for a group of patients. Then we want to evaluate the average BODE scores and BODE survival rates for each area hospital.

Your patient input file will have the following columns:
NAME,SSN,LANGUAGE,JOB,HEIGHT_M,WEIGHT_KG,fev_pct,dyspnea_description,distance_in_meters,hospital

BODE calculations require a BMI value, so you will have to create a function for it.

Your output should be in the form of two CSV files, patient_output.csv and hospital_output.csv.

Patient_output will have the following columns:
NAME,BODE_SCORE,BODE_RISK,HOSPITAL

Hospital output will have the following columns:
HOSPITAL_NAME, COPD_COUNT, PCT_OF_COPD_CASES_OVER_BEDS, AVG_SCORE, AVG_RISK

Each function you create should have documentation and a suitable number of test cases. If the input data could be wrong, make sure to raise a Value Error.

For this assignment, use the doctest, json, and csv libraries. Pandas is not allowed for this assignment.

In [143]:
import doctest
import json
import csv

### Step 1: Calculate BMI

In [144]:
def calculate_bmi(weight_kg, height_m):
    """
    Calculate Body Mass Index (BMI).

    ValueError: Weight and height must be greater than zero.
    """
    if weight_kg <= 0 or height_m <= 0:
        raise ValueError("Weight and height must be greater than zero.")
    return weight_kg / (height_m ** 2)

### Step 2: Calculate BODE Score

In [145]:
def calculate_bode_score(bmi, fev_pct, dyspnea_description, distance_in_meters):
    """
    Calculate the BODE score based on BMI, FEV1 percentage, dyspnea description, and distance in meters.

    >>> calculate_bode_score(22, 70, 'ONLY STRENUOUS EXERCISE', 400)
    1
    >>> calculate_bode_score(18, 40, 'STOPS WHEN WALKING AT PACE', 200)
    8
    """
    bode_score = 0

    # Calculate BMI score
    if bmi > 21:
        bode_score += 0
    else:
        bode_score += 1

    # Calculate FEV1 score
    if fev_pct >= 65:
        bode_score += 0
    elif 50 <= fev_pct < 65:
        bode_score += 1
    elif 36 <= fev_pct < 50:
        bode_score += 2
    else:
        bode_score += 3

    # Normalize dyspnea description and map it to a score
    dyspnea_description = normalize_dyspnea_description(dyspnea_description)
    dyspnea_mapping = {
        "No breathlessness": 0,
        "Mild breathlessness": 1,
        "Moderate breathlessness": 2,
        "Severe breathlessness": 3,
    }

    dyspnea_score = dyspnea_mapping.get(dyspnea_description, None)
    if dyspnea_score is None:
        print(f"Invalid dyspnea description: {dyspnea_description}")
        raise ValueError("Invalid dyspnea description.")

    bode_score += dyspnea_score

    # Calculate distance walked score
    if distance_in_meters > 350:
        bode_score += 0
    elif 250 <= distance_in_meters <= 350:
        bode_score += 1
    elif 150 <= distance_in_meters < 250:
        bode_score += 2
    else:
        bode_score += 3

    return bode_score

### Step 3: Calculate BODE Risk

In [146]:
def calculate_bode_risk(bode_score):
    """
    Determine the BODE risk based on the score.

    :param bode_score: The calculated BODE score
    :return: Risk category

    >>> calculate_bode_risk(1)
    'Low risk'
    >>> calculate_bode_risk(5)
    'High risk'
    """
    if bode_score <= 2:
        return 'Low risk'
    elif bode_score <= 4:
        return 'Moderate risk'
    else:
        return 'High risk'

### Step 4: Load Hospital Data

In [147]:
import json

# Function to load hospital data from a JSON file
def load_hospital_data(file_path):
    with open(file_path, 'r') as f:
        return json.load(f)

# Function to initialize hospital metrics
def initialize_hospital_metrics(data):
    hospital_beds = {}
    hospital_metrics = {}

    for system in data:
        # Each system contains a list of hospitals
        hospitals = system.get('hospitals', [])
        for hospital in hospitals:
            hospital_name = hospital.get('name')
            beds = hospital.get('beds', 0)

            if hospital_name:
                # Store the number of beds in hospital_beds dictionary
                hospital_beds[hospital_name] = beds

                # Initialize hospital metrics
                hospital_metrics[hospital_name] = {
                    'total_bode_score': 0,
                    'total_risk': 0,
                    'copd_count': 0,
                    'beds': beds
                }

    return hospital_beds, hospital_metrics

# Main execution
hospital_json = "hospitals.json"
hospital_data = load_hospital_data(hospital_json)
hospital_beds, hospital_metrics = initialize_hospital_metrics(hospital_data)



### Step 5: Main business logic

Call BODE Score, BODE Risk functions for each patient.

For each hospital, calculate Avg BODE score and Avg BODE risk and count the number of cases for each hospital.

In [148]:
import csv
import json

patient_csv = "patient.csv"
hospital_json_file = "hospitals.json"

patient_output_file = "patient_output.csv"
hospital_output_file = "hospital_output.csv"

patient_results = []
hospital_metrics = {}

# Placeholder functions for BODE calculations (implement logic as needed)
def calculate_bmi(weight_kg, height_m):
    return weight_kg / (height_m ** 2)

def calculate_bode_score(bmi, fev_pct, dyspnea_description, distance_in_meters):
    # Replace with actual logic for calculating BODE score
    return round(bmi + (fev_pct / 100) + (1 if dyspnea_description else 0) + (distance_in_meters / 100), 2)

def calculate_bode_risk(bode_score):
    # Replace with actual logic for calculating BODE risk
    if bode_score < 2:
        return 0  # Low risk
    elif 2 <= bode_score < 4:
        return 1  # Moderate risk
    else:
        return 2  # High risk

# Read patient data from the CSV file
with open(patient_csv, 'r') as csvfile:
    reader = csv.DictReader(csvfile)

    for row in reader:
        name = row['NAME']
        ssn = row['SSN']
        language = row['LANGUAGE']
        job = row['JOB']
        height_m = float(row['HEIGHT_M'])
        weight_kg = float(row['WEIGHT_KG'])
        fev_pct = float(row['fev_pct'])
        dyspnea_description = row['dyspnea_description']
        distance_in_meters = float(row['distance_in_meters'])
        hospital_name = row['hospital']

        # Calculate BMI, BODE score, and BODE risk
        bmi = calculate_bmi(weight_kg, height_m)
        bode_score = calculate_bode_score(bmi, fev_pct, dyspnea_description, distance_in_meters)
        bode_risk = calculate_bode_risk(bode_score)

        # Add patient results
        patient_results.append([name, bode_score, bode_risk, hospital_name])

        # Initialize hospital metrics if not already present
        if hospital_name not in hospital_metrics:
            hospital_metrics[hospital_name] = {
                'total_bode_score': 0,
                'total_risk': 0,
                'copd_count': 0,
                'beds': 0  # You need to populate this from the JSON or another source
            }

        # Update hospital metrics
        hospital_metrics[hospital_name]['total_bode_score'] += bode_score
        hospital_metrics[hospital_name]['total_risk'] += bode_risk  # Now bode_risk should be numeric
        hospital_metrics[hospital_name]['copd_count'] += 1

hospital_output_list = []

# Calculate hospital metrics
for hospital_name, metrics in hospital_metrics.items():
    copd_count = metrics['copd_count']
    avg_bode_score = metrics['total_bode_score'] / copd_count if copd_count > 0 else 0
    avg_bode_risk = metrics['total_risk'] / copd_count if copd_count > 0 else 0
    pct_of_copd_cases = (copd_count / metrics['beds']) * 100 if metrics['beds'] > 0 else 0
    hospital_output_list.append([hospital_name, copd_count, pct_of_copd_cases, avg_bode_score, avg_bode_risk])

# Write patient_output.csv
with open(patient_output_file, 'w', newline='') as csvfile:
    writer = csv.writer(csvfile)
    writer.writerow(["NAME", "BODE_SCORE", "BODE_RISK", "HOSPITAL"])
    writer.writerows(patient_results)

# Write hospital_output.csv
with open(hospital_output_file, 'w', newline='') as csvfile:
    writer = csv.writer(csvfile)
    writer.writerow(["HOSPITAL_NAME", "COPD_COUNT", "PCT_OF_COPD_CASES_OVER_BEDS", "AVG_SCORE", "AVG_RISK"])
    writer.writerows(hospital_output_list)

