### Mid-term for HDS5210

Your supervisor is concerned about 4-year survival risks for COPD. She has asked for you to do some analysis using a new metric, BODE. BODE is an improvement on a previous metric and promises to provide insight on survival risks.

BODE is defined here. https://www.mdcalc.com/calc/3916/bode-index-copd-survival#evidence

Your assignment is to create a BODE calculation, use it to calculate BODE scores and BODE survival rates for a group of patients. Then we want to evaluate the average BODE scores and BODE survival rates for each area hospital.

Your patient input file will have the following columns:
NAME,SSN,LANGUAGE,JOB,HEIGHT_M,WEIGHT_KG,fev_pct,dyspnea_description,distance_in_meters,hospital

BODE calculations require a BMI value, so you will have to create a function for it.

Your output should be in the form of two CSV files, patient_output.csv and hospital_output.csv.

Patient_output will have the following columns:
NAME,BODE_SCORE,BODE_RISK,HOSPITAL

Hospital output will have the following columns:
HOSPITAL_NAME, COPD_COUNT, PCT_OF_COPD_CASES_OVER_BEDS, AVG_SCORE, AVG_RISK

Each function you create should have documentation and a suitable number of test cases. If the input data could be wrong, make sure to raise a Value Error.

For this assignment, use the doctest, json, and csv libraries. Pandas is not allowed for this assignment.

In [1]:
import doctest
import json
import csv

### Step 1: Calculate BMI

In [15]:
def calculate_bmi(weight_kg, height_m):
    """
    Calculate the Body Mass Index (BMI) based on weight and height.

    Args:
        weight_kg (float): The weight in kilograms.
        height_m (float): The height in meters.

    Returns:
        float: The calculated BMI value, rounded to two decimal places.

    Raises:
        ValueError: If the weight or height is non-positive.

    Doctest:
    >>> calculate_bmi(70, 1.75)
    22.86
    >>> calculate_bmi(90, 1.8)
    27.78
    >>> calculate_bmi(-70, 1.75)
    Traceback (most recent call last):
        ...
    ValueError: Weight and height must be positive values.
    """
    if weight_kg <= 0 or height_m <= 0:
        raise ValueError("Weight and height must be positive values.")

    bmi = weight_kg / (height_m ** 2)
    return round(bmi, 2)


# Run the doctests
if __name__ == "__main__":
    import doctest
    doctest.testmod()


# Example BMI calculations
bmi1 = calculate_bmi(70, 1.75)  # Expected output: 22.86
bmi2 = calculate_bmi(90, 1.8)   # Expected output: 27.78
print(bmi1, bmi2)

# Example of handling invalid input (negative weight)
try:
    calculate_bmi(-70, 1.75)
except ValueError as e:
    print(e)  # Expected output: "Weight and height must be positive values."




22.86 27.78
Weight and height must be positive values.


### Step 2: Calculate BODE Score

In [19]:
def calculate_bode_score(bmi, fev_pct, dyspnea_level, walk_distance):
    """
    Calculates the BODE index score based on BMI, FEV1 percentage, dyspnea level, and 6-minute walk distance.

    Args:
        bmi (float): Body Mass Index (BMI).
        fev_pct (float): FEV1 percentage of predicted value.
        dyspnea_level (int): Dyspnea level rated from 0 to 4.
        walk_distance (float): Distance in meters walked during a 6-minute walk test.

    Returns:
        int: Total BODE score based on the input values.

    Raises:
        ValueError: If any input values are out of the expected ranges.

    Example:
    >>> calculate_bode_score(22, 70, 1, 360)
    0
    >>> calculate_bode_score(20, 55, 2, 300)
    4
    >>> calculate_bode_score(18, 30, 3, 100)
    8
    """

    # Validate inputs
    if bmi <= 0:
        raise ValueError("BMI must be a positive number.")
    if not (0 <= fev_pct <= 100):
        raise ValueError("FEV1 percentage must be between 0 and 100.")
    if not (0 <= dyspnea_level <= 4):
        raise ValueError("Dyspnea level must be between 0 and 4.")
    if walk_distance < 0:
        raise ValueError("Walk distance must be non-negative.")

    # BMI Score calculation
    bmi_score = 0 if bmi > 21 else 1

    # FEV1 Score calculation
    if fev_pct >= 65:
        fev_score = 0
    elif 50 <= fev_pct < 65:
        fev_score = 1
    elif 36 <= fev_pct < 50:
        fev_score = 2
    else:
        fev_score = 3

    # Dyspnea Score calculation (passed directly as dyspnea_level)
    # No modification required.

    # Walk Distance Score calculation
    if walk_distance >= 350:
        walk_score = 0
    elif 250 <= walk_distance < 350:
        walk_score = 1
    elif 150 <= walk_distance < 250:
        walk_score = 2
    else:
        walk_score = 3

    # Calculate the total BODE score
    total_bode_score = bmi_score + fev_score + dyspnea_level + walk_score

    return total_bode_score

# Example test cases
print(calculate_bode_score(22, 70, 1, 360))  # Expected score: 0
print(calculate_bode_score(20, 55, 2, 300))  # Expected score: 4
print(calculate_bode_score(18, 30, 3, 100))  # Expected score: 8

# Run doctests to ensure correct functionality
if __name__ == "__main__":
    import doctest
    doctest.testmod()

    doctest.testmod()




1
5
10
**********************************************************************
File "__main__", line 20, in __main__.calculate_bode_index
Failed example:
    calculate_bode_index(19, 35, 3, 200)
Expected:
    7
Got:
    9
**********************************************************************
File "__main__", line 22, in __main__.calculate_bode_index
Failed example:
    calculate_bode_index(24, 50, 4, 300)
Expected:
    5
Got:
    6
**********************************************************************
File "__main__", line 18, in __main__.calculate_bode_score
Failed example:
    calculate_bode_score(22, 70, 1, 360)
Expected:
    0
Got:
    1
**********************************************************************
File "__main__", line 20, in __main__.calculate_bode_score
Failed example:
    calculate_bode_score(20, 55, 2, 300)
Expected:
    4
Got:
    5
**********************************************************************
File "__main__", line 22, in __main__.calculate_bode_score
Failed 

### Step 3: Calculate BODE Risk

In [20]:
def calculate_bode_risk(bode_score):
    """
    Determines the risk category and survival probability based on the BODE score.

    Args:
        bode_score (int): BODE score, must be between 0 and 10.

    Returns:
        tuple: A tuple containing the risk category (str) and the survival probability (float).

    Raises:
        ValueError: If the BODE score is outside the valid range (0-10).

    Example:
    >>> calculate_bode_risk(1)
    ('Low Risk', 80.0)
    >>> calculate_bode_risk(4)
    ('Moderate Risk', 67.0)
    >>> calculate_bode_risk(6)
    ('High Risk', 57.0)
    >>> calculate_bode_risk(9)
    ('Very High Risk', 18.0)
    """

    # Validate BODE score input
    if not (0 <= bode_score <= 10):
        raise ValueError("BODE score must be between 0 and 10.")

    # Determine risk category and survival probability
    if 0 <= bode_score <= 2:
        return ("Low Risk", 80.0)
    elif 3 <= bode_score <= 4:
        return ("Moderate Risk", 67.0)
    elif 5 <= bode_score <= 6:
        return ("High Risk", 57.0)
    else:
        return ("Very High Risk", 18.0)


# Example test cases
print(calculate_bode_risk(1))  # Expected: ('Low Risk', 80.0)
print(calculate_bode_risk(4))  # Expected: ('Moderate Risk', 67.0)
print(calculate_bode_risk(6))  # Expected: ('High Risk', 57.0)
print(calculate_bode_risk(9))  # Expected: ('Very High Risk', 18.0)

# Run doctests to verify functionality
if __name__ == "__main__":
    import doctest
    doctest.testmod()


('Low Risk', 80.0)
('Moderate Risk', 67.0)
('High Risk', 57.0)
('Very High Risk', 18.0)
**********************************************************************
File "__main__", line 20, in __main__.calculate_bode_index
Failed example:
    calculate_bode_index(19, 35, 3, 200)
Expected:
    7
Got:
    9
**********************************************************************
File "__main__", line 22, in __main__.calculate_bode_index
Failed example:
    calculate_bode_index(24, 50, 4, 300)
Expected:
    5
Got:
    6
**********************************************************************
File "__main__", line 18, in __main__.calculate_bode_score
Failed example:
    calculate_bode_score(22, 70, 1, 360)
Expected:
    0
Got:
    1
**********************************************************************
File "__main__", line 20, in __main__.calculate_bode_score
Failed example:
    calculate_bode_score(20, 55, 2, 300)
Expected:
    4
Got:
    5
********************************************************

### Step 4: Load Hospital Data

In [21]:
import csv
from google.colab import files

# Function to upload the file
def upload_file():
    uploaded = files.upload()
    return list(uploaded.keys())[0] if uploaded else None

# Function to load hospital data from a CSV file
def load_hospital_data(filename):
    patient_data = []

    try:
        # Open the CSV file for reading
        with open(filename, mode='r') as file:
            csv_reader = csv.DictReader(file)

            # Process each row in the CSV
            for row in csv_reader:
                # Convert numeric fields to floats
                try:
                    row['HEIGHT_M'] = float(row['HEIGHT_M'])
                    row['WEIGHT_KG'] = float(row['WEIGHT_KG'])
                    row['fev_pct'] = float(row['fev_pct'])
                    row['distance_in_meters'] = float(row['distance_in_meters'])
                except ValueError as ve:
                    print(f"Value error in row {row}: {ve}")
                    continue  # Skip this row if there's a conversion error

                # Append the processed row to the patient_data list
                patient_data.append(row)

    except FileNotFoundError:
        print(f"Error: The file '{filename}' was not found.")
    except Exception as e:
        print(f"An unexpected error occurred: {e}")

    return patient_data

# Upload the file and load hospital data
filename = upload_file()
if filename:
    hospital_data = load_hospital_data(filename)

    # Output the first few rows of data to verify
    for patient in hospital_data[:5]:  # Print the first 5 patients
        print(patient)
else:
    print("No file uploaded.")


Saving patient csv.csv to patient csv.csv
{'NAME': 'Vanessa Roberts', 'SSN': '295-82-3703', 'LANGUAGE': 'Belarusian', 'JOB': 'Teacher English as a foreign language', 'HEIGHT_M': 1.72, 'WEIGHT_KG': 90.28, 'fev_pct': 57.73, 'dyspnea_description': 'STOPS AFTER A FEW MINUTES', 'distance_in_meters': 367.9, 'hospital': "ST.LUKE'S"}
{'NAME': 'Christopher Fox', 'SSN': '286-30-9664', 'LANGUAGE': 'Macedonian', 'JOB': 'Local government officer', 'HEIGHT_M': 1.64, 'WEIGHT_KG': 83.09, 'fev_pct': 61.6, 'dyspnea_description': 'WHEN HURRYING', 'distance_in_meters': 184.16, 'hospital': 'SAINT LOUIS UNIVERSITY'}
{'NAME': 'Benjamin Johnston', 'SSN': '139-07-4381', 'LANGUAGE': 'Kirghiz', 'JOB': 'Multimedia programmer', 'HEIGHT_M': 1.61, 'WEIGHT_KG': 94.91, 'fev_pct': 83.11, 'dyspnea_description': 'BREATHLESS WHEN DRESSING', 'distance_in_meters': 260.66, 'hospital': 'BJC'}
{'NAME': 'Christopher Hernandez', 'SSN': '687-37-0804', 'LANGUAGE': 'South Ndebele', 'JOB': 'Community education officer', 'HEIGHT_M': 

### Step 5: Main business logic

Call BODE Score, BODE Risk functions for each patient.

For each hospital, calculate Avg BODE score and Avg BODE risk and count the number of cases for each hospital.

In [30]:
import csv
import json

# File paths
patient_csv = "/content/patient.csv"
hospital_json = "/content/hospitals.json"

# Function to calculate BMI
def calculate_bmi(weight_kg, height_m):
    if weight_kg <= 0 or height_m <= 0:
        raise ValueError("Weight and height must be positive numbers.")
    return weight_kg / (height_m ** 2)

# Function to calculate BODE score and risk
def calculate_bode_score(patient):
    bmi = calculate_bmi(patient['WEIGHT_KG'], patient['HEIGHT_M'])
    bmi_score = 0 if bmi > 21 else 1
    fev_score = (0 if patient['fev_pct'] >= 65 else
                 1 if patient['fev_pct'] >= 50 else
                 2 if patient['fev_pct'] >= 36 else 3)

    dyspnea_score = {
        "Only breathless with strenuous exercise": 0,
        "Breathless when hurrying or walking uphill": 1,
        "Walks slower, stops for breath": 2,
        "Stops for breath after 100 yards or a few minutes on level ground": 3,
        "Too breathless to leave house or while dressing": 4
    }.get(patient['dyspnea_description'], 0)

    distance_score = (0 if patient['distance_in_meters'] >= 350 else
                      1 if patient['distance_in_meters'] >= 250 else
                      2 if patient['distance_in_meters'] >= 150 else 3)

    bode_score = bmi_score + fev_score + dyspnea_score + distance_score
    bode_risk = {0: 80, 1: 80, 2: 80, 3: 67, 4: 67, 5: 57, 6: 57}.get(bode_score, 18)
    return bode_score, bode_risk

# Load patient data from CSV
def load_patient_data(filename):
    with open(filename, mode='r') as file:
        csv_reader = csv.DictReader(file)
        required_columns = ['NAME', 'HEIGHT_M', 'WEIGHT_KG', 'fev_pct', 'dyspnea_description', 'distance_in_meters', 'hospital']
        for column in required_columns:
            if column not in csv_reader.fieldnames:
                raise ValueError(f"Missing required column: {column}")
        return [{**row, 'HEIGHT_M': float(row['HEIGHT_M']), 'WEIGHT_KG': float(row['WEIGHT_KG']),
                 'fev_pct': float(row['fev_pct']), 'distance_in_meters': float(row['distance_in_meters'])} for row in csv_reader]

# Load hospital data from JSON
def load_hospital_data(filename):
    with open(filename, mode='r') as file:
        return json.load(file)

# Process patients and hospitals
def process_data(patient_data, hospital_data):
    patient_results, hospital_aggregates = [], {}

    for patient in patient_data:
        bode_score, bode_risk = calculate_bode_score(patient)
        patient_id, hospital_id = patient['NAME'], patient['hospital']
        patient_results.append([patient_id, hospital_id, bode_score, bode_risk])

        if hospital_id not in hospital_aggregates:
            hospital_aggregates[hospital_id] = {'total_bode_score': 0, 'total_bode_risk': 0, 'num_patients': 0}

        hospital_aggregates[hospital_id]['total_bode_score'] += bode_score
        hospital_aggregates[hospital_id]['total_bode_risk'] += bode_risk
        hospital_aggregates[hospital_id]['num_patients'] += 1

    hospital_output = [
        [hospital_id, aggregates['total_bode_score'] / aggregates['num_patients'],
         aggregates['total_bode_risk'] / aggregates['num_patients'], aggregates['num_patients']]
        for hospital_id, aggregates in hospital_aggregates.items()
    ]

    return patient_results, hospital_output

# Write data to CSV
def write_csv(filename, data, headers=None):
    with open(filename, 'w', newline='') as csvfile:
        writer = csv.writer(csvfile)
        if headers:
            writer.writerow(headers)
        writer.writerows(data)

# Load data, process it, and save results
patient_data = load_patient_data(patient_csv)
hospital_data = load_hospital_data(hospital_json)
patient_results, hospital_output = process_data(patient_data, hospital_data)

write_csv("patient_output.csv", patient_results, headers=["PATIENT_NAME", "HOSPITAL", "BODE_SCORE", "BODE_RISK"])
write_csv("hospital_output.csv", hospital_output, headers=["HOSPITAL", "AVG_BODE_SCORE", "AVG_BODE_RISK", "NUM_PATIENTS"])

# Output for verification
print("Patient Results (First 10):", patient_results[:10])
print("Hospital Results (First 10):", hospital_output[:10])

Patient Results (First 10): [['Vanessa Roberts', "ST.LUKE'S", 1, 80], ['Christopher Fox', 'SAINT LOUIS UNIVERSITY', 3, 67], ['Benjamin Johnston', 'BJC', 1, 80], ['Christopher Hernandez', 'MISSOURI BAPTIST', 1, 80], ['Valerie Burch', 'BJC WEST COUNTY', 0, 80], ['Heather Hart', 'SAINT LOUIS UNIVERSITY', 3, 67], ['Ronald Cobb', "ST.MARY'S", 4, 67], ['Austin French', 'SAINT LOUIS UNIVERSITY', 6, 57], ['Mary Leonard', 'BJC', 5, 57], ['Mrs. Nicole Smith', "ST.MARY'S", 3, 67]]
Hospital Results (First 10): [["ST.LUKE'S", 2.9146341463414633, 70.70731707317073, 164], ['SAINT LOUIS UNIVERSITY', 3.0365853658536586, 69.78658536585365, 164], ['BJC', 2.972826086956522, 70.19021739130434, 184], ['MISSOURI BAPTIST', 2.8260869565217392, 70.72049689440993, 161], ['BJC WEST COUNTY', 2.8771929824561404, 70.64327485380117, 171], ["ST.MARY'S", 2.8974358974358974, 70.1923076923077, 156]]
