<a href="https://colab.research.google.com/github/twinklegithub/HDS5210_InClass/blob/master/midterm/midterm.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

### Mid-term for HDS5210

Your supervisor is concerned about 4-year survival risks for COPD. She has asked for you to do some analysis using a new metric, BODE. BODE is an improvement on a previous metric and promises to provide insight on survival risks.

BODE is defined here. https://www.mdcalc.com/calc/3916/bode-index-copd-survival#evidence

Your assignment is to create a BODE calculation, use it to calculate BODE scores and BODE survival rates for a group of patients. Then we want to evaluate the average BODE scores and BODE survival rates for each area hospital.

Your patient input file will have the following columns:
NAME,SSN,LANGUAGE,JOB,HEIGHT_M,WEIGHT_KG,fev_pct,dyspnea_description,distance_in_meters,hospital

BODE calculations require a BMI value, so you will have to create a function for it.

Your output should be in the form of two CSV files, patient_output.csv and hospital_output.csv.

Patient_output will have the following columns:
NAME,BODE_SCORE,BODE_RISK,HOSPITAL

Hospital output will have the following columns:
HOSPITAL_NAME, COPD_COUNT, PCT_OF_COPD_CASES_OVER_BEDS, AVG_SCORE, AVG_RISK

Each function you create should have documentation and a suitable number of test cases. If the input data could be wrong, make sure to raise a Value Error.

For this assignment, use the doctest, json, and csv libraries. Pandas is not allowed for this assignment.

In [77]:
import doctest
import json
import csv

### Step 1: Calculate BMI

In [78]:
def calculate_bmi(weight_kg, height_m):
    """
    Calculate the BMI from weight and height.

    Parameters:
    weight_kg (float): Weight in kilograms.
    height_m (float): Height in meters.

    Returns:
    float: BMI value rounded to two decimal places.

    Raises:
    ValueError: If weight or height is non-positive.

    Examples:

    >>> calculate_bmi(80, 1.8)
    24.69

    >>> calculate_bmi(90, 1.9)
    24.93

    >>> calculate_bmi(55, 1.55)
    22.89
    """
    if weight_kg <= 0 or height_m <= 0:
        raise ValueError("Weight and height always must be positive numbers.")

    bmi = weight_kg / (height_m ** 2)
    return round(bmi, 2)




### Step 2: Calculate BODE Score

In [79]:
def normalize_dyspnea_description(description):
    """
    Normalize variations of dyspnea descriptions to fit known categories.

    Parameters:
    description (str): Original dyspnea description.

    Returns:
    str: Normalized dyspnea description.

    Examples:
    >>> normalize_dyspnea_description("STOPS AFTER A FEW MINUTES")
    'Severe breathlessness'

    >>> normalize_dyspnea_description("WHEN HURRYING")
    'Moderate breathlessness'

    >>> normalize_dyspnea_description("UNABLE TO LEAVE HOME")
    'Severe breathlessness'

    >>> normalize_dyspnea_description("ONLY STRENUOUS EXERCISE")
    'Mild breathlessness'

    >>> normalize_dyspnea_description("WALKING UPHILL")
    'Moderate breathlessness'

    >>> normalize_dyspnea_description("UNKNOWN DESCRIPTION")
    'Invalid description'
    """
    description = description.upper().strip()

    dyspnea_mapping = {
        "STOPS AFTER A FEW MINUTES": "Severe breathlessness",
        "WHEN HURRYING": "Moderate breathlessness",
        "UNABLE TO LEAVE HOME": "Severe breathlessness",
        "SLOWER THAN PEERS": "Moderate breathlessness",
        "WALKING UPHILL": "Moderate breathlessness",
        "ONLY STRENUOUS EXERCISE": "Mild breathlessness",
        "BREATHLESS WHEN DRESSING": "Severe breathlessness",
        "STOPS WHEN WALKING AT PACE": "Severe breathlessness",
        "STOPS AFTER 100 YARDS": "Severe breathlessness",
    }

    return dyspnea_mapping.get(description, "Invalid description")

def calculate_bode_score(bmi, fev_pct, dyspnea_description, distance_in_meters):
    """
    Calculate the BODE score based on BMI, FEV1 percentage, dyspnea description, and distance in meters.

    Parameters:
    bmi (float): Body Mass Index.
    fev_pct (float): FEV1 percentage.
    dyspnea_description (str): Description of dyspnea.
    distance_in_meters (float): Distance walked in meters.

    Returns:
    int: BODE score.

    Raises:
    ValueError: If the dyspnea description is invalid.

    Examples:
    >>> calculate_bode_score(22, 70, 'ONLY STRENUOUS EXERCISE', 400)
    1

    >>> calculate_bode_score(18, 40, 'STOPS WHEN WALKING AT PACE', 200)
    8

    >>> calculate_bode_score(25, 80, 'WHEN HURRYING', 300)
    2

    >>> calculate_bode_score(20, 55, 'UNABLE TO LEAVE HOME', 100)
    7

    >>> calculate_bode_score(19, 30, 'UNKNOWN DESCRIPTION', 150)
    Traceback (most recent call last):
        ...
    ValueError: Invalid dyspnea description.
    """
    bode_score = 0

    # Calculate BMI score
    bode_score += 0 if bmi > 21 else 1

    # Calculate FEV1 score
    if fev_pct >= 65:
        bode_score += 0
    elif 50 <= fev_pct < 65:
        bode_score += 1
    elif 36 <= fev_pct < 50:
        bode_score += 2
    else:
        bode_score += 3

    # Normalize dyspnea description and map it to a score
    dyspnea_score = normalize_dyspnea_description(dyspnea_description)

    if dyspnea_score == "Invalid description":
        raise ValueError("Invalid dyspnea description.")

    dyspnea_mapping = {
        "No breathlessness": 0,
        "Mild breathlessness": 1,
        "Moderate breathlessness": 2,
        "Severe breathlessness": 3,
    }

    bode_score += dyspnea_mapping.get(dyspnea_score, 0)  # Default to 0 if not found

    # Calculate distance walked score
    if distance_in_meters > 350:
        bode_score += 0
    elif 250 <= distance_in_meters <= 350:
        bode_score += 1
    elif 150 <= distance_in_meters < 250:
        bode_score += 2
    else:
        bode_score += 3

    return bode_score


### Step 3: Calculate BODE Risk

In [80]:
def calculate_bode_risk(bode_score: int) -> str:
    """
    Calculate the BODE risk category based on the BODE score.

    Parameters:
    bode_score (int): The BODE score.

    Returns:
    str: BODE risk category.

    Examples:
    >>> calculate_bode_risk(0)
    'Low Risk'

    >>> calculate_bode_risk(2)
    'Low Risk'

    >>> calculate_bode_risk(3)
    'Moderate Risk'

    >>> calculate_bode_risk(4)
    'Moderate Risk'

    >>> calculate_bode_risk(5)
    'Moderate Risk'

    >>> calculate_bode_risk(6)
    'High Risk'

    >>> calculate_bode_risk(10)
    'High Risk'
    """
    if bode_score <= 2:
        return "Low Risk"
    elif 3 <= bode_score <= 5:
        return "Moderate Risk"
    else:
        return "High Risk"




### Step 4: Load Hospital Data

In [81]:
import json

def load_hospital_data(json_file: str) -> dict:
    """
    Open a JSON file containing hospital data.

    Parameters:
    json_file (str): The path to the hospital data JSON file.

    Returns:
    dict: Hospital data parsed from the JSON file.

    Raises:
    FileNotFoundError: If the specified file does not exist.
    json.JSONDecodeError: If the file content is not valid JSON.

    Examples:
    >>> import json
    >>> with open('hospital_data.json', 'w') as f:
    ...     json.dump({"hospitals": [{"name": "General Hospital", "beds": 100}]}, f)
    >>> data = load_hospital_data('hospital_data.json')
    >>> isinstance(data, dict)
    True
    >>> len(data['hospitals'])
    1
    >>> data['hospitals'][0]['name']
    'General Hospital'
    >>> data['hospitals'][0]['beds']
    100

    >>> # Error case: File does not exist
    >>> load_hospital_data('non_existent_file.json')
    Traceback (most recent call last):
        ...
    FileNotFoundError: File not found: non_existent_file.json

    >>> # Error case: Invalid JSON content
    >>> with open('invalid_hospital_data.json', 'w') as f:
    ...     f.write("This is not valid JSON")
    >>> load_hospital_data('invalid_hospital_data.json')
    Traceback (most recent call last):
        ...
    ValueError: Error decoding JSON from the file: invalid_hospital_data.json

    >>> # Error case: Empty JSON file
    >>> with open('empty_hospital_data.json', 'w') as f:
    ...     f.write("{}")
    >>> data = load_hospital_data('empty_hospital_data.json')
    >>> isinstance(data, dict)
    True
    >>> len(data.get('hospitals', []))
    0
    """
    try:
        with open(json_file, 'r') as file:
            return json.load(file)
    except FileNotFoundError as e:
        raise FileNotFoundError(f"File not found: {json_file}") from e
    except json.JSONDecodeError as e:
        raise ValueError(f"Error decoding JSON from the file: {json_file}") from e



### Step 5: Main business logic

Call BODE Score, BODE Risk functions for each patient.

For each hospital, calculate Avg BODE score and Avg BODE risk and count the number of cases for each hospital.

In [82]:
import csv
import json  # If you're loading JSON
def load_hospital_data(file_path):
    with open(file_path, 'r') as f:
        return json.load(f)
def calculate_bmi(weight_kg, height_m):
    return weight_kg / (height_m ** 2)

def calculate_bode_score(bmi, fev_pct, dyspnea_description, distance_in_meters):
    # Sample calculation
    return bmi + (100 - fev_pct) + (1 if dyspnea_description else 0) + (1 if distance_in_meters < 100 else 0)

def calculate_bode_risk(bode_score):
    # Define risk based on BODE score
    return "Low" if bode_score < 2 else "High"
def test_calculate_bmi():
    assert calculate_bmi(70, 1.75) == 22.86  # expected BMI for 70 kg, 1.75 m

def test_calculate_bode_score():
    assert calculate_bode_score(22.86, 80, "mild", 150) == expected_score  # Define expected score based on your logic

def test_hospital_metrics():
    # Mock data
    mock_hospital_data = {
        "Hospital A": {
            "total_bode_score": 50,
            "total_risk": 5,
            "copd_count": 5,
            "beds": 100
        }
    }
    # Implement logic to test average calculation
def test_calculate_bmi():
    assert calculate_bmi(70, 1.75) == 22.86  # expected BMI for 70 kg, 1.75 m

def test_calculate_bode_score():
    assert calculate_bode_score(22.86, 80, "mild", 150) == expected_score  # Define expected score based on your logic

def test_hospital_metrics():
    # Mock data
    mock_hospital_data = {
        "Hospital A": {
            "total_bode_score": 50,
            "total_risk": 5,
            "copd_count": 5,
            "beds": 100
        }
    }
    # Implement logic to test average calculation


In [83]:
if __name__ == "__main__":
    # Execute doctest to run the tests defined in the docstrings of the functions.
    doctest.testmod()
