<a href="https://colab.research.google.com/github/kousalyaogirala26/HDS5210_InClasskousalya/blob/master/Copy_of_midterm.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

### Mid-term for HDS5210

Your supervisor is concerned about 4-year survival risks for COPD. She has asked for you to do some analysis using a new metric, BODE. BODE is an improvement on a previous metric and promises to provide insight on survival risks.

BODE is defined here. https://www.mdcalc.com/calc/3916/bode-index-copd-survival#evidence

Your assignment is to create a BODE calculation, use it to calculate BODE scores and BODE survival rates for a group of patients. Then we want to evaluate the average BODE scores and BODE survival rates for each area hospital.

Your patient input file will have the following columns:
NAME,SSN,LANGUAGE,JOB,HEIGHT_M,WEIGHT_KG,fev_pct,dyspnea_description,distance_in_meters,hospital

BODE calculations require a BMI value, so you will have to create a function for it.

Your output should be in the form of two CSV files, patient_output.csv and hospital_output.csv.

Patient_output will have the following columns:
NAME,BODE_SCORE,BODE_RISK,HOSPITAL

Hospital output will have the following columns:
HOSPITAL_NAME, COPD_COUNT, PCT_OF_COPD_CASES_OVER_BEDS, AVG_SCORE, AVG_RISK

Each function you create should have documentation and a suitable number of test cases. If the input data could be wrong, make sure to raise a Value Error.

For this assignment, use the doctest, json, and csv libraries. Pandas is not allowed for this assignment.

In [2]:
import doctest
import json
import csv

### Step 1: Calculate BMI

In [3]:
def calculate_bmi(weight_kg, height_m):
    """
    Calculate BMI (Body Mass Index) based on weight in kg and height in meters.

    Args:
    weight_kg (float): Weight of the patient in kilograms
    height_m (float): Height of the patient in meters

    Returns:
    float: Calculated BMI value

    Raises:
    ValueError: If weight_kg or height_m is non-positive.

    >>> calculate_bmi(70, 1.75)
    22.86
    >>> calculate_bmi(90, 1.80)
    27.78
    """
    if weight_kg <= 0 or height_m <= 0:
        raise ValueError("Weight and height must be positive values.")

    bmi = round(weight_kg / (height_m ** 2), 2)
    return bmi


In [8]:
if __name__ == "__main__":
    import doctest
    doctest.testmod()


sys.settrace() should not be used when the debugger is being used.
This may cause the debugger to stop working correctly.
If this is needed, please check: 
http://pydev.blogspot.com/2007/06/why-cant-pydev-debugger-work-with.html
to see how to restore the debug tracing back correctly.
Call Location:
  File "/usr/lib/python3.10/doctest.py", line 1501, in run
    sys.settrace(save_trace)



### Step 2: Calculate BODE Score

In [None]:
def normalize_dyspnea_description(description):
    """
    Normalize the description of dyspnea to standardized breathlessness categories.

    >>> normalize_dyspnea_description("STOPS AFTER A FEW MINUTES")
    'Severe breathlessness'

    >>> normalize_dyspnea_description("WHEN HURRYING")
    'Moderate breathlessness'
    """
    # Convert the input description to uppercase and remove leading/trailing spaces
    description = description.upper().strip()

    # Map specific descriptions to standardized categories
    if "STOPS AFTER A FEW MINUTES" in description:
        return "Severe breathlessness"
    elif "WHEN HURRYING" in description:
        return "Moderate breathlessness"
    elif "UNABLE TO LEAVE HOME" in description:
        return "Severe breathlessness"
    elif "SLOWER THAN PEERS" in description:
        return "Moderate breathlessness"
    elif "WALKING UPHILL" in description:
        return "Moderate breathlessness"
    elif "ONLY STRENUOUS EXERCISE" in description:
        return "Mild breathlessness"
    elif "BREATHLESS WHEN DRESSING" in description:
        return "Severe breathlessness"
    elif "STOPS WHEN WALKING AT PACE" in description:
        return "Severe breathlessness"
    elif "STOPS AFTER 100 YARDS" in description:
        return "Severe breathlessness"

    # If no match, return the original description
    return description


def calculate_bode_score(bmi, fev_pct, dyspnea_description, distance_in_meters):
    """
    Compute the BODE score using BMI, FEV1 percentage, dyspnea description, and walking distance.

    >>> calculate_bode_score(22, 70, 'ONLY STRENUOUS EXERCISE', 400)
    1
    >>> calculate_bode_score(18, 40, 'STOPS WHEN WALKING AT PACE', 200)
    8
    """
    bode_score = 0

    # Add to BODE score based on BMI
    if bmi > 21:
        bode_score += 0
    else:
        bode_score += 1

    # Add to BODE score based on FEV1 percentage
    if fev_pct >= 65:
        bode_score += 0
    elif 50 <= fev_pct < 65:
        bode_score += 1
    elif 36 <= fev_pct < 50:
        bode_score += 2
    else:
        bode_score += 3

    # Normalize the dyspnea description and map to a score
    dyspnea_description = normalize_dyspnea_description(dyspnea_description)
    dyspnea_mapping = {
        "No breathlessness": 0,
        "Mild breathlessness": 1,
        "Moderate breathlessness": 2,
        "Severe breathlessness": 3,
    }

    # Get dyspnea score from the mapping
    dyspnea_score = dyspnea_mapping.get(dyspnea_description)
    if dyspnea_score is None:
        raise ValueError(f"Invalid dyspnea description: {dyspnea_description}")

    bode_score += dyspnea_score

    # Add to BODE score based on distance walked
    if distance_in_meters > 350:
        bode_score += 0
    elif 250 <= distance_in_meters <= 350:
        bode_score += 1
    elif 150 <= distance_in_meters < 250:
        bode_score += 2
    else:
        bode_score += 3

    return bode_score


In [10]:
if __name__ == "__main__":
    import doctest
    doctest.testmod()

### Step 3: Calculate BODE Risk

In [None]:
def calculate_bode_risk(bode_score):
    """
    Determine the risk category based on the given BODE score.

    Parameters:
    bode_score (int): The calculated BODE score.

    Returns:
    str: The corresponding BODE risk category.
    """
    if bode_score <= 2:
        return "Low Risk"
    elif 3 <= bode_score <= 5:
        return "Moderate Risk"
    else:
        return "High Risk"



In [17]:
if __name__ == "__main__":
    import doctest
    doctest.testmod()

### Step 4: Load Hospital Data

In [12]:
def load_hospital_data(json_file):
    """
    Load hospital data from a JSON file.
    """
    with open(json_file, 'r') as file:
        return json.load(file)
dyspnea_descriptions = set()

with open('patient.csv', 'r') as csvfile:
    reader = csv.DictReader(csvfile)
    for row in reader:
        dyspnea_descriptions.add(row['dyspnea_description'])

# Print the different types of dyspnea descriptions
print("Different types of dyspnea descriptions in the dataset:")
for description in dyspnea_descriptions:
    print(description)



Different types of dyspnea descriptions in the dataset:
SLOWER THAN PEERS
STOPS AFTER 100 YARDS
WALKING UPHILL
BREATHLESS WHEN DRESSING
STOPS AFTER A FEW MINUTES
STOPS WHEN WALKING AT PACE
UNABLE TO LEAVE HOME
WHEN HURRYING
ONLY STRENUOUS EXERCISE


In [16]:
if __name__ == "__main__":
    import doctest
    doctest.testmod()

### Step 5: Main business logic

Call BODE Score, BODE Risk functions for each patient.

For each hospital, calculate Avg BODE score and Avg BODE risk and count the number of cases for each hospital.

In [None]:
patient_csv = "patient.csv"
hospital_json = "hospitals.json"

patient_output_file = "patient_output.csv"
hospital_output_file = "hospital_output.csv"

###
# Your logic here

# Load hospital data
hospital_data = load_hospital_data(hospital_json)

# Initialize the hospital metrics dictionary using the hospital names from the JSON data
hospital_metrics = {}

for entry in hospital_data:
    # Iterate over the hospitals list within the entry
    for hospital in entry['hospitals']:
        hospital_metrics[hospital['name']] = {
            'total_bode_score': 0,
            'total_risk': 0,
            'copd_count': 0,
            'beds': hospital['beds']
        }

patient_results = []
# Read patient data from the CSV file
with open(patient_csv, 'r') as csvfile:
    reader = csv.DictReader(csvfile)

    for row in reader:
        name = row['NAME']
        ssn = row['SSN']
        language = row['LANGUAGE']
        job = row['JOB']
        height_m = float(row['HEIGHT_M'])
        weight_kg = float(row['WEIGHT_KG'])
        fev_pct = float(row['fev_pct'])
        dyspnea_description = row['dyspnea_description']
        distance_in_meters = float(row['distance_in_meters'])
        hospital_name = row['hospital']

        # Calculate BMI, BODE score, and BODE risk
        bmi = calculate_bmi(weight_kg, height_m)
        bode_score = calculate_bode_score(bmi, fev_pct, dyspnea_description, distance_in_meters)
        bode_risk = calculate_bode_risk(bode_score)

        # Add patient results
        patient_results.append([name, bode_score, bode_risk, hospital_name])

        # Update hospital metrics
        if hospital_name in hospital_metrics:
            hospital_metrics[hospital_name]['total_bode_score'] += bode_score
            hospital_metrics[hospital_name]['total_risk'] += 1
            hospital_metrics[hospital_name]['copd_count'] += 1

hospital_output_list = []

# Calculate hospital metrics
for hospital_name, metrics in hospital_metrics.items():
    copd_count = metrics['copd_count']
    if copd_count > 0:
        avg_bode_score = metrics['total_bode_score'] / copd_count
        avg_bode_risk = metrics['total_risk'] / copd_count
    else:
        avg_bode_score = 0
        avg_bode_risk = 0
    pct_of_copd_cases = (copd_count / metrics['beds']) * 100 if metrics['beds'] > 0 else 0
    hospital_output_list.append([hospital_name, copd_count, pct_of_copd_cases, avg_bode_score, avg_bode_risk])
#Write Patient_output.csv
with open(patient_output_file, 'w', newline='') as csvfile:
    writer = csv.writer(csvfile)
    writer.writerows(patient_results)
#Write Hospital_output.csv
with open(hospital_output_file, 'w', newline='') as csvfile:
    writer = csv.writer(csvfile)
    writer.writerows(hospital_output_list)

In [15]:
if __name__ == "__main__":
    import doctest
    doctest.testmod()