### Mid-term for HDS5210

Your supervisor is concerned about 4-year survival risks for COPD. She has asked for you to do some analysis using a new metric, BODE. BODE is an improvement on a previous metric and promises to provide insight on survival risks.

BODE is defined here. https://www.mdcalc.com/calc/3916/bode-index-copd-survival#evidence

Your assignment is to create a BODE calculation, use it to calculate BODE scores and BODE survival rates for a group of patients. Then we want to evaluate the average BODE scores and BODE survival rates for each area hospital.

Your patient input file will have the following columns:
NAME,SSN,LANGUAGE,JOB,HEIGHT_M,WEIGHT_KG,fev_pct,dyspnea_description,distance_in_meters,hospital

BODE calculations require a BMI value, so you will have to create a function for it.

Your output should be in the form of two CSV files, patient_output.csv and hospital_output.csv.

Patient_output will have the following columns:
NAME,BODE_SCORE,BODE_RISK,HOSPITAL

Hospital output will have the following columns:
HOSPITAL_NAME, COPD_COUNT, PCT_OF_COPD_CASES_OVER_BEDS, AVG_SCORE, AVG_RISK

Each function you create should have documentation and a suitable number of test cases. If the input data could be wrong, make sure to raise a Value Error.

For this assignment, use the doctest, json, and csv libraries. Pandas is not allowed for this assignment.

In [3]:
import doctest
import json
import csv

### Step 1: Calculate BMI

In [16]:
def bmi(weight_kg, height_m):
    """
    Calculate the BMI given weight in kilograms and height in meters.

    >>> bmi(70, 1.75)
    22.86

    >>> bmi(100, 0)
    Traceback (most recent call last):
        ...
    ValueError: Height must be greater than 0
    """
    if height_m <= 0:
        raise ValueError("Height must be greater than 0")
    bmi= round(weight_kg / (height_m ** 2), 2)

    return bmi


In [17]:
import doctest
doctest.run_docstring_examples(bmi, globals(),verbose=True)

Finding tests in NoName
Trying:
    bmi(70, 1.75)
Expecting:
    22.86
ok
Trying:
    bmi(100, 0)
Expecting:
    Traceback (most recent call last):
        ...
    ValueError: Height must be greater than 0
ok


In [18]:
bmi(82,1.65)#my personal bmi

30.12

### Step 2: Calculate BODE Score

In [24]:
def bode_score(bmi, fev_pct, dyspnea_description, distance_in_meters):
  """
  Calculate the BODE score based on FEV percentage, distance walked, dyspnea level, and BMI.
  BODE score is calculated based on:
   FEV1 percentage (0 to 3 points)
   Distance walked (0 to 3 points)
   Dyspnea level (0 to 3 points)
   BMI (0 or 1 point)
  >>> bode_score(20,65,'SLOWER THAN PEERS',350)
  2
  >>> bode_score(25,80,'BREATHLESS WHEN DRESSING',150)
  5
  """
  if bmi <= 0:
        raise ValueError("bmi must be greater than 0")
  if bmi > 21:
    bmi_score=0
  else:
    bmi_score = 1

  if fev_pct >= 65:
        fev_score = 0
  elif fev_pct >= 50:
        fev_score = 1
  elif fev_pct >= 36:
        fev_score = 2
  else:
        fev_score = 3
  if dyspnea_description in ["STOPS WHEN WALKING AT PACE", "SLOWER THAN PEERS"]:
    dyspnea_score = 1
  elif dyspnea_description in ["STOPS AFTER 100 YARDS", "STOPS AFTER A FEW MINUTES"]:
    dyspnea_score = 2
  elif dyspnea_description in ["BREATHLESS WHEN DRESSING", "UNABLE TO LEAVE HOME"]:
    dyspnea_score = 3
  else:
    dyspnea_score = 0

  if distance_in_meters >= 350:
        distance_score = 0
  elif distance_in_meters >= 250:
        distance_score = 1
  elif distance_in_meters >= 150:
        distance_score = 2
  else:
        distance_score = 3

  bode_score = bmi_score + fev_score + dyspnea_score + distance_score

  return bode_score


In [25]:
import doctest
doctest.run_docstring_examples(bode_score,globals(),verbose=True)

Finding tests in NoName
Trying:
    bode_score(20,65,'SLOWER THAN PEERS',350)
Expecting:
    2
ok
Trying:
    bode_score(25,80,'BREATHLESS WHEN DRESSING',150)
Expecting:
    5
ok


In [26]:
bode_score(0,70,"STOPS AFTER 100 YARDS",350)

ValueError: bmi must be greater than 0

In [27]:
bode_score(20,43,"UNABLE TO LEAVE HOME",140)

9

### Step 3: Calculate BODE Risk

In [35]:
def bode_risk(bode_score):
    """
    Calculate the 4-year survival risk based on the BODE score.

    >>> bode_risk(2)
    80
    >>> bode_risk(4)
    67
    >>> bode_risk(7)
    18
    """
    if bode_score <0 or bode_score >10:
      raise ValueError ("INVALID BODE SCORE")
    if bode_score <= 2:
        return 80  # 80% survival rate for BODE score
    elif bode_score <= 4:
        return 67  # 67% survival rate for BODE score
    elif bode_score <= 6:
        return 57  # 57% survival rate for BODE score
    else:
        return 18  # 18% survival rate for BODE score

    return bode_risk

In [36]:
import doctest
doctest.run_docstring_examples(bode_risk,globals(),verbose=True)

Finding tests in NoName
Trying:
    bode_risk(2)
Expecting:
    80
ok
Trying:
    bode_risk(4)
Expecting:
    67
ok
Trying:
    bode_risk(7)
Expecting:
    18
ok


In [37]:
bode_risk(-3)

ValueError: INVALID BODE SCORE

In [38]:
bode_risk(200)

ValueError: INVALID BODE SCORE

In [39]:
bode_risk(5)

57

### Step 4: Load Hospital Data

In [49]:
def load_hospital_data(hospital_json_file):
    """
    Load hospital data from a JSON file.

    >>> load_hospital_data("/content/hospitals.json")
    {'BJC': 2000, 'BJC WEST COUNTY': 1000, 'MISSOURI BAPTIST': 800,
     'SAINT LOUIS UNIVERSITY': 1000, 'ST.MARY\'S': 500,
     'ST.LUKE\'S': 800}
    """

    hospital_beds = {}

    with open(hospital_json_file, 'r') as f:
        hospital_data = json.load(f)

    for system in hospital_data:
        for hospital in system['hospitals']:
            hospital_beds[hospital['name']] = hospital['beds']

    return hospital_beds

### Step 5: Main business logic

Call BODE Score, BODE Risk functions for each patient.

For each hospital, calculate Avg BODE score and Avg BODE risk and count the number of cases for each hospital.

In [50]:
def main(patient_csv_file, hospital_json_file, patient_output_file, hospital_output_file):
    hospitals = load_hospital_data(hospital_json_file)
    hospital_stats = {hospital: {'bode_scores': [], 'bode_risks': [], 'copd_count': 0} for hospital in hospitals}
    patient_results = []

    try:
        with open(patient_csv_file, newline='') as csvfile:
            reader = csv.DictReader(csvfile)
            for row in reader:
                try:
                    name = row['NAME']
                    weight = float(row['WEIGHT_KG'])
                    height = float(row['HEIGHT_M'])
                    fev_pct = float(row['fev_pct'])
                    dyspnea = row['dyspnea_description']
                    distance = float(row['distance_in_meters'])
                    hospital = row['hospital']

                    bmi_value = bmi(weight, height)
                    bode_score_value = bode_score(bmi_value, fev_pct, dyspnea, distance)
                    bode_risk_value = bode_risk(bode_score_value)

                    patient_results.append([name, bode_score_value, bode_risk_value, hospital])
                    hospital_stats[hospital]['bode_scores'].append(bode_score_value)
                    hospital_stats[hospital]['bode_risks'].append(bode_risk_value)
                    hospital_stats[hospital]['copd_count'] += 1

                except ValueError as e:
                    print(f"Error processing patient {name}: {e}")

    except FileNotFoundError:
        raise ValueError("Patient file not found.")
    except KeyError as e:
        raise ValueError(f"Missing column in patient data: {e}")

    hospital_output_list = [['HOSPITAL_NAME', 'COPD_COUNT', 'PCT_OF_COPD_CASES_OVER_BEDS', 'AVG_SCORE', 'AVG_RISK']]
    for hospital, stats in hospital_stats.items():
        copd_count = stats['copd_count']
        beds = hospitals[hospital]
        avg_bode_score = sum(stats['bode_scores']) / copd_count if copd_count > 0 else 0
        avg_bode_risk = sum(stats['bode_risks']) / copd_count if copd_count > 0 else 0
        pct_copd_cases_over_beds = (copd_count / beds) * 100 if beds > 0 else 0
        hospital_output_list.append([hospital, copd_count, pct_copd_cases_over_beds, round(avg_bode_score, 2), round(avg_bode_risk, 2)])

    with open(patient_output_file, 'w', newline='') as csvfile:
        writer = csv.writer(csvfile)
        writer.writerow(['NAME', 'BODE_SCORE', 'BODE_RISK', 'HOSPITAL'])
        writer.writerows(patient_results)

    with open(hospital_output_file, 'w', newline='') as csvfile:
        writer = csv.writer(csvfile)
        writer.writerows(hospital_output_list)

main('/content/patient.csv', '/content/hospitals .json', '/content/patient_output.csv', '/content/hospital_output.csv')