### Mid-term for HDS5210

Your supervisor is concerned about 4-year survival risks for COPD. She has asked for you to do some analysis using a new metric, BODE. BODE is an improvement on a previous metric and promises to provide insight on survival risks.

BODE is defined here. https://www.mdcalc.com/calc/3916/bode-index-copd-survival#evidence

Your assignment is to create a BODE calculation, use it to calculate BODE scores and BODE survival rates for a group of patients. Then we want to evaluate the average BODE scores and BODE survival rates for each area hospital.

Your patient input file will have the following columns:
NAME,SSN,LANGUAGE,JOB,HEIGHT_M,WEIGHT_KG,fev_pct,dyspnea_description,distance_in_meters,hospital

BODE calculations require a BMI value, so you will have to create a function for it.

Your output should be in the form of two CSV files, patient_output.csv and hospital_output.csv.

Patient_output will have the following columns:
NAME,BODE_SCORE,BODE_RISK,HOSPITAL

Hospital output will have the following columns:
HOSPITAL_NAME, COPD_COUNT, PCT_OF_COPD_CASES_OVER_BEDS, AVG_SCORE, AVG_RISK

Each function you create should have documentation and a suitable number of test cases. If the input data could be wrong, make sure to raise a Value Error.

For this assignment, use the doctest, json, and csv libraries. Pandas is not allowed for this assignment.

In [80]:
import doctest
import json
import csv

### Step 1: Calculate BMI

In [81]:
import doctest
def calculate_bmi(height_m, weight_kg):
    """
    Calculate BMI using height in meters and weight in kilograms.
    these are the tests for the function:
    >>> calculate_bmi(1.80, 70)
    21.604938271604937
    >>> calculate_bmi(1.60, 60)
    23.437499999999996
    >>> calculate_bmi(1.5, 50)
    22.22222222222222
    >>> calculate_bmi(1.7, 80)
    27.68166089965398

    """

    try:
        if height_m <= 0:
            raise ValueError("height cannot be zero or negative")
        if weight_kg <= 0:
            raise ValueError("weight cannot be zero or negative")

        bmi = weight_kg / (height_m ** 2)  # Formula to get BMI using height in m and weight in kg
        return bmi

    except ValueError as e:
        raise e






In [82]:
assert round(calculate_bmi(1.80, 70),1) == 21.6
assert round(calculate_bmi(1.60, 60),1) == 23.4
assert round(calculate_bmi(1.5, 50),1) == 22.2
assert round(calculate_bmi(1.7, 80),1) == 27.7

In [83]:
doctest.run_docstring_examples(calculate_bmi, globals(), verbose=True)

Finding tests in NoName
Trying:
    calculate_bmi(1.80, 70)
Expecting:
    21.604938271604937
ok
Trying:
    calculate_bmi(1.60, 60)
Expecting:
    23.437499999999996
ok
Trying:
    calculate_bmi(1.5, 50)
Expecting:
    22.22222222222222
ok
Trying:
    calculate_bmi(1.7, 80)
Expecting:
    27.68166089965398
ok


### Step 2: Calculate BODE Score

In [84]:
def calculate_bode_score(fev_pct, bmi, dyspnea_description, distance_in_meters):
  """
  calculating bode score using fev_pct, bmi, dyspnea_score and 6 minutes walk distance_in_meters
  fev_pct: float
  bmi: float
  dyspnea_score: int
  distance_in_meters: int
  fev_pct scale
  >=65: 0
  50–64: 1
  36–49: 2
  <=35: 3
  bmi scale
  >21: 0
  <=21: 1
  none: 2
  none: 3
  dyspnea scale
  1: 0
  2: 1
  3: 2
  4: 3
  Distance scale:
  >= 350m: 0
  250-349m: 1
  150-249m: 2
  <=149m: 3
  test cases:
  >>> calculate_bode_score (25, 80, 1, 249)
  5
  >>> calculate_bode_score (36, 80, 2, 349)
  4
  >>> calculate_bode_score (65, 80, 3, 250)
  3
  >>> calculate_bode_score (65, 70, 4, 50)
  6
  >>> calculate_bode_score (25, 60, 1, 0)
  6
  >>> calculate_bode_score (25, 80, 1, -100)
  6

  """
  fev_pct_score = 0
  if fev_pct >= 65:
    fev_pct_score = 0
  elif 50 <= fev_pct <= 64:
    fev_pct_score = 1
  elif 36 <= fev_pct <= 49:
    fev_pct_score = 2
  elif fev_pct <= 35:
    fev_pct_score = 3


  dyspnea_score = 0
  if dyspnea_description == 1:
    dyspnea_score = 0
  elif dyspnea_description == 2:
    dyspnea_score = 1
  elif dyspnea_description == 3:
    dyspnea_score = 2
  elif dyspnea_description == 4:
    dyspnea_score = 3
  else:    #raises a valueerror if description is something else like none
    raise ValueError("dyspnea_description cannot be none")

  distance_score = 0
  if distance_in_meters >= 350:
    distance_score = 0
  elif 250 <= distance_in_meters <= 349:
    distance_score = 1
  elif 150 <= distance_in_meters <= 249:
    distance_score = 2
  elif distance_in_meters <= 149:
    distance_score = 3
  elif distance_in_meters <0:
     distance_score = 0
  else: #raises an error if distance is zero bc someone cannot walk and cover zero distance
    raise ValueError("distance_in_meters cannot be zero")
  bmi_score = 0
  if bmi > 21:
    bmi_score = 0
  elif bmi <= 21:
    bmi_score = 1
  else: # raises error if bmi is zero
    raise ValueError("bmi cannot be zero")

  bode_score = fev_pct_score + bmi_score + dyspnea_score + distance_score
  return bode_score
try:
  calculate_bode_score (25, 80, "normal", 249)
except ValueError as e:
  print(e)

dyspnea_description cannot be none


In [85]:
assert calculate_bode_score (25, 80, 1, 249) == 5
assert calculate_bode_score (36, 80, 2, 349) == 4
assert calculate_bode_score (65, 80, 3, 250) == 3
assert calculate_bode_score (65, 70, 4, 50) == 6
assert calculate_bode_score (25, 60, 1, 0) == 6
assert calculate_bode_score (25, 80, 1, -100) == 6

In [86]:
doctest.run_docstring_examples(calculate_bode_score, globals(), verbose=True)

Finding tests in NoName
Trying:
    calculate_bode_score (25, 80, 1, 249)
Expecting:
    5
ok
Trying:
    calculate_bode_score (36, 80, 2, 349)
Expecting:
    4
ok
Trying:
    calculate_bode_score (65, 80, 3, 250)
Expecting:
    3
ok
Trying:
    calculate_bode_score (65, 70, 4, 50)
Expecting:
    6
ok
Trying:
    calculate_bode_score (25, 60, 1, 0)
Expecting:
    6
ok
Trying:
    calculate_bode_score (25, 80, 1, -100)
Expecting:
    6
ok


### Step 3: Calculate BODE Risk

In [87]:
import doctest
def calculate_bode_risk(bode_score):

  """
  calculating bode risk using bode_score
  returns risk as a percentage
  bode_score: int
  TEST CASES:
  >>> calculate_bode_risk(1)
  80
  >>> calculate_bode_risk(4)
  67
  >>> calculate_bode_risk(5)
  57
  >>> calculate_bode_risk(9)
  18
  """
  if bode_score < 0:
    raise ValueError("BODE score cannot be negative")

  if 0 <= bode_score <= 2:
    return 80
  elif 3 <= bode_score <= 4:
    return 67
  elif 5 <= bode_score <= 6:
    return 57
  elif 7 <= bode_score <= 10:
    return 18
  else:
    raise ValueError("BODE score out of range")
try:
  calculate_bode_risk(1)
except ValueError as e:
  print(e)


In [88]:
assert calculate_bode_risk(1) == 80
assert calculate_bode_risk(4) == 67
assert calculate_bode_risk(5) == 57
assert calculate_bode_risk(9) == 18

In [89]:
doctest.run_docstring_examples(calculate_bode_risk, globals(), verbose=True)

Finding tests in NoName
Trying:
    calculate_bode_risk(1)
Expecting:
    80
ok
Trying:
    calculate_bode_risk(4)
Expecting:
    67
ok
Trying:
    calculate_bode_risk(5)
Expecting:
    57
ok
Trying:
    calculate_bode_risk(9)
Expecting:
    18
ok


### Step 4: Load Hospital Data

In [90]:
import json
from pathlib import Path
hospital_data = Path("hospitals.json")

with open("hospitals.json") as f:
    hospitals = json.load(f)
    print("Hospital Data loaded successfully")
    print(hospitals)

Hospital Data loaded successfully
[{'System': 'BJC', 'Hospitals': [{'Beds': 1432, 'Hospital': 'BJH', 'City': 'St. Louis'}, {'Beds': 1107, 'Hospital': 'MOBap', 'City': 'Creve Coeur'}]}, {'System': 'SSM', 'Hospitals': [{'Beds': 965, 'Hospital': 'SLUH', 'City': 'St. Louis'}]}, {'System': 'Mercy', 'Hospitals': [{'Beds': 983, 'Hospital': 'Mercy STL', 'City': 'Creve Coeur'}]}]


### Step 5: Main business logic

Call BODE Score, BODE Risk functions for each patient.

For each hospital, calculate Avg BODE score and Avg BODE risk and count the number of cases for each hospital.

In [91]:
from os import write
import csv
import json
import doctest
patient_data = []
hospital_data = []
patient_csv = "patient.csv"
hospital_json = "hospitals.json"

patient_output_file = "patient_output.csv"
hospital_output_file = "hospital_output.csv"
patient_results = []
hospital_results = []

###
def calculate_bmi(height_m, weight_kg):
    """
    Calculate BMI using height in meters and weight in kilograms.
    these are the tests for the function:
    >>> calculate_bmi(1.80, 70)
    21.604938271604937
    >>> calculate_bmi(1.60, 60)
    23.437499999999996
    >>> calculate_bmi(1.5, 50)
    22.22222222222222
    >>> calculate_bmi(1.7, 80)
    27.68166089965398

    """

    try:
        if height_m <= 0:
            raise ValueError("height cannot be zero or negative")
        if weight_kg <= 0:
            raise ValueError("weight cannot be zero or negative")

        bmi = weight_kg / (height_m ** 2)  # Formula to get BMI using height in m and weight in kg
        return bmi

    except ValueError as e:
        raise e
def calculate_bode_score(fev_pct, bmi, dyspnea_description, distance_in_meters):
  """
  calculating bode score using fev_pct, bmi, dyspnea_score and 6 minutes walk distance_in_meters
  fev_pct: float
  bmi: float
  dyspnea_score: int
  distance_in_meters: int
  fev_pct scale
  >=65: 0
  50–64: 1
  36–49: 2
  <=35: 3
  bmi scale
  >21: 0
  <=21: 1
  none: 2
  none: 3
  dyspnea scale
  1: 0
  2: 1
  3: 2
  4: 3
  Distance scale:
  >= 350m: 0
  250-349m: 1
  150-249m: 2
  <=149m: 3
  test cases:
  >>> calculate_bode_score (25, 80, 1, 249)
  5
  >>> calculate_bode_score (36, 80, 2, 349)
  4
  >>> calculate_bode_score (65, 80, 3, 250)
  3
  >>> calculate_bode_score (65, 70, 4, 50)
  6
  >>> calculate_bode_score (25, 60, 1, 0)
  6
  >>> calculate_bode_score (25, 80, 1, -100)
  6

  """
  fev_pct_score = 0
  if fev_pct >= 65:
    fev_pct_score = 0
  elif 50 <= fev_pct <= 64:
    fev_pct_score = 1
  elif 36 <= fev_pct <= 49:
    fev_pct_score = 2
  elif fev_pct <= 35:
    fev_pct_score = 3


  dyspnea_score = 0
  if dyspnea_description == 1:
    dyspnea_score = 0
  elif dyspnea_description == 2:
    dyspnea_score = 1
  elif dyspnea_description == 3:
    dyspnea_score = 2
  elif dyspnea_description == 4:
    dyspnea_score = 3


  distance_score = 0
  if distance_in_meters >= 350:
    distance_score = 0
  elif 250 <= distance_in_meters <= 349:
    distance_score = 1
  elif 150 <= distance_in_meters <= 249:
    distance_score = 2
  elif distance_in_meters <= 149:
    distance_score = 3
  elif distance_in_meters <0:
     distance_score = 0

  bmi_score = 0
  if bmi > 21:
    bmi_score = 0
  elif bmi <= 21:
    bmi_score = 1
  else: # raises error if bmi is zero
    raise ValueError("bmi cannot be zero")

  bode_score = fev_pct_score + bmi_score + dyspnea_score + distance_score
  return bode_score
def calculate_bode_risk(bode_score):

  """
  calculating bode risk using bode_score
  returns risk as a percentage
  bode_score: int
  TEST CASES:
  >>> calculate_bode_risk(1)
  80
  >>> calculate_bode_risk(4)
  67
  >>> calculate_bode_risk(5)
  57
  >>> calculate_bode_risk(9)
  18
  """

  if bode_score < 0:
    raise ValueError("BODE score cannot be negative")

  if 0 <= bode_score <= 2:
    return 80
  elif 3 <= bode_score <= 4:
    return 67
  elif 5 <= bode_score <= 6:
    return 57
  elif 7 <= bode_score <= 10:
    return 18
  else:
    raise ValueError("BODE score out of range")

def load_hospital_data(file_path):
    with open(file_path, 'r') as file:
        return json.load(file)

def processing_hospital_data(patient_data, hospital_data):


      hospital_stats = {}
      for system_data in hospital_data: #iterates through each system
        system_name = system_data['System']
        beds = system_data.get('hospital',[{}])[0].get('beds', 0)# if system_data['hospital'] else 0

        hospital_stats[system_name] = {'copd_count': 0, 'total_score': 0, 'total_risk': 0, 'beds': beds}


      for patient in patient_data:
        hospital_name = patient['system']
        #checks if the hosppital_name exits in hospital stats
        if hospital_name not in hospital_stats:
          # if not initialize it with default values
            hospital_stats[hospital_name] = {'copd_count': 0, 'total_score': 0, 'total_risk': 0, 'beds': 0}

        hospital_stats[hospital_name]['copd_count'] += 1
        hospital_stats[hospital_name]['total_score'] += patient['bode_score']
        hospital_stats[hospital_name]['total_risk'] += patient['bode_risk']

        results = []
      for hospital_name, stats in hospital_stats.items():
        if stats['copd_count'] > 0:
            avg_score = stats['total_score'] / stats['copd_count']
            avg_risk = stats['total_risk'] / stats['copd_count']
            # avoid division by zero if beds are 0
            pct_of_cases_over_beds = stats['copd_count'] / stats['beds'] if stats['beds'] else stats['copd_count']
            results.append({
                'HOSPITAL_NAME': hospital_name,
                'COPD_COUNT': stats['copd_count'],
                'PCT_OF_CASES_OVER_BEDS': pct_of_cases_over_beds,
                'AVG_SCORE': avg_score,
                'AVG_RISK': avg_risk
            })
      return results

def main(patient_csv, hospital_json):
    global patient_data, hospital_data
    hospital_data = load_hospital_data(hospital_json)

    with open(patient_csv, 'r') as csvfile:
        reader = csv.DictReader(csvfile)
        for row in reader:
         try:
            name = row['NAME']
            height_m = float(row['HEIGHT_M'])
            weight_kg = float(row['WEIGHT_KG'])
            fev_pct = float(row.get('FEV_PCT',row.get('FEV_PCT', 0)))
            #dyspnea_description = row.get('dyspnea_description', 'normal')
            dyspnea_description = row['dyspnea_description']
            distance_in_meters = float(row['distance_in_meters'])
            system = row['hospital']

            bmi = calculate_bmi(height_m, weight_kg)
            bode_score = calculate_bode_score(fev_pct, bmi, dyspnea_description, distance_in_meters)
            bode_risk = calculate_bode_risk(bode_score)

            patient_results = {
                'name': name,
                'bode_score': bode_score,
                'bode_risk': bode_risk,
                'system': system
            }
            patient_data.append(patient_results)

         except ValueError as e:
            print(f"Error processing patient data: {e}")
            continue

    # Write Patient_output.csv

    with open(patient_output_file, 'w', newline='') as csvfile:
        fieldnames = ['name', 'bode_score', 'bode_risk', 'system']
        writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
        writer.writeheader()
        writer.writerows(patient_data)
        print(f"Successfully created Patient_output.csv")



    hospital_results = processing_hospital_data(patient_data, hospital_data)



    # Write Hospital_output.csv

    with open(hospital_output_file, 'w', newline='') as csvfile:
        fieldnames = ['HOSPITAL_NAME', 'COPD_COUNT', 'PCT_OF_CASES_OVER_BEDS', 'AVG_SCORE', 'AVG_RISK']
        writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
        writer.writeheader()
        writer.writerows(hospital_results)
        print(f"Successfully created Hospital_output.csv")


###
if __name__ == "__main__":
    main(patient_csv, hospital_json)


Successfully created Patient_output.csv
Successfully created Hospital_output.csv
