# HDS5210-2020 Midterm

In the midterm, you're going to focus on using the programming skills that you've developed so far to build a calculator for the Apache II scoring system for ICU Mortality.  
* https://www.mdcalc.com/apache-ii-score#evidence
* https://reference.medscape.com/calculator/apache-ii-scoring-system

For the midterm, we'll be building a calculator for the Apache II score and then running that against a patient file that's available to you out on the internet.  This will be broken down into three main steps:
1. Create your JSON file to encapsulate all of the calculation rules for Apache II
2. Create functions to calculate the Apache II score using your JSON configuration
3. Create a function to loop over the patients in a file on the internet and calculate Apach II scores for all of them



---

## Part 1: Creating a JSON Rules File

Look at the rules for the Apache II scoring system on the pages above.  The first step in the midterm is to use those rules and create a JSON configuration file as described in the 2019 midterm video.  I've provided a starter file named `apache.json` to get you started.

Inside that file, you'll find placeholders for all of the measures that go into the Apache II scoring model:
* Organ Failure History
* Age
* Temperature
* [pH](https://en.wikipedia.org/wiki/PH)
* Heart rate
* Respiratory rate
* [Sodium](https://www.mayoclinic.org/diseases-conditions/hyponatremia/symptoms-causes/syc-20373711)
* [Potassium](https://www.emedicinehealth.com/hyperkalemia/article_em.htm)
* [Creatinine](https://www.medicalnewstoday.com/articles/322380)
* [Hematocrit](https://labtestsonline.org/tests/hematocrit)
* White Blood Count
* [FiO2](https://www.ausmed.com/cpd/articles/oxygen-flow-rate-and-fio2)
* [PaO2](https://www.verywellhealth.com/partial-pressure-of-oyxgen-pa02-914920)
* [A-a gradient](https://www.ncbi.nlm.nih.gov/books/NBK545153/)


You may need to create a sort of nested set of rules in some cases.  For instance, the rule for Creatinine says to use certain ranges and points in the case of Acute Renal Failure and a different set of points for Chronic Renal Failure.

Similarly, the rule for FiO2 says to use PaO2 to calculate scores if the FiO2 is <50, and to use A-a Gradient if the PaO2 is >50.

When you've created your `apache.json` file, make sure it's in the same directory as this notebook.

### Testing your JSON

The assert() functions below should all run just fine.  If you want to change the names of any of the keys in the JSON I provided you, you may, but you'll also need to update this test code so that it doesn't fail.  Remember, your notebook should be able to run end-to-end before you submit it.

In [1]:
import json

with open("apache_mine.json") as f:
    rules = json.load(f)
assert('Organ Failure History' in rules.keys())
assert('Age' in rules.keys())
assert('Temperature' in rules.keys())
assert('pH' in rules.keys())
assert('Heart Rate' in rules.keys())
assert('Respiratory Rate' in rules.keys())
assert('Sodium' in rules.keys())
assert('Potassium' in rules.keys())
assert('Creatinine' in rules.keys())
assert('Hematocrit' in rules.keys())
assert('White Blood Count' in rules.keys())
assert('FiO2' in rules.keys())

In [2]:
rules.keys()

dict_keys(['Organ Failure History', 'Age', 'Temperature', 'pressure', 'Heart Rate', 'Respiratory Rate', 'FiO2', 'pH', 'Sodium', 'Potassium', 'Creatinine', 'Hematocrit', 'White Blood Count', 'Glasgow Coma Scale'])

---

## Part 2: Functions to evaluate rules

Write a series of functions, enough to satisfy all of the main criteria that we're using to calculate the Apache II score.  That list is the same as the assert statements above.

* Each of your functions should be well documented.
* Each function should have "config_file" as one of it's parameters.
* Each function should return a numerical score value.
* Similar to what we discussed in the review, if you can generalize some rules, do so.  You should **NOT** end up with one function for each input variable.  If you did that, you'd have a lot of repetative code.

The Glasgow Coma Scale is simply a 1-to-1 score translation.  Simply add the Glasgow Coma Scale value.  So, you don't need to write a function for this. [Glasgow Coma Scale](https://www.cdc.gov/masstrauma/resources/gcs.pdf)

**CORRECTION ADDED 2/29** - The Glasgow Coma Scale points should be calculated as `1 - Glasgow Coma Scale` rather than what I just stated above.  My preference would be that you do the calculation correctly, as per MDCalc, and then use the **corrected** scores files to compare against as noted in Part 4.

### Tesing Score for age, temperature, pressure, heart_rate, and respiratory_rate

In [73]:
#Age function
import json
def age_score(age, config_file):
    config = json.load(open(config_file))
    age_scores = config.get("Age")
    for rule in age_scores:
        if age >= rule.get('min') and age < rule.get('max'):
            age_score = rule.get('points')
    return (age_score)

In [126]:
#Temperature function
import json
def temperature_score(temperature, config_file):
    config = json.load(open(config_file))
    temp_score = 0
    temperature_scores = config.get("Temperature")
    for rule in temperature_scores:
        if temperature >= rule.get('min') and temperature < rule.get('max'):
            temp_score = rule.get('points')
    return (temp_score)

In [127]:
#Pressure function
import json
def pressure_score(pressure, config_file):
    config = json.load(open(config_file))
    pressure_scores = config.get("pressure")
    for rule in pressure_scores:
        if pressure >= rule.get('min') and pressure < rule.get('max'):
            pressure_score = rule.get('points')
    return (pressure_score)

In [128]:
#Heart Rate function
import json
def heartrate_score(heart_rate, config_file):
    config = json.load(open(config_file))
    Heart_Rate_scores = config.get("Heart Rate")
    for rule in Heart_Rate_scores:
        if heart_rate >= rule.get('min') and heart_rate < rule.get('max'):
            Heart_Rate_score = rule.get('points')
    return (Heart_Rate_score)

In [129]:
# Respiratory rate function
import json
def respiratoryrate_score(respiratory_rate, config_file):
    config = json.load(open(config_file))
    respiratory_rate_scores = config.get("Respiratory Rate")
    for rule in respiratory_rate_scores:
        if respiratory_rate >= rule.get('min') and respiratory_rate < rule.get('max'):
            respiratory_rate_score = rule.get('points')
    return (respiratory_rate_score)

### Organ Failure

In [130]:
import json
def of_score(parameter, config_file):
    config = json.load(open(config_file))
    of_score = config.get("Organ Failure History")
    return of_score.get(parameter)

### FiO2_PaO2_A-a_gradient Score

In [131]:
import json
def FiO2_PaO2_gradient_score(fio2, pao2, a_gradient, config_file):
    score = 0
    config = json.load(open(config_file))
    FiO2_PaO2_gradient_scores = config.get("FiO2")
    for rule in FiO2_PaO2_gradient_scores:
        if fio2 >= rule.get('min') and fio2 < rule.get('max'):
            if fio2>=0 and fio2 < 50:
                pao2_rules = rule.get('PaO2')
                for pao2_rule in pao2_rules:
                    if pao2 >= pao2_rule.get('min') and pao2 < pao2_rule.get('max'):
                        score = pao2_rule.get('points')
            elif fio2 >= 50:
                gradients_rules = rule.get('A-a gradient')
                for gradients_rule in gradients_rules:
                    if a_gradient >= gradients_rule.get('min') and a_gradient < gradients_rule.get('max'):
                        score = gradients_rule.get('points')
    return (score)

### pH, Sodium, Potassium, Hematocrit, White_Blood_Count, and GCS Score

In [132]:
#pH score
def pH_score(ph, config_file):
    config = json.load(open(config_file))
    ph_scores = config.get("pH")
    for rule in ph_scores:
        if ph >= rule.get('min') and ph < rule.get('max'):
            score = rule.get('points')
    return (score)

In [133]:
#Sodium score
def sodium_score(Sodium, config_file):
    config = json.load(open(config_file))
    Sodium_scores = config.get("Sodium")
    for rule in Sodium_scores:
        if Sodium >= rule.get('min') and Sodium < rule.get('max'):
            Sodium_score = rule.get('points')
    return (Sodium_score)

In [134]:
#Potassium score
def potassium_score(Potassium, config_file):
    config = json.load(open(config_file))
    Potassium_scores = config.get("Potassium")
    for rule in Potassium_scores:
        if Potassium >= rule.get('min') and Potassium < rule.get('max'):
            Potassium_score = rule.get('points')
    return (Potassium_score)

In [135]:
#Hematocrit score
def Hematocrit_score(Hematocrit, config_file):
    config = json.load(open(config_file))
    Hematocrit_scores = config.get("Hematocrit")
    for rule in Hematocrit_scores:
        if Hematocrit >= rule.get('min') and Hematocrit < rule.get('max'):
            Hematocrit_score = rule.get('points')
    return (Hematocrit_score)

In [136]:
#White blood count score
def wbc_score(White_Blood_Count, config_file):
    config = json.load(open(config_file))
    White_Blood_Count_scores = config.get("White Blood Count")
    for rule in White_Blood_Count_scores:
        if White_Blood_Count >= rule.get('min') and White_Blood_Count < rule.get('max'):
            White_Blood_Count_score = rule.get('points')
    return (White_Blood_Count_score)

In [137]:
#Glasgow Coma Scale score
def gcs_score(GCS, config_file):
    config = json.load(open(config_file))
    GSC_scores = config.get("Glasgow Coma Scale")
    return (GSC_scores.get(str(GCS)))    

#### Creatinine Score

In [138]:
import json
def creatinine_score(creatinine, renal_failure, config_file):
    score = 0
    config = json.load(open(config_file))
    creatinine_scores = config.get("Creatinine")
    for rule in creatinine_scores:
        if creatinine >= rule.get('min') and creatinine < rule.get('max') and renal_failure == rule.get('Failure'):
            creatinine_score = rule.get('points')
            score = creatinine_score
    return (score)

## Testing

In [139]:
# Put the name of your configuration file below
EF_CONFIG_FILE = "apache_mine.json"

In [140]:
assert(of_score("Nonoperative",  EF_CONFIG_FILE)==5)
assert(of_score("Emergency",  EF_CONFIG_FILE)==5)
assert(of_score("Elective",  EF_CONFIG_FILE)==2)

In [141]:
assert(age_score(50, EF_CONFIG_FILE)==2)
assert(age_score(66, EF_CONFIG_FILE)==5)
assert(age_score(80, EF_CONFIG_FILE)==6)

In [142]:
assert(temperature_score(40, EF_CONFIG_FILE)==3)
assert(temperature_score(33, EF_CONFIG_FILE)==2)
assert(temperature_score(25, EF_CONFIG_FILE)==4)

In [143]:
assert(pressure_score(70, EF_CONFIG_FILE)==0)
assert(pressure_score(30, EF_CONFIG_FILE)==4)
assert(pressure_score(130, EF_CONFIG_FILE)==3)

In [144]:
assert(heartrate_score(200, EF_CONFIG_FILE)==4)
assert(heartrate_score(100, EF_CONFIG_FILE)==0)
assert(heartrate_score(60, EF_CONFIG_FILE)==2)

In [145]:
assert(respiratoryrate_score(60, EF_CONFIG_FILE)==4)
assert(respiratoryrate_score(11, EF_CONFIG_FILE)==1)
assert(respiratoryrate_score(5, EF_CONFIG_FILE)==4)

In [146]:
assert(FiO2_PaO2_gradient_score(20, 20,50, EF_CONFIG_FILE)==4)
assert(FiO2_PaO2_gradient_score(65, 20,100, EF_CONFIG_FILE)==0)
assert(FiO2_PaO2_gradient_score(90, 20,550, EF_CONFIG_FILE)==4)

In [147]:
assert(pH_score(7.5, EF_CONFIG_FILE)==1)
assert(pH_score(6, EF_CONFIG_FILE)==4)
assert(pH_score(8, EF_CONFIG_FILE)==4)

In [148]:
assert(sodium_score(200, EF_CONFIG_FILE)==4)
assert(sodium_score(140, EF_CONFIG_FILE)==0)
assert(sodium_score(100, EF_CONFIG_FILE)==4)

In [149]:
assert(potassium_score(8, EF_CONFIG_FILE)==4)
assert(potassium_score(3, EF_CONFIG_FILE)==1)
assert(potassium_score(3.5, EF_CONFIG_FILE)==0)

In [150]:
assert(Hematocrit_score(60, EF_CONFIG_FILE)==4)
assert(Hematocrit_score(40, EF_CONFIG_FILE)==0)
assert(Hematocrit_score(10, EF_CONFIG_FILE)==4)

In [151]:
assert(wbc_score(50, EF_CONFIG_FILE)==4)
assert(wbc_score(7, EF_CONFIG_FILE)==0)
assert(wbc_score(2, EF_CONFIG_FILE)==2)

In [152]:
assert(gcs_score(15, EF_CONFIG_FILE)==0)
assert(gcs_score(12, EF_CONFIG_FILE)==3)
assert(gcs_score(1, EF_CONFIG_FILE)==14)

In [153]:
assert(creatinine_score(2, "Acute Renal Failure", EF_CONFIG_FILE)==6)
assert(creatinine_score(3.5, "Acute Renal Failure", EF_CONFIG_FILE)==8)
assert(creatinine_score(2, "Chronic Renal Failure", EF_CONFIG_FILE)==3)

---

## Part 3: Put it all together

Create a new function called `apache_score()` that takes all of the necessary inputs and returns the final Apache II score.  Use any variable names that you want.  For clarity and organization, my recommendation is to create them in the same order as they're documented in the website.

1. Organ Failure History
2. Age
3. Temperature
4. pH 
5. Heart rate
6. Respiratory rate
7. Sodium
8. Potassium
9. Creatinine
10. Acute renal failure
11. Hematocrit
12. White Blood Count
13. Glasgow Coma Scale
14. FiO2
15. PaO2
16. A-a gradient


In [177]:
def apache_score(organ_failure, age, temperature, pressure, pH, heart_rate, respiratory_rate, Sodium, Potassium,creatinine,
                 renal_failure, Hematocrit, White_Blood_Count, GCS, fio2, a_gradient, pao2):
    score = 0
    EF_CONFIG_FILE = "apache_mine.json"
    score += of_score(organ_failure,EF_CONFIG_FILE)
    score += age_score(age,EF_CONFIG_FILE)
    score += temperature_score(temperature,EF_CONFIG_FILE)
    score += pressure_score(pressure,EF_CONFIG_FILE)
    score += pH_score(pH,EF_CONFIG_FILE)
    score += heartrate_score(heart_rate,EF_CONFIG_FILE)
    score += respiratoryrate_score(respiratory_rate,EF_CONFIG_FILE)
    score += sodium_score(Sodium,EF_CONFIG_FILE)
    score += potassium_score(Potassium,EF_CONFIG_FILE)
    score += creatinine_score(creatinine,renal_failure,EF_CONFIG_FILE)
    score += Hematocrit_score(Hematocrit,EF_CONFIG_FILE)
    score += wbc_score(White_Blood_Count,EF_CONFIG_FILE)
    score += gcs_score(GCS,EF_CONFIG_FILE)
    score += FiO2_PaO2_gradient_score(fio2, pao2, a_gradient, EF_CONFIG_FILE)
    return (score)    

### Testing your Function

Write a few test cases to make sure that your code functions correctly.  In the last step, you'll have LOTS of test cases run through, but you should do some of your before moving on.

In [178]:
assert(apache_score("Elective", 65, 100, 100, 7.4, 60, 50, 140, 4, 1.1, "Acute Renal Failure", 40, 4, 5, 33, 100, 40)==27)
assert(apache_score("Nonoperative", 65, 100, 100, 7.4, 60, 50, 140, 4, 1.1, "Acute Renal Failure", 40, 15, 5, 33, 100, 40)==31)
assert(apache_score("Emergency", 65, 107, 100, 7.4, 110, 50, 140, 4, 1.1, "Acute Renal Failure", 50, 15, 5, 33, 100, 40)==33)

---

## Part 4: Accessing and processing the patient file

Fill out the simple function below to retrieve the patient data as a CSV file from any given URL and return a list of all of the Apache II scores based on the data you find for those patients.
* The patient file will be a CSV
* It will have column headers that match the labels shown above
* The columns will not necessarily appear in the order shown above
* You should output only the Apache II scores, not any other information
* Your output should be a list in the same order as the input rows

In [183]:
#Pressure is missing here in this file. Therefore I created another function without having the pressure.
def apache_score_2(organ_failure, age, temperature, pH, heart_rate, respiratory_rate, Sodium, Potassium,creatinine,
                 renal_failure, Hematocrit, White_Blood_Count, GCS, fio2, a_gradient, pao2):
    score = 0
    EF_CONFIG_FILE = "apache_mine.json"
    score += of_score(organ_failure,EF_CONFIG_FILE)
    score += age_score(age,EF_CONFIG_FILE)
    score += temperature_score(temperature,EF_CONFIG_FILE)
    score += pH_score(pH,EF_CONFIG_FILE)
    score += heartrate_score(heart_rate,EF_CONFIG_FILE)
    score += respiratoryrate_score(respiratory_rate,EF_CONFIG_FILE)
    score += sodium_score(Sodium,EF_CONFIG_FILE)
    score += potassium_score(Potassium,EF_CONFIG_FILE)
    score += creatinine_score(creatinine,renal_failure,EF_CONFIG_FILE)
    score += Hematocrit_score(Hematocrit,EF_CONFIG_FILE)
    score += wbc_score(White_Blood_Count,EF_CONFIG_FILE)
    score += gcs_score(GCS,EF_CONFIG_FILE)
    score += FiO2_PaO2_gradient_score(fio2, pao2, a_gradient, EF_CONFIG_FILE)
    return (score)  

In [185]:
#Pressure is missing here in this file. Therefore I created another function without having the pressure.
import pandas as pd
data=pd.read_csv('https://hds5210-2020.s3.amazonaws.com/TestPatients.csv')
for i,row in data.iterrows():
    of = row['Organ Failure History']
    age = row['Age']
    temp = row['Temperature']
    ph = row['pH']
    hrate = row['Heart Rate']
    rrate = row['Respiratory Rate']
    sodium = row['Sodium']
    potassium = row['Potassium']
    creatinine = row['Creatinine']
    rf = row['Acute Renal Failure']
    hematocrit = row['Hematocrit']
    wbc = row['White Blood Count']
    gcs = row['Glasgow Coma Scale']
    fio2 = row['FiO2']
    pao2 = row['PaO2']
    a_gradient = row['A-a Gradient']
    score = apache_score_2(of, age, temp, ph,hrate,rrate,sodium, potassium,creatinine,rf,hematocrit,wbc, gcs, fio2, pao2,a_gradient)
    
    print (row['Patient']+" "+ "Score: "+ str(score))

E3408 Score: 32
E1100 Score: 28
E8001 Score: 43
E4369 Score: 33
E5772 Score: 41
E1743 Score: 32
E3011 Score: 30
E3820 Score: 49
E2640 Score: 39
E3083 Score: 48
E2319 Score: 40
E7093 Score: 42
E1004 Score: 31
E5146 Score: 38
E8495 Score: 40
E3466 Score: 47
E8028 Score: 35
E7273 Score: 37
E5903 Score: 34
E8693 Score: 41
E8540 Score: 37
E3233 Score: 29
E5359 Score: 37
E3431 Score: 28
E7376 Score: 40
E1444 Score: 38
E2803 Score: 34
E7258 Score: 42
E2590 Score: 39
E8414 Score: 44
E5218 Score: 34
E6356 Score: 40
E5713 Score: 39
E7611 Score: 44
E1553 Score: 40
E9992 Score: 37
E9643 Score: 36
E7887 Score: 36
E5641 Score: 38
E1926 Score: 30
E5215 Score: 45
E9252 Score: 29
E9873 Score: 40
E9019 Score: 41
E8522 Score: 34
E5265 Score: 35
E6794 Score: 37
E6137 Score: 37
E1249 Score: 27
E3756 Score: 46
E2902 Score: 49
E3226 Score: 39
E5599 Score: 45
E5487 Score: 33
E8741 Score: 33
E5757 Score: 42
E9219 Score: 41
E5809 Score: 28
E8528 Score: 44
E9437 Score: 43
E9312 Score: 33
E8508 Score: 35
E1458 Sc

### Testing your Function

The URL for the test data is: https://hds5210-2020.s3.amazonaws.com/TestPatients.csv


You can verify your results by comparing them against this data: https://hds5210-2020.s3.amazonaws.com/Scores.csv

**CORRECTION ADDED 3/29** - If you calculated the Glasgow Coma Scale points as per the actual instructions in MDCalc, then please use this set of corrected scores to compare your results with: https://hds5210-2020.s3.amazonaws.com/Scores_corrected.csv
