# HDS5210-2021 Midterm

In the midterm, you're going to focus on using the programming skills that you've developed so far to build a calculator three different risk scores and apply that to a data file. The three calculations you're going to write functions for are: 
* CHA2DS2-VASc Score for Atrial Fibrillation Stroke Risk - [link](https://www.mdcalc.com/cha2ds2-vasc-score-atrial-fibrillation-stroke-risk)
* HEART Score for Major Cardiac Events - [link](https://www.mdcalc.com/heart-score-major-cardiac-events)
* Framingham Risk Score for Hard Coronary Heart Disease - [link](https://www.mdcalc.com/framingham-risk-score-hard-coronary-heart-disease)

In each of the next three parts, you'll be programming a function to calculate each score.  In the last part of the midterm, you'll take those functions and use them to calculate risk scores for a list of patients from a CSV file and select a limited group of patients that match a fourth set of risk assessment criteria.


---

## Part 1: CHA2DS2-VASc

This scoring mechanism for Atrial Fibrillation Stroke uses 7 inputs:
* Age (Number)
* Sex (Male / Female)
* CHF History (True / False)
* Hypertension History (True / False)
* Stroke History (True / False)
* Vascular Disease History (True / False)
* Diabetes History (True / False)

Fill out the function below with logic to calculate the numeric risk score for teh given input.

Be sure to provide meaningful documentation and at least two test cases in your documentation.  Also make sure your code satisfies the test cases provided in the assert statements.

In [1]:
def cha2ds2_vasc(age, sex, chf, hypertension, stroke, vascular, diabetes):
    '''
    (int, str, bool, bool, bool, bool, bool) -> int
    This function takes a variety of patient health metrics to output a score for atrial fibrillation stroke risk
    The function does this but checking each variable against the thresholds shown here:
    https://www.mdcalc.com/cha2ds2-vasc-score-atrial-fibrillation-stroke-risk
    
    The function adds up the overall score as it goes through each of the variables and returns a final score as an integer output.
    '''
    cha2ds2_score = 0
    
    if age < 65:
        cha2ds2_score += 0
    elif age <= 74:
        cha2ds2_score += 1
    else:
        cha2ds2_score += 2
    
    if sex == 'Female':
        cha2ds2_score += 1
    else:
        cha2ds2_score += 0
    
    if chf == True:
        cha2ds2_score += 1
    else:
        cha2ds2_score += 0
    
    if hypertension == True:
        cha2ds2_score += 1
    else:
        cha2ds2_score += 0
        
    if stroke == True:
        cha2ds2_score += 2
    else:
        cha2ds2_score += 0
        
    if vascular == True:
        cha2ds2_score += 1
    else:
        cha2ds2_score += 0
        
    if diabetes == True:
        cha2ds2_score += 1
    else:
        cha2ds2_score += 0
        
    return cha2ds2_score
    
    
    ### YOUR SOLUTION HERE

Testing your code with assertions....

In [2]:
assert cha2ds2_vasc(82,'Male',False,True,True,True,True) == 7
assert cha2ds2_vasc(22,'Male',False,False,True,False,False) == 2
assert cha2ds2_vasc(32,'Female',True,True,True,True,True) == 7
assert cha2ds2_vasc(21,'Female',True,True,True,False,False) == 5
assert cha2ds2_vasc(52,'Female',True,True,False,False,False) == 3
assert cha2ds2_vasc(88,'Male',True,True,True,False,False) == 6
assert cha2ds2_vasc(22,'Male',False,False,True,False,False) == 2
assert cha2ds2_vasc(71,'Female',False,False,False,True,True) == 4
assert cha2ds2_vasc(89,'Female',True,False,False,True,True) == 6
assert cha2ds2_vasc(54,'Male',True,False,False,False,True) == 2
assert cha2ds2_vasc(89,'Female',False,False,True,True,False) == 6
assert cha2ds2_vasc(36,'Male',False,True,False,True,True) == 3
assert cha2ds2_vasc(57,'Female',True,False,False,True,True) == 4
assert cha2ds2_vasc(22,'Female',False,True,False,True,False) == 3
assert cha2ds2_vasc(40,'Female',True,True,True,False,False) == 5
assert cha2ds2_vasc(54,'Female',False,False,False,True,True) == 3
assert cha2ds2_vasc(39,'Male',True,False,False,False,False) == 1
assert cha2ds2_vasc(61,'Female',False,False,False,True,False) == 2
assert cha2ds2_vasc(57,'Female',True,False,True,False,False) == 4
assert cha2ds2_vasc(76,'Female',True,True,True,True,True) == 9
assert cha2ds2_vasc(83,'Male',False,False,False,False,False) == 2
assert cha2ds2_vasc(86,'Female',False,True,False,False,False) == 4
assert cha2ds2_vasc(61,'Female',True,False,False,False,True) == 3
assert cha2ds2_vasc(46,'Male',True,True,True,True,False) == 5
assert cha2ds2_vasc(25,'Male',True,True,False,True,True) == 4
assert cha2ds2_vasc(62,'Male',False,True,True,True,True) == 5
assert cha2ds2_vasc(59,'Male',False,True,True,False,False) == 3
assert cha2ds2_vasc(60,'Female',False,True,True,False,True) == 5
assert cha2ds2_vasc(53,'Male',False,True,True,False,False) == 3

---

## Part 2: HEART Score

The HEART score is a predictor for major cardiac events.  It requires 5 high-level inputs:
* History (Slightly / Moderately / Highly suspicious)
* EKG (Normal / Non-specific repolarization disturbance / Significant ST deviation)
* Age (Number)
* Risk Factors (Number of risk factors)
* Initial Troponin (Number of times the normal limit)

Fill out the function below with logic to calculate the numeric risk score for teh given input.

Be sure to provide meaningful documentation and at least two test cases in your documentation. Also make sure your code satisfies the test cases provided in the assert statements.

In [3]:
def heart(history, ekg, age, risks, troponin):
    '''
    (str, str, int, int, int) -> int
    This function calculates the HEART score predictor for major cardiac events based on metrics from a patients medical history
    The function does this but checking each variable against the thresholds shown here:
    https://www.mdcalc.com/heart-score-major-cardiac-events
    
    The function adds up the overall score as it goes through each of the variables and returns a final score as an integer output.
    '''
    HEART_score = 0
    
    if history == "Slightly suspicious":
        HEART_score += 0
    elif history == "Moderately suspicious":
        HEART_score += 1
    else:
        HEART_score += 2
            
    if ekg == "Normal":
        HEART_score += 0
    elif ekg == "Non-specific repolarization":
        HEART_score += 1
    elif ekg == "Significant ST deviation":
        HEART_score += 2
    else:
        HEART_score == "Incorrect ekg input"
        
    if age < 45:
        HEART_score += 0
    elif age <= 64:
        HEART_score += 1
    else: 
        HEART_score += 2
    
    if risks == 0:
        HEART_score += 0
    elif risks <=2:
        HEART_score += 1
    else:
        HEART_score += 2
    
    if troponin > 3:
        HEART_score += 2
    elif troponin > 1:
        HEART_score += 1
    else:
        HEART_score += 0
    
    return HEART_score
        
    
    ### YOUR SOLUTION HERE

In [4]:
assert heart('Moderately suspicious','Normal',82,4,3.8) == 7
assert heart('Slightly suspicious','Non-specific repolarization',22,2,2.3) == 3
assert heart('Slightly suspicious','Non-specific repolarization',32,4,1.3) == 4
assert heart('Highly suspicious','Non-specific repolarization',21,1,1.1) == 5
assert heart('Slightly suspicious','Normal',52,5,1.2) == 4
assert heart('Moderately suspicious','Significant ST deviation',88,5,0.5) == 7
assert heart('Slightly suspicious','Non-specific repolarization',22,5,3.0) == 4
assert heart('Slightly suspicious','Significant ST deviation',71,4,3.9) == 8
assert heart('Moderately suspicious','Non-specific repolarization',89,5,0.3) == 6
assert heart('Highly suspicious','Normal',54,4,3.9) == 7
assert heart('Moderately suspicious','Normal',89,3,0.3) == 5
assert heart('Slightly suspicious','Non-specific repolarization',36,1,0.4) == 2
assert heart('Moderately suspicious','Normal',57,4,1.3) == 5
assert heart('Slightly suspicious','Normal',22,5,0.2) == 2
assert heart('Slightly suspicious','Normal',40,4,3.9) == 4
assert heart('Highly suspicious','Normal',54,3,3.1) == 7
assert heart('Highly suspicious','Significant ST deviation',39,4,0.9) == 6
assert heart('Moderately suspicious','Normal',61,2,1.9) == 4
assert heart('Slightly suspicious','Normal',57,1,1.7) == 3
assert heart('Moderately suspicious','Significant ST deviation',76,2,1.7) == 7
assert heart('Slightly suspicious','Normal',83,1,1.0) == 3
assert heart('Highly suspicious','Normal',86,1,2.3) == 6
assert heart('Highly suspicious','Non-specific repolarization',61,2,3.5) == 7
assert heart('Slightly suspicious','Normal',46,2,1.0) == 2
assert heart('Slightly suspicious','Significant ST deviation',25,4,3.1) == 6
assert heart('Moderately suspicious','Non-specific repolarization',62,1,2.4) == 5
assert heart('Highly suspicious','Non-specific repolarization',59,2,3.6) == 7
assert heart('Moderately suspicious','Significant ST deviation',60,1,2.1) == 6
assert heart('Slightly suspicious','Normal',53,4,0.1) == 3

## Part 3: Framingham Risk Score for Hard Coronary Heart Disease

The Framingham Risk Score for Hard Coronary Heart Disease is intended for non diabetic patients age 30-79 only.  So, if the patient's age is < 30 or > 79, your function should return `-1` rather than a specific risk score.

The Framingham Risk Score takes 7 inputs:
* Age (Number)
* Sex (Male / Female)
* Smoker (True / False)
* Total cholesterol (Number)
* HDL cholesterol (Number)
* Systolic BP (Number)
* Blood pressure being treated with medicines (No / Yes)

You'll not that rather than being a basic parametric equation, this is a regression function defined by coefficients.  It also requires you take the natural logarithm (`ln`) of many of the parameters.  To help you out, here's an example of how to interpret the formulat provided in the website's **Evidence** tab.

Take special note of the footnotes in the logic:
* *Yes=1, No=0 (for Treated for blood pressure and Smoker)
* ** Men: if age >70, use ln(70) x Smoker. Women: if age >78, use ln(78) x Smoker.


---

These segments of the equation and the coefficient table...

> $ L_{Men} = \beta \times \ln(Age) + \beta \times \ln(cholesterol) ... $
>
> | Variable &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; | Men | Women
> | :---------------------------- | :--: | :--: |
> | $ \ln(Age) $ | 52.00961 | 31.764001
> | $ \ln(Cholesterol) $ | 20.014077 | 22.465206

Could be written in Python as follows:

```python
import math

L = 0
if sex == "Male":
    L += 52.00961 * math.ln(age)
    L += 20.014077 * math.ln(cholesterol)
else:
    L += 31.764001 * math.ln(age)
    L += 22.465206 * math.ln(cholesterol)
```

---

**ROUND YOUR FINAL RESULT TO 4 DECIMAL PLACES**

In [5]:
#Something is not quite right with the outcomes I am getting from my code. I feel like it has to do with the if then statements for some of the metrics, but I can't find what
#is wrong with the way I wrote it.
#I tried it several ways but I still end up slightly negative with most cases which
#doesn't make sense. 

import math
def framingham(age, sex, smoker, cholesterol, hdl, systolic, bp_treated):
    '''
    (int, str, bool, int, int, int, bool) -> int
    This function take inputs for 7 different metrics of a patient to determine their risk for hard coronary heart disease
    the function utilizes a regression method to compute the risk factor. Beta coefficients are multiplied by each metric. The natural log is taken of the integer inputs used for
    the regression while the str inputs do not use a natural log. 
    The written equation is as follows for men:
    
    LMen = β x ln(Age) + β x ln(Total cholesterol) + β x ln(HDL cholesterol) + β x ln(Systolic BP) + β x Treated for blood pressure + β x Smoker + β x ln(Age) x ln(Total cholesterol) + β x ln(Age) x Smoker + β x ln(Age) x ln(Age) - 172.300168
    
    PMen = 1 - 0.9402^exp(LMen)
    
    The written equation is as follows for women:
    
    LWomen = β x ln(Age) + β x ln(Total cholesterol) + β x ln(HDL cholesterol) + β x ln(Systolic BP) + β x Treated for blood pressure + β x Smoker + β x ln(Age) x ln(Total cholesterol) + β x ln(Age) x Smoker - 146.5933061

    PWomen = 1 - 0.98767^exp(LWomen)
    
    This test is meant only for patients age 30-79 so the function will return a -1 rather than a risk score if the patients age is outside of the acceptable range
    '''
    
    L = 0
    P = 0
    
    if sex == "Male":
        L += 52.00961 * math.log(age)
        L += 20.014077 * math.log(cholesterol)
        L += -0.905964 * math.log(hdl)
        L += 1.305784 * math.log(systolic)
        
        if bp_treated == True:
            L += 0.241549
        else:
            L += 0
            
        if smoker == True:
            L += 12.096316
        else:
            L += 0
            
        L += -4.605038 * math.log(age) * math.log(cholesterol)
        
        if smoker == True:
            if age > 70:
                L += -2.84367 * math.log(70) * 1
            else:
                L += -2.84367 * math.log(age)
        else:
            L += 0
            
        L += -2.93323 * math.log(age) * math.log(age)
        L += -172.300168
        
        if age < 30:
            L = -1
        elif age > 79:
            L = -1
        else:
            L += 0
        
    else:
        L += 31.764001 * math.log(age)
        L += 22.465206 * math.log(cholesterol)
        L += -1.187731 * math.log(hdl)
        L += 2.552905 * math.log(systolic)
        
        if bp_treated == True:
            L += 0.420251
        else:
            L += 0
        
        if smoker == True:
            L += 13.0574
        else:
            L += 0
            
        L += -5.060998 * math.log(age) * math.log(cholesterol)
        
        if smoker == True:
            if age > 78:
                L += -2.996945 * math.log(78) * 1
            else:
                L += -2.996945 * math.log(age) * 1    
        else:
            L += 0
        L += -146.5933061
        
        if age < 30:
            L = -1
        elif age > 79:
            L = -1
        else:
            L += 0
    
    if L == -1:
        P = -1
    elif sex == "Male":
        P = 1 - 0.9402 ** L
    else:
        P = 1 - 0.98767 ** L
    
    return round(P, 4)
        
    ### YOUR SOLUTION HERE

In [6]:
assert framingham(82,'Male',False,214,64,92,True) == -1
assert framingham(22,'Male',False,146,33,102,False) == -1
assert framingham(32,'Female',False,195,31,115,True) == 0.0015
assert framingham(21,'Female',False,152,42,82,True) == -1
assert framingham(52,'Female',False,214,58,85,True) == 0.005
assert framingham(88,'Male',True,173,67,104,False) == -1
assert framingham(22,'Male',False,163,62,112,False) == -1
assert framingham(71,'Female',False,188,30,99,False) == 0.0391
assert framingham(89,'Female',True,172,55,88,False) == -1
assert framingham(54,'Male',False,156,52,117,True) == 0.0437
assert framingham(89,'Female',False,147,58,127,True) == -1
assert framingham(36,'Male',True,169,33,128,True) == 0.0465
assert framingham(57,'Female',True,204,40,86,False) == 0.0189
assert framingham(22,'Female',False,177,59,81,False) == -1
assert framingham(40,'Female',False,165,43,111,True) == 0.0016
assert framingham(54,'Female',True,200,50,86,False) == 0.0126
assert framingham(39,'Male',False,189,49,130,True) == 0.0126
assert framingham(61,'Female',True,176,68,106,False) == 0.0153
assert framingham(57,'Female',False,181,47,124,True) == 0.0183
assert framingham(76,'Female',True,162,56,94,False) == 0.0239
assert framingham(83,'Male',False,215,52,98,True) == -1
assert framingham(86,'Female',True,169,55,100,True) == -1
assert framingham(61,'Female',False,151,65,86,True) == 0.0053
assert framingham(46,'Male',False,174,64,114,False) == 0.0142
assert framingham(25,'Male',False,193,31,84,False) == -1
assert framingham(62,'Male',False,167,31,115,False) == 0.1098
assert framingham(59,'Male',True,174,66,88,True) == 0.0709
assert framingham(60,'Female',True,156,63,124,True) == 0.0293
assert framingham(53,'Male',False,141,51,109,False) == 0.0244

AssertionError: 

---

## Part 4. Putting it all together

Now that we have our three scores, we need to put them together into an overall composite risk score for a whole group of patients.  Those patients are in a CSV file on the server called `/data/midterm_patients.csv`.  You can open this file in Jupyter by browsing to your Home icon -> from_instructor -> data.  There are several things that you're going to need to do read this file, calculate individual risk scores, and compute an overall risk score.

First, you'll notice that some of the column names are different between the data in the input file and the values that are expected by your functions above.  For example: "M" from the file needs to be turned into "Male" and "Yes" in the file needs to be turned into "True".  You will need to do conversions for all of the fields listed below:

| Field in CSV | Parameter Name Above | Source Values | Values Needed Above |
| :----------- | :------------------- | :-: | :-: |
| bp medicine  | bp_treated           | Yes / No | True / False |
| sex          | sex                  | M / F | Male / Female |
| smoker       | smoker               | Yes / No | True / False |
| risk factors | risks                | # | # |
| chf       | chf history               | Yes / No | True / False |
| hypertension       | hypertension history               | Yes / No | True / False |
| stroke    | stroke history             | Yes / No | True / False |
| vascular      | vascular disease history           | Yes / No | True / False |
| diabetes     | diabetes history              | Yes / No | True / False |


After calculating these three risk scores, use the rules below to determine who is at highest risk.  To be classified as "High Risk" a patient must meet all three criteria below:
1. CHA2DS2_VASc >= 2
2. HEART >= 4
3. Framingham >= 3%

Your output for this function needs to be a list where each item in this contains `[patient, CHA2DS2_VASc, HEART, Framingham, High Risk]`

In [None]:
from pathlib import Path
HOME = str(Path.home())
midterm_patients = "/data/test-patients.csv"

In [None]:
#Had trouble with the step of recalling functions from previous steps in this function
#I was able to get all the formatting where I wanted it to be, but it seems like I can't
#get the function to recognize the items in the list are to be used in the recalled functions
#I still finished out the code for the final formatting and check for High Risk, but there is a section
#in the middle that does not work. There is a note down in the code where this issue is at.

'''
This function calls data from a csv that contains a list of patients and their corresponding values for 16 metrics related to heart issues.

The function first converts the csv format into a list of lists with each list containing a patient and their subsequent health metrics.

The function then removes the first list which contains the column headers from the csv data set.

The function then uses a for loop with enumeration to replace values such as 'M' and 'F' with 'Male' and 'Female' to be used in recalled functions from the previous problems
This for loop also converts 'Yes' and 'No' into True and False.

The function then uses the three previously created functions to calculate heart issues scores for each patient.

Lastly, the function evaluates each patients scores on the three tests and determines if the patient is at High Risk for a heart issue based on the following thresholds:
CHA2DS2_VASc >= 2
HEART >= 4
Framingham >= 3%

All three thresholds must be met to be deemed High Risk
'''

def test_patients(midterm_patients):
    
    import csv
    
    #This first part of the code creates empty lists to be appended as the for loop goes through the csv file
    
    converted_format = []
    final_format = []
    
    #This for loop runs through each line of the csv and creates a list for each line. This is then stored into the converted_format list
    
    with open(midterm_patients) as csv_file:
        csv_reader = csv.reader(csv_file)
        
        for line in csv_reader:
            first_format = []
            
            case_number = line[0]
            age = line[1]
            sex = line[2]
            chf = line[3]
            hypertension = line[4]
            stroke = line[5]
            vascular = line[6]
            diabetes = line[7]
            history = line[8]
            ekg = line[9]
            risks = line[10]
            troponin = line[11]
            smoker = line[12]
            cholesterol = line[13]
            hdl = line[14]
            systolic = line[15]
            bp_treated = line[16]
                      
                        
            first_format.append(case_number)
            first_format.append(age)
            first_format.append(sex)
            first_format.append(chf)
            first_format.append(hypertension)
            first_format.append(stroke)
            first_format.append(vascular)
            first_format.append(diabetes)
            first_format.append(history)
            first_format.append(ekg)
            first_format.append(risks)
            first_format.append(troponin)
            first_format.append(smoker)
            first_format.append(cholesterol)
            first_format.append(hdl)
            first_format.append(systolic)
            first_format.append(bp_treated)
            
            converted_format.append(first_format)
                
        #I'm sure there is a better way to remove the headers, but this is what I could get to work
        
        converted_format.remove(['\ufeffpatient', 'age', 'sex', 'chf history', 'hypertension history', 'stroke history', 'vascular disease history', 'diabetes history', 'history', 'ekg', 'risk factors', 'troponin', 'smoker', 'total cholesterol', 'hdl cholesterol', 'systolic bp', 'bp medicine'])
        
                
        
        #This for loop goes through and replaces the csv item outcome names with the proper data name/type for the recalled functions.
        
        for case in converted_format:
            for n, i in enumerate(case):
                if i == 'M':
                    case[n] = 'Male'
                elif i == 'F':
                    case[n] = 'Female'
                elif i == 'Yes':
                    case[n] = True
                elif i == 'No':
                    case[n] = False
                else:
                    i = i
        
        #This for loop goes through each of the cases and calculates the scores for the three functions previously built. The final format then appends the patient name and their subsequent scores
        
        for case in converted_format:
            case_number = case[0]
            age = case[1]
            sex = case[2]
            chf = case[3]
            hypertension = case[4]
            stroke = case[5]
            vascular = case[6]
            diabetes = case[7]
            history = case[8]
            ekg = case[9]
            risks = case[10]
            troponin = case[11]
            smoker = case[12]
            cholesterol = case[13]
            hdl = case[14]
            systolic = case[15]
            bp_treated = case[16]
            
            #This is the part I can't quite get to work. I couldn't find a way to recall the previous functions built in parts 1-3
            #I think the code is not recognizing the items in the list for each case as the variables to be used in the recalled functions
            #I get an error that says I am trying to use a '<' in the first function to compare a str and an int 
            
            cha2ds2_score = cha2ds2_vasc(age, sex, chf, hypertension, stroke, vascular, diabetes)
            heart_score = heart(history, ekg, age, risks, troponin)
            framingham_risk = framingham(age, sex, smoker, cholesterol, hdl, systolic, bp_treated)
            
            final_format.append(case_number)
            final_format.append(cha2ds2_score)
            final_format.append(heart_score)
            final_format.append(framingham_risk)
        
        
        #This last section of the function creates a running risk threshold score
        #If the patient is above the threshold for a test, the threshold score is increased by 1.
        #A patient must be above the threshold for all three tests, so a threshold total score of 3 is labeled High Risk and appended to the final formatted list
        
        for cases in final_format:
            risk_threshold = []
            risk_level = []
            
            cha2ds2_score = cases[1]
            heart_score = cases[2]
            framingham_risk = cases[3]
            
            if cha2ds2_score >= 2:
                risk_threshold += 1
            else:
                risk_threshold += 0
            
            if heart_score >= 4:
                risk_threshold += 1
            else:
                risk_threshold += 0
            
            if framingham_risk >= 0.03:
                risk_threshold += 1
            else:
                risk_threshold += 0
            
            if risk_threshold < 3:
                risk_level = False
            else:
                risk_level = True
            
            final_format.append(risk_level)
            
        return(final_format)
        
    
    ### YOUR SOLUTION HERE

In [None]:
test_patients('/data/test-patients.csv')

In [None]:
answers = [['E40794', 7.0, 7.0, -1.0, False],
 ['E57853', 2.0, 2.0, -1.0, False],
 ['E63841', 7.0, 3.0, 0.0015, False],
 ['E87700', 5.0, 4.0, -1.0, False],
 ['E49662', 3.0, 4.0, 0.005, False],
 ['E19241', 6.0, 7.0, -1.0, False],
 ['E94033', 2.0, 3.0, -1.0, False],
 ['E19724', 4.0, 8.0, 0.0391, True],
 ['E77077', 6.0, 5.0, -1.0, False],
 ['E75736', 2.0, 7.0, 0.0437, True],
 ['E20246', 6.0, 5.0, -1.0, False],
 ['E58235', 3.0, 1.0, 0.0465, False],
 ['E29619', 4.0, 5.0, 0.0189, False],
 ['E18023', 3.0, 2.0, -1.0, False],
 ['E56386', 5.0, 4.0, 0.0016, False],
 ['E87379', 3.0, 7.0, 0.0126, False],
 ['E44264', 1.0, 6.0, 0.0126, False],
 ['E85955', 2.0, 4.0, 0.0153, False],
 ['E17497', 4.0, 3.0, 0.0183, False],
 ['E11391', 9.0, 7.0, 0.0239, False],
 ['E41611', 2.0, 3.0, -1.0, False],
 ['E66188', 4.0, 6.0, -1.0, False],
 ['E74052', 3.0, 6.0, 0.0053, False],
 ['E40182', 5.0, 2.0, 0.0142, False],
 ['E21161', 4.0, 6.0, -1.0, False],
 ['E59494', 5.0, 4.0, 0.1098, True],
 ['E61747', 3.0, 6.0, 0.0709, True],
 ['E42697', 5.0, 6.0, 0.0293, False],
 ['E61043', 3.0, 3.0, 0.0244, False]]

In [None]:
assert test_patients('/data/test-patients.csv') == answers

---

## Submitting Your Work

In order to submit your work, you'll need to use the `git` command line program to **add** your homework file (this file) to your local repository, **commit** your changes to your local repository, and then **push** those changes up to github.com.  From there, I'll be able to **pull** the changes down and do my grading.  I'll provide some feedback, **commit** and **push** my comments back to you.  Next week, I'll show you how to **pull** down my comments.

To run through everything one last time and submit your work:
1. Use the `Kernel` -> `Restart Kernel and Run All Cells` menu option to run everything from top to bottom and stop here.
2. Save this note with Ctrl-S (or Cmd-S)
2. Skip down to the last command cell (the one starting with `%%bash`) and run that cell.

If anything fails along the way with this submission part of the process, let me know.  I'll help you troubleshoort.

In [None]:
assert False, "DO NOT REMOVE THIS LINE"

---

In [7]:
%%bash
git pull
git add midterm-2021.ipynb
git commit -a -m "Finally submitting the midterm!"
git push

Already up to date.
[main af79590] Finally submitting the midterm!
 2 files changed, 885 insertions(+), 2 deletions(-)
 create mode 100644 week07-midterm/midterm-2021.ipynb


To github.com:nsokolis/hds5210-2021.git
   d9d33aa..af79590  main -> main



---

If the message above says something like _Finally submitting the midterm!__ or _Everything is up to date_, then your work was submitted correctly.