# Week 3 Exercises

See: _McKinney 2.3_ and [Python Documentation](https://docs.python.org/3/tutorial/controlflow.html) section 4 on flow control.


**At the begining of the semester, all of the workshop programming exercises will be structured a specific way to make it easier to verify for yourself that you're on the right track as well as easier for me to do a first pass on automated grading.  The structure of each question will require you to write a function using Python code. Don't worry that we haven't talked about functions yet. Just edit the code between** `### BEGIN SOLUTION` and `### END SOLUTION` **as shown in the example below.**


**WHAT I PROVIDE:**
```
def some_function(parameter1, parameter2):

   ### BEGIN SOLUTION
   x = -1
   ### END SOLUTION
   
   return x
```

**WHAT YOU SHOULD DO:** Just change the parameter names (if you feel you need to) and the calculations between `### BEGIN SOLUTION` and `### END SOLUTION`.  This is just a made up example.
```
def some_function(a, b):

   ### BEGIN SOLUTION
   temp = a + b
   x = temp / a * b
   ### END SOLUTION
   
   return x
```


---
---

**Below each programming exercise are some tests (`assertions`) that verify your code is working correctly.  If any assertions fail, you know that something isnt' right with your code, but having all assertions pass doesn't necessarily mean your code is perfect, yet. You should also create your own tests to make sure your code is correct.**

**For now, please don't change any function names**

### 14.1 Difference in rate per 1,000

Often in public health, we report metrics as a number per 1,000 or per 1,000,000 in population. The purpose of that is to normalize the numbers between area of larger and smaller populations.

Below, we have a function already built to calculate teh different in rate per 1,000 in two different regions.  The function normalizes them and then prints out a message describing their difference.

In [1]:
def diff_in_rate_normalized(count_a, total_a, count_b, total_b):
    """(int,int,int,int) -> str
    * count_a is the number of occurences in area A
    * total_a is the total population in area A
    * count_b is the number of occurences in area B
    * total_b is the total population in area B
    
    This function returns a string describing how A and B compare in terms of occurence rates per 1,000 population."""

    rate_a = count_a / total_a * 1000
    rate_b = count_b / total_b * 1000
    
    if rate_a == rate_b:
        msg = "The rate in A and the rate in B are the same ({}).".format(rate_a)
    elif rate_a > rate_b:
        msg = "The rate in A ({}) is greater than the rate in B ({}).".format(rate_a, rate_b)
    else:
        msg = "The rate in A ({}) is less than the rate in B ({})".format(rate_a, rate_b)
        
    return msg

In [2]:
rate_covid_testing = diff_in_rate_normalized(1,1000,15,10000)
print(rate_covid_testing)

The rate in A (1.0) is less than the rate in B (1.5)


In [3]:
diff_in_rate_normalized(3,1000,30,10000)

'The rate in A and the rate in B are the same (3.0).'

### 14.2 Trimming outliers

While it works well in most cases, in practice, this may not truly make numbers comparable between extremely large population centers (e.g. New York City at 8.5 million) and very small rural areas (e.g. Meeteese, WY at 459 people).

Let's take that function and make some adjustments.  If the total population of either A or B is more than 100 times larger than the other, then we want to return a different message.  That is, if the populations are more than two orders of magnitude different, then we shouldn't try to compare them.

In [4]:
def diff_in_rate_normalized(count_a, total_a, count_b, total_b):
    """(int,int,int,int) -> str
    * count_a is the number of occurences in area A
    * total_a is the total population in area A
    * count_b is the number of occurences in area B
    * total_b is the total population in area B
    
    This function returns a string describing how A and B compare in terms of occurence rates per 1,000 population.
    
    If total_a / total_b > 100 OR total_b / total_a > 100 then we'll return a message saying the two can't be compared.
    "The total populations in A and B are so different that they can't be compared."
    """
    
    msg = ""
    
    if ((total_a / total_b) > 100) or ((total_b / total_a) > 100):
        msg = "The total populations in A and B are so different that they can't be compared."
    else:
        rate_a = count_a / total_a * 1000
        rate_b = count_b / total_b * 1000

        if rate_a == rate_b:
            msg = "The rates in A and B are the same: {} per 1,000".format(rate_a)
        elif rate_a > rate_b:
            msg = "The rate in A ({})is greater than the rate in B ({}) per 1,000".format(rate_a, rate_b)
        else:
            msg = "The rate in A ({}) is less than the rate in B ({}) per 1,000".format(rate_a, rate_b)
        
    return msg

In [5]:
diff_in_rate_normalized(5, 459, 30, 8500)

'The rate in A (10.893246187363834)is greater than the rate in B (3.5294117647058827) per 1,000'

### 14.3 LACE Score
There is a simple readmission index called the LACE Score: https://www.hindawi.com/journals/bmri/2015/169870/tab1/

Use this documentation to create a function that can compute LACE scores based on the 4 input parameters.  Below is the function signature and documentation that you start with.

In [146]:
def LACE(length_of_stay, acute_flag, charlson, ed_visits):
    
    
    """(int, bool, int, int) -> int
    This function uses the logic from https://www.hindawi.com/journals/bmri/2015/169870/tab1/
    to compute the LACE score for this patient.
    
    >>> LACE(4, False, 1, 0)
    5
    
    >>> LACE(4, True, 4, 7)
    16
    
    """
    score=0
    if length_of_stay<0:
        score=0
    elif length_of_stay==1:
        score=1
    elif length_of_stay==2:
        score=2
    elif length_of_stay==3:
        score=3
    elif length_of_stay>=4 and length_of_stay<=6:
        score=4
    elif length_of_stay>=7 and length_of_stay<=13:
        score=5
    elif length_of_stay>=14:
        score=7
    if acute_flag==True:
        score=score+3
    if charlson==0:
        score=score+0
    elif charlson==1:
        score=score+1
    elif charlson==2:
        score=score+2
    elif charlson==3:
        score=score+3
    elif charlson>=4:
        score=score+5
    if ed_visits==0:
        score=score+0
    elif ed_visits==1:
        score=score+1
    elif ed_visits==2:
        score=score+2
    elif ed_visits==3:
        score=score+3
    elif ed_visits>=4:
        score=score+4
    return score

In [147]:
assert LACE(4, False, 1, 0) == 5
assert LACE(4, True, 4, 7) == 16

In [148]:
LACE(4, True, 4, 7)

16

In [169]:
### 14.4 Care Management Criteria

Care managers use LACE as part of the criteria for assigning a care coodinator to a patient who has been recently discharged. If the score is above 10, then a care coordinator will be assigned. The other criteria they use is if the patient has been discharged with a diagnosis of CHF or COPD.  If the diagnosis field has CHF or COPD in it, then the patient will have a care coordinator assigned.

For this exercise, write another function that takes the same inputs as LACE() plus another diagnosis parameter, and return True or False depending on if the paient needs a care coordinator.

**NOTE** Pay attention to the fact that the order of parameters in this function definition are not the same as the order they were in the LACE score.  

SyntaxError: invalid syntax (<ipython-input-169-cc70405e0fab>, line 3)

In [170]:
 """ (str, int, int, bool, int) -> bool
    Care managers use LACE as part of the criteria for assigning a care coodinator to a 
    patient who has been recently discharged. If the score is above 10, then a care 
    coordinator will be assigned. The other criteria they use is if the patient has been 
    discharged with a diagnosis of CHF or COPD. If the diagnosis field has CHF or COPD in 
    it, then the patient will have a care coordinator assigned.
    """
def assign_care_coordinator(diagnosis_cd, ed_visits, length_of_stay, acute_flag, charlson):
    if diagnosis_cd == 'CHF' or diagnosis_cd == 'COPD': 
        assignment = True
        return assignment
    if length_of_stay<0:
        length_of_stay=1
    elif length_of_stay==1:
        length_of_stay=1
    elif length_of_stay==2:
        length_of_stay=2
    elif length_of_stay==3:
        length_of_stay=3
    elif length_of_stay >= 4 and length_of_stay <= 6: 
        lenght_of_stay = 4
    elif length_of_stay >= 7 and length_of_stay <=13: 
        length_of_stay = 5
        
    if acute_flag==True: 
        acute_flag == 3
    elif acute_flag==False:
        acute_flag=0
    
    if charlson==0:
        charlson=0
    elif charlson==1:
        charlson=1
    elif charlson==2:
        charlson=2
    elif charlson==3:
        charlson=3
    elif charlson>=4:
        charlson=5
        
    if ed_visits==0:
        ed_visits=0
    elif ed_visits==1:
        ed_visits=1
    elif ed_visits==2:
        ed_visits=2
    elif ed_visits==3:
        ed_visits=3
    elif ed_visits>=4:
        ed_visits=4
        
    total_score=length_of_stay+acute_flag+ charlson + ed_visits
    if total_score > 10: 
        assignment=True
    else: 
        assignment=False
    return assignment
        

In [190]:
assert assign_care_coordinator('None', 0, 4, False, 1) == False
assert assign_care_coordinator('CHF', 0, 4, False, 1) == True
assert assign_care_coordinator('COPD', 0, 4, False, 1) == True
assert assign_care_coordinator('None', 7, 4, True, 4) == True
assert assign_care_coordinator('CHF', 7, 4, True, 4) == True

In [189]:
assign_care_coordinator('None', 0, 4, False, 1)

False

### 14.5 qCSI COVID-19 Severity Index

See: https://www.mdcalc.com/quick-covid-19-severity-index-qcsi#evidence

Calculate the total risk score as per the point values assigned to respiratory rate, pulse oximetry, and O2 flow rate.  Then calculate and return the Risk Level.

In addition to the rules provided at the link above, also add the following checks for valid values:
* If `respiratory_rate <= 0` then return _invalid respiratory rate_
* If `pulse_ox <= 0` then return _invalid pulse ox_
* If `pulse_ox > 100` then return _invalid pulse ox_
* If `os_flow <= 0` then return _invalid O2 flow rate_

In [172]:
def qcsi(respiratory_rate, pulse_ox, o2_flow):
    
    
    """(int, int, int) -> str
    * respiratory_rate is an integer value
    * pulse_ox is an integer value (e.g. 30 means 30%)
    * o2_flow is an integer value
    """
    if respiratory_rate <= 0: 
        risk = 'invalid respiratory rate'
        return risk
    if pulse_ox <= 0 or pulse_ox >= 100: 
        risk = 'invalid pulse ox'
        return risk
    if o2_flow <= 0:
        risk = 'invalid O2 flow rate'
        return risk
        
    if respiratory_rate <= 22:
        respiratory_rate = 0
    elif respiratory_rate >= 23 and respiratory_rate <= 28:
        respiratory_rate = 1
    elif respiratory_rate > 28:
        respiratory_rate = 2

    if pulse_ox > 92:
        pulse_ox = 0
    elif pulse_ox <= 92 and pulse_ox >= 89:
        pulse_ox = 2
    elif pulse_ox <= 88:
        pulse_ox = 5

    if o2_flow <= 2:
        o2_flow = 0
    elif o2_flow >= 3 and o2_flow <= 4:
        o2_flow = 4
    elif o2_flow >= 5 and o2_flow <= 6:
        o2_flow = 5

    score = respiratory_rate + pulse_ox + o2_flow
    
    if score <= 3:
        risk = 'low'
    elif score >= 4 and score <= 6:
        risk = 'low-intermediate'
    elif score >= 7 and score <= 9:
        risk = 'high-intermediate'
    elif score >= 10 and score <= 12:
        risk = 'high'
  
    return risk

In [173]:
assert (qcsi(29, 95, 1) == 'low')
assert (qcsi(20, 93, 1) == 'low')
assert (qcsi(29, 88, 1) == 'high-intermediate')
assert (qcsi(29, 88, 4) == 'high')
assert (qcsi(30, 90, 1) == 'low-intermediate')
assert (qcsi(28, 92, -1) == 'invalid O2 flow rate')
assert (qcsi(22, 0, 4) == 'invalid pulse ox')
assert (qcsi(0, 97, 2) == 'invalid respiratory rate')

In [175]:
qcsi(29, 95, 1)

'low'

---

## Check your work above

If you didn't get them all correct, take a few minutes to think through those that aren't correct.


## Submitting Your Work

In order to submit your work, you'll need to use the `git` command line program to **add** your homework file (this file) to your local repository, **commit** your changes to your local repository, and then **push** those changes up to github.com.  From there, I'll be able to **pull** the changes down and do my grading.  I'll provide some feedback, **commit** and **push** my comments back to you.  Next week, I'll show you how to **pull** down my comments.

First run through everything one last time and submit your work:
1. Use the `Kernel` -> `Restart Kernel and Run All Cells` menu option to run everything from top to bottom and stop here.
2. Then open a new command line by clicking the `+` icon above the file list and chosing `Terminal`
3. At the command line in the new Terminal, follow these steps:
  1. Change directories to your project folder and the week03 subfolder (`cd <folder name>`)
  2. Make sure your project folders are up to date with github.com (`git pull`)
  3. Add the homework files for this week (`git add <file name>`)
  4. Commit your changes (`git commit -m "message"`)
  5. Push your changes (`git push`)
  
**Here's a full examle**
```
cd hds5210-2022/week03
git pull
git add week03_assignment_1.ipynb week03_assignment_2.ipynb
git commit -a -m "Submitting homework assignments for week 2"
git push
```

If anything fails along the way with this submission part of the process, let me know.  I'll help you troubleshoort.