_SLUCOR HDS5210 - Programming for Health Data Science - 2018 Spring_

Week 3 (February 5) Exercises
===

Before starting, be sure that you have completed the GitHub setup from week 2 and have your hds5210 repository setup from the homework last assignments.  You'll continue to use the same Git project / repository to submit your assignments.

#1 - Creating Functions
---

Write a function called `navy_bmi` like the BMI one we did in class, that uses the US Navy calculator for BMI for men.  You can find the formula below.  Note that the **formula assumes all measurements are in cm**.

![navy_bmi](navy_bmi_formula.png "Navy BMI")

Your function will need to take in parameters for `waist_inches`, `neck_inches`, and `height_inches`, **all in inches**.  Test your function using the following values to make sure you get the right answer.

```
navy_bmi(36.5, 16, 72)

19.499313827566784
```

You'll need to import that **math** module first to be able to use the **log** function.  You may also want to review the help documentation on **log** to make sure you use it correctly.

In [1]:
import math
def navy_bmi(waist, neck, height):
    waist_cm = waist * 2.54
    neck_cm = neck * 2.54
    height_cm = height * 2.54
    bmi = 86.010 * math.log(waist_cm - neck_cm,10) - 70.041 * math.log(height_cm,10) + 30.30
    return bmi


In [2]:
navy_bmi(36.5, 16, 72)

19.499313827566784

#2 - Understanding Functions
---

Try to reason through this set of instruction and determine what the output will be.  Check your understanding by running the code in Jupyter.  Then explain why the output is what it is.

```
temp = 103

def calculate_target (temp):
    temp -= 4
    return temp

calculate_target (temp)

print("The current value of temp is " + str(temp))
```


Solution
---

1. The program-level variable `temp` is set to `103`
2. We have a function that uses a local variable also named `temp` to do a calculation
3. When we use the `calculate_target` function and pass it the program-level variable `temp`, the function does its work and returns a value, but our program doesn't assign that return value to a variable. Nothing has changed at the program-level.
4. We print out the value of the program-level variable `temp`... which never changed

#3 - Parsing Dosage Amounts and Units
---

Create a function that will take as input an infusion dosage value and unit in the format `# volume/time` such as `45 mg/hr` or `0.2 L/hr`, and return just the numeric part of the dosage.  It should return the numeric part as a floating point decimal number so that calculations can easily be done with the number.

Demonstrate that your function works correctly using each of these tests:
```
1.0 L/hr
10 mg/hr
0.75 g/day
```

Solution Option 1, using `find`
---

In [3]:
def parse_rate1(rate):
    """ (str) -> float
    Given a string that's in the format '# volume/time' 
    extract the numeric part and return it as a number
    
    >>> parse_rate1('1.0 L/hr')
    1.0
    
    >>> parse_rate1('10 mg/hr')
    10.0
    
    >>> parse_rate1('0.75 g/day')
    0.75
    """
    space_pos = rate.find(' ')    # The space is our separator between number part and units
    num_part = rate[:space_pos]   # Take everything up to that space
    return float(num_part)        # Return the float version

In [4]:
import doctest
doctest.run_docstring_examples(parse_rate1,globals(),verbose=True)

Finding tests in NoName
Trying:
    parse_rate1('1.0 L/hr')
Expecting:
    1.0
ok
Trying:
    parse_rate1('10 mg/hr')
Expecting:
    10.0
ok
Trying:
    parse_rate1('0.75 g/day')
Expecting:
    0.75
ok


Solution Option 2 - using `split`
---

In [5]:
def parse_rate2(rate):
    """ (str) -> float
    Given a string that's in the format '# volume/time' 
    extract the numeric part and return it as a number
    
    >>> parse_rate2('1.0 L/hr')
    1.0
    
    >>> parse_rate2('10 mg/hr')
    10.0
    
    >>> parse_rate2('0.75 g/day')
    0.75
    """
    parts = rate.split(' ')   # Split our string into parts based using space as a separator
    return float(parts[0])    # The number part will be the 0-index of the parts array

In [6]:
import doctest
doctest.run_docstring_examples(parse_rate2,globals(),verbose=True)

Finding tests in NoName
Trying:
    parse_rate2('1.0 L/hr')
Expecting:
    1.0
ok
Trying:
    parse_rate2('10 mg/hr')
Expecting:
    10.0
ok
Trying:
    parse_rate2('0.75 g/day')
Expecting:
    0.75
ok


#4 - Parsing and Rewriting a String
---

Create another function or collection of functions that will take a string in the format `drug # volumne/time` such as `Asprin 20 mg/hr`, and return a string in the format `In one hr, the patient will have received 20 mg of Asprin.  Doubling the dosage to 40 mg would be dangerous!`. Be sure that you reuse the other functions you've created in this assignment if possible.

Demonstrate that your function works correctly using these tests:

```
Asprin 20 mg/hr
Amoxicillin 300 mg/day
```

Solution Option 1 - Using `find`
---

In [7]:
def parse_drug1(drug):
    """ (str) -> str
    Given a string in the format 'drug # volume/time` return a sentence that describes the amount
    of drug a person will receive in that unit of time.
    
    >>> parse_drug1('Asprin 20 mg/hr')
    'In one hr, the patient will have received 20 mg of Asprin.  Doubling the dosage to 40 mg would be dangerous!'
    
    >>> parse_drug1('Amoxicillin 300 mg/day')
    'In one day, the patient will have received 300 mg of Amoxicillin.  Doubling the dosage to 600 mg would be dangerous!'
    """
    # Find the first and second spaces
    first_space = drug.find(' ')
    second_space = drug.find(' ', first_space + 1)
    
    # Parse out the three main pieces
    name = drug[:first_space]
    amount = int(drug[first_space + 1:second_space])
    units = drug[second_space + 1:]
    
    # Parse the units into numerator / denominator
    unit_sep_pos = units.find('/')
    volume = units[:unit_sep_pos]
    time = units[unit_sep_pos + 1:]
    
    statement = "In one {}, the patient will have received {} {} of {}.".format(time, amount, volume, name)
    statement += "  Doubling the dosage to {} {} would be dangerous!".format(amount*2, volume)
    return statement

In [8]:
import doctest
doctest.run_docstring_examples(parse_drug1,globals(),verbose=True)

Finding tests in NoName
Trying:
    parse_drug1('Asprin 20 mg/hr')
Expecting:
    'In one hr, the patient will have received 20 mg of Asprin.  Doubling the dosage to 40 mg would be dangerous!'
ok
Trying:
    parse_drug1('Amoxicillin 300 mg/day')
Expecting:
    'In one day, the patient will have received 300 mg of Amoxicillin.  Doubling the dosage to 600 mg would be dangerous!'
ok


Solution Option 2 - Using `split`
---

In [9]:
def parse_drug2(drug):
    """ (str) -> str
    Given a string in the format 'drug # volume/time` return a sentence that describes the amount
    of drug a person will receive in that unit of time.
    
    >>> parse_drug2('Asprin 20 mg/hr')
    'In one hr, the patient will have received 20 mg of Asprin.  Doubling the dosage to 40 mg would be dangerous!'
    
    >>> parse_drug2('Amoxicillin 300 mg/day')
    'In one day, the patient will have received 300 mg of Amoxicillin.  Doubling the dosage to 600 mg would be dangerous!'
    """
    # Split the input using space
    drug_parts = drug.split(' ')
    name = drug_parts[0]
    amount = int(drug_parts[1])
    units = drug_parts[2]
    
    # Parse the units into numerator / denominator
    unit_parts = units.split('/')
    volume = unit_parts[0]
    time = unit_parts[1]
    
    statement = "In one {}, the patient will have received {} {} of {}.".format(time, amount, volume, name)
    statement += "  Doubling the dosage to {} {} would be dangerous!".format(amount*2, volume)
    return statement

In [10]:
import doctest
doctest.run_docstring_examples(parse_drug2,globals(),verbose=True)

Finding tests in NoName
Trying:
    parse_drug2('Asprin 20 mg/hr')
Expecting:
    'In one hr, the patient will have received 20 mg of Asprin.  Doubling the dosage to 40 mg would be dangerous!'
ok
Trying:
    parse_drug2('Amoxicillin 300 mg/day')
Expecting:
    'In one day, the patient will have received 300 mg of Amoxicillin.  Doubling the dosage to 600 mg would be dangerous!'
ok


#5 - (Stretch Assignment)
---

If you complete the other assignments above easily, I offer you this additional assignment for extra credit.  If you do not finish #5 it will not count against your grade on this assignment.  If you do complete it correctly, you'll be entered into a prize drawing for for something right before the end of the final.

Imagine a family tree with the following general structure:

```person ( mother + father )```

An example would use actual names for the various roles:

```Paul Boal ( Carol Boal + James Boal )```

The structure can also include information about the mother and father embedded within the text, too.

```person ( mother (mother's moth + mother's father) + father (father's mother + father's father))```

An example would be:

```Paul Boal ( Carol Boal ( Dorothy Greenfield + Howard Greenfield ) + James Boal ( Velma Boal + Harold Boal ))```

This kind of structure can be arbitrarily deep.  The spaces don't matter. The only important punctuation are the parentheses to enclose `( mother + father )` and the plus sign to separate `mother + father`.  Note that mother always comes first, followed by the plus sign, and then the father.

Write a recursive function that can find an arbitrarily deep request using a phrase like `mother's mother` or `father's mother's mother` to identify the person to lookup.  Your function should take the family tree and the request as parameters, and should return the name of the person in that position.

In our example above: `father's father` would return `Harold Boal`


Solution Description
---

We're going to use a recursive function to do this.

In [11]:
import logging

def get_side(whole, part):
    """ (str, int) -> str
    >>> get_side('a + b',1)
    'a'
    
    >>> get_side('a + b',2)
    'b'
    
    >>> get_side('( a + b ) + ( c + d )',1)
    '( a + b )'

    >>> get_side('junk ( a + b ) + ( c + d )',1)
    'junk ( a + b )'
    
    >>> get_side(' Carol Boal ( Dorothy Greenfield + Howard Greenfield ) + James Boal ( Velma Boal + Harold Boal )',1)
    'Carol Boal ( Dorothy Greenfield + Howard Greenfield )'
    """
    
    logging.debug("Getting side {} from '{}'".format(part,whole))
    depth = 0
    pos = -1
    for i in range(0,len(whole)):
        if whole[i] == '(':
            depth += 1
        if whole[i] == ')':
            depth -= 1
        if depth == 0 and whole[i] == '+':
            pos = i
    
    if part == 1:
        return whole[:pos].strip()
    else:
        return whole[pos+1:].strip()

In [12]:
import doctest
doctest.run_docstring_examples(get_side,globals(),verbose=True)

Finding tests in NoName
Trying:
    get_side('a + b',1)
Expecting:
    'a'
ok
Trying:
    get_side('a + b',2)
Expecting:
    'b'
ok
Trying:
    get_side('( a + b ) + ( c + d )',1)
Expecting:
    '( a + b )'
ok
Trying:
    get_side('junk ( a + b ) + ( c + d )',1)
Expecting:
    'junk ( a + b )'
ok
Trying:
    get_side(' Carol Boal ( Dorothy Greenfield + Howard Greenfield ) + James Boal ( Velma Boal + Harold Boal )',1)
Expecting:
    'Carol Boal ( Dorothy Greenfield + Howard Greenfield )'
ok


In [13]:
def get_descendant(geneology, query):
    """
    >>> get_descendant('Paul Boal ( Carol Boal + James Boal )', 'mother')
    'Carol Boal'

    >>> get_descendant('Paul Boal ( Carol Boal + James Boal )', 'father')
    'James Boal'
    
    >>> history = 'Paul Boal ( Carol Boal ( Dorothy Greenfield + Howard Greenfield ) + James Boal ( Velma Boal + Harold Boal ))'
    >>> get_descendant(history, "mother's mother")
    'Dorothy Greenfield'
    
    >>> get_descendant(history, "mother's father")
    'Howard Greenfield'
    
    >>> get_descendant(history, "father's mother")
    'Velma Boal'
    
    >>> get_descendant(history, "father's father")
    'Harold Boal'
    
    """
    
    parents_pos = geneology.find('(')
    parents = geneology[parents_pos+1:-1]
    logging.debug("Parents are: {}".format(parents))
    
    parent_split = parents.find('+')
    mother = parents[:parent_split]
    father = parents[parent_split + 1:]
    
    if query == 'mother':
        return mother.strip()
    elif query == 'father':
        return father.strip()
    else:
        query_pos = query.find(' ')
        first_query = query[:query_pos]
        remaining_query = query[query_pos+1:]
        logging.debug("Query is: {}".format(first_query))
        if first_query == "mother's":
            parent = get_side(parents, 1)
        elif first_query == "father's":
            parent = get_side(parents, 2)
        else:
            logging.error("Got a request that doesn't make sense: {}".format(first_query))
        logging.debug("Getting '{}' from '{}'".format(remaining_query, parent))
        return get_descendant(parent, remaining_query)
        
    

In [14]:
import logging
import imp
imp.reload(logging)
logging.basicConfig(format='%(asctime)s %(levelname)s:%(message)s', level=logging.DEBUG, datefmt='%I:%M:%S')
history = 'Paul Boal ( Carol Boal ( Dorothy Greenfield + Howard Greenfield ) + James Boal ( Velma Boal + Harold Boal ))'
get_descendant(history, "mother's father")

11:31:49 DEBUG:Parents are:  Carol Boal ( Dorothy Greenfield + Howard Greenfield ) + James Boal ( Velma Boal + Harold Boal )
11:31:49 DEBUG:Query is: mother's
11:31:49 DEBUG:Getting side 1 from ' Carol Boal ( Dorothy Greenfield + Howard Greenfield ) + James Boal ( Velma Boal + Harold Boal )'
11:31:49 DEBUG:Getting 'father' from 'Carol Boal ( Dorothy Greenfield + Howard Greenfield )'
11:31:49 DEBUG:Parents are:  Dorothy Greenfield + Howard Greenfield 


'Howard Greenfield'

In [15]:
import doctest
doctest.run_docstring_examples(get_descendant,globals(),verbose=True)

11:31:49 DEBUG:Parents are:  Carol Boal + James Boal 
11:31:49 DEBUG:Parents are:  Carol Boal + James Boal 
11:31:49 DEBUG:Parents are:  Carol Boal ( Dorothy Greenfield + Howard Greenfield ) + James Boal ( Velma Boal + Harold Boal )
11:31:49 DEBUG:Query is: mother's
11:31:49 DEBUG:Getting side 1 from ' Carol Boal ( Dorothy Greenfield + Howard Greenfield ) + James Boal ( Velma Boal + Harold Boal )'
11:31:49 DEBUG:Getting 'mother' from 'Carol Boal ( Dorothy Greenfield + Howard Greenfield )'
11:31:49 DEBUG:Parents are:  Dorothy Greenfield + Howard Greenfield 
11:31:49 DEBUG:Parents are:  Carol Boal ( Dorothy Greenfield + Howard Greenfield ) + James Boal ( Velma Boal + Harold Boal )
11:31:49 DEBUG:Query is: mother's
11:31:49 DEBUG:Getting side 1 from ' Carol Boal ( Dorothy Greenfield + Howard Greenfield ) + James Boal ( Velma Boal + Harold Boal )'
11:31:49 DEBUG:Getting 'father' from 'Carol Boal ( Dorothy Greenfield + Howard Greenfield )'
11:31:49 DEBUG:Parents are:  Dorothy Greenfield + H

Finding tests in NoName
Trying:
    get_descendant('Paul Boal ( Carol Boal + James Boal )', 'mother')
Expecting:
    'Carol Boal'
ok
Trying:
    get_descendant('Paul Boal ( Carol Boal + James Boal )', 'father')
Expecting:
    'James Boal'
ok
Trying:
    history = 'Paul Boal ( Carol Boal ( Dorothy Greenfield + Howard Greenfield ) + James Boal ( Velma Boal + Harold Boal ))'
Expecting nothing
ok
Trying:
    get_descendant(history, "mother's mother")
Expecting:
    'Dorothy Greenfield'
ok
Trying:
    get_descendant(history, "mother's father")
Expecting:
    'Howard Greenfield'
ok
Trying:
    get_descendant(history, "father's mother")
Expecting:
    'Velma Boal'
ok
Trying:
    get_descendant(history, "father's father")
Expecting:
    'Harold Boal'
ok



## If you need any help remembing how to commit your work, look here:

```
%%bash
cd ~/hds5210/
git add week03-paulboal.ipynb
git commit -a -m "Adding homework for week 3"
git push
```