# Week 6 Exercises

_McKinney 6.1_

There are multiple ways to solve the problems below.  You can use any one of several approaches.  For example, you can read CSV files using Pandas or the csv module.  Your score won't depend on which modules you choose to use unless explicitly noted below, but your programming style will still matter.

### 30.1 List of Allergies

In the /data directory on the Jupyter server, there is a file called `allergies.json` that contains a list of patient allergies.  It is taken from sample data provided by the EHR vendor, Epic, here: https://open.epic.com/Clinical/Allergy

Take some time to look at the structure of the file.  You can open it directly in Jupyter by clicking the _Home_ icon, then the _from_instructor_ folder, and then the _data_ folder.

Within the file, you'll see that it is a dictionary with many items in it.  One of those items is called `entry` and that item is a list of things.  You can tell that because the item name is immediately followed by an opening square bracket, signifying the start of a list.  It's line 11 of the file: `  "entry": [`

Write a function named `allergy_count(json_file)` that takes as one parameter the name of the JSON file and returns an integer number of entries in that file.  Your function should open the file, read the json into a Python object, and return how many items there are in the list of `entry`s.

In [1]:
import json
from pathlib import Path
HOME = str(Path.home())

ALLERGIES_FILE="/data/allergies.json"

In [2]:
### BEGIN SOLUTION
def allergy_count(ALLERGIES_FILE):
    ''' (json_file) -> (int)
    This function finds the number of items in the list of entries.  
    
    '''
    ## opening the json_file
    
    with open(ALLERGIES_FILE) as af:
        
    ## loading the json and assigning it to the reader
        
        reader = json.load(af)
        
    ## looping through the entry in reader and returning the number of items in entry 
        
        for entry in reader:
            return len(reader['entry'])
                
### END SOLUTION

In [3]:
allergy_count(ALLERGIES_FILE)

4

In [4]:
assert type(allergy_count(ALLERGIES_FILE)) == int
assert allergy_count(ALLERGIES_FILE) == 4

### 30.2 Number of Patients

If you dig a little bit deaper into this list of allergies, you'll see that each result has a patient associated with it.  Create a funcation called `patient_count(json_file)` that will count how many unique patients we have in this JSON structure.  

In [5]:
### BEGIN SOLUTION
def patient_count(ALLERGIES_FILE):
    ''' (json_file) -> int
    This function takes the json file and returns the number of unique patients.
    
    '''
    
    ## opening the json_file
    
    with open(ALLERGIES_FILE) as af:
        
    ## loading the json file and assigning it to the variable num_patients
       
        num_patients = json.load(af)
        
    ## assigning the empty lists to the variables values and unique_patients
        
        values = []
        unique_patients = []
        
    ## looping through every patient and appending the unique patients to the values
        
        for i in num_patients['entry']:
            if(i ['resource']['patient'] not in unique_patients):
                unique_patients.append(i['resource']['patient'])
                values.append(i)
        
     ## returning the length of values           
   
        return len(values)       
            
### END SOLUTION

In [6]:
patient_count(ALLERGIES_FILE)

2

### 30.3 How Many Allergies per Patient

Although each entry is a separate allergy, several of them are for the same patient.  Write a function called `allergy_per_patient(json_file)` that counts up how many allergies each patient has.


In [7]:
### BEGIN SOLUTION
def allergy_per_patient(ALLERGIES_FILE):
    ''' (json_file) -> dict
    This function returns the count of allergies of each patient.
    
    '''
    
    ## opening the json_file 
    
    with open(ALLERGIES_FILE) as af:
        
    ## loading the json file and assigning it to the variable patient_allergy 
        
        patient_allergy = json.load(af)
        
    ## assigning variables output and values_per_key to empty list and empty dictionary
        
        output = []
        values_of_patient_data = {}
        
    ## looping through the entry 

        for pat in patient_allergy['entry']:
            
    ## accessing the patient_name        
    
            name = (pat['resource']['patient'])
            patient_name = name['display']
            
    ## accessing the patient_allergy
            
            allergy = (pat['resource']['substance'])
            patient_allergy = allergy['text']
            
    ## appending patient_name and patient_allergy to output 
            
            output.append({patient_name:patient_allergy})       
            
    ## looping through the patient_data in the output
      
        for patient_data in output:
            
    ## looping through the key(patient_name) and values(patient_allergy) in patient_data
            
            for patient_name, patient_allergy in patient_data.items():
                values_of_patient_data.setdefault(patient_name, set()).add(patient_allergy)
                
    ## calculating the number of allergies of each patient
                
                counts = {patient_name: len(patient_allergy) for patient_name, patient_allergy in values_of_patient_data.items()}
              
        return counts            
### END SOLUTION

In [8]:
allergy_per_patient(ALLERGIES_FILE)

{'Jason Argonaut': 3, 'Paul Boal': 1}

### 30.4 Patient Allergies and Reaction

You'll see in the file that each of the items in the `entry` list have several other attributes including a patient name, substance text representation, and a reaction manifestation.  Create a function named `allergy_list(json_file)` that will create an output list that has patient name, allergy, and reaction for each `entry`.  The actual result you should get will be:

```python
[['Jason Argonaut', 'PENICILLIN G', 'Hives'],
 ['Paul Boal', 'PENICILLIN G', 'Bruising'],
 ['Jason Argonaut', 'SHELLFISH-DERIVED PRODUCTS', 'Itching'],
 ['Jason Argonaut', 'STRAWBERRY', 'Anaphylaxis']]
```

You'll notice that the reaction and the manifestation of that action are lists.  You only need to capture the first reaction and the first manifestation of the action.  That is, if there is a list of things, just output the first one.

In [9]:
import json

### BEGIN SOLUTION
def allergy_list(ALLERGIES_FILE):
    '''(json_file) -> (list)
    This function returns the list containing a list of patient, alleric subsatnce, and the reaction.
    
    '''
    
    ## opening the json_file
    
    with open(ALLERGIES_FILE) as af:
        
    ## loading the json file and assigning it to the variable patient_allergy 
        
        patient_allergy = json.load(af)
        
    ## assigning variables output to empty list
        
        output = []
        
    ## looping through each entry in the file
       
        for pat in patient_allergy['entry']:
            
    ## accessing patient from entry
            
            name = (pat['resource']['patient'])
            patient = name['display']
            
    ## accessing substance causing allergy of patient from entry
            
            allergy = (pat['resource']['substance'])
            substance = allergy['text']
            
    ## accessing the reaction of patient from entry
            
            patient_reaction = (pat['resource']['reaction'])
            patient_manifestation = patient_reaction[0]['manifestation']
            reaction = patient_manifestation[0]['text']
            
    ## appending the list of patient, substance, reaction to the output list
            
            output.append([patient,substance, reaction])
            
        return output
### END SOLUTION

In [10]:
output=[['Jason Argonaut', 'PENICILLIN G', 'Hives'],
 ['Paul Boal', 'PENICILLIN G', 'Bruising'],
 ['Jason Argonaut', 'SHELLFISH-DERIVED PRODUCTS', 'Itching'],
 ['Jason Argonaut', 'STRAWBERRY', 'Anaphylaxis']]

assert allergy_list(ALLERGIES_FILE) == output


### 30.5 Allergy Reaction

Write a function called `allergy_reaction(json_file,patient,substance)` that takes three parameter and returns the reaction that will happen if the patient takes the specified substance.  Solve this, in part, by calling your `allergy_list` function inside your new `allergy_reaction` function.

If the substance is not found in the allergy list, the function should return None.

In [11]:
import json

### BEGIN SOLUTION
def allergy_reaction(ALLERGIES_FILE, patient, substance):
    ''' (json_file, patient, substance) -> (str)
    This function takes the json_file, patient, substance and returns the reaction of that patient.
    
    '''
    
    ## assigning the allergy_list to output variable
    
    output = allergy_list(ALLERGIES_FILE)
    
    ## looping through each element in output and returning the reaction
    
    for element in output:
        if patient == element[0] and substance == element[1]:
            reaction = element[2]
            return reaction
### END SOLUTION

In [12]:
assert allergy_reaction(ALLERGIES_FILE, 'Jason Argonaut', 'PENICILLIN G') == 'Hives'
assert allergy_reaction(ALLERGIES_FILE, 'Jason Argonaut', 'SHELLFISH-DERIVED PRODUCTS') == 'Itching'
assert allergy_reaction(ALLERGIES_FILE, 'Jason Argonaut', 'STRAWBERRY') == 'Anaphylaxis'
assert allergy_reaction(ALLERGIES_FILE, 'Jason Argonaut', 'PENICILLIN') == None
assert allergy_reaction(ALLERGIES_FILE, 'Paul Boal', 'PENICILLIN G') == 'Bruising'

---
---

# Stretch (Extra) Problem

Work on either of the stretch problems below can earn you up to 25 free points toward the midterm assignment.  That is, if you complete one of these extra problems successfully, you can skip 1 of the problems that will appear on the midterm exam coming up next week.

The midterm will be distributed 10/14 and due 10/24.



---
---

### STRETCH for October 2022 - For those looking for an additional challenge

As I've mentioned in class, CMS is now enforcing a rule around price transparency.  Every facility that take Medicare payments is required to publish a "machine readable" file with it's pricing infomration for a number of common procedures across all of the payers they work with.  There are two examples of such files in the `/data/` directory: `whiteriver.json` and `saline.xml`.

If you want to compare contracted prices across these two hospitals, you'll need to read in the information from both of those files into some kind of data structure, then merge the data together from those two files.  See what you can do.

See if you can create an output file that has the following fields:
* HOSPITAL
* PROCEDURE_CODE
* PAYER
* AMOUNT

If you choose to work on this, you may get stuck at some point and you won't know if you're _doing it right_. Make some assumptions. Document your questions in this notebook.



```
Procedure Code |  Description  |  Gross Charges  |  Aetna  |  QualChoice
```

---

## Check your work above

If you didn't get them all correct, take a few minutes to think through those that aren't correct.


## Submitting Your Work

In order to submit your work, you'll need to use the `git` command line program to **add** your homework file (this file) to your local repository, **commit** your changes to your local repository, and then **push** those changes up to github.com.  From there, I'll be able to **pull** the changes down and do my grading.  I'll provide some feedback, **commit** and **push** my comments back to you.  Next week, I'll show you how to **pull** down my comments.

First run through everything one last time and submit your work:
1. Use the `Kernel` -> `Restart Kernel and Run All Cells` menu option to run everything from top to bottom and stop here.
2. Then open a new command line by clicking the `+` icon above the file list and chosing `Terminal`
3. At the command line in the new Terminal, follow these steps:
  1. Change directories to your project folder and the week03 subfolder (`cd <folder name>`)
  2. Make sure your project folders are up to date with github.com (`git pull`)
  3. Add the homework files for this week (`git add <file name>`)
  4. Commit your changes (`git commit -a -m "message"`)
  5. Push your changes (`git push`)
  
If anything fails along the way with this submission part of the process, let me know.  I'll help you troubleshoort.