# Week 6 Exercises

_McKinney 6.1_

There are multiple ways to solve the problems below.  You can use any one of several approaches.  For example, you can read CSV files using Pandas or the csv module.  Your score won't depend on which modules you choose to use unless explicitly noted below, but your programming style will still matter.

### 30.1 List of Allergies

In this GitHub repository, there is a file called `allergies.json` that contains a list of patient allergies.  You will need to download this [file from here](https://raw.githubusercontent.com/paulboal/hds5210-2023/main/week06/allergies.json) and then upload it into Google Colab to run these examples. It is taken from sample data provided by the EHR vendor, Epic, here: https://open.epic.com/Clinical/Allergy

Take some time to look at the structure of the file.  You can open it directly in Jupyter by clicking the _Home_ icon, then the _from_instructor_ folder, and then the _data_ folder.

Within the file, you'll see that it is a dictionary with many items in it.  One of those items is called `entry` and that item is a list of things.  You can tell that because the item name is immediately followed by an opening square bracket, signifying the start of a list.  It's line 11 of the file: `  "entry": [`

Write a function named `allergy_count(json_file)` that takes as one parameter the name of the JSON file and returns an integer number of entries in that file.  Your function should open the file, read the json into a Python object, and return how many items there are in the list of `entry`s.

In [70]:
# Python has pandas package used to view, manipulate or to create a file.
import pandas as pd
import numpy as np
ALLERGIES_FILE=pd.read_json('allergies.json')
print(type(ALLERGIES_FILE))
ALLERGIES_FILE

<class 'pandas.core.frame.DataFrame'>


Unnamed: 0,entry
0,{'resource': {'resourceType': 'AllergyIntolera...
1,{'resource': {'resourceType': 'AllergyIntolera...
2,{'resource': {'resourceType': 'AllergyIntolera...
3,{'resource': {'resourceType': 'AllergyIntolera...


In [71]:
def allergy_count(allergy_data):
  """(pandas.core.frame.DataFrame)->int
  Here we are counting allergies. So we need to count if the text in certainty is confirmed
  so in order to access certainty we need to access it as follows that I mentioned in if
  condition. So I am running a loop if the loop go on rotating if we find confirmed in the index
  then asked to count. So it will count whenever it find the confirmed in the indexes.

  >>> allergy_count(ALLERGIES_FILE)
  4

  >>> type(allergy_count(ALLERGIES_FILE))
  int

  """
  count=0
  for i in range(len(allergy_data['entry'])):
    if allergy_data['entry'][i]['resource']['reaction'][0]['certainty']=='confirmed':
      count+=1
  return count

In [72]:
allergy_count(ALLERGIES_FILE)

4

In [73]:
assert type(allergy_count(ALLERGIES_FILE)) == int
assert allergy_count(ALLERGIES_FILE) == 4

In [75]:
import doctest
doctest.run_docstring_examples(allergy_count, globals(), verbose=True)


sys.settrace() should not be used when the debugger is being used.
This may cause the debugger to stop working correctly.
If this is needed, please check: 
http://pydev.blogspot.com/2007/06/why-cant-pydev-debugger-work-with.html
to see how to restore the debug tracing back correctly.
Call Location:
  File "/usr/lib/python3.10/doctest.py", line 1501, in run
    sys.settrace(save_trace)



Finding tests in NoName
Trying:
    allergy_count(ALLERGIES_FILE)
Expecting:
    4
ok
Trying:
    type(allergy_count(ALLERGIES_FILE))
Expecting:
    int
**********************************************************************
File "__main__", line 11, in NoName
Failed example:
    type(allergy_count(ALLERGIES_FILE))
Expected:
    int
Got:
    <class 'int'>


### 30.2 Number of Patients

If you dig a little bit deaper into this list of allergies, you'll see that each result has a patient associated with it.  Create a funcation called `patient_count(json_file)` that will count how many unique patients we have in this JSON structure.  

In [76]:
# Put your solution here
def patient_count(json_file):
  """Here we are counting the patients. Initially we created a empty list. In it we are getting the patients names
  from each element in entry. so we called the patient name and appended the empty list with patient names and we
  also made a condition that not to add the patient name again if we already add the same name. then we get the
  list. and we are calling out the length of the list. Now we get total number of patients. Otherwise we can do a list
  without any condition. After creating function we can do set of that list and obtain unique names.

  >>> patient_count(ALLERGIES_FILE)
  2
  """
  patient_names=[]
  for i in range(len(json_file['entry'])):
    if json_file['entry'][i]['resource']['patient']['display'] not in patient_names:
      patient_names.append(json_file['entry'][i]['resource']['patient']['display'])
  total_patients=len(patient_names)
  return total_patients


In [21]:
patient_count(ALLERGIES_FILE)

2

In [22]:
assert type(patient_count(ALLERGIES_FILE)) == int
assert patient_count(ALLERGIES_FILE) == 2

In [77]:
import doctest
doctest.run_docstring_examples(patient_count, globals(), verbose=True)

Finding tests in NoName
Trying:
    patient_count(ALLERGIES_FILE)
Expecting:
    2
ok


### 30.3 How Many Allergies per Patient

Although each entry is a separate allergy, several of them are for the same patient.  Write a function called `allergy_per_patient(json_file)` that counts up how many allergies each patient has.


In [78]:
# Put your solution here
def allergy_per_patient(json_file):
  """('pandas.core.frame.DataFrame')->dict
  Her in this initially created a list. Now I accessed the patient names and appending
  it to the empty list. Now I appended the names to list. Then I used set function and
  saved then in new variable.

  Now I created empty dictionary and using unique set i made a for loop to count how many times
  the name came in actual list. And assigned key and values.

  >>> allergy_per_patient(ALLERGIES_FILE)
  {'Paul Boal': 1, 'Jason Argonaut': 3}
  """
  patient_names=[]
  for i in range(len(json_file['entry'])):
    patient_names.append(json_file['entry'][i]['resource']['patient']['display'])
  unique_patient=set(patient_names)
  allergies_per_person={}
  for patient in unique_patient:
    allergies_per_person[patient]=patient_names.count(patient)
  return allergies_per_person
allergy_per_patient(ALLERGIES_FILE)



{'Jason Argonaut': 3, 'Paul Boal': 1}

In [79]:
allergy_per_patient(ALLERGIES_FILE)

{'Jason Argonaut': 3, 'Paul Boal': 1}

In [80]:
assert type(allergy_per_patient(ALLERGIES_FILE)) == dict
assert allergy_per_patient(ALLERGIES_FILE) == {'Paul Boal': 1, 'Jason Argonaut': 3}

In [81]:
import doctest
doctest.run_docstring_examples(allergy_per_patient, globals(), verbose=True)

Finding tests in NoName
Trying:
    allergy_per_patient(ALLERGIES_FILE)
Expecting:
    {'Paul Boal': 1, 'Jason Argonaut': 3}
**********************************************************************
File "__main__", line 11, in NoName
Failed example:
    allergy_per_patient(ALLERGIES_FILE)
Expected:
    {'Paul Boal': 1, 'Jason Argonaut': 3}
Got:
    {'Jason Argonaut': 3, 'Paul Boal': 1}


### 30.4 Patient Allergies and Reaction

You'll see in the file that each of the items in the `entry` list have several other attributes including a patient name, substance text representation, and a reaction manifestation.  Create a function named `allergy_list(json_file)` that will create an output list that has patient name, allergy, and reaction for each `entry`.  The actual result you should get will be:

```python
[['Jason Argonaut', 'PENICILLIN G', 'Hives'],
 ['Paul Boal', 'PENICILLIN G', 'Bruising'],
 ['Jason Argonaut', 'SHELLFISH-DERIVED PRODUCTS', 'Itching'],
 ['Jason Argonaut', 'STRAWBERRY', 'Anaphylaxis']]
```

You'll notice that the reaction and the manifestation of that action are lists.  You only need to capture the first reaction and the first manifestation of the action.  That is, if there is a list of things, just output the first one.

In [86]:
# Put your solution here
def allergy_list(json_file):
  """('pandas.core.frame.DataFrame')-> list
  Here I created an empty list and then I used for loop where I selected the range of 'for' loop as length of patient entries
  then I created list with patient name, substance that the patient allergic to, the allergic reaction and I appended these to the empty list

  >>> allergy_list(ALLERGIES_FILE)
  [['Jason Argonaut', 'PENICILLIN G', 'Hives'],
  ['Jason Argonaut', 'SHELLFISH-DERIVED PRODUCTS', 'Itching'],
  ['Jason Argonaut', 'STRAWBERRY', 'Anaphylaxis'],
  ['Paul Boal', 'PENICILLIN G', 'Bruising']]

  """
  allergy_lists=[]
  for i in range(len(json_file['entry'])):
    allergy_lists.append([json_file['entry'][i]['resource']['patient']['display'],json_file['entry'][i]['resource']['substance']['text'],json_file['entry'][i]['resource']['reaction'][0]['manifestation'][0]['text']])
  return allergy_lists

In [87]:
allergy_list(ALLERGIES_FILE)

[['Jason Argonaut', 'PENICILLIN G', 'Hives'],
 ['Jason Argonaut', 'SHELLFISH-DERIVED PRODUCTS', 'Itching'],
 ['Jason Argonaut', 'STRAWBERRY', 'Anaphylaxis'],
 ['Paul Boal', 'PENICILLIN G', 'Bruising']]

In [88]:
assert allergy_list(ALLERGIES_FILE) == [['Jason Argonaut', 'PENICILLIN G', 'Hives'],
 ['Jason Argonaut', 'SHELLFISH-DERIVED PRODUCTS', 'Itching'],
 ['Jason Argonaut', 'STRAWBERRY', 'Anaphylaxis'],
 ['Paul Boal', 'PENICILLIN G', 'Bruising']]


In [89]:
import doctest
doctest.run_docstring_examples(allergy_list, globals(), verbose=True)

Finding tests in NoName
Trying:
    allergy_list(ALLERGIES_FILE)
Expecting:
    [['Jason Argonaut', 'PENICILLIN G', 'Hives'],
    ['Jason Argonaut', 'SHELLFISH-DERIVED PRODUCTS', 'Itching'],
    ['Jason Argonaut', 'STRAWBERRY', 'Anaphylaxis'],
    ['Paul Boal', 'PENICILLIN G', 'Bruising']]
**********************************************************************
File "__main__", line 7, in NoName
Failed example:
    allergy_list(ALLERGIES_FILE)
Expected:
    [['Jason Argonaut', 'PENICILLIN G', 'Hives'],
    ['Jason Argonaut', 'SHELLFISH-DERIVED PRODUCTS', 'Itching'],
    ['Jason Argonaut', 'STRAWBERRY', 'Anaphylaxis'],
    ['Paul Boal', 'PENICILLIN G', 'Bruising']]
Got:
    [['Jason Argonaut', 'PENICILLIN G', 'Hives'], ['Jason Argonaut', 'SHELLFISH-DERIVED PRODUCTS', 'Itching'], ['Jason Argonaut', 'STRAWBERRY', 'Anaphylaxis'], ['Paul Boal', 'PENICILLIN G', 'Bruising']]


### 30.5 Allergy Reaction

Write a function called `allergy_reaction(json_file,patient,substance)` that takes three parameter and returns the reaction that will happen if the patient takes the specified substance.  You can solve this, in part, by calling your `allergy_list` function inside your new `allergy_reaction` function.

If the substance is not found in the allergy list, the function should return None.

In [90]:
# Put your solution here
def allergy_reaction(json_file,patient_name='patient',Drug='drug'):
  """('pandas.core.frame.DataFrame',str,str)-> str
  I have created a variable named result as None then I used for loop and range as number of entries.
  later I created patient location and causative of the adverse event.
  then I conditioned if patient name and substance rensponsible for adverse event matches then only
  the reaction should be shown otherwise it return result as adr event otherwise it shouldn't return anything.
  after coming out of from for loop if result variable doesn't recorded anything then we should return None for
  the function call else we need to to return result i.e the adverse event.

  >>> allergy_reaction(ALLERGIES_FILE, 'Jason Argonaut', 'PENICILLIN')
  None
  """
  result=None
  for i in range(len(json_file['entry'])):
    patient=json_file['entry'][i]['resource']['patient']['display']
    treatment=json_file['entry'][i]['resource']['substance']['text']
    if patient==patient_name and treatment==Drug:
      result= json_file['entry'][i]['resource']['reaction'][0]['manifestation'][0]['text']
  if result is None:
    return None
  else:
    return result

In [91]:
allergy_reaction(ALLERGIES_FILE, 'Jason Argonaut', 'PENICILLIN G')

'Hives'

In [92]:
assert allergy_reaction(ALLERGIES_FILE, 'Jason Argonaut', 'PENICILLIN G') == 'Hives'
assert allergy_reaction(ALLERGIES_FILE, 'Jason Argonaut', 'SHELLFISH-DERIVED PRODUCTS') == 'Itching'
assert allergy_reaction(ALLERGIES_FILE, 'Jason Argonaut', 'STRAWBERRY') == 'Anaphylaxis'
assert allergy_reaction(ALLERGIES_FILE, 'Jason Argonaut', 'PENICILLIN') == None
assert allergy_reaction(ALLERGIES_FILE, 'Paul Boal', 'PENICILLIN G') == 'Bruising'

In [93]:
import doctest
doctest.run_docstring_examples(allergy_reaction, globals(), verbose=True)

Finding tests in NoName
Trying:
    allergy_reaction(ALLERGIES_FILE, 'Jason Argonaut', 'PENICILLIN')
Expecting:
    None
**********************************************************************
File "__main__", line 11, in NoName
Failed example:
    allergy_reaction(ALLERGIES_FILE, 'Jason Argonaut', 'PENICILLIN')
Expected:
    None
Got nothing


---

## Check your work above

If you didn't get them all correct, take a few minutes to think through those that aren't correct.


## Submitting Your Work

In order to submit your work, you'll need to save this notebook file back to GitHub.  To do that in Google Colab:
1. File -> Save a Copy in GitHub
2. Make sure your HDS5210 repository is selected
3. Make sure the file name includes the week number like this: `week06/week06_assignment_2.ipynb`
4. Add a commit message that means something

**Be sure week names are lowercase and use a two digit week number!!**

**Be sure you use the same file name provided by the instructor!!**

