<a href="https://colab.research.google.com/github/hfatima2/HDS-assignments/blob/main/week06/week06_assignment_2.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Week 6 Exercises

_McKinney 6.1_

There are multiple ways to solve the problems below.  You can use any one of several approaches.  For example, you can read CSV files using Pandas or the csv module.  Your score won't depend on which modules you choose to use unless explicitly noted below, but your programming style will still matter.

### 30.1 List of Allergies

In the /data directory on the Jupyter server, there is a file called `allergies.json` that contains a list of patient allergies.  It is taken from sample data provided by the EHR vendor, Epic, here: https://open.epic.com/Clinical/Allergy

Take some time to look at the structure of the file.  You can open it directly in Jupyter by clicking the _Home_ icon, then the _from_instructor_ folder, and then the _data_ folder.

Within the file, you'll see that it is a dictionary with many items in it.  One of those items is called `entry` and that item is a list of things.  You can tell that because the item name is immediately followed by an opening square bracket, signifying the start of a list.  It's line 11 of the file: `  "entry": [`

Write a function named `allergy_count(json_file)` that takes as one parameter the name of the JSON file and returns an integer number of entries in that file.  Your function should open the file, read the json into a Python object, and return how many items there are in the list of `entry`s.

In [208]:
import json
ALLERGIES_FILE="allergies.json"

In [209]:
entry = json.load(open(ALLERGIES_FILE))

In [210]:
def allergy_count(json_file):
  """
  This function counts the number of allergy entries in the provided JSON file.

  Parameter:
        json_file (str): The path to the JSON file containing allergy data.

  Returns:
        int: The number of entries in the "entry" list from the JSON file.

  Example:
  >>> allergy_count("allergies.json")
  4
    """

  entry = json.load(open(json_file))
  return len(entry["entry"])

In [211]:
import doctest
doctest.run_docstring_examples(allergy_count, globals(), verbose=True)

Finding tests in NoName
Trying:
    allergy_count("allergies.json")
Expecting:
    4
ok


In [212]:
allergy_count(ALLERGIES_FILE)

4

In [213]:
assert type(allergy_count(ALLERGIES_FILE)) == int
assert allergy_count(ALLERGIES_FILE) == 4

### 30.2 Number of Patients

If you dig a little bit deaper into this list of allergies, you'll see that each result has a patient associated with it.  Create a funcation called `patient_count(json_file)` that will count how many unique patients we have in this JSON structure.  

In [214]:
ALLERGIES_FILE="allergies.json"
import json

def patient_count(json_file):

   """
    Counts the number of unique patients in the provided JSON file.

    This function reads a JSON file containing allergy data, extracts the patient information from each
    entry, and returns the count of unique patients. The patient's display name is used to determine uniqueness.

    Parameter:
        json_file (str): The path to the JSON file containing allergy data.

    Returns:
        int: The number of unique patients represented in the JSON file.

    Example:
    >>> patient_count("allergies.json")
    2

   """
   with open(json_file) as file:
    data = json.load(file)

#Extract the list of allergy entries from the "entry" key in the JSON data.
   entries = data["entry"]

#Use a set to store unique patient IDs
   unique_patients = set()

#Loop through the entries to collect unique patient names
   for entry in entries:
      patient_display = entry.get("resource", {}).get("patient", {}).get("display", None)

      if patient_display:
                unique_patients.add(patient_display)

   return len(unique_patients)



In [215]:
import doctest
doctest.run_docstring_examples(patient_count, globals(), verbose=True)

Finding tests in NoName
Trying:
    patient_count("allergies.json")
Expecting:
    2
ok


In [216]:
type(patient_count)

function

In [217]:
patient_count(ALLERGIES_FILE)

2

In [218]:
assert type(patient_count(ALLERGIES_FILE)) == int
assert patient_count(ALLERGIES_FILE) == 2

### 30.3 How Many Allergies per Patient

Although each entry is a separate allergy, several of them are for the same patient.  Write a function called `allergy_per_patient(json_file)` that counts up how many allergies each patient has.


In [219]:
ALLERGIES_FILE="allergies.json"

def allergy_per_patient(json_file):
  """
    This function reads a JSON file containing allergy data and returns a dictionary
    mapping each patient to the number of allergies they have.

    Parameters:
    json_file (str): The path to the JSON file.

    Returns:
    dict: A dictionary where keys are patient names and values are the count of their allergies.

    Example:

    >>> allergy_per_patient("allergies.json")
    {'Jason Argonaut': 3, 'Paul Boal': 1}


    """
  with open(json_file) as file:
    data = json.load(file)

#Extract the "entry" section from the JSON, which contains the allergy records
  entries = data["entry"]

#Initialize an empty dictionary to store the count of allergies for each patient
  patient_allergy_count = {}

#Loop through each entry to count the allergies for each patient
  for entry in entries:
    patient_display = entry.get("resource", {}).get("patient", {}).get("display", None)

#Check if 'patient_display' contains a valid value (not None or empty).
    if patient_display:
      #if the patient is already in the dictionary, increment their allergy count
      if patient_display in patient_allergy_count:
         patient_allergy_count[patient_display] += 1

      else:
         patient_allergy_count[patient_display] = 1     #If the patient is not in the dictionary, add them with an initial count of 1.


  return dict(patient_allergy_count)




In [132]:
allergy_per_patient(ALLERGIES_FILE)

{'Jason Argonaut': 3, 'Paul Boal': 1}

In [133]:
assert type(allergy_per_patient(ALLERGIES_FILE)) == dict
assert allergy_per_patient(ALLERGIES_FILE) == {'Paul Boal': 1, 'Jason Argonaut': 3}

In [220]:
import doctest
doctest.run_docstring_examples(allergy_per_patient, globals(), verbose=True)

Finding tests in NoName
Trying:
    allergy_per_patient("allergies.json")
Expecting:
    {'Jason Argonaut': 3, 'Paul Boal': 1}
ok


### 30.4 Patient Allergies and Reaction

You'll see in the file that each of the items in the `entry` list have several other attributes including a patient name, substance text representation, and a reaction manifestation.  Create a function named `allergy_list(json_file)` that will create an output list that has patient name, allergy, and reaction for each `entry`.  The actual result you should get will be:

```python
[['Jason Argonaut', 'PENICILLIN G', 'Hives'],
 ['Paul Boal', 'PENICILLIN G', 'Bruising'],
 ['Jason Argonaut', 'SHELLFISH-DERIVED PRODUCTS', 'Itching'],
 ['Jason Argonaut', 'STRAWBERRY', 'Anaphylaxis']]
```

You'll notice that the reaction and the manifestation of that action are lists.  You only need to capture the first reaction and the first manifestation of the action.  That is, if there is a list of things, just output the first one.

In [221]:
def allergy_list(json_file):
  """
  This function generates a list of allergies from a JSON file.

  Parameters:
    json_file (str): The path to the JSON file containing allergy information.

  Returns:
    list: A list of lists, where each inner list contains a patient's name, the medication name,
           and the allergy reaction associated with it.

test:
>>> allergy_list(ALLERGIES_FILE)
[['Jason Argonaut', 'PENICILLIN G', 'Hives'], ['Jason Argonaut', 'SHELLFISH-DERIVED PRODUCTS', 'Itching'], ['Jason Argonaut', 'STRAWBERRY', 'Anaphylaxis'], ['Paul Boal', 'PENICILLIN G', 'Bruising']]
"""

  with open(json_file) as file:
      data = json.load(file)
      entries = data["entry"]

#Initialize an empty list to store the allergy information for each patient.
      allergy_list = []

#Loop through each entry in the list of allergy records.
  for entry in entries:
        patient_name = entry.get('resource').get('patient').get('display');
        med_name = entry.get('resource').get('substance').get('text');
        allergy_reaction = entry.get('resource').get('reaction')[0].get('manifestation')[0].get('text');

# Append a list containing the patient's name, medication name, and allergy reaction to the allergy_list.
        allergy_list.append([patient_name, med_name, allergy_reaction])

  allergy_list.sort()
  return allergy_list



In [222]:
import doctest
doctest.run_docstring_examples(allergy_list, globals(), verbose=True)

Finding tests in NoName
Trying:
    allergy_list(ALLERGIES_FILE)
Expecting:
    [['Jason Argonaut', 'PENICILLIN G', 'Hives'], ['Jason Argonaut', 'SHELLFISH-DERIVED PRODUCTS', 'Itching'], ['Jason Argonaut', 'STRAWBERRY', 'Anaphylaxis'], ['Paul Boal', 'PENICILLIN G', 'Bruising']]
ok


In [223]:
allergy_list(ALLERGIES_FILE)

[['Jason Argonaut', 'PENICILLIN G', 'Hives'],
 ['Jason Argonaut', 'SHELLFISH-DERIVED PRODUCTS', 'Itching'],
 ['Jason Argonaut', 'STRAWBERRY', 'Anaphylaxis'],
 ['Paul Boal', 'PENICILLIN G', 'Bruising']]

In [224]:
assert allergy_list(ALLERGIES_FILE) == [['Jason Argonaut', 'PENICILLIN G', 'Hives'],
 ['Jason Argonaut', 'SHELLFISH-DERIVED PRODUCTS', 'Itching'],
 ['Jason Argonaut', 'STRAWBERRY', 'Anaphylaxis'],
 ['Paul Boal', 'PENICILLIN G', 'Bruising']]


### 30.5 Allergy Reaction

Write a function called `allergy_reaction(json_file,patient,substance)` that takes three parameter and returns the reaction that will happen if the patient takes the specified substance.  You can solve this, in part, by calling your `allergy_list` function inside your new `allergy_reaction` function.

If the substance is not found in the allergy list, the function should return None.

In [225]:
def allergy_reaction(json_file, patient, diagnosis):
    """
    This function retrieves the allergy reaction for a specific patient and diagnosis from a JSON file.

    Parameters:
    json_file (str): The path to the JSON file containing allergy information.
    patient (str): The name of the patient whose allergy reaction is to be retrieved.
    diagnosis (str): The diagnosis or medication related to the allergy.

    Returns:
    str: The allergy reaction associated with the given patient and diagnosis, or None if not found.

    Example:
    >>> allergy_reaction("allergies.json", "Jason Argonaut", "PENICILLIN G")
    'Hives'

    >>> allergy_reaction("allergies.json", "Jason Argonaut", "SHELLFISH-DERIVED PRODUCTS")
    'Itching'

    >>> allergy_reaction("allergies.json", "Jason Argonaut", "STRAWBERRY")
    'Anaphylaxis'
    """

    with open(json_file) as file:
      data= json.load(file)
      entries = data["entry"]

#Initialize an empty dictionary to store patient-medication as keys and their reactions as values
      allergy_dict = {}

#Loop through each allergy record in the entries list
      for entry in entries:
        patient_name = entry.get('resource').get('patient').get('display');  #Extract the patient's name from the entry
        med_name = entry.get('resource').get('substance').get('text');       #Extract the medication name (substance) from the entry
        allergy_reaction = entry.get('resource').get('reaction')[0].get('manifestation')[0].get('text');  # Extract the first manifestation of the allergic reaction from the entry

# Update the 'allergy_dict' dictionary, using the patient name and medication as the key, and the reaction as the value
        allergy_dict.update({patient_name + med_name : allergy_reaction})

#Construct the dictionary key by combining the patient name and diagnosis (medication)
        key = patient+diagnosis


    return allergy_dict.get(key)

In [226]:
import doctest
doctest.run_docstring_examples(allergy_reaction, globals(), verbose=True)

Finding tests in NoName
Trying:
    allergy_reaction("allergies.json", "Jason Argonaut", "PENICILLIN G")
Expecting:
    'Hives'
ok
Trying:
    allergy_reaction("allergies.json", "Jason Argonaut", "SHELLFISH-DERIVED PRODUCTS")
Expecting:
    'Itching'
ok
Trying:
    allergy_reaction("allergies.json", "Jason Argonaut", "STRAWBERRY")
Expecting:
    'Anaphylaxis'
ok


In [227]:
allergy_reaction(ALLERGIES_FILE, 'Jason Argonaut', 'PENICILLIN G')

'Hives'

In [228]:
assert allergy_reaction(ALLERGIES_FILE, 'Jason Argonaut', 'PENICILLIN G') == 'Hives'
assert allergy_reaction(ALLERGIES_FILE, 'Jason Argonaut', 'SHELLFISH-DERIVED PRODUCTS') == 'Itching'
assert allergy_reaction(ALLERGIES_FILE, 'Jason Argonaut', 'STRAWBERRY') == 'Anaphylaxis'
assert allergy_reaction(ALLERGIES_FILE, 'Jason Argonaut', 'PENICILLIN') == None
assert allergy_reaction(ALLERGIES_FILE, 'Paul Boal', 'PENICILLIN G') == 'Bruising'

---

## Check your work above

If you didn't get them all correct, take a few minutes to think through those that aren't correct.


## Submitting Your Work

In order to submit your work, you'll need to save this notebook file back to GitHub.  To do that in Google Colab:
1. File -> Save a Copy in GitHub
2. Make sure your HDS5210 repository is selected
3. Make sure the file name includes the week number like this: `week06/week06_assignment_2.ipynb`
4. Add a commit message that means something

**Be sure week names are lowercase and use a two digit week number!!**

**Be sure you use the same file name provided by the instructor!!**

