<a href="https://colab.research.google.com/github/pthammaneni041218/HDS-5210/blob/main/week06_assignment.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Week 6 Exercises

_McKinney 6.1_

There are multiple ways to solve the problems below.  You can use any one of several approaches.  For example, you can read CSV files using Pandas or the csv module.  Your score won't depend on which modules you choose to use unless explicitly noted below, but your programming style will still matter.

### 30.1 List of Allergies

In the /data directory on the Jupyter server, there is a file called `allergies.json` that contains a list of patient allergies.  It is taken from sample data provided by the EHR vendor, Epic, here: https://open.epic.com/Clinical/Allergy

Take some time to look at the structure of the file.  You can open it directly in Jupyter by clicking the _Home_ icon, then the _from_instructor_ folder, and then the _data_ folder.

Within the file, you'll see that it is a dictionary with many items in it.  One of those items is called `entry` and that item is a list of things.  You can tell that because the item name is immediately followed by an opening square bracket, signifying the start of a list.  It's line 11 of the file: `  "entry": [`

Write a function named `allergy_count(json_file)` that takes as one parameter the name of the JSON file and returns an integer number of entries in that file.  Your function should open the file, read the json into a Python object, and return how many items there are in the list of `entry`s.

In [176]:
import json
ALLERGIES_FILE="allergies.json"

In [177]:
import json

def allergy_count(json_file):
    """
    Counts the number of allergy entries in the JSON file.

    Args:
        json_file (str): The name of the JSON file containing allergies data.

    Returns:
        int: The number of allergy entries.

    Example:
        >>> count = allergy_count("allergies.json")
        >>> print(count)  # This will print the total number of allergy entries.
    """
    # This opens the JSON file and read its contents
    with open(json_file, 'r') as f:
        data = json.load(f)  # Load the JSON data into a Python dictionary

    # hTis step returns the count of entries in the "entry" list
    return len(data["entry"])  # Count the number of entries and return it


In [178]:
allergy_count(ALLERGIES_FILE)

4

In [179]:
assert type(allergy_count(ALLERGIES_FILE)) == int
assert allergy_count(ALLERGIES_FILE) == 4

### 30.2 Number of Patients

If you dig a little bit deaper into this list of allergies, you'll see that each result has a patient associated with it.  Create a funcation called `patient_count(json_file)` that will count how many unique patients we have in this JSON structure.  

In [180]:
def patient_count(json_file):
    """
    Counts the number of unique patients in the JSON file containing allergy data.

    Args:
        json_file (str): The name of the JSON file containing allergy data.

    Returns:
        int: The number of unique patients.

    Example:
        >>> count = patient_count("allergies.json")
        >>> print(count)  # This will print the number of unique patients.
    """
    unique_patients = set()  # Use a set to store unique patient names

    # This step opens the JSON file and read its contents
    with open(json_file, "r") as f:
        data = json.load(f)  # Load the JSON data into a Python dictionary

        # Now we need to iterate through each entry in the "entry" list
        for entry in data["entry"]:
            patient = entry["resource"]["patient"]["display"]  # Get the patient name
            unique_patients.add(patient)  # Add the patient to the set (only unique values are stored)

    # Finally this return the number of unique patients
    return len(unique_patients)  # Return the count of unique patients


In [181]:
patient_count(ALLERGIES_FILE)

2

In [182]:
assert type(patient_count(ALLERGIES_FILE)) == int
assert patient_count(ALLERGIES_FILE) == 2

### 30.3 How Many Allergies per Patient

Although each entry is a separate allergy, several of them are for the same patient.  Write a function called `allergy_per_patient(json_file)` that counts up how many allergies each patient has.


In [183]:
def allergy_per_patient(json_file):
    """
    Counts how many allergies each patient has.

    Args:
        json_file (str): The name of the JSON file containing allergy data.

    Returns:
        dict: A dictionary where the keys are patient names and the values are counts of their allergies.

    Example:
        >>> allergies_count = allergy_per_patient("allergies.json")
        >>> print(allergies_count)  # This will print a dictionary of patients and their allergy counts.
    """
    allergy_counts = {}  # Initialize an empty dictionary is created to store counts

    # This is to open the JSON file and read its contents
    with open(json_file, "r") as f:
        data = json.load(f)  # Load the JSON data into a Python dictionary

        # Now iterate through each entry in the "entry" list
        for entry in data["entry"]:
            patient = entry["resource"]["patient"]["display"]  # Get the patient name

            # Update the count for this patient
            if patient in allergy_counts:
                allergy_counts[patient] += 1  # Increment count if patient exists
            else:
                allergy_counts[patient] = 1  # Initialize count if patient is new

    return allergy_counts  # Return the dictionary of allergy counts


In [184]:
allergy_per_patient(ALLERGIES_FILE)

{'Jason Argonaut': 3, 'Paul Boal': 1}

In [185]:
assert type(allergy_per_patient(ALLERGIES_FILE)) == dict
assert allergy_per_patient(ALLERGIES_FILE) == {'Paul Boal': 1, 'Jason Argonaut': 3}

### 30.4 Patient Allergies and Reaction

You'll see in the file that each of the items in the `entry` list have several other attributes including a patient name, substance text representation, and a reaction manifestation.  Create a function named `allergy_list(json_file)` that will create an output list that has patient name, allergy, and reaction for each `entry`.  The actual result you should get will be:

```python
[['Jason Argonaut', 'PENICILLIN G', 'Hives'],
 ['Paul Boal', 'PENICILLIN G', 'Bruising'],
 ['Jason Argonaut', 'SHELLFISH-DERIVED PRODUCTS', 'Itching'],
 ['Jason Argonaut', 'STRAWBERRY', 'Anaphylaxis']]
```

You'll notice that the reaction and the manifestation of that action are lists.  You only need to capture the first reaction and the first manifestation of the action.  That is, if there is a list of things, just output the first one.

In [186]:
def allergy_list(json_file):
    """
    Creates a list of allergies for each patient, including the patient's name, substance they are allergic to,
    and their reaction.

    Args:
        json_file (str): The name of the JSON file containing allergy data.

    Returns:
        list: A sorted list of lists, where each inner list contains the patient's name,
              the substance they are allergic to, and the first reaction.

    Example:
        >>> allergies = allergy_list("allergies.json")
        >>> print(allergies)
        # This will print a list of allergies like the following:
        # [['Jason Argonaut', 'PENICILLIN G', 'Hives'], ...]
    """
    result = []  # Initialize an empty list to store the results

    # Open the JSON file and read its contents
    with open(json_file) as file:
        data = json.load(file)  # Load the JSON data into a Python dictionary

    # Iterate through each entry in the "entry" list
    for entry in data['entry']:
        patient = entry['resource']['patient']['display']  # Get the patient name
        allergy = entry['resource']['substance']['text']  # Get the allergy substance
        reaction = entry['resource']['reaction'][0]['manifestation'][0]['text']  # Get the first reaction

        # Append the patient's name, allergy, and reaction as a list to the result
        result.append([patient, allergy, reaction])

    return sorted(result)  # Return the sorted list of results


In [187]:
allergy_list(ALLERGIES_FILE)

[['Jason Argonaut', 'PENICILLIN G', 'Hives'],
 ['Jason Argonaut', 'SHELLFISH-DERIVED PRODUCTS', 'Itching'],
 ['Jason Argonaut', 'STRAWBERRY', 'Anaphylaxis'],
 ['Paul Boal', 'PENICILLIN G', 'Bruising']]

In [188]:
assert allergy_list(ALLERGIES_FILE) == [['Jason Argonaut', 'PENICILLIN G', 'Hives'],
 ['Jason Argonaut', 'SHELLFISH-DERIVED PRODUCTS', 'Itching'],
 ['Jason Argonaut', 'STRAWBERRY', 'Anaphylaxis'],
 ['Paul Boal', 'PENICILLIN G', 'Bruising']]


### 30.5 Allergy Reaction

Write a function called `allergy_reaction(json_file,patient,substance)` that takes three parameter and returns the reaction that will happen if the patient takes the specified substance.  You can solve this, in part, by calling your `allergy_list` function inside your new `allergy_reaction` function.

If the substance is not found in the allergy list, the function should return None.

In [189]:
def allergy_reaction(json_file, patient, substance):
    """
    Retrieves the reaction for a specific patient's allergy to a specified substance.

    Args:
        json_file (str): The name of the JSON file containing allergy data.
        patient (str): The name of the patient whose allergy reaction is to be retrieved.
        substance (str): The name of the substance to check for allergies.

    Returns:
        str or None: The reaction the patient has to the specified substance,
                     or None if the patient does not have an allergy to that substance.

    Example:
        >>> reaction = allergy_reaction("allergies.json", "Jason Argonaut", "PENICILLIN G")
        >>> print(reaction)  # This will print 'Hives'
    """
    allergies = allergy_list(json_file)  # Get the list of allergies
    for record in allergies:
        # Check if the record matches the specified patient and substance
        if record[0] == patient and record[1] == substance:
            return record[2]  # Return the reaction if found
    return None  # Return None if no match is found


In [190]:
allergy_reaction(ALLERGIES_FILE, 'Jason Argonaut', 'PENICILLIN G')

'Hives'

In [191]:
assert allergy_reaction(ALLERGIES_FILE, 'Jason Argonaut', 'PENICILLIN G') == 'Hives'
assert allergy_reaction(ALLERGIES_FILE, 'Jason Argonaut', 'SHELLFISH-DERIVED PRODUCTS') == 'Itching'
assert allergy_reaction(ALLERGIES_FILE, 'Jason Argonaut', 'STRAWBERRY') == 'Anaphylaxis'
assert allergy_reaction(ALLERGIES_FILE, 'Jason Argonaut', 'PENICILLIN') == None
assert allergy_reaction(ALLERGIES_FILE, 'Paul Boal', 'PENICILLIN G') == 'Bruising'

---

## Check your work above

If you didn't get them all correct, take a few minutes to think through those that aren't correct.


## Submitting Your Work

In order to submit your work, you'll need to save this notebook file back to GitHub.  To do that in Google Colab:
1. File -> Save a Copy in GitHub
2. Make sure your HDS5210 repository is selected
3. Make sure the file name includes the week number like this: `week06/week06_assignment_2.ipynb`
4. Add a commit message that means something

**Be sure week names are lowercase and use a two digit week number!!**

**Be sure you use the same file name provided by the instructor!!**

