# Stanford FHIR for Health Remote Data Access and Utility Exercise

## Learning Objectives and Key Concepts

In this exercise, you will 

- Query patient data hosted in a remote 3rd party server
- Use Python to convert JSON formatted data into a Pandas Dataframe
- Query active Prescriptions in our Patient cohort
- Understand the (non-FHIR) Drug-on-Drug Interaction API and learn how to query it
- Combine the FHIR data with the non-FHIR API to determine Drug-on-Drug Interactions.

## Drug on Drug Interactions

For this exercise we will explore potential drug on drug interactions in a sizable patient cohort stored in FHIR combined with drug interaction data from the NIH's Drug RxNAV database. 

## Motivation/Purpose
From a research persective we can envision leveraging these sorts of analyses to do post-market surveillance of drugs to determine both the rate of known adverse events among patients, as well as to potentially flag additional risks not yet identified. 

From a clinical perspective, this exercise demonstrates the power of SMART-on-FHIR applications, where third-party data (in this case Drug on Drug interaction data), can be pulled in, paired with FHIR formatted clinical data, and then leveraged to better inform patient care in the form of Clinical Decision Support tools.

This exercise does NOT build a SMART-on-FHIR application, but gives you a sense of the type of applications you could build.

## Step 1: Query all active prescriptions in our patient cohort

For this exercise we will call on the 'MedicationRequest' resource which is the closest equivalent to a prescription resource in FHIR. 

Each item in this resource is effectively a single prescription, such that you have a many-to-one relationship of prescriptions to patients.

(This fact will be critical for our exercise, as determining a potential drug on drug interaction will require effectively grouping medication request resources by patient, to determine if the patient is on multiple concurrent prescriptions. We will therefore want to make sure we can include the relevant patient information to ensure we can map multiple prescriptions to individual patients.)

In [None]:
# Here's a set of libraries we'll import for this exercise
import requests
import json
import pandas as pd
from pandas import json_normalize

### Compose the FHIR query

First compose a query to pull the 'MedicationRequest' resource from the FHIR server. Then convert it to JSON format. Optionally, you could output the resulting JSON file to confirm that you've successfully queried the database.

In [None]:
r = requests.get(f"https://api.logicahealth.org/StanfordPythonHealth/open/MedicationRequest?", verify=False)
bundle = 
r

We can now leverage the methods we deployed previously in Exercises 1 and 2 to map and ultimately convert our JSON into a dataframe. 

As a first step let's leverage the list mapping lambda function we deployed in Exercise 1 to map out our JSON file (entering the entire bundle, and mapping by resource) 

As a sanity test let's return the first resource item (index 0 or [0]) so we can get a better look at what information we have to work with.

In [None]:
prescriptions = list(map(lambda e: e['resource'], bundle['entry']))


### Convert Data onto a Pandas Dataframe

Now that we've confirmed that we've extracted information we need from our FHIR server, we will then take the FHIR formatted data and convert it into a pandas dataframe for subsequent analysis.

Based on our previous exercises we know we can use the `json_normalize` function parse the JSON into a pandas dataframe. Let's do that now and then output the resulting dataframe to confirm we've successfully converted it.

In [None]:
dfprescriptions = 

Depending on how you've parsed it, certain fields are immediately usable in their current form. For others, we're going to need to do further work to parse out the precise information we want to work with. 

While we will pause our work on the dataframe for the moment, it may be worth pausing to document our current set of available features and their potential utility.

So we now have a basic datafame with drug and patient information. Before we can begin trying to construct a parser, we need to examine our Drug Interaction API to see how data is submitted and returned.

## Step 2: Understanding the Drug API and using that API with FHIR data

Reviewing the NIH's RX Norm database documentation. Link here: https://lhncbc.nlm.nih.gov/RxNav/APIs/index.html

We see one clear option we have to use is the RX CUI code using the six-digit NDC code
https://lhncbc.nlm.nih.gov/RxNav/APIs/api-RxNorm.getNDCs.html

This correllates with our Patient data column: `resource.medicationCodeableConcept.coding.codes` (quite a mouthful! But we'll deal with that shortly).

Let's pull two sample interactions using the following general notation:

URL/list.json?rxcuis=[code 1]+[code 2]


 - The URL is: https://rxnav.nlm.nih.gov/REST/interaction/

Two combinations we can try are:
 - 207106 and 656659
 - 762675 and 859258

Convert the response to both into JSON format and output it to see what sort of information the API returns

In [None]:
d = 

In [None]:
d = 

Feel free to experiment with additional drug combinations, including 3 or more drugs to see how the information varies.

Taking stock, we have successfully accessed the Drug API, and hopefully now have an understanding of what the API returns when there is a drug interaction versus when there isn't.

We now have important information informing our next steps. 

First, we have a structured target to work toward for submitting our patient data to the Drug API. For each patient, we will need to compile a list of RXCUI codes of the prescriptions they are on, and then append them to our API query with a `+` between each code. For our next step we'll go about constructing that!

Second, we have an understanding of how the Drug API returns a known interaction, versus how it returns when there isn't one. We can begin to consider how the format of this data can be used to indicate - in bulk - the presence or absence of a reaction.

## Step 3: Construct a composite list of all drugs per-patient (so we can determine a potential Drug on Drug interaction

So now we know that in order to engage our RXNorm server we need to extract and submit our patient's six digit RXCUI code, let's go back to our original mapped JSON data and try to do a list comprehension to extract the specific code.

Then wrapping that within a `Series` function and executing a `to_frame()` method on the resulting series, we can create a dataframe with our desired RXCUI code. Try it now. Output the result to confirm you've successfully extracted the desired information and converted it properly.

In [None]:
rxcodes = pd.Series([codings['code'] for MedicationRequest in prescriptions for codings in MedicationRequest['medicationCodeableConcept']['coding']], name='rxcode')

dfcode = 

Let's now consolidate our dataframe to retain the information we need. Specifically we'll need information identifying the patient, an indication on whether or not the prescription is active or not (as only active prescriptions could cause a drug interaction, and finally the RXCUI code we previously extracted. 

Construct your final dataframe and then output the result to confirm you've retained the desired information.

### Filter data to only include active prescriptions

We want to ensure that we're only querying active prescriptions. If a patient is no longer taking a drug, the risk of a Drug-on-Drug interaction is no longer applicable. If any inactive prescrptions are present, then filter your dataframe to ensure that only active prescrptions are included. 

Conduct a value count to confirm that only active prescriptions remain.

### Merge our prescriptions into a list by patient

We now need to create a list of drug codes for each patient, in order to feed that list into the RXNav API. 

Our desired output will look something like this where we have a tuple-like structure of patient ID, and a list of codes:

![Screen%20Shot%202022-01-24%20at%2012.55.48%20AM.png](attachment:Screen%20Shot%202022-01-24%20at%2012.55.48%20AM.png)

Hint: to accomplish this try modifying the GroupBy function to merge our drugs by patient, and then apply a lambda function, to append the code values to a lst.

In [None]:
groups_by_patient = dffinal.groupby('subject.reference', sort=False)['rxcode'].apply(lambda x: x.values.tolist())


So now we've generated a list of active prescriptions for each patient, we can append this list to the RXNav query and determine whether each of these patients have a drug interaction.

## Step 4: Loop through our entire cohort and determine each patient's drug interactions

To recap: we now have a list of patients with associated drug codes in list form, and we know how to query the RXNav API to determine if a drug interaction exists. 

As a last step, create a series of functions to iterate through our patient list and for each patient return whether or not a Drug on Drug interaction could occur.

It might help to compose a helper function for taking a string of RXCodes (e.g., `123456+654321`) and submit it to the API, and returns the result as a formatted JSON. 

Test our original two drug combinations to ensure that it is effectively outputting the expected responses.

In [None]:
# Function for calling NIH API
def get_api_data(drug_list):
    try:
        url = 'https://rxnav.nlm.nih.gov/REST/interaction/list.json?rxcuis=' + drug_list
        response = (requests.get(url).text)
        response_json = json.loads(response)
        return response_json

    except Exception as e:
        raise e

Now the tricky part!

For this step you'll want to loop through your patients, and for each patient, append each drug codeto a string separated by a plus sign (i.e., `drug1+drug2`). 

Test this mechanism to make sure you are creating the proper string for each patient by outputting it directly.

Then modify your `for` loop to insert that string into the helper function you previously created to return the desired API result. 

Based on the result you can tailor the output to return if an interaction might occur or not. 

While you can choose how you want to format this, here is one possible output format you may want to build towards:

![Screen%20Shot%202022-01-24%20at%201.02.15%20AM.png](attachment:Screen%20Shot%202022-01-24%20at%201.02.15%20AM.png)

In [None]:
# Declare variables for patient index counting and the sum total of potential drug interactions
count_drug_int = 0
patient_index = 0

In [None]:
# Iterate through each patient list of medications
for drug_list in groups_by_patient:
    print('For',dffinal['subject.reference'][patient_index])
    joined_drug_list = "+".join(str(i) for i in drug_list)
    data = get_api_data(joined_drug_list) # returns JSON response
    if 'fullInteractionTypeGroup' not in data:
        print('No known drug interaction for drugs: ', joined_drug_list,'\n')
        patient_index += len(drug_list)
        continue
    count_drug_int += 1
    patient_index += len(drug_list)
    print('Possible Drug interaction for Drugs: ', joined_drug_list,'\n')
patient_index +=1

print('Total number of possible drug interactions in patient cohort: ', count_drug_int)

As a bonus consider some additional information you can output, such as keeping a running count of total interactions, or specific details about the interaction types.

## Summary

This exercise demonstrates how FHIR data can interact with the broader ecosystem of healthcare data and resources to determine additional health care insights. Here we pulled data from multiple resources into a unified dataframe, and then modified how the data was stored in order to pass it through to a third-party API and determine health outcomes.