## FHIR Query Activity (Python)

Querying data from a FHIR server is an essential component of any health IT solution. There are two main approaches to querying clinical data:
        
        1) Querying for resources related to a specific cohort of individuals
        2) Querying resources related to a specific individual
This query activity will give examples of both forms. It will help you practice (a) building queries, and (b) parsing the resulting resources to access the data fields you need. You may need to look through the FHIR specification for a certain resource to know which fields you can query for, look through SNOMED CT or RxNorm browsers to know which codes to use for medications or conditions, or look through other terminologies as used by the FHIR resource specification.

The specification page for each resource will have a section on "Search Parameters" which will guide you in knowing what you can include in a query.

### Data Server

We will be using an HSPC Sandbox as the source of our FHIR data. Follow this [link](https://sandbox.hspconsortium.org) to create an account. Once logged in, create a sandbox which uses FHIR version R4, which allows an open endpoint, and which imports sample patients.

![create-sandbox](imgs/mynewsandbox.png)

Once created, click on the "Settings" tab on the left and copy the "Open FHIR Server URL". It should look like this:

![open-url](imgs/mynewopenurl.png)

In [1]:
import requests,json,statistics
# Paste your Open FHIR Server URL here
base = 'https://api.logicahealth.org/mynewsandbox/open'

### Pagination

When returning data for a wide-net cohort query (e.g. return all males patients on server), FHIR servers return the results as a FHIR Bundle resource . This is opposed to a specific by-ID query (e.g. return the patient whose ID is 1234) which will only return a single resource with that ID. 

FHIR Bundle resources have a default limit of 20 resources <i>per page</i>. This can be extended up to 200 resources per page by including the following parameter in the query.
        
        &_count=200
        
However, most queries will match more than 200 resources. This is where pagination will come into play. The server has multiple pages of responses, but can only send them one at a time. If your query has additional pages of responses waiting, the bundle you get will include a link to the next page of resources. It will also tell you the "total" number of resource included across all pages of responses.

        {
            "resourceType": "Bundle",
            "id": "c413c0a7-e9a6-43a3-9d99-8ababefe8bbf",
            "meta": {
                "lastUpdated": "2019-06-18T14:18:47.329+00:00"
            },
            "type": "searchset",
            "total": 272,
            "link": [
                {
                    "relation": "self",
                    "url": "<URL YOU JUST USED>"
                },
                {
                    "relation": "next",
                    "url": "<URL YOU CAN USE TO GET THE NEXT PAGE>"
                }
            ],
            ....

You can then extract the link, perform another GET request, get that next page, and repeat. Once a bundle no longer contains a "next" link, you know that you have gotten to the last page of responses. An example query will be included below to show one way of looping through bundles. However, this approach could change in relation to your specific use-case.

In [2]:
def next_bundle(url):
    next = True
    r = requests.get(url)
    bundle = json.loads(r.text)
    try:
        for link in bundle['link']:
            if link['relation'] == 'self':
                next = False
            if link['relation'] == 'next':
                next = True
                url = link['url']
    except:
        print('server error')
    return next,url,bundle

#### Example - the number of male patients who live in each state

In [8]:
next = True
url = base + '/Patient?gender=male&_count=200'
#Use a dictionary to store language counts
state_counts = {}

while next == True:
    #Check to see if we will need to handle another bundle after this current one
    next,url,bundle = next_bundle(url)
    #Parse resources in the current bundle
    for r in bundle['entry']:
        try:
            #Pull the state from the address field of the resource
            state = r['resource']['address'][0]['state']
            #If the state has already been seen, increment its counter, if not, add it to the dictionary
            if state in state_counts:
                state_counts[state] += 1
            else:
                state_counts[state] = 1
        except:
            pass
print(state_counts)

{'OK': 27, 'NC': 1, 'MA': 1, 'Utah': 7}


#### 1. The code of the condition afflicting the most patients
Solution: 38341003

#### 2. The number of encounters in 2009
Solution: 70

#### 3. The average of all height observations
Hint: the python statistics library (imported above) can find the average of a list using the statistics.mean( ) command

Solution: 158.7191802969917

#### 4. First and last name of the heaviest patient
Solution: Kimberly Moore

#### 5. Number of patients taking 'clopidogrel 75 MG Oral Tablet [Plavix]'
Hint: use the MedicationDispense resourceHint: use the MedicationDispense resource

Solution: 52

#### 6. Number of patients born in the month of December
Solution: 12

#### 7. The birth year of the oldest patient with a peanut allergy
Solution: 1968

#### 8. The average heart_rate of Brian Q. Gracia for each year recorded
Printed as a dictionary where key=year and value=rate average for that year

Solution: {'1997': 93.5, '2001': 76.0, '2003': 74.45454545454545, '2006': 77.0}