# MIMIC-IV-on-FHIR Google Healthcare API Tutorial
This tutorial will walk through using MIMIC-IV-on-FHIR on GCP. The GCP Healthcare API provides the primary features all FHIR servers have.

The following features will be explored:
- Search with Export
  - Search by gender
  - Search by condition
  - Search by procedure
  - Search by medication
- Export patient-everything 


## Initial Steps
To start the tutorial, the following steps must be completed first:
1. Ensure you have the *Healthcare FHIR Resource Reader* role on the MIMIC-IV-on-FHIR datastore (contact the kind-lab group if not set up)
2. Create a GCP project <project_name> for use in the tutorial
3. Add a storage account to you GCP project
    - Create a bucket <bucket_name> 
    - Set the location of the bucket to "us-central1"
    - Create a folder <export_folder> where exported resources can be sent
4. Run `gcloud auth login` from the cmd to initialize CLI (if not done already)
5. Run `gcloud init` from the cmd to set user and project
    - Use the <project_name> project you created as default
6. Update your `.env` file to have the following:
    - export GCP_PROJECT="kind-lab"
    - export GCP_BUCKET=<bucket_name>
    - export GCP_EXPORT_FOLDER=<export_folder>
    - export GCP_LOCATION="us-central1
    - export GCP_DATASET="mimic-iv-fhir-dataset"
    - export GCP_FHIRSTORE="mimic-iv-fhir-v2-demo"


NOTE: If you are facing issues exporting, it may be due to the default project not being found. Two things can be attempted:
- Run `gcloud auth application-default login` and select your project 
- OR You can update the following line in the *export_resource_to_storage* function
  - storage_client = storage.Client() --> storage_client = storage.Client(project=<project_name>)

In [None]:
from dotenv import load_dotenv
from pathlib import Path
import google.auth
from google.auth.transport import requests
from google.cloud import storage
import json
import time
import os

## Environment Variables
To run this tutorial a few GCP components need to be set up.

In [None]:
load_dotenv(Path(Path.cwd()).parents[0].parents[0] / '.env')

In [None]:
# Fixed kind-lab variables
GCP_PROJECT = os.getenv('GCP_PROJECT')
GCP_LOCATION = os.getenv('GCP_LOCATION')
GCP_DATASET = os.getenv('GCP_DATASET')
GCP_FHIRSTORE = os.getenv('GCP_FHIRSTORE')

# Custom variables to configure exporting
GCP_BUCKET = os.getenv('GCP_BUCKET')
GCP_EXPORT_FOLDER = os.getenv('GCP_EXPORT_FOLDER')

credentials, project = google.auth.default()
project

In [None]:
session = requests.AuthorizedSession(credentials)
base_url = "https://healthcare.googleapis.com/v1"

project_url = f'{base_url}/projects/{GCP_PROJECT}/locations/{GCP_LOCATION}'
fhir_url = f'{project_url}/datasets/{GCP_DATASET}/fhirStores/{GCP_FHIRSTORE}/fhir'
headers = {"Content-Type": "application/fhir+json;charset=utf-8"}

## Support Functions

In [None]:
# export function 
# -- write out all the resources to the GCP export folder
def export_resources_to_storage(resources, resource_type, criteria, filter, pagenum=1, current_time=None):
    rlist = [json.dumps(rsrc['resource']) for rsrc in resources['entry']]
    output_bundle = '\n'.join(rlist)
    if current_time is None:
        current_time = time.strftime("%Y%m%d-%H%M%S")

    storage_client = storage.Client()
    bucket = storage_client.get_bucket(GCP_BUCKET)
    filename = f"{GCP_EXPORT_FOLDER}/search/{current_time}-{resource_type}-{criteria}-{filter}/{pagenum}.ndjson"
    blob = bucket.blob(filename)
    blob.upload_from_string(output_bundle)

    link_info = [
        rsrc for rsrc in resources['link'] if rsrc['relation'] == 'next'
    ]
    if len(link_info) > 0:
        pagenum = pagenum + 1
        response = session.get(link_info[0]['url'], headers=headers)
        new_resources = response.json()
        export_resources_to_storage(new_resources, resource_type, criteria, filter, pagenum, current_time)
    print(f'Exported resources to {filename}')
    return filename

def count_patients(resources):
    patients = [pat for pat in resources['entry'] if pat['resource']['resourceType']=='Patient']
    return len(patients)

def get_linked_patients(resources):
    patients = [pat['resource'] for pat in resources['entry'] if pat['resource']['resourceType']=='Patient']
    return patients

# print function
# -- simple summary statement with the number of resources with the metrics
# -- statement will say: X resources have Y criteria!
def print_search_results(resources, resource_type, criteria, filter):
    total_num = resources['total']
    patients = [pat for pat in resources['entry'] if pat['resource']['resourceType']=='Patient']
    msg = f'SUMMARY RESULTS: {total_num} {resource_type} resources have {resource_type}.{criteria} equal to {filter}'
    if len(patients) > 0:
        msg = f'{msg}. {len(patients)} Patient(s) linked with {resource_type} resources'
    print(msg)

def resource_handling(resources, resource_type, criteria, filter):   
    print_search_results(resources, resource_type, criteria, filter)
    if export_flag: 
        export_resources_to_storage(resources, resource_type, criteria, filter)

## Search Resources
FHIR has provided extensive capabilities to search the resources and the relations between resources. The following examples were created to demonstrate the search functionality on MIMIC-IV-on-FHIR:
- Search all Patients by gender
- Search all Conditions by a code
- Search all Procedures by a code
- Search all Medicaiton by a code

All searches have the added `_include` parameter to return the associated patient to the primary resource.

A summary print statement will be output for each search, with the option of exporting the result to your project bucket dependant on the `export_flag` specified below:

In [None]:
# Decide if you want all resources exported to Cloud Storage or just get summary print statements
export_flag = False

### Search By Gender
Search for all patients with a certain gender

In [None]:
resource_type = 'Patient'
gender = 'female'

resource_url = f'{fhir_url}/{resource_type}/_search?gender={gender}'
response = session.post(resource_url, headers=headers)
resources = response.json()
resource_handling(resources, resource_type, 'gender', gender)

### Search by Condition
Search for all Condition resources with a certain condition. All associated patients will be returned as well.

In [None]:
resource_type = 'Condition'
code = '99591' #Sepsis

resource_url = f'{fhir_url}/{resource_type}/_search?code={code}&_include={resource_type}:subject'
response = session.post(resource_url, headers=headers)
response.raise_for_status()
resources = response.json()
resource_handling(resources, resource_type, 'code', code)

### Search by Procedure
Search for a specific Procedure resource with a certain code. All associated patients will be returned as well.

In [None]:
resource_type = 'Procedure'
code = '227194' #Extubation

resource_url = f'{fhir_url}/{resource_type}?code={code}&_include={resource_type}:subject'
response = session.get(resource_url, headers=headers)
response.raise_for_status()
resources = response.json()
resource_handling(resources, resource_type, 'code', code)

### Search by Medication
Search for a Medication resource by a certain code. All associated patients will be returned as well.

In [None]:
resource_type = 'MedicationAdministration'
code = 'NACLFLUSH' #Extubation

resource_url = f'{fhir_url}/{resource_type}/_search?medicationCodeableConcept.coding.code={code}'
response = session.post(resource_url, headers=headers)
response.raise_for_status()
resources = response.json()
resource_handling(resources, resource_type, 'code', code)

## Export patient-everything
A patient-everything export allows you to get patient with all the user specified resources. 
- The export will be sent to your project bucket under the *patient-everything* folder
- The resources output with the patient can be specified as any valid FHIR resource type 

In [None]:
GCP_PATIENT_EVERYTHING_FOLDER = f'patient-everything/bundles-{time.strftime("%Y%m%d-%H%M%S")}'

In [None]:
# Support functions
def get_resource_ids(fhir_url, resource_type):
    resource_url = f'{fhir_url}/{resource_type}/_search?_elements=id'
    response = session.post(resource_url, headers=headers)
    response.raise_for_status()
    resources = response.json()
    patient_ids = [ entry['resource']['id'] for entry in resources['entry']]
    return patient_ids

def send_patient_everything(export_url, headers, patient_id, page_num=1):
    response = session.get(export_url, headers=headers)
    response.raise_for_status()
    resp_fhir = response.json()

    if 'error' in resp_fhir:
        print('ERROR IN RESPONSE')
    elif resp_fhir['resourceType'] == 'OperationOutcome':
        print(resp_fhir['issue'][0])
    elif  ((resp_fhir['resourceType'] == 'Bundle') and ('link' in resp_fhir)):
        filename = export_bundle_to_storage(resp_fhir, patient_id, page_num)
        print(f'Stored file: {filename}')
        link_info = [
            resp for resp in resp_fhir['link'] if resp['relation'] == 'next'
        ]
        if len(link_info) > 0:
            send_patient_everything(
                link_info[0]['url'], headers, patient_id, page_num + 1
            )
    else:
        filename = export_bundle_to_storage(resp_fhir, patient_id, page_num)
        print(f'Stored file: {filename}')


    return resources

def export_bundle_to_storage(resp_fhir, patient_id, page_num):
    bundle = resp_fhir

    storage_client = storage.Client()
    bucket = storage_client.get_bucket(GCP_BUCKET)
    filename = f"{GCP_PATIENT_EVERYTHING_FOLDER}/patient-{patient_id}-page{page_num}"
    blob = bucket.blob(filename)
    blob.upload_from_string(json.dumps(bundle))
    return filename

In [None]:
# Export patient-everything

resource_type = 'Patient'
output_resource_types = 'Patient,Encounter,Condition,Procedure' # resource types to output
num_patients = 1
count = 100 # how many resources per bundle page

patient_list = get_resource_ids(fhir_url, resource_type)
patient_list

if num_patients > len(patient_list):
    num_patients = len(patient_list)
for idx in range(0,num_patients):
    patient_id = patient_list[idx]
    export_url = f'{fhir_url}/Patient/{patient_id}/$everything?_count={count}&_type={output_resource_types}'
    print(export_url)
    resources = send_patient_everything(export_url, headers, patient_id)