# Populating a FHIR Server with Synthea Data

## Generating a Synthetic Patient Population

Download [Synthea](https://github.com/synthetichealth/synthea)

```shell
git clone https://github.com/synthetichealth/synthea.git
cd synthea
```

Generate a Synthetic Dataset of 1000 individuals

```shell
./run_synthea -p 1000
```

Once that's completed, we should have 1000 FHIR Bundles in `/output/fhir`. Copy that directory into the same directory as this notebook and we'll continue from there.

You can also downloaded a recently geneated data set here: https://synthetichealth.github.io/synthea-sample-data/downloads/synthea_sample_data_fhir_r4_nov2021.zip


In [1]:
import os
import json
import requests
import glob
import time
from IPython.display import clear_output

s = requests.Session()
s.headers.update({'Accept':'application/fhir+json', 'Content-Type': 'application/fhir+json'})
s.verify = False
requests.packages.urllib3.disable_warnings()

synthea_data_directory = 'fhir'
fhir_server = 'https://api.logicahealth.org/opioids/open'

In [4]:
# Grab the patient identifier from the transaction bundle
def patient_identifier_in_transaction(transaction):
    for entry in transaction['entry']:
        if entry['resource']['resourceType'] == 'Patient':
            identifier = entry['resource']['identifier'][0]
            break
    return f"{identifier['system']}|{identifier['value']}"

# See if the patient record already exists on the server
def record_on_server(filename):
    with open(filename, 'r') as json_file:
        file_content = json.load(json_file)
        r = s.get(f"{fhir_server}/Patient", params={'identifier': patient_identifier_in_transaction(file_content)})
        return 'entry' in r.json()

def post_transaction_bundle(filename):
    with open(filename, 'r') as json_file:
        file_content = json.load(json_file)
        r = s.post(fhir_server, data=json.dumps(file_content))
        if not r.status_code == requests.codes.ok:
            print(f'Failed to upload {filename}. Received Status Code {r.status_code}')
            time.sleep(120)

Synthea puts the hospital (FHIR Organization Resources) and practitioner (FHIR Practitioner Resources) in separate transaction bundles that need to be loaded first.

See the [Synthea FHIR Transaction Bundles Wiki Page](https://github.com/synthetichealth/synthea/wiki/FHIR-Transaction-Bundles) for more details.

_Note: Depending on the server, you may experience timeouts when performing the following requests since it can take a few minutes for the FHIR server to process all the transactions in the bundle._

In [3]:
# Post hospital transaction bundle
post_transaction_bundle(glob.glob(os.path.join(synthea_data_directory, 'hospitalInformation*.json'))[0])

Failed to upload fhir/hospitalInformation1640814769641.json. Received Status Code 504


In [4]:
# Post practitioner transaction bundle
post_transaction_bundle(glob.glob(os.path.join(synthea_data_directory, 'practitionerInformation*.json'))[0])

Failed to upload fhir/practitionerInformation1640814769641.json. Received Status Code 504


In [5]:
# Post the patient transaction bundles
for filename in os.scandir(synthea_data_directory):
    if filename.name.startswith('hospitalInformation') or filename.name.startswith('practitionerInformation') or not filename.name.endswith('.json'):
        continue
    if record_on_server(filename.path):
        print(f'{filename.path} already loaded')
        continue
    post_transaction_bundle(filename.path)

fhir/Doloris378_Konopelski743_3c2f9134-21ad-ac38-1f7e-d9fa0e0c010b.json already loaded
Failed to upload fhir/Sung603_Eichmann909_db97e2dd-206f-8b63-3639-64b76a6e461c.json. Received Status Code 504
Failed to upload fhir/Thurman577_Kreiger457_aab1648e-2344-dc6e-f783-10860ce7f173.json. Received Status Code 504
Failed to upload fhir/Helena807_Rowe323_f734c2a0-dc3a-a148-ee78-045ddbbb6408.json. Received Status Code 504
fhir/Adena533_Heidenreich818_6603b8d0-ea29-e068-b13e-f8c1c94be1a0.json already loaded
Failed to upload fhir/Logan497_Walter473_4784a127-25b3-2d20-188f-813248bdbb50.json. Received Status Code 504
Failed to upload fhir/Versie644_Turner526_9e90ab3e-b8ad-f6a3-98fc-282e57ce6eef.json. Received Status Code 504
Failed to upload fhir/Kathrine376_Corwin846_2417861b-0aeb-6be0-7ee4-8d4da85dd96f.json. Received Status Code 504
Failed to upload fhir/Cheree978_Langosh790_dc78d8ba-9f53-4bbc-0bae-068cd990ae17.json. Received Status Code 504
Failed to upload fhir/Benjamin360_O'Keefe54_079b64b7-22

## What is this notebook?

(common overview of the FHIR Training)

(overview of this specific notebook)




### Icons in this Guide
 📘 A link to a useful external reference related to the section the icon appears in  

 ⚡️ A key takeaway for the section that this icon appears in  

 🖐 A hands-on section where you will code something or interact with the server  


(any required MITRE legalese should either go here or at the very bottom of the notebook)

## Motivation / Purpose

## Scenario

(this section describes the specifics of the use case: what is the problem statement, what is the basic approach we are going to take, etc)


## Initial Setup

In [None]:
# import any required libraries here.
#  - requests
#  - fhirclient: https://github.com/smart-on-fhir/client-py
#  - Pandas - DataFrames
#  - NumPy - basic data analysis
#  - matplotlib
#  - maybe seaborn for viz on top of matplotlib ?

## Step 1 Connect to Client

sync to source server for data extraction

## Step 2 Query Data

Submit query to source and retreive data. Save it locally

## Step 3 Mount Data onto Pandas Dataframe

Take FHIR formatted data and convert it to a pandas dataframe for subsequent analysis.

This resource seems like a good one! https://github.com/dermatologist/fhiry

## Step 4 Exploratory Data Analysis 

Conduct some limited, EDA for demonstration purposes.

## Summary

(A review of what was done in this notebook, possibly reinforcing how this kind of use case could be useful in the real world)