<center><h1>🔥  FHIR Kindling</h1></center>

<center>Python fhir client library for easier and safer interactions with FHIR servers and resources</center>

## Why this library was made
- PHT requires FHIR project warehouses
- Data transfer between FHIR servers difficult and tedious
- No automatic conversion to tabular format for analysis
- Existing libraries felt slow



## Features

- Create, Read, Update and Delete resources using a server's REST API
- Resource validation powered by pydantic models
- Transfer resources between FHIR servers
- CSV/Dataframe serialization for resources & bundles
- Synthetic data generation and upload


<center><img src="benchmark.png" alt="benchmark" style="width:1300px; height: 800px"/></center>

## Installation

Install the latest published version from pypi:
```bash
pip install --user fhir-kindling
```
or install the newest version directly from github:
```bash
pip install --user git+https://github.com/migraf/fhir-kindling.git
```

More details can be found in the [documentation]()


In [None]:
!pip install --upgrade fhir-kindling
!pip install RISE

<center><h2>👨‍💻   How to use the library</h2></center>

## Connecting to a server

- Different auth methods: Basic, Bearer, OIDC
- Configuration of proxies and custom headers

In [58]:
import os
from dotenv import load_dotenv, find_dotenv
from fhir_kindling import FhirServer

_ = load_dotenv(find_dotenv())

In [59]:

fhir_api = "https://demo.personalhealthtrain.de/demo-fhir-3"
username = os.getenv("DEMO_USER")
password = os.getenv("DEMO_PW")
server = FhirServer(
    api_address=fhir_api,
    username=username,
    password=password,
)

## Query for resources

Query the server with the `query()` method of the server class.

Three ways to define a query:
- Iteratively build the query on a resource using methods like `where()`, `include()`, `has()`
- Use an existing `query_string` to define the query i.e. `Patient?_id=123"`
- Pass a `FHIRQueryParameters` object to the query method

## Iteratively building a query

Start building a query by selecting the base resource first

In [60]:
query = server.query("Patient")
query.query_url

'https://demo.personalhealthtrain.de/demo-fhir-3/Patient?&_count=5000&_format=json'

### Querying the server
the query is executed against the server using one of the methods `all()`, `first()`, `limit()`

In [61]:
response = query.all()
response

<QueryResponse(resource=Patient, n=673)>

In [62]:
response = query.limit(5)
response

<QueryResponse(resource=Patient, n=5)>

Accessing the resources in a `QueryResponse` object.

In [63]:
response.resources[0] 

Patient(resource_type='Patient', fhir_comments=None, id='1', implicitRules=None, implicitRules__ext=None, language=None, language__ext=None, meta=Meta(resource_type='Meta', fhir_comments=None, extension=None, id=None, lastUpdated=datetime.datetime(2022, 6, 8, 10, 36, 17, 669000, tzinfo=datetime.timezone.utc), lastUpdated__ext=None, profile=None, profile__ext=None, security=None, source='#SCHsV8ok4VcRcOjM', source__ext=None, tag=None, versionId='1', versionId__ext=None), contained=None, extension=None, modifierExtension=None, text=Narrative(resource_type='Narrative', fhir_comments=None, extension=None, id=None, div='<div xmlns="http://www.w3.org/1999/xhtml"><div class="hapiHeaderText">Duncann <b>NOVARA </b></div><table class="hapiPropertyTable"><tbody><tr><td>Date of birth</td><td><span>08 August 1968</span></td></tr></tbody></table></div>', div__ext=None, status='generated', status__ext=None), active=None, active__ext=None, address=None, birthDate=datetime.date(1968, 8, 8), birthDate__

### Adding filter conditions

Filter parameters are added on the fields of the base resource using the `where()` method.

In [64]:
query_2 = server.query("Patient").where("birthdate", "lt", "2000-01-01")
query_2.query_url

'https://demo.personalhealthtrain.de/demo-fhir-3/Patient?birthdate=lt2000-01-01&_count=5000&_format=json'

In [65]:
query_2.all()

<QueryResponse(resource=Patient, n=632)>

### Including related resources

In [66]:
query_3 = query_2.include(resource="Condition", reference_param="subject", reverse=True)
query_3.query_url

'https://demo.personalhealthtrain.de/demo-fhir-3/Patient?birthdate=lt2000-01-01&_revinclude=Condition:subject&_count=5000&_format=json'

In [67]:
resp = query_3.all()
resp

<QueryResponse(resource=Patient, format=json, included_resources=['Condition'])>

## Working with the response

The response to the query is a `QueryResponse` object.

- The `resources` attribute contains a list of resources of the base resource type returned by the query
- The `included_resources` attribute contains a list of included resources. Each entry in the list represents a list of resources of a certain type


In [68]:
 [resource.resource_type for resource in resp.included_resources]

['Condition']

In [69]:
resp.included_resources[0].resources[0]

Condition(resource_type='Condition', fhir_comments=None, id='1077', implicitRules=None, implicitRules__ext=None, language=None, language__ext=None, meta=Meta(resource_type='Meta', fhir_comments=None, extension=None, id=None, lastUpdated=datetime.datetime(2022, 7, 27, 7, 12, 2, 341000, tzinfo=datetime.timezone.utc), lastUpdated__ext=None, profile=None, profile__ext=None, security=None, source='#oHMhBq6cAstLdzrN', source__ext=None, tag=None, versionId='1', versionId__ext=None), contained=None, extension=None, modifierExtension=None, text=None, abatementAge=None, abatementDateTime=None, abatementDateTime__ext=None, abatementPeriod=None, abatementRange=None, abatementString=None, abatementString__ext=None, asserter=None, bodySite=None, category=None, clinicalStatus=None, code=CodeableConcept(resource_type='CodeableConcept', fhir_comments=None, extension=None, id=None, coding=[Coding(resource_type='Coding', fhir_comments=None, extension=None, id=None, code='RA01.0', code__ext=None, display=

## Saving the response

Responses can be saved to a file using the `save()` method of the `QueryResponse` class.
Supported formats are `json`, `xml` (if the query was executed with `xml` format) and `csv`.

In [70]:
path = os.path.join(os.getcwd(), "query_response.json")

resp.save(file_path=path)

In [71]:
with open(path, "r") as f:
    print("".join(f.readlines()[:8]))

{
  "resourceType": "Bundle",
  "id": "d63030a4-923d-4f99-8a6d-12b531d3ff84",
  "meta": {
    "lastUpdated": "2022-09-06T15:00:39.337000+00:00"
  },
  "type": "searchset",
  "total": 632,



## Serializing resources into a pandas dataframe

A response (or any bundle) can be serialized into pandas dataframes.
If the response contains resources of different types, the resources are serialized into separate dataframes for each type.

In [72]:
dfs = resp.to_dfs()

In [73]:
dfs[0].head()

Unnamed: 0,resourceType,id,meta_versionId,meta_lastUpdated,meta_source,text_status,text_div,name_0_family,name_0_given_0,gender,birthDate
0,Patient,682,1,2022-07-26 15:13:18.680000+00:00,#hH1NYHyZmZNGbOAI,generated,"<div xmlns=""http://www.w3.org/1999/xhtml""><div...",Giovanini,Erford,other,1921-09-11
1,Patient,100,1,2022-06-08 10:36:17.669000+00:00,#SCHsV8ok4VcRcOjM,generated,"<div xmlns=""http://www.w3.org/1999/xhtml""><div...",Aiola,Tyniqua,male,1921-09-30
2,Patient,391,1,2022-07-26 14:18:27.623000+00:00,#QmJ643x11dMzK1oE,generated,"<div xmlns=""http://www.w3.org/1999/xhtml""><div...",Graveran,Elliekate,female,1921-11-04
3,Patient,683,1,2022-07-26 15:13:18.680000+00:00,#hH1NYHyZmZNGbOAI,generated,"<div xmlns=""http://www.w3.org/1999/xhtml""><div...",Loosen,Monigue,female,1921-11-27
4,Patient,407,1,2022-07-26 14:18:27.623000+00:00,#QmJ643x11dMzK1oE,generated,"<div xmlns=""http://www.w3.org/1999/xhtml""><div...",Anuska,Zamoni,female,1921-11-30


In [74]:
dfs[1].head()

Unnamed: 0,resourceType,id,meta_versionId,meta_lastUpdated,meta_source,code_coding_0_system,code_coding_0_code,code_coding_0_display,code_text,subject_reference,clinicalStatus_coding_0_system,clinicalStatus_coding_0_code,clinicalStatus_coding_0_display
0,Condition,1077,1,2022-07-27 07:12:02.341000+00:00,#oHMhBq6cAstLdzrN,http://id.who.int/icd/release/11/mms,RA01.0,"COVID-19, virus identified",COVID-19,Patient/899,,,
1,Condition,1078,1,2022-07-27 07:12:02.341000+00:00,#oHMhBq6cAstLdzrN,http://id.who.int/icd/release/11/mms,RA01.0,"COVID-19, virus identified",COVID-19,Patient/900,,,
2,Condition,1079,1,2022-07-27 07:12:02.341000+00:00,#oHMhBq6cAstLdzrN,http://id.who.int/icd/release/11/mms,RA01.0,"COVID-19, virus identified",COVID-19,Patient/901,,,
3,Condition,1080,1,2022-07-27 07:12:02.341000+00:00,#oHMhBq6cAstLdzrN,http://id.who.int/icd/release/11/mms,RA01.0,"COVID-19, virus identified",COVID-19,Patient/902,,,
4,Condition,1081,1,2022-07-27 07:12:02.341000+00:00,#oHMhBq6cAstLdzrN,http://id.who.int/icd/release/11/mms,RA01.0,"COVID-19, virus identified",COVID-19,Patient/903,,,


## Converting a list of resources to a dataframe

Any list of resources (pydantic models or dicts) can be converted to a dataframe using the `flatten()` method.

In [75]:
from fhir_kindling.serde import flatten_resources

# get a list of patient resources
patients = server.query("Patient").limit(100).resources

In [76]:
flatten_resources(patients)

Unnamed: 0,resourceType,id,meta_versionId,meta_lastUpdated,meta_source,text_status,text_div,name_0_family,name_0_given_0,gender,birthDate
0,Patient,1,1,2022-06-08 10:36:17.669000+00:00,#SCHsV8ok4VcRcOjM,generated,"<div xmlns=""http://www.w3.org/1999/xhtml""><div...",Novara,Duncann,male,1968-08-08
1,Patient,2,1,2022-06-08 10:36:17.669000+00:00,#SCHsV8ok4VcRcOjM,generated,"<div xmlns=""http://www.w3.org/1999/xhtml""><div...",Weisman,Damiso,other,1981-01-26
2,Patient,3,1,2022-06-08 10:36:17.669000+00:00,#SCHsV8ok4VcRcOjM,generated,"<div xmlns=""http://www.w3.org/1999/xhtml""><div...",Newball,Kimbrick,male,1994-12-30
3,Patient,4,1,2022-06-08 10:36:17.669000+00:00,#SCHsV8ok4VcRcOjM,generated,"<div xmlns=""http://www.w3.org/1999/xhtml""><div...",Gargan,Chanae,female,1969-12-07
4,Patient,5,1,2022-06-08 10:36:17.669000+00:00,#SCHsV8ok4VcRcOjM,generated,"<div xmlns=""http://www.w3.org/1999/xhtml""><div...",Aromin,Kallysta,male,1943-11-06
...,...,...,...,...,...,...,...,...,...,...,...
95,Patient,96,1,2022-06-08 10:36:17.669000+00:00,#SCHsV8ok4VcRcOjM,generated,"<div xmlns=""http://www.w3.org/1999/xhtml""><div...",Balitas,Eligijus,female,1989-11-05
96,Patient,97,1,2022-06-08 10:36:17.669000+00:00,#SCHsV8ok4VcRcOjM,generated,"<div xmlns=""http://www.w3.org/1999/xhtml""><div...",Kliskey,Quierra,other,2000-11-14
97,Patient,98,1,2022-06-08 10:36:17.669000+00:00,#SCHsV8ok4VcRcOjM,generated,"<div xmlns=""http://www.w3.org/1999/xhtml""><div...",Arya,Welsey,female,1981-08-26
98,Patient,99,1,2022-06-08 10:36:17.669000+00:00,#SCHsV8ok4VcRcOjM,generated,"<div xmlns=""http://www.w3.org/1999/xhtml""><div...",Zaxas,Juawan,male,1951-03-10


## Additional CRUD operations

All other CRUD operations (and their asynchronous equivalents) are exposed as methods on the `FhirServer` object.

- Create: `add()`, `add_all()`
- Read: `get()`
- Update: `update()`
- Delete: `delete()`

### Adding resources & resource validation

- Resources as pydantic models or simple dictionaries. 
- Dictionaries are validated with the corresponding model before being added to the server


In [80]:
patient_dict = {"resourceType": "Patient", "birthDate": "2000-01-01", "name": [{"family": "Mustermann", "given": ["Max"]}]}
server.add(patient_dict)

<ResourceCreateResponse(resource_id=1643, location=https://demo.personalhealthtrain.de/demo-fhir-3/Patient/1643, version=None)>

### Get, Update, Delete

In [81]:
patient = server.get("Patient/1643")
patient.birthDate 

datetime.date(2000, 1, 1)

In [82]:
import datetime

patient.birthDate = datetime.date(1990, 1, 1)
server.update([patient])
updated = server.get("Patient/1643")
updated.birthDate

datetime.date(1990, 1, 1)

In [84]:
server.delete([updated])
server.get("Patient/1643")

HTTPStatusError: Client error '410 ' for url 'https://demo.personalhealthtrain.de/demo-fhir-3/Patient/1643'
For more information check: https://httpstatuses.com/410

## Generating synthetic data

Generate complex synthetic data sets using dataset and resource generator functions.
Interdependencies between resources and the likelihood of a resource being generated can be defined.


This example will generate a dataset with:
- Patients
- with Covid-19 conditions
- a certain likelihood of being vaccinated.

Start by importing and defining some constants i.e. Codes for the condition and the vaccination.

In [None]:
from fhir.resources.codeableconcept import CodeableConcept
from fhir.resources.coding import Coding

from fhir_kindling.generators.patient import PatientGenerator
from fhir_kindling.generators.resource_generator import ResourceGenerator, GeneratorParameters, FieldValue
from fhir_kindling.generators.field_generator import FieldGenerator
from fhir_kindling.generators.dataset import DatasetGenerator
from fhir_kindling.fhir_query.query_parameters import QueryOperators

import pendulum

covid_code = CodeableConcept(
    coding=[
        Coding(
            system="http://id.who.int/icd/release/11/mms",
            code="RA01.0",
            display="COVID-19, virus identified"
        )
    ],
    text="COVID-19"
)

vaccination_code = CodeableConcept(
    coding=[
        Coding(
            system="http://id.who.int/icd/release/11/mms",
            code="XM0GQ8",
            display="COVID-19 vaccine, RNA based"
        )
    ],
    text="COVID vaccination"
)


Configure the data set generator and subgenerators

In [None]:

covid_params = GeneratorParameters(
    field_values=[
        FieldValue(field="code", value=covid_code),
    ]
)

covid_generator = ResourceGenerator("Condition", generator_parameters=covid_params)
# add covid conditions to patients
dataset_generator.add_resource(covid_generator, name="covid")

vaccination_date_generator = FieldGenerator(
    field="occurrenceDateTime",
    generator_function=lambda: pendulum.now().to_date_string()
)

first_vax_params = GeneratorParameters(
    field_values=[
        FieldValue(field="vaccineCode", value=vaccination_code),
        FieldValue(field="status", value="completed"),
    ],
    field_generators=[
        vaccination_date_generator
    ]
)


In [None]:
count = 100
dataset_generator = DatasetGenerator("Patient", n=count)

vaccination_generator = ResourceGenerator("Immunization", generator_parameters=first_vax_params)
dataset_generator.add_resource(vaccination_generator, name="first_vaccination", likelihood=0.8)

dataset = dataset_generator.generate(ids=True)

dataset.upload(server)

### Check if our server now has covid patients

In [85]:
covid_query = server.query("Patient").has(
    resource="Condition",
    search_param="code",
    operator=QueryOperators.eq,
    value="RA01.0",
    reference_param="subject",
).include(
    resource="Condition",
    reference_param="subject",
    reverse=True
)

covid_response = covid_query.all()
covid_response

<QueryResponse(resource=Patient, format=json, included_resources=['Condition'])>

## Transferring resources from one server to another

Use the `transfer()` function on a server object to transfer resources from one server to another while keeping referential integrity and using server assigned IDs.  
The transfer is a three-step process:
1. Analyze the resources to be transferred and build a DAG modeling the references
2. Obtain any missing resources that are referenced from the source server
3. Upload the resources to the target server based on the reference DAG

### Connect to an additional server and transfer the query results

In [86]:
# define a new server
transfer_api_url = "https://demo.personalhealthtrain.de/demo-fhir-4"
transfer_server = FhirServer(api_address=transfer_api_url, username=username, password=password)

server.transfer(transfer_server, covid_response)

<TransferResponse(origin_server=https://demo.personalhealthtrain.de/demo-fhir-3, destination_server=https://demo.personalhealthtrain.de/demo-fhir-4, query_parameters=resource='Patient' resource_parameters=None include_parameters=[IncludeParameter(resource='Condition', search_param='subject', target=None, reverse=True, iterate=False)] has_parameters=[ReverseChainParameter(operator=<QueryOperators.eq: 'eq'>, value='RA01.0', resource='Condition', reference_param='subject', search_param='code')], create_responses=<ResourceCreateResponse(resource_id=1543, location=Patient/1543, version=None)>...<ResourceCreateResponse(resource_id=1942, location=Patient/1942, version=None)>)

## Outlook

- Optimizations for handling large amounts of data

### Privacy Methods
- use automatic tabular serialization to evaluate bundle responses with methods like k-anonymity
- automatic anonymization of fields (i.e. rounding dates to the year recoreded)

### User interface
- Graphical userinterface to create, save and execute queries
- Autocomplete for resources and their fields

<center><h2>Questions?</h2></center>


<center><h2>Feature requests and contributions are very welcome!</h2></center>