# FHIR kindling examples & tutorial

This notebook contains some examples on how to use the FHIR kindling library. If you don't have access to a development FHIR server instance, you can start a [Hapi JPA Server](https://hapifhir.io/hapi-fhir/docs/server_plain/server_types.html) container by running the following command on in the current working directory (requires docker and docker-compose to be installed and port 8082 to be available).
```shell
docker-compose up hapi
```

Make sure the library is installed in the current environment

In [None]:
!pip install fhir-kindling


The following examples assume that you have a FHIR server running on the localhost:8082. If you wish to connect to a remote server, you can find a more detailed description in the README.


In [1]:
import os
from dotenv import load_dotenv, find_dotenv
import pandas as pd
from fhir_kindling import FhirServer

from fhir_kindling.generators import PatientGenerator

In [2]:
% load_ext autoreload
% autoreload 2

UsageError: Line magic function `%` not found.


## Connecting to a FHIR server

Connect to a fhir server on a given API endpoint. In this case, we only have to specify the API endpoint as the test API is not secured.


In [3]:
fhir_api_url = "http://localhost:8082/fhir"

server = FhirServer(api_address=fhir_api_url)

## First basic query using the server
Attempt to query the first 100 patients from the server.

In [None]:
resp = server.query("Patient")
result = resp.limit(100)
result.resources

If this is the first time starting using the development server or there are no Patient resources in the server you are connecting to you. There will be no resources present on the server. So lets create some and upload them to the server.

In [None]:
# create 100 randomly generated patients
patients = PatientGenerator(n=100).generate()

# upload the patients to the server
response = server.add_all(patients)


now we can query the server for the newly created patients.

In [None]:
query = server.query("Patient").all()
print(f"Num patients: {len(query.resources)}")
print(query.resources[0])

there should now be 100 patients in the server.

## Dataset generation
Generating resource that retain referential integrity is a common difficulty while getting started with FHIR. This library provides a convenient way to generate a dataset of resources that can be uploaded to a FHIR server.
More resources and references between resource on the server are also required to be able to showcase more complex queries.

### Molecular sequence dataset based on a text file
In this example there is we use text file containing molecular sequences to generate [FHIR molecular sequence resources](http://www.hl7.org/fhir/molecularsequence.html). There are example files in the `./hiv_sequences` folder.

#### Reading the relevant data from the text file
We want to extract the sequence and the variant from one of the example text files.

In [4]:
# read all lines from the text file
with open("./hiv_sequences/sequences_1.txt", "r") as sequence_file:
    seq_lines = [line.strip() for line in sequence_file.readlines()]

sequences = []
variants = []
for line in seq_lines:
    # split the line on tab chars
    _, sequence, variant = line.split("\t")
    sequences.append(sequence)
    # split the variants
    variants.append(variant.split())

#### Creating the FHIR resources
The best way to go about creating a ResourceGenerator is by first looking up the resource definition of the resource you want to create on the [FHIR website](http://www.hl7.org/fhir/resourcelist.html).
Then you can import the corresponding fhir resource classes from the [fhir.resources library](https://github.com/nazrulworld/fhir.resources).
And use them to build a ResourceGenerator.

First import the resource and Generator classes.
Field Generators can generate values based on weighted choice from a list of values or based on a generator function that returns the matching value for the field.
For static values field values can be specified as a list of values.

In [5]:
from fhir_kindling.generators import (ResourceGenerator, FieldGenerator, FieldValue, DatasetGenerator,
                                      GeneratorParameters)
from fhir.resources.molecularsequence import MolecularSequence

# create a generator for the MolecularSequence resource
mol_seq_generator = ResourceGenerator(MolecularSequence)

# turn the found variants and sequences into iterators
variant_iter = iter(variants.copy())
sequence_iter = iter(sequences.copy())


#  write a generator function for the variants found in the text file
def generate_variants():
    # variants need to be a list according to the FHIR spec
    generated_variants = []
    # get the next variant from the variants iterator
    # print(len(list(variant_iter)))
    iter_var = next(variant_iter)

    for v in iter_var:
        # append a dictionary with the variant information
        generated_variants.append(
            {
                "observedAllele": v,
            }
        )
    return generated_variants


# initialize a field generator instance for the variants using our generator function
variant_generator = FieldGenerator(field="variant", generator_function=generate_variants)

# Create a Field generator for the observedSequence. It simply returns the next sequence from the sequence iterator
sequence_generator = FieldGenerator(field="observedSeq", generator_function=lambda: next(sequence_iter))

# Static value for the coordinate system
coordinate_value = FieldValue(field="coordinateSystem", value=0)

# Group the generators and values into GeneratorParameters
params = GeneratorParameters(
    count=len(sequences),
    field_values=[coordinate_value],
    field_generators=[variant_generator, sequence_generator],
)

# set the parameters
mol_seq_generator.params = params

# Generate the list of resources
mol_seq = mol_seq_generator.generate()

# mol_seg

#### Uploading the resources to the server

To associate the generated sequences with some patients we can use the data set generator.

In [None]:
# reset the iterators
variant_iter = iter(variants.copy())
sequence_iter = iter(sequences.copy())

# create a data set generator instance
sequence_ds_generator = DatasetGenerator(n=len(sequences))
# add our resource generator to after setting its count to None to let the data set generator
# handle the distribution of resources
mol_seq_generator.params.count = None
sequence_ds_generator = sequence_ds_generator.add_resource(resource_generator=mol_seq_generator)

# generate the data set
sequence_ds = sequence_ds_generator.generate()

This will generate a patient and add a reference to it for each molecular sequence.

In [None]:
# output the data set
sequence_ds

Now we can add the generated data set to the server.

In [9]:
resources, reference = sequence_ds.upload(server=server)

(<fhir_kindling.fhir_server.server_responses.BundleCreateResponse at 0x169ac415100>,
 <fhir_kindling.fhir_server.server_responses.BundleCreateResponse at 0x169ac4151f0>)

#### Accessing the generated resources
Getting our newly generated resources from the server demonstrates the more complex query functionality of this library.
We will write a query that only returns patient that have a sequence with the vairant CCand the query should include the sequence as well

In [18]:
mol_query = server.query("MolecularSequence")
mol_query = mol_query.include(resource="MolecularSequence", search_param="patient", reverse=True)
response = mol_query.all()

In [20]:
response.resources

[MolecularSequence(resource_type='MolecularSequence', fhir_comments=None, id='370', id__ext=None, implicitRules=None, implicitRules__ext=None, language=None, language__ext=None, meta=Meta(resource_type='Meta', fhir_comments=None, extension=None, id=None, id__ext=None, lastUpdated=datetime.datetime(2022, 1, 21, 17, 12, 14, 43000, tzinfo=datetime.timezone.utc), lastUpdated__ext=None, profile=None, profile__ext=None, security=None, source='#U6qbK0OfIh6tMXrg', source__ext=None, tag=None, versionId='1', versionId__ext=None), contained=None, extension=None, modifierExtension=None, text=None, coordinateSystem=0, coordinateSystem__ext=None, device=None, identifier=None, observedSeq='CTRPNNN-TRKS-I-RIQ-RGPGRAFVTI----GKIGNMRQAHC', observedSeq__ext=None, patient=Reference(resource_type='Reference', fhir_comments=None, extension=None, id=None, id__ext=None, display=None, display__ext=None, identifier=None, reference='Patient/202', reference__ext=None, type=None, type__ext=None), performer=None, po