# DigitalTWINS on FHIR - Live demo

## Introduction
The examples considered in this demo involve:
- 2 patients for Exemplar Project 1 – Biomarkers for pulmonary hypertension (note that a simplified workflow with one step/tool is used here).
- 2 patients for Exemplar Project 4 - Breast cancer reporting (6 step workflow).
- 1 patient contributed to both projects (ie in this scenario, they have both pulmonary hypertension and breast cancer).

## Definitions
- FHIR - Fast Healthcare Interoperability Resources
- FHIR Server - a server implemented according to the FHIR standard, allow users to create, update, delete, and search FHIR health data.
- Primary Measurements - a SPARC SDS dataset.
- CWL - Common Workflow Language.
- Workflow tool process - generated by executing a tool/step of a workflow.

## Learning outcomes

- Learn how to install the digitaltwins-on-fhir python client
- Learn how to find all primary measurements for a patient that have been contributed from multiple research studies. This includes:
  -  finding FHIR ImagingStudy resources and their PACS endpoints.
- Learn how to find which workflow and tool generated a specific derived measurement observation.
- Learn how to find all tools and models used by a workflow and their workflow tool processes.
- Learn how to find inputs and outputs of a given tool in a workflow.




## Installing the digitaltwins-on-fhir python client

Install package


*   pip install digitaltwins-on-fhir

In [None]:
pip install digitaltwins-on-fhir

In [None]:
from digitaltwins_on_fhir import Adapter
from pprint import pprint
adapter = Adapter("http://localhost:8080/fhir")
client = adapter.async_client

## Example 1: Finding all primary measurements for a patient

Let's randomly select a patient

Step 1: Find the random patient's uuid and get all dataset composition resources.

In [None]:

patients = await client.resources("Patient").search().fetch_all()
print(f"There are total {len(patients)} patients: {patients}")
target_patient = patients[1]
p_uuid = target_patient.get_by_path([
        'identifier',
        {'system':'https://www.auckland.ac.nz/en/abi.html'},
        'value'
    ], '')
print(f"Let's select the patient: {p_uuid} as example.")
dataset = []
research_subjects = await client.resources("ResearchSubject").search(patient=target_patient.to_reference()).fetch_all()
for r in research_subjects:
  # title="primary measurements",

  compositions = await client.resources("Composition").search(type="primary measurements", subject=r.to_reference()).fetch_all()
  dataset.extend(compositions)

print(f"Data for patient: {p_uuid} was collected from {len(dataset)} datasets: {dataset}")


Now we know that patient `c6923eb4-a5c2-4239-8b7a-16d1268b108d` appears in six datasets. From there, we can find all of his primary measurement data across those datasets

Step 2: Next, find all primary measurements for the patient mentioned above. To do this, list the type of measurement resource (e.g., Observation, ImagingStudy, DocumentReference) along with the corresponding code for each measurement collected in the each dataset Composition resource.

In [None]:
for c in dataset:
  measurements = c.get_by_path(["section", 0, "entry"])
  print()
  print(f"The patient {p_uuid} in dataset: '{c.get('title')}', has {len(measurements)} collected measurements.")
  duuid = c.get_by_path([
        'identifier',
        'value'
    ], '')
  for m in measurements:
    mr = await m.to_resource()
    if mr['resourceType'] == "Observation":
      code = mr.get_by_path(["code", "coding", 0, "code"])
      print(f"For dataset {c.get('title')}: {duuid}, a measurement `{mr.get('resourceType')}` was found with code: {code}.")
    elif mr['resourceType'] == "ImagingStudy":
      print(f"For dataset {c.get('title')}: {duuid}, a measurement `{mr.get('resourceType')}` with description: {mr.get('description')}.")
    elif mr['resourceType'] == "DocumentReference":
      des = mr.get_by_path([
            'content',
            0,
            "attachment",
            "title"
        ], '')
      url = mr.get_by_path([
            'content',
            0,
            "attachment",
            "url"
        ], '')
      print(f"For dataset {c.get('title')}: {duuid}, a measurement {mr.get('resourceType')} stores: {des} information, and you can find the file via this url: {url}.")


Step 3: List the study and series PACS endpoints for each ImagingStudy that has been peformed on the specified patient. This allows us to search for all imaging study data that has been collected in for that individual.

In [None]:
images = await client.resources("ImagingStudy").search(subject=target_patient.to_reference()).fetch_all()

for image in images:
  composition = await client.resources("Composition").search(entry=image.to_reference()).first()
  duuid = composition.get_by_path([
      'identifier',
      'value'
  ], '')
  print(f"The image: {image} comes from dataset ID: {duuid}")
  image_study_endpoint = await image.get("endpoint")[0].to_resource()
  print(f"The image: {image} study PACS address is: {image_study_endpoint.get('address')}")
  for series in image.get("series"):
    series_endpoint = await series.get("endpoint")[0].to_resource()
    print(f"The image: {image} study has a PACS address of: {series_endpoint.get('address')} for one of it's series.")
  print("-----------------------------------------------------------------------------------\n")

# Example 2: Find which workflow, tool, and primary data was used to generate a specific derived measurement observation

It might of interest to find out more information regarding the provenance of a give observation e.g. called "tumour position" (uuid: `231d9946-949a-4fee-8695-5887209bd2db_2673e5a3-8437-41f5-9fef-0983f5662e93_Workflow-Process-Output-Observation-0-0`). For example, we could be interested in finding:
- which assay, workflow and tool generated this observation.

We can start by defining the observation of interest:

In [None]:
ob = await client.resources("Observation").search(
            identifier="231d9946-949a-4fee-8695-5887209bd2db_2673e5a3-8437-41f5-9fef-0983f5662e93_Workflow-Process-Output-Observation-0-0").first()

We can find which workflow and tool generated this observation using the following steps:

In [None]:
# Step 1: Find the workflow result composition (dataset) that this observation belongs to.
composition = await client.resources("Composition").search(entry=ob.to_reference().reference).first()
print("Step 1: The observation belongs to this dataset (Composition resource): ", composition)

# Step 2: Find out which patient this Composition resource belongs to.
authors = composition.get("author")
for a in authors:
  temp = await a.to_resource()
  if temp["resourceType"] == "Patient":
    patient = temp
    break
  else:
    patient = None
if patient != None:
  p_uuid = patient.get_by_path([
        'identifier',
        {'system':'https://www.auckland.ac.nz/en/abi.html'},
        'value'
    ], '')
  print("Step 2: The patient has been found, and the uuid is: ", p_uuid)

# Step 3: Find all assays the patient was involved in.
research_subjects = await client.resources("ResearchSubject").search(individual=patient.to_reference().reference).fetch_all()
assays = [await r["study"].to_resource() for r in research_subjects if r.get("study", None) != None]
print("Step 3: The patient was involved in these assays: ", assays)
# Step 4: Find the tool process that generated the observation.
tool_process = []
flag = False
for a in assays:
  processes = await client.resources("Task").search(subject=a.to_reference().reference).fetch_all()
  tool_process.extend(processes)
for t in tool_process:
  outputs = t.get("output")
  if outputs != None:
    for o in outputs:
      if ob.to_reference().reference == o["valueReference"].reference:
        flag = True
        t_uuid = t.get_by_path([
              'identifier',
              {'system':'https://www.auckland.ac.nz/en/abi.html'},
              'value'
          ], '')
        assay = await t["for"].to_resource()
        workflow = await assay["protocol"][0].to_resource()
        study = await assay["partOf"][0].to_resource()
        workflow_tool = await t["focus"].to_resource()
        print(f"Now, we find the Observation `{ob.get('code').get('text')}` was generate by the process: {t_uuid}, \n using workflow tool `{workflow_tool['name']}`: {workflow_tool} and workflow `{workflow.get('name')}`: {workflow}")
        print(f"Also, we find out the Observation belong to this assay `{assay.get('title')}`: {assay}, \n and this assay is belong to the study `{study.get('title')}`: {study} ")
        break
    if flag:
      break

We can find all inputs and their dataset uuid for generating the Observation

In [None]:
# Step 1: Find all the inputs
processes = await client.resources("Task").search(subject=assay.to_reference().reference, owner=patient.to_reference().reference).fetch_all()
print(f"To generate the Observation {ob} for patient `{p_uuid}`, the following primary inputs were used: \n")
for t in processes:
  temp = t.get("input")
  if temp != None:
    for i in temp:
      primary_input = await i["valueReference"].to_resource()
      primary_input_uuid = primary_input.get_by_path([
              'identifier',
              {'system':'https://www.auckland.ac.nz/en/abi.html'},
              'value'
          ], '')
      composition = await client.resources("Composition").search(entry=primary_input.to_reference().reference).first()
      d_uuid = composition.get_by_path([
            'identifier',
            'value'
        ], '')
      if primary_input['resourceType'] == "ImagingStudy":
        print(f"The input {primary_input['resourceType']}: `{primary_input_uuid}` \n is in this dataset: {d_uuid}\n")
      elif primary_input['resourceType'] == "DocumentReference":
        des = primary_input.get_by_path([
              'content',
              0,
              "attachment",
              "title"
          ], '')
        print(f"The input {primary_input['resourceType']} `{des}`: `{primary_input_uuid}` \n is in this dataset: {d_uuid}\n")
      else:
        print(f"The input {primary_input['resourceType']} `{primary_input.get('code').get('text')}`: `{primary_input_uuid}` \n is in this dataset: {d_uuid}\n")

## Example 3: Find all tools and models used by a workflow and their workflow tool processes

Lets find all workflows, and then choose a specific uuid that we can use as an example for finding all tools used by the workflow.

In [None]:
# Get all workflows
workflows =  await client.resources("PlanDefinition").search().fetch_all()

for i, w in enumerate(workflows):
  uid = w.get_by_path([
          'identifier',
          {'system':'https://www.auckland.ac.nz/en/abi.html'},
          'value'
      ], '')
  print(f"{i}, The workflow resource: {w.get('name')}, {uid}")


We can also find a workflow by searching by its `name` e.g. "Automated torso model generation - script" direactly.

In [None]:
breast_workflow = await client.resources("PlanDefinition").search(name="Automated torso model generation - script").fetch_all()
print("Automated torso model generation workflow:", breast_workflow)
print("uuid: ", breast_workflow[0].get_by_path([
        'identifier',
        {'system':'https://www.auckland.ac.nz/en/abi.html'},
        'value'
    ]))

We can then find all the tools that were used in this workflow including the software and/or models used in the tool (here we show how we can access the workflow from its UUID).

In [None]:
workflow = await client.resources("PlanDefinition").search(identifier="e3b3eaa0-65ae-11ef-917d-484d7e9beb16").first()
actions = workflow.get("action")
for a in actions:
  if a.get("definitionCanonical") is None:
      continue
  resource_type, _id = a.get("definitionCanonical").split("/")
  workflow_tool = await client.reference(resource_type, _id).to_resource()
  print(f"workflow tool name: {workflow_tool.get('name')}", workflow_tool)
  print("Software and model uuids:")
  pprint(workflow_tool.get("participant"))
  print("---------------------------------")

We can also find all the workflow tool process that have been run for that particular workflow.

In [None]:
# Step 1: Find all assays that use this workflow.
assays = await client.resources("ResearchStudy").search(protocol=workflow.to_reference().reference).fetch_all()
w_uid = workflow.get_by_path([
          'identifier',
          {'system':'https://www.auckland.ac.nz/en/abi.html'},
          'value'
      ], '')
for a in assays:
  a_uid = a.get_by_path([
          'identifier',
          {'system':'https://www.auckland.ac.nz/en/abi.html'},
          'value'
      ], '')
  workflow_tool_processes = await client.resources("Task").search(
              subject=a.to_reference().reference).fetch_all()
  print(f"Here are all processes for the assay `{a_uid}` that use the workflow `{w_uid}`")
  pprint(workflow_tool_processes)
  print()

## Example 4 Find inputs and outputs of a given tool in a workflow
Here we will specify some of the resources that we are looking for, however, all this information can be queried as shown in the previous examples.:

*   workflow uuid: 4c36c076-3813-4247-8317-e163901b1ae3
*   workflow tool uuid: 2673e5a3-8437-41f5-9fef-0983f5662e93

By specifying the workflow and workflow tool UUIDs above, we can retrieve all related assays and workflow tool processes.


In [None]:

# Step 1: Find the workflow and assays.
workflow = await client.resources("PlanDefinition").search(identifier="4c36c076-3813-4247-8317-e163901b1ae3").first()
assays = await client.resources("ResearchStudy").search(protocol=workflow.to_reference().reference).fetch_all()
# Step 2: Find the workflow tool.
workflow_tool = await client.resources("ActivityDefinition").search(
    identifier="2673e5a3-8437-41f5-9fef-0983f5662e93").first()
# Step 3: Find all workflow tool processes for that workflow tool.
workflow_tool_processes = []
for a in assays:
  processes = await client.resources("Task").search(subject=a.to_reference(),
                                                    focus=workflow_tool.to_reference()).fetch_all()
  workflow_tool_processes.extend(processes)

print("Workflow tool process: ", workflow_tool_processes)

Now we have all the processes for that workflow tool.

We can then find the input and output of each workflow tool process:

In [None]:
print(f"Now, we can find all inputs and ouputs of the workflow tool: {workflow_tool.get('name')} \n")
for workflow_tool_process in workflow_tool_processes:
  inputs = workflow_tool_process.get("input")
  outputs = workflow_tool_process.get("output")
  patient = await workflow_tool_process.get("owner").to_resource()
  p_uid = patient.get_by_path([
          'identifier',
          {'system':'https://www.auckland.ac.nz/en/abi.html'},
          'value'
      ], '')
  print(f"Inputs of patient {p_uid}: ")
  for i in inputs:
      resource = await i.get("valueReference").to_resource()
      # await client.resources("Composition").search(entry=ob.to_reference().reference).first()
      composition = await client.resources("Composition").search(entry=i.get("valueReference")).first()
      c_uid = composition.get_by_path([
                  'identifier',
                  'value'
              ], '')
      if resource['resourceType'] == "ImagingStudy":
        print(f"The workflow tool: {workflow_tool.get('name')} `2673e5a3-8437-41f5-9fef-0983f5662e93` use patient `{p_uid}`'s {resource.get('resourceType')} as input, which comes from dataset `{c_uid}`")
      elif resource['resourceType'] == "DocumentReference":
        des = resource.get_by_path([
              'content',
              0,
              "attachment",
              "title"
          ], '')
        print(f"The workflow tool: {workflow_tool.get('name')} `2673e5a3-8437-41f5-9fef-0983f5662e93` use patient `{p_uid}`'s {resource.get('resourceType')} `{des}` as input, which comes from dataset `{c_uid}`")
      else:
        print(f"The workflow tool: {workflow_tool.get('name')} `2673e5a3-8437-41f5-9fef-0983f5662e93` use patient `{p_uid}`'s {resource.get('resourceType')} `{resource.get('code').get('text')}` \n (value: {resource.get('valueString')}) as input, which comes from dataset `{c_uid}`")
  print()
  print(f"Outputs of patient {p_uid}: ")
  for o in outputs:
      resource = await o.get("valueReference").to_resource()
      # await client.resources("Composition").search(entry=ob.to_reference().reference).first()
      composition = await client.resources("Composition").search(entry=o.get("valueReference")).first()
      c_uid = composition.get_by_path([
                  'identifier',
                  'value'
              ], '')
      if resource['resourceType'] == "ImagingStudy":
        print(f"The workflow tool: {workflow_tool.get('name')} `2673e5a3-8437-41f5-9fef-0983f5662e93` generated an output {resource.get('resourceType')} for patient `{p_uid}`, which was saved in dataset `{c_uid}`")
      elif resource['resourceType'] == "DocumentReference":
        des = resource.get_by_path([
              'content',
              0,
              "attachment",
              "title"
          ], '')
        print(f"The workflow tool: {workflow_tool.get('name')} `2673e5a3-8437-41f5-9fef-0983f5662e93` generated an output {resource.get('resourceType')} `{des}` patient `{p_uid}`, which was saved in dataset `{c_uid}`")
      else:
        print(f"The workflow tool: {workflow_tool.get('name')} `2673e5a3-8437-41f5-9fef-0983f5662e93` generated an output {resource.get('resourceType')} `{resource.get('code').get('text')}` \n (value: {resource.get('valueString')}) patient `{p_uid}`, which was saved in dataset `{c_uid}`")
  print()

