# Demonstrator - Standardized views for MomCare using SQL on FHIR

This notebook shows an alternate method for creating standardized views for MomCare using SQL on FHIR and [Pathling](https://pathling.csiro.au/).

It is based upon the process described in [FHIR-based reporting](https://health-data-commons.pharmaccess.org/docs/reporting.html).

In [1]:
import os

from pathling import PathlingContext
from pyspark.sql.functions import lit

## Load data from NDJSON

The first step is to load the data from the NDJSON files.

This data is stored in a directory specified by the `MOMCARE_DATA_PATH` environment variable.

The assumed structure of the data within the directory looks like this:

```
MOMCARE_DATA_PATH
├── Encounter.ndjson
├── Organization.ndjson
├── Procedure.ndjson
├── Condition.ndjson
```

In [2]:
pc = PathlingContext.create()
data = pc.read.ndjson(os.getenv("MOMCARE_DATA_PATH", ".data"))

:: loading settings :: url = jar:file:/opt/homebrew/Caskroom/miniconda/base/envs/momcare/lib/python3.11/site-packages/pyspark/jars/ivy-2.5.1.jar!/org/apache/ivy/core/settings/ivysettings.xml


Ivy Default Cache set to: /Users/gri306/.ivy2/cache
The jars for the packages stored in: /Users/gri306/.ivy2/jars
au.csiro.pathling#library-runtime added as a dependency
io.delta#delta-core_2.12 added as a dependency
org.apache.hadoop#hadoop-aws added as a dependency
:: resolving dependencies :: org.apache.spark#spark-submit-parent-56247325-a0a9-401e-8e0f-c6b47381ffc2;1.0
	confs: [default]
	found au.csiro.pathling#library-runtime;7.1.0-SNAPSHOT in local-m2-cache
	found io.delta#delta-core_2.12;2.4.0 in local-m2-cache
	found io.delta#delta-storage;2.4.0 in local-m2-cache
	found org.antlr#antlr4-runtime;4.9.3 in local-m2-cache
	found org.apache.hadoop#hadoop-aws;3.3.4 in local-m2-cache
	found com.amazonaws#aws-java-sdk-bundle;1.12.262 in local-m2-cache
	found org.wildfly.openssl#wildfly-openssl;1.0.7.Final in local-m2-cache
:: resolution report :: resolve 152ms :: artifacts dl 8ms
	:: modules in use:
	au.csiro.pathling#library-runtime;7.1.0-SNAPSHOT from local-m2-cache in [default]
	com.

## Load SQL on FHIR views

We have defined a set of SQL on FHIR views that can be used to compose the patient timeline.

They are each based around extracting the relevant data from a particular FHIR resource type.

Each view conforms to both the "shareable" and "tabular" profiles, to maximise interoperability and also to ensure that the result can be accommodated within a standard database table.

### Encounter view

This view extracts the following fields from the `Encounter` resource:

- `encounter_id`
- `patient_id`
- `visit_provider_id`
- `visitType`
- `visit_type_code`

```json
{
  "name": "momcare_encounter",
  "title": "MomCare Encounter View",
  "version": "0.1.0",
  "url": "https://momcare.cot.pharmaccess.org/fhir/ViewDefinition/encounter",
  "meta": {
    "profile": [
      "http://hl7.org/fhir/uv/sql-on-fhir/StructureDefinition/ShareableViewDefinition",
      "http://hl7.org/fhir/uv/sql-on-fhir/StructureDefinition/TabularViewDefinition"
    ]
  },
  "status": "draft",
  "resource": "Encounter",
  "fhirVersion": [
    "4.0.1"
  ],
  "select": [
    {
      "column": [
        {
          "name": "encounter_id",
          "path": "identifier.where(use = 'temp').value",
          "type": "string",
          "collection": false
        },
        {
          "name": "patient_id",
          "path": "subject.reference",
          "type": "string",
          "collection": false
        },
        {
          "name": "visit_provider_id",
          "path": "serviceProvider.reference",
          "type": "string",
          "collection": false
        }
      ]
    },
    {
      "forEachOrNull": "type.coding.where(system = 'http://snomed.info/sct')",
      "column": [
        {
          "name": "visitType",
          "path": "display",
          "type": "string",
          "collection": false
        },
        {
          "name": "visit_type_code",
          "path": "code",
          "type": "code",
          "collection": false
        }
      ]
    }
  ]
}
```

## Organization view

This view extracts the following fields from the `Organization` resource:

- `visit_provider_id`
- `visit_provider_name`

```json
{
  "name": "momcare_organization",
  "title": "MomCare Organization View",
  "version": "0.1.0",
  "url": "https://momcare.cot.pharmaccess.org/fhir/ViewDefinition/organization",
  "meta": {
    "profile": [
      "http://hl7.org/fhir/uv/sql-on-fhir/StructureDefinition/ShareableViewDefinition",
      "http://hl7.org/fhir/uv/sql-on-fhir/StructureDefinition/TabularViewDefinition"
    ]
  },
  "status": "draft",
  "resource": "Organization",
  "fhirVersion": [
    "4.0.1"
  ],
  "select": [
    {
      "column": [
        {
          "name": "visit_provider_id",
          "path": "identifier.where(use = 'temp').value",
          "type": "string",
          "collection": false
        },
        {
          "name": "visit_provider_name",
          "path": "name",
          "type": "string",
          "collection": false
        }
      ]
    }
  ]
}
```

## Procedure view

This view extracts the following fields from the `Procedure` resource:

- `encounter_id`
- `procedure_time`
- `description_name`
- `system`
- `code`

```json
{
  "name": "momcare_procedure",
  "title": "MomCare Procedure View",
  "version": "0.1.0",
  "url": "https://momcare.cot.pharmaccess.org/fhir/ViewDefinition/procedure",
  "meta": {
    "profile": [
      "http://hl7.org/fhir/uv/sql-on-fhir/StructureDefinition/ShareableViewDefinition",
      "http://hl7.org/fhir/uv/sql-on-fhir/StructureDefinition/TabularViewDefinition"
    ]
  },
  "status": "draft",
  "resource": "Procedure",
  "fhirVersion": [
    "4.0.1"
  ],
  "select": [
    {
      "column": [
        {
          "name": "encounter_id",
          "path": "encounter.reference",
          "type": "string",
          "collection": false
        },
        {
          "name": "procedure_time",
          "path": "performed.ofType(dateTime)",
          "type": "dateTime",
          "collection": false
        },
        {
          "name": "description_name",
          "path": "code.text",
          "type": "string",
          "collection": false
        }
      ]
    },
    {
      "forEachOrNull": "code.coding",
      "column": [
        {
          "name": "system",
          "path": "system",
          "type": "uri",
          "collection": false
        },
        {
          "name": "code",
          "path": "code",
          "type": "code",
          "collection": false
        }
      ]
    }
  ]
}
```

## Condition view

This view extracts the following fields from the `Condition` resource:

- `encounter_id`
- `description_name`
- `system`
- `code`

```json
{
  "name": "momcare_condition",
  "title": "MomCare Condition View",
  "version": "0.1.0",
  "url": "https://momcare.cot.pharmaccess.org/fhir/ViewDefinition/condition",
  "meta": {
    "profile": [
      "http://hl7.org/fhir/uv/sql-on-fhir/StructureDefinition/ShareableViewDefinition",
      "http://hl7.org/fhir/uv/sql-on-fhir/StructureDefinition/TabularViewDefinition"
    ]
  },
  "status": "draft",
  "resource": "Condition",
  "fhirVersion": [
    "4.0.1"
  ],
  "select": [
    {
      "column": [
        {
          "name": "encounter_id",
          "path": "encounter.reference",
          "type": "string",
          "collection": false
        },
        {
          "name": "description_name",
          "path": "code.text",
          "type": "string",
          "collection": false
        }
      ]
    },
    {
      "forEachOrNull": "code.coding",
      "column": [
        {
          "name": "system",
          "path": "system",
          "type": "uri",
          "collection": false
        },
        {
          "name": "code",
          "path": "code",
          "type": "code",
          "collection": false
        }
      ]
    }
  ]
}
```

In [3]:
view_names = ["Encounter", "Organization", "Procedure", "Condition"]

views = {}
for name in view_names:
    with open(f"views/{name}.ViewDefinition.json") as f:
        views[name] = data.view(resource=name, json=f.read())

24/06/27 17:14:08 WARN package: Truncated the string representation of a plan since it was too large. This behavior can be adjusted by setting 'spark.sql.debug.maxToStringFields'.


## Combine views

The final step will be to combine the views into the patient timeline, using standard SQL operations.

First we join the encounter view to the organization view.

In [4]:
base = views["Encounter"].join(views["Organization"], on="visit_provider_id")

Then we join this to the procedures view to create a table with rows that are related to procedures.

In [5]:
procedures = base.join(views["Procedure"], on="encounter_id").select(
    base.patient_id,
    base.encounter_id,
    base.visit_provider_id,
    base.visit_provider_name,
    views["Procedure"].procedure_time,
    views["Procedure"].code,
    views["Procedure"].system,
    lit("procedure").alias("type"),
    views["Procedure"].description_name,
    base.visitType,
    base.visit_type_code,
)

We now create a table with rows that are related to diagnoses.

In [6]:
diagnoses = base.join(views["Condition"], on="encounter_id").select(
    base.patient_id,
    base.encounter_id,
    base.visit_provider_id,
    base.visit_provider_name,
    lit(None).alias("procedure_time"),
    views["Condition"].code,
    views["Condition"].system,
    lit("condition").alias("type"),
    views["Condition"].description_name,
    base.visitType,
    base.visit_type_code,
)

The final step is to create a union of the procedure and diagnoses tables to create the final timeline.

In [7]:
patient_timeline = procedures.union(diagnoses)

patient_timeline.toPandas()

                                                                                

Unnamed: 0,patient_id,encounter_id,visit_provider_id,visit_provider_name,procedure_time,code,system,type,description_name,visitType,visit_type_code
0,a8340406-712e-11eb-9439-0242ac130002,100010,28,Endasak Dispensary,2021-08-27,,,procedure,SP tabs for Presumptive malaria treatment,Antenatal care,424525001
1,a8340406-712e-11eb-9439-0242ac130002,100010,28,Endasak Dispensary,2021-08-27,,,procedure,Fetus heart beats,Antenatal care,424525001
2,a8340406-712e-11eb-9439-0242ac130002,100010,28,Endasak Dispensary,2021-08-27,,,procedure,Weight,Antenatal care,424525001
3,a8340406-712e-11eb-9439-0242ac130002,100010,28,Endasak Dispensary,2021-08-27,,,procedure,Blood pressure,Antenatal care,424525001
4,a8340406-712e-11eb-9439-0242ac130002,100010,28,Endasak Dispensary,2021-08-27,,,procedure,Folic acid/Ferous,Antenatal care,424525001
...,...,...,...,...,...,...,...,...,...,...,...
2186835,897ffef8-6156-11eb-ae93-0242ac130002,99994,5,Babati Town Council Hospital,,Z39,http://hl7.org/fhir/sid/icd-10,condition,Postpartum care and examination,Postpartum care,133906008
2186836,2bf609ce-9876-11eb-a8b3-0242ac130003,99995,3,Dareda Hospital,,O98.7,http://hl7.org/fhir/sid/icd-10,condition,HIV in pregnancy,Antenatal care,424525001
2186837,f740210e-fe94-11ea-adc1-0242ac120002,99997,14,Mwada Dispensary,,Z34,http://hl7.org/fhir/sid/icd-10,condition,Supervision of normal pregnancy,Antenatal care,424525001
2186838,a833e93a-712e-11eb-9439-0242ac130002,99998,28,Endasak Dispensary,,Z34,http://hl7.org/fhir/sid/icd-10,condition,Supervision of normal pregnancy,Antenatal care,424525001


This view can then be written out to CSV or Parquet, or saved into a table within a relational database.