Antal borgere per kommune fortolkes som:
* Antal borgere der er tilknyttet en episodeofcare, der er administreret af et careteam, der er administreret af kommunen

In [None]:
import pyspark.sql.functions as F
from data_location import DELTA_LOCATION

from spark_bi.spark import FutPathlingContext

pc = FutPathlingContext.create(app_name="example-spark-app")
delta_lake = pc.read.delta(DELTA_LOCATION)

25/12/12 10:25:48 WARN SimpleFunctionRegistry: The function display replaced a previously registered function.
25/12/12 10:25:48 WARN SimpleFunctionRegistry: The function member_of replaced a previously registered function.
25/12/12 10:25:48 WARN SimpleFunctionRegistry: The function subsumes replaced a previously registered function.
25/12/12 10:25:48 WARN SimpleFunctionRegistry: The function designation replaced a previously registered function.
25/12/12 10:25:48 WARN SimpleFunctionRegistry: The function property_string replaced a previously registered function.
25/12/12 10:25:48 WARN SimpleFunctionRegistry: The function property_code replaced a previously registered function.
25/12/12 10:25:48 WARN SimpleFunctionRegistry: The function property_integer replaced a previously registered function.
25/12/12 10:25:48 WARN SimpleFunctionRegistry: The function property_boolean replaced a previously registered function.
25/12/12 10:25:48 WARN SimpleFunctionRegistry: The function property_deci

In [19]:
from pyspark.sql.functions import when, col

episodes_of_care = delta_lake.view(
    resource="EpisodeOfCare",
    select=[
        {
            "column": [
                {"name": "eoc_id", "path": "getResourceKey()"},
                {"name": "eoc_patient_id", "path": "patient.getReferenceKey()"},
            ]
        },
        {"forEach": "team", "column": [{"name": "eoc_team_id", "path": "getReferenceKey()"}]},
    ],
)
episodes_of_care.head(5)

[Row(eoc_id='EpisodeOfCare/2000000029', eoc_patient_id='Patient/1000264558', eoc_team_id='CareTeam/3000108752'),
 Row(eoc_id='EpisodeOfCare/2000000035', eoc_patient_id='Patient/1000264558', eoc_team_id='CareTeam/3000108752'),
 Row(eoc_id='EpisodeOfCare/2000000042', eoc_patient_id='Patient/1000264558', eoc_team_id='CareTeam/3000108752'),
 Row(eoc_id='EpisodeOfCare/2000000049', eoc_patient_id='Patient/1000264558', eoc_team_id='CareTeam/3000108752'),
 Row(eoc_id='EpisodeOfCare/2000000068', eoc_patient_id='Patient/1000264558', eoc_team_id='CareTeam/3000108752')]

In [20]:
careteams = delta_lake.view(
    resource="CareTeam",
    select=[
        {
            "column": [
                {"name": "ct_id", "path": "getResourceKey()"},
                {
                    "name": "ct_org_id",
                    "path": "managingOrganization.first().getReferenceKey()",
                },  # Har snakket med Systematic (Erik). Den faktiske kardinalitet af managingOrganization er 0..1, men de har ikke opdateret IG'en.
            ]
        }
    ],
)
careteams.filter(F.col("ct_org_id").isNotNull()).head(5)

[Row(ct_id='CareTeam/3000148060', ct_org_id='Organization/3000038806'),
 Row(ct_id='CareTeam/3000148061', ct_org_id='Organization/3000029719')]


Desværre er der på TRIFORK-miljøet kun 2 careteams der har en tilknyttet organisation. Vi fortsætter analysen.

In [None]:
organizations = delta_lake.view(
    resource="Organization",
    select=[
        {
            "column": [
                {"name": "org_id", "path": "getResourceKey()"},
                {
                    "name": "municipality_code",
                    "path": "extension('http://ehealth.sundhed.dk/fhir/StructureDefinition/ehealth-organization-municipalityCode').valueString",
                },
            ]
        }
    ],
)
organizations.head(5)

25/12/12 09:54:37 WARN Collection: Traversing a choice element `valueString` without using ofType() is not portable and may not work in some FHIRPath implementations. Consider using ofType() to specify the type of element you want to traverse.


[Row(org_id='Organization/3000000064', municipality_code='0787'),
 Row(org_id='Organization/3000000069', municipality_code='0265'),
 Row(org_id='Organization/3000000072', municipality_code='0173'),
 Row(org_id='Organization/3000000088', municipality_code='0360'),
 Row(org_id='Organization/3000000090', municipality_code='0787')]

In [22]:
patients = delta_lake.view(
    resource="Patient", select=[{"column": [{"name": "patient_id", "path": "getResourceKey()"}]}]
)
patients.head(5)

[Row(patient_id='Patient/1000264558'),
 Row(patient_id='Patient/1000264559'),
 Row(patient_id='Patient/1000264560'),
 Row(patient_id='Patient/1000264604'),
 Row(patient_id='Patient/1000264605')]

In [26]:
joined = (
    episodes_of_care.join(careteams, episodes_of_care.eoc_team_id == careteams.ct_id, how="left")
    .join(patients, episodes_of_care.eoc_patient_id == patients.patient_id, how="left")
    .join(organizations, careteams.ct_org_id == organizations.org_id, how="left")
    .filter(col("org_id").isNotNull())
)

joined.head(5)

[]

Det viser sig at der på TRIFORKs testmiljø ikke er nogle patienter med episodes of care for de careteams, der har `.managingOrganization`. 

En alternativ fortolkning er antal borgere fordelt på bopælskommune:

In [36]:
patients_by_municipality = delta_lake.view(
    resource="Patient",
    select=[
        {
            "column": [
                {"name": "patient_id", "path": "getResourceKey()"},
                {
                    "name": "municipalityCode",
                    "path": "address.where(use = 'home').extension('http://hl7.dk/fhir/core/StructureDefinition/dk-core-municipalityCodes').valueCodeableConcept.coding.code",
                },
            ]
        }
    ],
)
patients_by_municipality.groupBy("municipalityCode").agg(
    F.countDistinct("patient_id").alias("n_patients")
).sort("n_patients", ascending=False).toPandas()

25/12/12 11:03:55 WARN Collection: Traversing a choice element `valueCodeableConcept` without using ofType() is not portable and may not work in some FHIRPath implementations. Consider using ofType() to specify the type of element you want to traverse.


Unnamed: 0,municipalityCode,n_patients
0,0575,18
1,0621,17
2,0370,16
3,0630,15
4,0756,15
...,...,...
93,0270,3
94,0849,2
95,0561,2
96,0840,2


Her ser vi en mere meningsfyldt fordeling.