## Vist Occurance Table Mapping

This is an attempt at mapping FHIR to OMOP using the following guide: https://build.fhir.org/ig/HL7/cdmh/profiles.html#omop-to-fhir-mappings
<br>In this notebook we are mapping FHIR to the OMOP Vist Occurance Table
<br><br><b>TODO</b>: find the correct mapping for the fields "care_site_id" and "discharge_to_concept_id"

### Load Data Frame from Parquet Catalog File

In [17]:
from pyspark.sql import SparkSession
from pyspark.sql.functions import dayofmonth,month,year,to_date,trunc,split,explode,array

# Create a local Spark session
spark = SparkSession.builder.appName('etl').getOrCreate()

In [18]:
# Reads file 
df = spark.read.parquet('data/catalog.parquet')

Data Frame schema 

In [19]:
#df.printSchema()

### Encounter Mapping 

Filter By Encounter Resource type 

In [20]:
filtered = df.filter(df['resourceType'] == 'Encounter')

In [29]:
#filtered.printSchema()

Selects relevant fields 

In [23]:
Encounter = filtered.select(['id','subject','type',
                              'location','hospitalization.admitSource',
                              'period','extension.valueCodeableConcept'])

Extract the start and end date along with the time from the period field.

In [25]:
#splits the date and time
split_start = split(Encounter['period.start'], 'T')
split_end = split(Encounter['period.end'], 'T') 

#assigns each to a column 
vist_date_time = Encounter\
    .withColumn("visit_start_date",split_start.getItem(0))\
    .withColumn("visit_start_datetime",split_start.getItem(1))\
    .withColumn("visit_end_date",split_end.getItem(0))\
    .withColumn("visit_end_datetime",split_end.getItem(1))

Drop columns no longer needed

In [26]:
dropped  = vist_date_time.drop("period")

Rename the columns 

In [27]:
visit_occurnace = dropped\
    .withColumnRenamed("type","preceding_visit_occurence")\
    .withColumnRenamed("id","visit_occurence_id")\
    .withColumnRenamed("admitSource","admitting_source_concept_id")\
    .withColumnRenamed("subject","person_id")\
    .withColumnRenamed("type","preceding_visit_occurence")\
    .withColumnRenamed("valueCodeableConcept","visit_type_concept_id")

#.withColumnRenamed("location.location.id","care_site_id")\    
#.withColumnRenamed("location.location.type","discharge_to_concept_id")\

Shows mapped output table

In [30]:
visit_occurnace.show(5) 