# Project: Performance of phenotype algorithms for the identification of opioid-exposed infants, Andrew D. Wiese et al. Hospital Pediatrics 2024
# Title: Identify birthing parents with positive opioid toxicology result 
# Summary: 
## Identify birthing parents with positive opioid toxicology results during birth hospitalization




##### Algorithm steps:
```
1. Read opioid toxicology code list table into dataframe
2. Get measurements dataframe using short name column:
    - Filter measurements table to mom IDs from mom-baby table
    - Join with code list table on short name
3. Get measurements dataframe using long name column:
    - Filter measurements table to mom IDs from mom-baby table 
    - Join with code list table on long name
4. Union the short name and long name dataframes
5. Join with mom-baby table to get mom and baby person IDs
6. Filter to birth hospitalization visit:
    - Join with baby's first visit table
    - Filter measurement date between first visit start and end dates
```  

##### Data dictionaries:

- meas_table: measurements table
- opioid_toxicology_code_list: table with opioid toxicology codes
- mom_baby_step1: temp table with mom and baby IDs 
- mom_baby_step1_baby1stvisit_all: temp table with baby's first visit info
- meas_df_short: measurements dataframe using short name join
- meas_df_long: measurements dataframe using long name join 
- meas_df: union of short and long name dataframes
- mom_baby_step1_matopioidtoxicology_all: all opioid toxicology measurements
- mom_baby_step1_matopioidtoxicology_all_birthhospital: filtered to birth hospitalization

##### Other usage notes:
```
- Use lowercase and trim when joining to account for case differences
```

In [0]:
%run "./project_modules"

In [0]:
%sql
select * from phenotyping.mprint_sheet_14_mat_opioid_toxicology

##### mom_baby_step1_matopioidtoxicology_all

In [0]:
def get_meas_df(col_name):
    
    sql=f"""
        select * from (
            select * from {meas_table} where person_id in (select fact_id_1 from global_temp.mom_baby_step1)
            and ((lower(value_source_value) like '%positive%' or 
             lower(value_source_value) like '%present%' or 
             lower(value_source_value) like '%reactive%') or 
             value_as_number is not null)
        ) a
        inner join 
        {opioid_toxicology_code_list} b 
       
        on lower(trim(measurement_source_value)) LIKE '%' || lower(b.{col_name}) || '%'
       
        """
  
    df = spark.sql(sql)
    return df

In [0]:
meas_df_short = get_meas_df('short_name')
meas_df_long = get_meas_df('long_name')
meas_df = union_dataframes([meas_df_short,meas_df_long])

meas_df.name="mat_opioid_toxicology"
register_parquet_global_view(meas_df)

sql="""
    select * from global_temp.mat_opioid_toxicology
    inner join 
    (select fact_id_1 as mom_person_id, fact_id_2 as baby_person_id, birth_datetime from global_temp.mom_baby_step1) as maternal 
    on global_temp.mat_opioid_toxicology.person_id = maternal.mom_person_id;
"""

mom_baby_step1_matopioidtoxicology_all = spark.sql(sql)
mom_baby_step1_matopioidtoxicology_all.createOrReplaceTempView("mom_baby_step1_matopioidtoxicology_all") 

##### Validation

In [0]:
df_inspection("mom_baby_step1_matopioidtoxicology_all","all")

##### time windows: birth hospitalization

In [0]:
sql="""
       select a.baby_person_id,person_id as mom_person_id, MEASUREMENT_ID,MEASUREMENT_DATE,MEASUREMENT_DATETIME,
       VALUE_AS_NUMBER,RANGE_LOW,RANGE_HIGH,VISIT_OCCURRENCE_ID,MEASUREMENT_SOURCE_VALUE,UNIT_SOURCE_VALUE,
       VALUE_SOURCE_VALUE,SHORT_NAME as short_name_search_term,LONG_NAME as long_name_search_term,first_visit_start_date,
       first_visit_end_date,baby_1st_visit_problem  from mom_baby_step1_matopioidtoxicology_all a
       
       join 
       global_temp.mom_baby_step1_baby1stvisit_all b
       
       on a.baby_person_id=b.baby_person_id AND a.birth_datetime=b.birth_datetime
       where measurement_date >= first_visit_start_date and measurement_date <= first_visit_end_date
    """

mom_baby_step1_matopioidtoxicology_all_birthhospital = spark.sql(sql)
mom_baby_step1_matopioidtoxicology_all_birthhospital.name='mom_baby_step1_matopioidtoxicology_all_birthhospital'
register_parquet_global_view(mom_baby_step1_matopioidtoxicology_all_birthhospital)

In [0]:
df_inspection("global_temp.mom_baby_step1_matopioidtoxicology_all_birthhospital","all")