# Project: Performance of phenotype algorithms for the identification of opioid-exposed infants, Andrew D. Wiese et al. Hospital Pediatrics 2024
# Title: Birthing Parent-Child Dyads with Live Births
# Summary: 
## Find birthing parent-child dyads with evidence of live births

##### Algorithm steps:
```
1. Create mom_baby_step1 view by joining relationship and person tables on fact_id_2
2. Validate mom_baby_step1 view
3. Get live birth codes from live birth code sheet
4. Get procedure records containing live birth codes
5. Get condition records containing live birth codes  
6. Get observation records containing live birth codes
7. Get mom records from procedure/condition/observation tables
8. Get baby records from procedure/condition/observation tables  
9. Union mom and baby records into final mom_baby_step1_birth_code view
10. Validate final mom_baby_step1_birth_code view
```

##### Data Dictionaries:

**mom_baby_step1:** View containing mom person_id, baby person_id, birth_datetime, and demographic info

**lbc_code_list:** Table containing list of live birth codes

**merged_lbc_proc_df:** View containing procedure records with live birth codes

**merged_lbc_cond_df:** View containing condition records with live birth codes

**merged_lbc_obs_df:** View containing observation records with live birth codes  

**mon_proc_df:** Mom records from procedure table 

**mon_cond_df:** Mom records from condition table

**mon_obs_df:** Mom records from observation table

**baby_proc_df:** Baby records from procedure table

**baby_cond_df:** Baby records from condition table 

**baby_obs_df:** Baby records from observation table

**mom_baby_step1_birth_code:** Final view containing all mom and baby records indicating a live birth

##### Usage Notes:
```
- Relationship concept id 4248584 indicates a mom-infant relationship
- Date filters are applied to procedure/condition/observation records to limit to 30 days before and after birth date
- Wildcard live birth codes are handled differently than exact code matches
```

In [0]:
%run "./project_modules"

In [0]:
%sql
--- Default bin size for range join optimization for all datetime comparision
SET spark.databricks.optimizer.rangeJoin.binSize=90

##### Create view for mom baby pair in FACT_RELATIONSHIP table
##### using relationship_concept_id = '4248584' which means 'MOM_INFANT' relation. 
##### 'fact_id_1' is mom's person_id, 'fact_id_2' is baby's person_id


In [0]:
sql=f"""
    SELECT distinct fact_id_1,fact_id_2,birth_datetime,person_source_value,gender_source_value,race_source_value FROM 
    (SELECT * FROM {fact_real_table} WHERE relationship_concept_id = '4248584') a
    INNER JOIN {person_table} b
    ON a.fact_id_2 = b.person_id; 
    """
mom_baby_step1 = spark.sql(sql)
mom_baby_step1.name='mom_baby_step1'
register_parquet_global_view(mom_baby_step1)

In [0]:
%sql
select * from global_temp.mom_baby_step1;

##### Validation

In [0]:
sql="""
    select count(distinct fact_id_1) as unique_mom, count(distinct fact_id_2) as unique_baby, count(*) as total_records from global_temp.mom_baby_step1;
    """
insp_df = spark.sql(sql)
insp_df.display()

#####  mom_baby_step1_birth_code
##### 1. live birth code: phenotyping.mprint_live_birth_code_sheet_1
##### 2. check records with codes in procedure_occurrence / condition_occurrence/ observation table (base on mom or baby)
##### 3. time limit: (date_of_birth - 30) < code date < (date_of_birth + 30)

##### LBC code list

In [0]:
sql=f"""
    select * from {lbc_code_list}
    """
lbc_code_list = spark.sql(sql)
lbc_code_list.display()

lbc_codes = lbc_code_list.agg(F.concat_ws(",",F.collect_list(F.concat(F.lit('"'),F.col('CODE'),F.lit('"'))))).first()[0]
wildcard_lbc_codes=['O82.%','V33.%','V30.0%','Z37.5%','V31.%','V36.%','V34.%','O81.%','O80.%','V39.%']

##### Procedure records with (wildcard) LBC codes

In [0]:
wildcard_lbc_proc_df=get_table_records(proc_table,wildcard_lbc_codes, 1)
lbc_proc_df=get_table_records(proc_table,lbc_codes, 0)

merged_lbc_proc_df = union_dataframes([wildcard_lbc_proc_df,lbc_proc_df])
merged_lbc_proc_df.createOrReplaceTempView("merged_lbc_proc_df") 

##### Condition records with (wildcard) LBC codes

In [0]:
wildcard_lbc_cond_df=get_table_records(cond_table,wildcard_lbc_codes, 1)
lbc_cond_df=get_table_records(cond_table,lbc_codes, 0)

merged_lbc_cond_df = union_dataframes([wildcard_lbc_cond_df,lbc_cond_df])
merged_lbc_cond_df.createOrReplaceTempView("merged_lbc_cond_df")

##### Observation records with (wildcard) LBC codes


In [0]:
wildcard_lbc_obs_df=get_table_records(obs_table,wildcard_lbc_codes, 1)
lbc_obs_df=get_table_records(obs_table,lbc_codes, 0)

merged_lbc_obs_df = union_dataframes([wildcard_lbc_obs_df,lbc_obs_df])
merged_lbc_obs_df.createOrReplaceTempView("merged_lbc_obs_df")

##### Mom and baby records in procedure, condition and observation tables

In [0]:
mon_proc_df=mon_baby_records("procedure","merged_lbc_proc_df","fact_id_1")
mon_cond_df=mon_baby_records("condition","merged_lbc_cond_df","fact_id_1")
mon_obs_df=mon_baby_records("observation","merged_lbc_obs_df","fact_id_1")

baby_proc_df=mon_baby_records("procedure","merged_lbc_proc_df","fact_id_2")
baby_cond_df=mon_baby_records("condition","merged_lbc_cond_df","fact_id_2")
baby_obs_df=mon_baby_records("observation","merged_lbc_obs_df","fact_id_2")

mom_baby_step1_birth_code = union_dataframes([mon_proc_df,mon_cond_df,mon_obs_df,baby_proc_df,baby_cond_df,baby_obs_df])
mom_baby_step1_birth_code.name='mom_baby_step1_birth_code'
register_parquet_global_view(mom_baby_step1_birth_code)

##### Validation

In [0]:
df_inspection("global_temp.mom_baby_step1_birth_code","all")