# Project: Performance of phenotype algorithms for the identification of opioid-exposed infants, Andrew D. Wiese et al. Hospital Pediatrics 2024
# Title: Exclude infants with evidence of congenital malformations
# Summary: 
## Exclude infants with evidence of congenital malformations based on ICD codes

# Notes:
- Look for evidence of congenital malformations during birth hospitalization stay


##### Algorithm steps:
```
1. Get list of fetal anomaly codes
fetal_anomaly_codes = get_codes_from_table(fetal_anomalies_code_list)

2. Find records in condition and observation tables matching codes
condition_records = get_records_with_codes(condition, 'fact_id_2', fetal_anomaly_codes)
observation_records = get_records_with_codes(observation, 'fact_id_2', fetal_anomaly_codes)

3. Merge condition and observation records
merged_records = merge_dataframes(condition_records, observation_records)

4. Join with first prenatal visit data
final_records = join(merged_records, first_prenatal_visit_data, 
                    on='baby_person_id')

5. Save results as global temp view
save_global_temp_view(final_records, 'mom_baby_step3_fetal_anomalies')
```

##### Data Dictionaries:

**fetal_anomalies_code_list**: Table containing list of fetal anomaly codes

**condition**: Condition occurrence table
**observation**: Observation table

**fact_id_2**: Fact ID column in condition and observation tables

**baby_person_id**: Baby person ID column for joining

**first_prenatal_visit_data**: 
    - first_visit_start_date: Date of first prenatal visit
    - first_visit_end_date: Date of end of first prenatal visit

**final_records**:
    - Columns from condition/observation tables
    - first_visit_start_date
    - first_visit_end_date

##### Usage Notes:
```
- Fetal anomaly codes are searched for only within the first prenatal visit period
- Distinct records returned to avoid duplicates
- Results registered as global temp view for further analysis
```

In [0]:
%run "./project_modules"

##### Search condition_occurrence and observation tables for baby using fetal anomalies codes

In [0]:
sql=f"select code from {fetal_anomalies_code_list};"

fetal_anomalies_code = spark.sql(sql)
fetal_anomalies_code.display()

##### Find records from condition and observation tables

In [0]:
code_list = fetal_anomalies_code.select('code').rdd.flatMap(lambda x: x).collect()
cond_df = mon_baby_records_code_list('condition','fact_id_2',code_list)
obs_df = mon_baby_records_code_list('observation','fact_id_2',code_list)
union_dataframes([cond_df,obs_df]).createOrReplaceTempView("merged_df")

sql="""
       select  distinct merged_df.*,c.first_visit_start_date,c.first_visit_end_date from
       merged_df
       inner join 
       global_temp.mom_baby_step1_baby1stvisit c
       on merged_df.baby_person_id = c.baby_person_id
       where code_date>= first_visit_start_date and code_date<= first_visit_end_date;
"""

mom_baby_step3_fetal_anomalies = spark.sql(sql)
mom_baby_step3_fetal_anomalies.name='mom_baby_step3_fetal_anomalies'
register_parquet_global_view(mom_baby_step3_fetal_anomalies)

##### Validation

In [0]:
df_inspection("global_temp.mom_baby_step3_fetal_anomalies","all")