##### Project: Opioid Exposed Infant Covariates
##### Investigator: Stephen Patrick, Sarah Loch
##### Programmers: Sander Su, Chris Guardo
##### Date Created: 01/17/23
##### Last Modified: 09/30/25

#### Notes: 
This step comes after birth visit in the workspace pipeline. It pulls every  NAST or NOWS composite score recorded during the birth hospitalization from nursing flow sheets. 


### Using flowsheet data from HCI PAT results table @ VUMC

In [0]:
%run "../Project_modules"

##### Scored for NAS: Y/N
##### search '%NAST%' and '%NWI%' in the column 'class_name' of table 'HCI_PAT_RESULT'
##### Cohort: infant in phenotype
##### time window: all time, mark it the NAST score got during birth hospitalization, using '1' or '0'   ('1': in birth hospitalization; '0': not in birth hospitalization)



In [0]:
phenotype_table_location = " ***Insert file location*** "
phenotype_table=spark.sql(f"SELECT * FROM {phenotype_table_location}")

##### cohort:infant in phenotype

In [0]:
phenotype_cohort=get_phenotype_cohort(phenotype_table)
phenotype_cohort.createOrReplaceTempView("phenotype_cohort")

##### Function to get NAST/NWI score cohorts
##### time window: all time, mark it the NWI score got during birth hospitalization, using '1' or '0' ('1': in birth hospitalization; '0': not in birth hospitalization)

In [0]:
def get_score_cohort(score_name):
  
   sql=f"""
   
         select *,case when date(perform_dt)>=first_visit_start_date and date(perform_dt)<=first_visit_end_date then '1' 
         else '0' end as if_during_birth_hospitalization  
      
         from
      
         (select * from phenotype_cohort) a
      
         inner join 
      
         (SELECT * FROM {flowsheet_table} WHERE upper(class_name) LIKE '%{score_name}%' and lower(display_text) like '%total%') score 
      
         on a.baby_person_source_value = score.mrn 
      
         left join (select baby_person_id,first_visit_start_date,first_visit_end_date from global_temp.mom_baby_step1_baby1stvisit_all) c       
         using (baby_person_id);

   """
   score_df= spark.sql(sql)
   return score_df

##### 'NWI' score of the cohort

In [0]:
mprint_nwi = get_score_cohort('NWI')
mprint_nwi.name='mprint_nwi'
register_parquet_global_view(mprint_nwi)

##### Validation

In [0]:
# sql="""
#    select count(*) as total, count(distinct baby_person_source_value) as unique_baby from global_temp.mprint_nwi 
#    where if_during_birth_hospitalization =1;
#    """
# inspect_df= spark.sql(sql)
# inspect_df.display()

##### Create 'NAST' score of the cohort (NAST is more important than NAS score)

In [0]:
mprint_nast = get_score_cohort('NAST')
mprint_nast.name='mprint_nast'
register_parquet_global_view(mprint_nast)

##### Validation

In [0]:
# sql="""
#    select count(*) as total, count(distinct baby_person_source_value) as unique_baby from global_temp.mprint_nast 
#    where if_during_birth_hospitalization =1;
#    """
# inspect_df= spark.sql(sql)
# inspect_df.display()

###### how many babies have both NAST or NWI score

In [0]:
# sql="""
#      select count(distinct baby_person_id) as unique_baby,min(baby_birth_datetime) as earliest_baby_brith_datetime, 
#      max(baby_birth_datetime) as latest_baby_brith_datetime from
   
#      (select distinct baby_person_id,baby_birth_datetime from global_temp.mprint_nwi) a
#      inner 
#      join (select distinct baby_person_id,baby_birth_datetime from global_temp.mprint_nast) b
   
#      using(baby_person_id,baby_birth_datetime);
#    """
# inspect_df= spark.sql(sql)
# inspect_df.display()

### Save Output for future use

In [0]:
mprint_nast.write.mode("overwrite").saveAsTable(f"covariate_output.nast_or_nows_scoring")