## 3.Drug-Drug Interactions
- The dataset shared in this notebook is a **synthetic dataset** created for demonstration purposes only.
- All results presented in our paper are based on the Merative™ MarketScan® Databases, which contain de-identified real-world healthcare data and cannot be publicly shared.
- For details about the actual maternal and neonatal datasets, please refer to the **Data Availability** and **Methods** sections of our paper.

In [2]:
import os
import sys
import warnings
# Function to set thread limits for external libraries to avoid oversubscription in the shared server
def set_threads_for_external_libraries(n_threads=1):
    if ("numpy" in sys.modules) or ("scipy" in sys.modules) or ("sklearn" in sys.modules):
        warnings.warn("Call set_threads_for_external_libraries() before importing numpy/scipy/sklearn for full effect.")
    for k in ["OMP_NUM_THREADS","OPENBLAS_NUM_THREADS","MKL_NUM_THREADS","VECLIB_MAXIMUM_THREADS","NUMEXPR_NUM_THREADS"]:
        os.environ[k] = str(n_threads)
        
set_threads_for_external_libraries(n_threads=64)

In [3]:
import math
import time
import pandas as pd
import numpy as np
import statsmodels.formula.api as smf

In [5]:
## Load the dataset (This is a synthetic dataset for demonstration purposes. Please read "Data Availability" section, and replace it with your actual dataset.)
df = pd.read_csv('../0_data/synthetic_baby_mom_data.csv')
## make a dataframe for testing
baby_cols_sel = ['RDS_Baby','NAS_Baby','Postmaturity_Baby','ROP_Baby','SGA_Baby']
med_cols_sel = ['Ondansetron, Oral','Sertraline, Oral','Oxycodone, Oral']


In [6]:
## Divide the dataframe into neonatal complications and maternal medications
baby_dz = df[baby_cols_sel]
mom_med = df.loc[:,'Ondansetron, Oral':]
mom_med_count = mom_med.sum()
ga_df = df[['GESTATIONAL_AGE']]

disease_list = baby_dz.columns
ddi_df = pd.concat([baby_dz,mom_med,ga_df],axis=1)

In [7]:
## Calculate Drug-Drug Interaction effects on Neonatal Complications
d_final={}
j=0 ## Index number (valid indices)
i=0 ## Number of pairs

for DISEASE in disease_list:
    for med1 in mom_med.columns:
        med1_ix = mom_med.columns.get_loc(med1)
        for med2 in mom_med.columns[med1_ix+1:]:
            sel_df = ddi_df[[DISEASE,med1,med2,'GESTATIONAL_AGE']]
            sel_df['Interaction'] = sel_df[med1]*sel_df[med2]
            med12_count = sel_df['Interaction'].sum()
            if med12_count==0:
                pass
            elif ((sel_df['Interaction']!=0)&(sel_df[DISEASE]!=0)).sum()<=10:
                pass
            else:
                stats_model_input = '{} ~ sel_df.iloc[:,1] + sel_df.iloc[:,2] + Interaction + GESTATIONAL_AGE'.format(DISEASE)
                try:
                    reg = smf.logit(stats_model_input, data=sel_df).fit()
                    b1= reg.params['sel_df.iloc[:, 1]']
                    b2=reg.params['sel_df.iloc[:, 2]']
                    b3=reg.params['Interaction']
                    b1_pval = reg.pvalues['sel_df.iloc[:, 1]']
                    b2_pval = reg.pvalues['sel_df.iloc[:, 2]']
                    b3_pval=reg.pvalues['Interaction']
                except:
                    b1= np.nan
                    b2= np.nan
                    b3= np.nan
                    b1_pval = np.nan
                    b2_pval = np.nan
                    b3_pval= np.nan
                j+=1
                dz_med12_count = ((sel_df['Interaction']!=0)&(sel_df[DISEASE]!=0)).sum()
                med1_count = mom_med_count[med1]
                med2_count = mom_med_count[med2]
                d_final[j]={'Disease':DISEASE,'Med1':med1,'Med2':med2,'b1':b1,'b2':b2,'b3':b3,'pval(b1)':b1_pval,'pval(b2)':b2_pval,'pval(b3)':b3_pval,'Med1(count)':med1_count,'Med2(count)':med2_count,'Med1/2(count)':med12_count,'Dz+Med1/2(count)':dz_med12_count} 
                print(d_final[j])
            i+=1

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  sel_df['Interaction'] = sel_df[med1]*sel_df[med2]
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  sel_df['Interaction'] = sel_df[med1]*sel_df[med2]
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  sel_df['Interaction'] = sel_df[med1]*sel_df[med2]
A value is trying to be set on a copy of a slice from a

In [8]:
## convert the final dictionary to a dataframe
ddi_table = pd.DataFrame.from_dict(d_final,'index')
ddi_table