## Drug Use Frequency

### ASD Drugs Pharmacy Data
To evaluate the use of each of the target study drugs in the ASD cohort, we built a table for all of the insurance pharmacy claims data for every member in the ASD cohort.

In [None]:
library( ggplot2 )
library( ggalluvial )
library( stringr )
library( UpSetR )

First, we create table with the pharmacy claims data for every member of the ASD cohort

In [None]:
dbSendUpdate( cn, "SELECT Fills.*
INTO ASDPharmacyClaims
FROM ASDMembers, PharmacyClaims Fills
WHERE ASDMembers.MemberId = Fills.MemberId")

Then, we mapped all of the individual study ASD-associated drugs to the many variations of National Drug Codes (NDC) to ensure that we had every NDC linked to each drug. Each drug’s NDC map was used to obtain the pharmacy claims data for each member to calculate the number of annual pharmacy claims for each drug.


In [None]:
## Methylphenidate: map of NDC codes
dbSendUpdate( cn, "SELECT DISTINCT NationalDrugCode, 'Methylphenid' AS DrugName
INTO MethyDrugCodeMap
FROM ASDPharmacyClaims
WHERE NdcDescription like '%Methylphenid%'")

## Obtain annual counts for Methylphenidate prescriptions
dbGetQuery( cn, "SELECT YEAR ( A.DispenseDate ), COUNT(*)
FROM ASDPharmacyClaims A, MethyDrugCodeMap B
WHERE A.NationalDrugCode = B.NationalDrugCode
AND B.DrugName= 'Methylphenid'
GROUP BY YEAR ( A.DispenseDate )
ORDER BY YEAR ( A.DispenseDate )")

## Atomoxetine
dbSendUpdate( cn, "SELECT DISTINCT NationalDrugCode, 'Atomox' AS DrugName
INTO AtomDrugCodeMap
FROM ASDPharmacyClaims
WHERE NdcDescription like '%Atomox%'")

##Guanfacine
dbSendUpdate( cn, "SELECT DISTINCT NationalDrugCode, 'Guanfac' AS DrugName
INTO GuanDrugCodeMap
FROM ASDPharmacyClaims
WHERE NdcDescription like '%Guanfac%'")

##Risperidone
dbSendUpdate( cn, "SELECT DISTINCT NationalDrugCode, 'Risperid' AS DrugName
INTO RisperidDrugCodeMap
FROM ASDPharmacyClaims
WHERE NdcDescription like '%Risperid%'")

##Aripiprazole
dbSendUpdate( cn, "SELECT DISTINCT NationalDrugCode, 'Aripipraz' AS DrugName
INTO AripipDrugCodeMap
FROM ASDPharmacyClaims
WHERE NdcDescription like '%Aripipraz%'")

##Fluoxetine
dbSendUpdate( cn, "SELECT DISTINCT NationalDrugCode,'Fluoxe' AS DrugName
INTO FluoxDrugCodeMap
FROM ASDPharmacyClaims
WHERE NdcDescription like '%Fluoxe%'")

##Citalopram
dbSendUpdate( cn, "SELECT DISTINCT NationalDrugCode,'Citalop' AS DrugName
INTO CitalDrugCodeMap
FROM ASDPharmacyClaims
WHERE NdcDescription
like '%Citalop%'")

## ASD Drug Use Overtime (Sankey Diagram)

To build a Sankey Diagram that depicts individual member changes in the ASD study drugs over time, we used our NDC maps unique to each drug to obtain the pharmacy claims data associated with each of the seven target study drugs.

We create a table with the pharmacy claims data for each member in the ASD cohort

In [None]:
#Associated with Methylphenidate prescriptions
dbSendUpdate( cn, "SELECT DISTINCT( MemberId )
INTO MethyASD
FROM ASDPharmacyClaims
WHERE NationalDrugCode
IN (SELECT NationalDrugCode
    FROM MethyDrugCodeMap )")
    
#Associated with Atomoxetine prescriptions
dbSendUpdate( cn, "SELECT DISTINCT( MemberId )
INTO AtomASD
FROM ASDPharmacyClaims
WHERE NationalDrugCode
IN (SELECT NationalDrugCode
    FROM AtomDrugCodeMap )")

#Associated with Guanfacine prescriptions
dbSendUpdate( cn, "SELECT DISTINCT( MemberId )
INTO GuanASD
FROM ASDPharmacyClaims
WHERE NationalDrugCode
IN (SELECT NationalDrugCode
    FROM GuanDrugCodeMap )")

#Associated with Risperidone prescriptions
dbSendUpdate( cn, "SELECT DISTINCT( MemberId )
INTO RisperidASD
FROM ASDPharmacyClaims
WHERE NationalDrugCode
IN (SELECT NationalDrugCode
    FROM RisperidDrugCodeMap)
")

#Associated with Aripiprazole prescriptions
dbSendUpdate( cn, "SELECT DISTINCT ( MemberId )
INTO AripipASD
FROM ASDPharmacyClaims
WHERE NationalDrugCode 
IN (SELECT NationalDrugCode
    FROM AripipDrugCodeMap )")

#Associated with Fluoxetine prescriptions
dbSendUpdate( cn, "SELECT DISTINCT( MemberId )
INTO FluoxASD
FROM ASDPharmacyClaims
WHERE NationalDrugCode
IN (SELECT NationalDrugCode
    FROM FluoxDrugCodeMap )")

#Associated with Citalopram prescriptions
dbSendUpdate( cn, "SELECT DISTINCT ( MemberId )
INTO CitalASD
FROM ASDPharmacyClaims
WHERE NationalDrugCode
IN (SELECT NationalDrugCode
    FROM CitalDrugCodeMap )")

Create the table

In [None]:
dbSendUpdate(cn, "SELECT MemberId, DispenseDate, NdcDescription 
                    INTO PharmacySubset
                    FROM ASDPharmacyClaims
                    WHERE
                    NdcDescription like '%Methylphenid%'
                    OR  NdcDescription like '%Atomox%' 
                    OR  NdcDescription like '%Guanfac%' 
                    OR  NdcDescription like '%Risperid%' 
                    OR  NdcDescription like '%Aripipraz%' 
                    OR  NdcDescription like '%Fluoxe%' 
                    OR  NdcDescription like '%Citalop%'")

Select only those from 2014 to 2019

In [None]:
dbSendUpdate( cn, "SELECT *
                      INTO PharmacySubset2014
                      FROM PharmacySubset
                      WHERE YEAR(DispenseDate) > 2013
                      ORDER BY MemberId, YEAR (PharmacySubset.DispenseDate)")


We extract the information and prepare the data as required for the Sankey diagram. 

In [None]:
drugData <- dbGetQuery( cn, "SELECT * FROM agf9.dbo.PharmacySubset2014")

drugData$therapy <-  sapply(strsplit( as.character(drugData$NdcDescription), " "), '[', 1)
drugData$timeperiod <-  sapply(strsplit( as.character(drugData$DispenseDate), "[-]"), '[', 1)
drugData <- drugData[ drugData$timeperiod < 2020, ]
drugDataSubset <- unique( drugData[ , c("MemberId", "timeperiod", "therapy")] )
drugDataSubset$therapy <- as.factor(drugDataSubset$therapy)
drugDataSubset$MemberId <- as.factor(drugDataSubset$MemberId)
drugDataSubset$timeperiod <- as.numeric( drugDataSubset$timeperiod )

drugDataSubset$pair <- paste0(drugDataSubset$MemberId, "-", drugDataSubset$timeperiod)

output <- as.data.frame( table( drugDataSubset$pair ))
onePerYear <- output[ output$Freq ==1, ]
subset <- drugDataSubset[ drugDataSubset$pair %in% onePerYear$Var1, ]

Plotting Sankey table

In [None]:
p <- ggplot(subset, aes(x = timeperiod, stratum = therapy, alluvium = MemberId, fill = therapy, label = therapy)) +
  scale_fill_brewer(type = "qual", palette = "Set2") +
  geom_flow(color = "darkgray") +
  geom_stratum() +
  theme(legend.position = "bottom") +
  ggtitle("Treatment across observation period")

save(p, file = "./outputGraphic.RData")

### ASD Drug Use Over Time (Single- and Two-Drug Regimen)
To analyze the use of each target study drug over time, we obtained a count of distinct members from the ASD cohort that also had valid pharmacy claims between 2014 and 2019 (table: PharmacySubset2014). 
Single- and Two-Drug Regimen Use Across All Years (2014 -2019)

First, we obtained these counts for members taking only one of the target study drugs (e.g., methylphenidate only; without pharmacy claims for atomoxetine, guanfacine, etc.). The sum of distinct member counts between 2014 and 2019 were obtained.

Second, we used a similar query to determine the number of distinct members from this same sample subset that were on a two-drug regimen (e.g., methylphenidate and atomoxetine, without prescriptions for the other target drugs). The sum of distinct member counts between 2014 and 2019 were obtained.

In [None]:
dbSendUpdate( cn, "select 
MemberID,
YEAR(DispenseDate) as DispenseYear,
sum(case when NdcDescription like 'METHYL%' then 1 else 0 end) as n_Methyl,
sum(case when NdcDescription like 'ATOM%' then 1 else 0 end) as n_Atom,
sum(case when NdcDescription like 'GUANFA%' then 1 else 0 end) as n_Guan,
sum(case when NdcDescription like 'FLUOX%' then 1 else 0 end) as n_Fluox,
sum(case when NdcDescription like 'CITALO%' then 1 else 0 end) as n_Cital,
sum(case when NdcDescription like 'RISPER%' then 1 else 0 end) as n_Risperid,
sum(case when NdcDescription like 'ARIPIP%' then 1 else 0 end) as n_Aripip
into agf9.dbo.PharmacySubset2014to2019_counts
from PharmacySubset2014
group by MemberID, YEAR(DispenseDate)")

In [None]:
for( i in c(2014:2019)){
  print(i)
  dyear <- dbGetQuery( cn, paste0("SELECT * FROM PharmacySubset2014to2019_counts where DispenseYear = ", i))
  dyear[3:9] <- lapply(dyear[3:9] , function(x) replace(x,x > 0, "Yes") )
  dyear[3:9] <- lapply(dyear[3:9] , function(x) replace(x,x %in% 0, "No") )
  
  dyear$combination <- apply( dyear[ , c(3:9) ] , 1 , paste , collapse = "-" )
  dyear$counts <- str_count(dyear$combination, "Yes")

  dyearSubset <- dyear[ dyear$counts == 1 | dyear$counts == 2, ]
  output <- as.data.frame( summary(as.factor(dyearSubset$combination)))
  output$combination <- NA
  
  drugs <- colnames(dyearSubset)[3:9]
  drugs <- gsub( "n_", "", drugs)
  
  for( j in 1:nrow(output)){
  output$combination[j] <- paste( drugs[ which( unlist(strsplit( rownames(output)[j], "-")) == "Yes")], collapse = "&" )
  }
  rownames(output) <- c()
  colnames(output) <- c("Count", "Combination")
  output$year <- i
  
  if( i == 2014){
    final <- output
  }else if(i > 2014){
    final <- rbind( final, output)
  }

}

#### UpSetR plots
We prepare the data to be plot using UpSetR (https://github.com/hms-dbmi/UpSetR)
First we extract the information by year and we plot it to see how it looks like. 

In [None]:
data2014 <- final[ final$year == 2014, "Count"]
names(data2014) <- final[ final$year == 2014, "Combination"]
data2014df <- fromExpression( data2014 )
data2014df$x = "Year_2014"
data2014df$Atom <- 0
data2014df$Aripip <- 0

upset( fromExpression( data2014), nsets = 7)

data2015 <- final[ final$year == 2015, "Count"]
names(data2015) <- final[ final$year == 2015, "Combination"]
data2015df <- fromExpression( data2015 )
data2015df$x = "Year_2015"
data2015df$Atom <- 0

upset( fromExpression( data2015), nsets = 7)

data2016 <- final[ final$year == 2016, "Count"]
names(data2016) <- final[ final$year == 2016, "Combination"]
data2016df <- fromExpression( data2016 )
data2016df$x = "Year_2016"
data2016df$Atom <- 0

upset( fromExpression( data2016), nsets = 7)

data2017 <- final[ final$year == 2017, "Count"]
names(data2017) <- final[ final$year == 2017, "Combination"]
data2017df <- fromExpression( data2017 )
data2017df$x = "Year_2017"

upset( fromExpression( data2017), nsets = 7)

data2018 <- final[ final$year == 2018, "Count"]
names(data2018) <- final[ final$year == 2018, "Combination"]
data2018df <- fromExpression( data2018 )
data2018df$x = "Year_2018"

upset( fromExpression( data2018), nsets = 7)

data2019 <- final[ final$year == 2019, "Count"]
names(data2019) <- final[ final$year == 2019, "Combination"]
data2019df <- fromExpression( data2019 )
data2019df$x = "Year_2019"
upset( fromExpression( data2019), nsets = 7)

Then, we put all the data together to plot all the barplots together, and get a sense of the combination of drugs per year. 

In [None]:
totalData <- rbind( data2014df, data2015df )
totalData <- rbind( data2016df, totalData )
totalData <- rbind( data2017df, totalData )
totalData <- rbind( data2018df, totalData )
totalData <- rbind( data2019df, totalData )
totalData$x <- as.factor( totalData$x )

###Rename the drugs to be displayed in the final plot
colnames( totalData )[1:7] <- c("Aripiprazole", "Risperidone", "Citalopram",
                                "Fluoxetine", "Guanfacine", "Atomoxetine", 
                                "Methylphenidate")
upset( totalData,
       nsets = 7,
       queries = list(
         list(query = elements, 
              params = list("x", c("Year_2019","Year_2018", "Year_2017", "Year_2016","Year_2015", "Year_2014")), color = "#b54e75", active = T),
         list(query = elements, 
              params = list("x", c("Year_2018", "Year_2017","Year_2016", "Year_2015", "Year_2014")), color = "#e69f00", active = T),
         list(query = elements, 
              params = list("x", c("Year_2017","Year_2016", "Year_2015", "Year_2014")), color = "#58ad97", active = T),
         list(query = elements, 
              params = list("x", c("Year_2016", "Year_2015", "Year_2014")), color = "#566fa8", active = T),
         list(query = elements, 
              params = list("x", c("Year_2015", "Year_2014")), color = "#2a2369", active = T),
         list(query = elements, 
              params = list("x", c("Year_2014")), color = grey(0.7), active = T)
         
       ), 
       query.legend = "bottom"
)