# Practicum 3: Secondary Aims Using Data Arising from a SMART


</br>
<font size=3>
    This material has been developed for [Getting SMART About Adaptive Interventions in Education](https://d3lab.isr.umich.edu/training/) led by [d3lab](https://d3lab.isr.umich.edu). 
    
    Notebooks were developed by [Nicholas J. Seewald](https://nickseewald.com). 
    SAS code originally written by Daniel Almirall, Inbal Nahum-Shani, and Susan A. Murphy.
    The code was translated into R by Audrey Boruvka and Nicholas J. Seewald.
</font>


### Practicum Tasks
- [Task 1: Estimate mean outcomes under different second-stage treatments at different levels of first-stage utterances](#task-1)
<hr>

Recall the Autism SMART:
<img src="assets/autism-smart-diagram.jpg" alt="Autism SMART diagram" style="width: 500px;"/>

**First-Stage Coding**:
- JASP+EMT: A1 = 1
- JASP+EMT+SGD: A1 = -1

**Second-Stage Coding**:
- ADD SGD: A2 = 1 
- INTENSIFY: A2 = -1

## Setup
As in previous practica, we need to set up our data frame and computing environment.

In [None]:
library(geepack)
source('functions.R')

aut <- read.csv("assets/autism-simulated-dataset.csv")
names(aut) <- tolower(names(aut))
aut <- aut[order(aut$id), ]

aut$o11c <- with(aut, o11 - mean(o11))
aut$o12c <- with(aut, o12 - mean(o12))
aut$o21c <- with(aut, o21 - mean(o21))
aut$o22c <- with(aut, o22 - mean(o22))
aut$o11cnr <- aut$o12cnr <- NA
aut$o21cnr <- aut$o22cnr <- NA
aut$o11cnr[aut$r == 0] <- with(subset(aut, r == 0), o11 - mean(o11))
aut$o12cnr[aut$r == 0] <- with(subset(aut, r == 0), o12 - mean(o12))
aut$o21cnr[aut$r == 0] <- with(subset(aut, r == 0), o21 - mean(o21))
aut$o22cnr[aut$r == 0] <- with(subset(aut, r == 0), o22 - mean(o22))

aut$s <- ifelse(aut$a1 == 1 & aut$r == 0, 1, 0)

aut <- aut[order(aut$id), ]
head(aut)

## Part 1: Examine Moderators of the Second-Stage Treatment Effect
Step 1 of Q-learning is to understand how intermediate outcomes can be used to make second-stage decisions about intensifying vs. adding SGD among slow-responders to JASP+EMT. This will help us more deeply tailor second-stage treatment for these slow responders based on the status of the child up to the point of slow response.

First, we will fit a full model with the re-randomized children.

In [None]:
fit <- geeglm(y ~ o11cnr + o12cnr + a2*o21 + a2*o22,
              id = id, data = aut, subset = s == 1, x = TRUE) 
# x = TRUE requests R return the matrix of data used to fit the model

## summary(fit) uses "robust" standard errors, which is different from proc 
## genmod under no clustering/weighting - so we run estimate with
## combos = identity matrix (1's on diagonal, 0's elsewhere)
mat <- diag(ncol(fit$x)) #mat is the appropriately-sized identity matrix 
rownames(mat) <- names(coef(fit)) #name the rows of mat for legibility
estimate(fit, mat)

Only `o21` (level of communicative utterances made in the first stage) appears to moderate the effect of second-stage treament. Now, we'll fit a reduced model and make appropriate comparisons. The `*` operator in a model will give us both main effects as well as an interaction. Here, `a2*o21` is equivalent to `a2 + o21 + a2:o21`. 

In [None]:
# Fit a reduced model and make appropriate comparisons
refit <-
  geeglm(y ~ o11cnr + a2*o21,
         id = id, data = aut, subset = s == 1, x = TRUE)

### <a name="task-1"></a> Task 1: Estimate mean outcomes under different second-stage treatments at different levels of first-stage utterances
Fill in the blanks below to create contrasts which will identify the mean outcomes under each second-stage treatment at a particular level of `o21`, the level of communicative utterances in the first stage. Remember the arguments are positional based on the model: the first element of the vector is the intercept, the second is `o11cnr`, etc.

In [None]:
estimate(refit,
  rbind(
    # The effect at a higher level of utterances
    "ADD,        o21=5" = c(1, 0, ____, 5, 5),
    "INTENSIFY,  o21=5" = c(1, 0, -1, 5, ____),
    "ADD-INTSFY, o21=5" = c(0, 0, ____, ____, 10),
    # The effect at a lower level of utterances
    "ADD,        o21=2" = c(1, 0, 1, ____, 2),
    "INTENSIFY,  o21=2" = c(1, 0, ____, 2, -2),
    "ADD-INTSFY, o21=2" = c(0, 0, 2, 0, ____)))

When you're done, press `SHIFT`+`ENTER` to run the code. If you've completed the task successfully, you'll find that the difference between ADD and INTENSIFY at a high level of utterances (`o21=5`) is **8.4641**.

## Part 2: Performing Q-Learning with `qlaci`
We now demonstrate how to use `qlaci()` to perform Q-learning in R. We'll first load the package, as we did in [Demo 3](03_QLearning_Demo.ipynb).

In [None]:
remotes::install_github("d3lab-isr/qlaci")
library(qlaci)

Recall that we need to rework the data a bit before we use `qlaci()`. For one, it can't handle missing values, even if they're limited to responders in the variables of the stage-2 regression model. Therefore, we'll create a new `data.frame` where we set all missing values to 0.

In [None]:
autq <- aut
autq[is.na(autq)] <- 0

We also need to create a contrast matrix for our stage-1 regression. We'll do that below, at two levels of `o11`.

In [None]:
c1 <-
  rbind("Mean Y JASP+EMT, o11 = 10 = low"           = c(1, 10,  10,  1),
        "Mean Y JASP+EMT+SGD, o11 = 10 = low"       = c(1, 10, -10, -1),
        "Mean diff (no SGD - SGD), o11 = 10 = low"  = c(0,  0,  20,  2),
        "Mean Y JASP+EMT, o11 = 70 = high"          = c(1, 70,  70,  1),
        "Mean Y JASP+EMT+SGD, o11 = 70 = high"      = c(1, 70, -70, -1),
        "Mean diff (no SGD - SGD), o11 = 70 = high" = c(0,  0, 140,  2))

Now we're ready to go! See [Demo 3](#03_QLearning_Demo.ipynb) for details on the syntax. Recall that we set the random seed to ensure reproducible results.

In [None]:
## setting the random seed ensures that the results are reproducible
set.seed(0)
options(warn = -1) # ignore warnings

## with attach we can be lazy and refer to variables directly; use with caution
attach(autq)
ql <- invisible(qlaci(H10 = cbind("intercept" = 1, o11),
            H11 = cbind("o11:a1" = o11, "a1" = 1),
            A1 = a1, Y1 = rep(0, nrow(autq)),
            H20 = cbind("intercept" = 1, o11cnr, o21),
            H21 = cbind("o21:a2" = o21, "a2" = 1),
            A2 = a2, Y2 = y, S = s, c1 = t(c1)))
detach(autq)
options(warn = 0)
ql

Using the results from `qlaci()`, we can estimate the expected outcome under our more deeply-tailored adaptive intervention.

In [None]:
aut$yopt <- with(aut, cbind(1, aut$o11) %*% ql$stg1coeff[1:2] +
  abs(cbind(o11*a1, a1) %*% ql$stg1coeff[3:4]))

round(mean(aut$yopt), 3)