## <div align="center"> <h1 align="center"> PUPILLOMETRY BASICS </h1> </div>
## <div align="center"> <h1 align="center"> FOR LINGUISTICS </h1> </div>

## <div align="center"> <h1 align="center"> PART VI: Interactions </h1> </div>

## Interactions among continuous variables

In [None]:
data <- read.csv("../input/pupillometry-sample/data_pup.csv")
#sanity check
#head(data)
target_NV <- droplevels(data[(data$condition == "NVS") | (data$condition == "NVI"), ])
target_NV$participant <- as.factor(target_NV$participant)
target_NV$session <- as.factor(target_NV$session)
target_NV$condition <- as.factor(target_NV$condition)
target_NV$item <- as.factor(target_NV$item)
target_NV$regularity <- as.factor(target_NV$regularity)
#sanity check
class(target_NV$participant)
library(readr)
model1 <- readr::read_rds("../input/gamm-model1/model1.rds")
#sanity check
summary(model1)

To explore the effects of a continuous variable on a categorical variable in GAMMs, the former need to be modeled as an interaction with time. When fitting a model to examine an interaction with a continuous variable we need to use *tensor product smooths* (rather than binary difference smooths). The resulting non-linear interaction accounted for changes in pupil size caused by a categorical variabl on a continuous variable **over time.**

In this case, we will use the variable Bilingual Picture Naming – Spanish Score (BPN_Sp) to model an interaction.

In [None]:
model_BPN <- bam(corrected_pupil_size ~ condition
             + te(bin, BPN_Sp, by = condition)
             + s(gaze_x, gaze_y)
             + s(bin, participant, bs = 'fs', m = 1, k = 10)
             + s(bin, item, bs = 'fs', m = 1, k = 10)
             , family = "scat"
             , data = target_NV
             , method = "fREML"
             , discrete = TRUE)

summary(model_BPN)


In the case of interactions with continuous variables, visualization is essential to understand the results because it is the only method used to determine significance. To visualize a two-dimensional pattern such as the effect of a categorical variabl on a continuous variable over time, **contour plots** based on the model’s fitted valued need to be plotted. 

Here, the color bands represents the range of values of the dependent variable, in this case the difference in pupillary response between conditions (i.e., NIV minus NVS), the closer to red, the greater the difference. The highlighted areas in the plot indicate where there was a significant effect of the independent variable (y-axis) on the dependent variable, and the x-axis shows the time course of the effect. 

In [None]:
plot_diff2(model_BPN, view = c("bin", "BPN_Sp"),
           comp= list(condition = c("NVI", "NVS")),
           rm.ranef = T,
           show.diff = T,
           hide.label = T,
           main = "Difference between NVI minus NVS",
           xlab = "Time Bin (20ms per bin)",
           ylab = "BPN Reaction Times (ms)")

## This concludes this series of notebooks on the basics of the analysis of pupillary data for experimental Linguistics.
## If you found them useful, please do not hesitate to reach me via [LinkedIn](https://www.linkedin.com/in/prislb/) and check out my [web page](https://ry2y67bvrg.wixsite.com/prislb).

## Resources

### **Readings**
  +  Sóskuthy, M. (2017). Generalised additive mixed models for dynamic analysis in linguistics: a practical introduction
  + Wieling (2018) Analyzing dynamic phonetic data using generalized additive mixed modeling: A tutorial focusing on articulatory differences between L1 and L2 speakers of English
  + Schmidtke (2018) Pupillometry linguistic research: an introduction and review for L2 researchers
  + Sirois & Brisson (2015) Pupillometry
  