# GxE problem

- I updated the simulation function (scaling model) so that sigma is always positive (exponential)
- I am using `yt` in the function below to estimate the model

In [None]:
library(brms)
library(data.table)
library(texreg)

In [112]:
# Domingue's simulation function
simData <- function(E,i=1,b0=.8,b1=.2,b2=0,b3=.05,h=sqrt(.6), a0 = 0, a1=.5, sigma=1,scaling=TRUE) {
    N <- length(E)
    G <- rnorm(N,0,1)
    eps <- rnorm(N,1,sigma)
    if (scaling){
        e <- sqrt(1-h^2)
        # I don't use ystar and y
        ystar <- h*G+e*eps
        y = a0 + a1*E+(b0 + b1*E)*ystar
        # sigma of the error term should be
        fsigma = exp(b0*e + b1*e*E) 
        # final y values
        yt = a0 + a1*E + b0*h*G + b1*h*E*G + rnorm(N, 0, fsigma)

    } else {
        y <- b1*G+b2*E+b3*G*E+eps
    }
    df <- data.frame( E=E, y=y,yt = yt, g=G, error = error)
    df
}


## Scaling model

In [113]:
E = rnorm(5000, 0, 1)
dts = data.table(simData(E, scaling = TRUE))
summary(dts)



       E                  y                 yt                  g            
 Min.   :-3.40134   Min.   :-2.4784   Min.   :-7.510141   Min.   :-3.463412  
 1st Qu.:-0.69075   1st Qu.:-0.2504   1st Qu.:-1.209768   1st Qu.:-0.657698  
 Median :-0.01347   Median : 0.4118   Median :-0.060209   Median :-0.002678  
 Mean   :-0.01863   Mean   : 0.4936   Mean   : 0.005862   Mean   : 0.010220  
 3rd Qu.: 0.66807   3rd Qu.: 1.1285   3rd Qu.: 1.184128   3rd Qu.: 0.663497  
 Max.   : 3.93702   Max.   : 6.0476   Max.   : 8.273707   Max.   : 3.348608  
     error      
 Min.   :1.006  
 1st Qu.:1.521  
 Median :1.657  
 Mean   :1.668  
 3rd Qu.:1.803  
 Max.   :2.817  

In [114]:
cnames = c("ystar", "y")
m1 = lm(y ~ g + E + g * E, data = dts)
m2 = lm(yt ~ g + E + g * E, data = dts)
cat(screenreg(list(m1, m2)))


             Model 1      Model 2    
-------------------------------------
(Intercept)     0.50 ***     0.01    
               (0.01)       (0.02)   
g               0.62 ***     0.62 ***
               (0.01)       (0.02)   
E               0.64 ***     0.50 ***
               (0.01)       (0.02)   
g:E             0.15 ***     0.15 ***
               (0.01)       (0.02)   
-------------------------------------
R^2             0.75         0.19    
Adj. R^2        0.75         0.19    
Num. obs.    5000         5000       
*** p < 0.001; ** p < 0.01; * p < 0.05



# Bayesian distributional model


In [115]:
# distributional model using bayesian stats

f = bf(yt ~ g + E + g * E, sigma ~ 1 + E)
m4 = brm(f, data = dts, family = brmsfamily("gaussian", link_sigma = "log"))


Compiling Stan program...

recompiling to avoid crashing R session

Start sampling




SAMPLING FOR MODEL '2f014d48f4e8e552ea3acf08e4adcff2' NOW (CHAIN 1).
Chain 1: 
Chain 1: Gradient evaluation took 0.001844 seconds
Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 18.44 seconds.
Chain 1: Adjust your expectations accordingly!
Chain 1: 
Chain 1: 
Chain 1: Iteration:    1 / 2000 [  0%]  (Warmup)
Chain 1: Iteration:  200 / 2000 [ 10%]  (Warmup)
Chain 1: Iteration:  400 / 2000 [ 20%]  (Warmup)
Chain 1: Iteration:  600 / 2000 [ 30%]  (Warmup)
Chain 1: Iteration:  800 / 2000 [ 40%]  (Warmup)
Chain 1: Iteration: 1000 / 2000 [ 50%]  (Warmup)
Chain 1: Iteration: 1001 / 2000 [ 50%]  (Sampling)
Chain 1: Iteration: 1200 / 2000 [ 60%]  (Sampling)
Chain 1: Iteration: 1400 / 2000 [ 70%]  (Sampling)
Chain 1: Iteration: 1600 / 2000 [ 80%]  (Sampling)
Chain 1: Iteration: 1800 / 2000 [ 90%]  (Sampling)
Chain 1: Iteration: 2000 / 2000 [100%]  (Sampling)
Chain 1: 
Chain 1:  Elapsed Time: 3.42186 seconds (Warm-up)
Chain 1:                3.33484 seconds (Sampling)


In [118]:
# able to get the sigma coefficients of the simulation
cat(screenreg(m4))


                 Model 1      
------------------------------
Intercept            0.01     
                 [-0.03; 0.05]
sigma_Intercept      0.50 *   
                 [ 0.48; 0.52]
g                    0.62 *   
                 [ 0.58; 0.66]
E                    0.51 *   
                 [ 0.47; 0.55]
g:E                  0.14 *   
                 [ 0.10; 0.17]
sigma_E              0.12 *   
                 [ 0.10; 0.13]
------------------------------
R^2                  0.19     
Num. obs.         5000        
loo IC           19174.26     
WAIC             19174.24     
* 0 outside the confidence interval.


In [119]:

# I cannot reject the null hypothesis
hyp <- "g * sigma_E = g:E * sigma_Intercept"
(hyp <- hypothesis(m4, hyp, alpha = 0.05))


Hypothesis Tests for class b:
                Hypothesis Estimate Est.Error CI.Lower CI.Upper Evid.Ratio
1 (g*sigma_E)-(g:E*... = 0        0      0.01    -0.02     0.03         NA
  Post.Prob Star
1        NA     
---
'CI': 90%-CI for one-sided and 95%-CI for two-sided hypotheses.
'*': For one-sided hypotheses, the posterior probability exceeds 95%;
for two-sided hypotheses, the value tested against lies outside the 95%-CI.
Posterior probabilities of point hypotheses assume equal prior probabilities.