# Modeling Study Results 
## Anchoring and Alignment: Data Factors in Part-to-Whole Visualization

### Imports

In [32]:
library(glmmTMB)

### Load Data

In [42]:
# Read in final results data
data <- read.csv("data/results.csv", header=T)

# Create numeric binary indicators for alignment, anchoring, and chart type
data$alignment_num <- ifelse(data$alignment == "aligned", -1, 1)
data$anchor_num <- ifelse(data$anchor == "anchor", -1, 1)
data$chartType_num <- ifelse(data$chartType == "pie", -1, 1)

# Create rounding factors based on distance to 5s and 10s
data$distToNearest5 <- abs(data$selectedPart - round(data$selectedPart / 5) * 5) - 1
data$distToNearest10 <- abs(data$selectedPart - round(data$selectedPart / 10) * 10) - 2

# Create ordinal variables for alignment and anchoring
data$anchorDistance <- as.numeric(factor(data$anchorCategory, 
    levels = c("anchor", "near-anchor", "far-anchor"))) - 1
data$alignmentDistance <- as.numeric(factor(data$alignmentCategory, 
    levels = c("aligned", "near-aligned", "far-from-aligned"))) - 1

### Build Absolute Error Models

#### Compare Distributions via AIC Model Comparison

In [34]:
modelPoisson <- glmmTMB(
  absError ~ chartType_num * anchor_num * alignment_num +
    (1 + anchor_num + alignment_num + chartType_num | userID),
  data = data,
  family = poisson(link = "log")
)

modelBinomial <- glmmTMB(
  absError ~ chartType_num * anchor_num * alignment_num +
    (1 + anchor_num + alignment_num + chartType_num | userID),
  data = data,
  family = nbinom2(link = "log")
)

modelTweedie <- glmmTMB(
  absError ~ chartType_num * anchor_num * alignment_num +
    (1 + anchor_num + alignment_num + chartType_num | userID),
  data = data,
  family = tweedie(link = "log")
)

AIC(modelPoisson, modelBinomial, modelTweedie)

Unnamed: 0_level_0,df,AIC
Unnamed: 0_level_1,<dbl>,<dbl>
modelPoisson,18,26251.76
modelBinomial,19,23202.92
modelTweedie,20,23580.43


The binomial model has the best fit and will be used for the remaining absolute error models

#### Compare Rounding Covariates via AIC Model Comparison

In [18]:
modelNoRound <- glmmTMB(
  absError ~ chartType_num * anchor_num * alignment_num +
    (1 + anchor_num + alignment_num + chartType_num | userID),
  data = data,
  family = nbinom2(link = "log")
)

modelRound5 <- glmmTMB(
  absError ~ chartType_num * anchor_num * alignment_num + distToNearest5 +
    (1 + anchor_num + alignment_num + chartType_num + distToNearest5 | userID),
  data = data,
  family = nbinom2(link = "log")
)

modelRound10 <- glmmTMB(
  absError ~ chartType_num * anchor_num * alignment_num + distToNearest10 +
    (1 + anchor_num + alignment_num + chartType_num + distToNearest10 | userID),
  data = data,
  family = nbinom2(link = "log")
)

AIC(modelNoRound, modelRound5, modelRound10)

“Model convergence problem; singular convergence (7). See vignette('troubleshooting'), help('diagnose')”


Unnamed: 0_level_0,df,AIC
Unnamed: 0_level_1,<dbl>,<dbl>
modelNoRound,19,23202.92
modelRound5,25,23205.67
modelRound10,25,23198.32


The model with rounding to 10s has the best fit and will be used for the remaining absolute error models

#### Summary of the Absolute Error Model

In [25]:
summary(modelRound10)

 Family: nbinom2  ( log )
Formula:          
absError ~ chartType_num * anchor_num * alignment_num + distToNearest10 +  
    (1 + anchor_num + alignment_num + chartType_num + distToNearest10 |  
        userID)
Data: data

      AIC       BIC    logLik -2*log(L)  df.resid 
  23198.3   23364.1  -11574.2   23148.3      5587 

Random effects:

Conditional model:
 Groups Name            Variance Std.Dev. Corr                    
 userID (Intercept)     0.498816 0.70627                          
        anchor_num      0.137221 0.37043  -0.93                   
        alignment_num   0.015398 0.12409  -0.66  0.50             
        chartType_num   0.018665 0.13662   0.18 -0.20 -0.36       
        distToNearest10 0.002904 0.05389  -0.14  0.20 -0.32 -0.01 
Number of obs: 5612, groups:  userID, 60

Dispersion parameter for nbinom2 family (): 1.91 

Conditional model:
                                        Estimate Std. Error z value Pr(>|z|)
(Intercept)                             0.23748

These results per factor in the GLMM from the summary:
| Factor | Estimate | $p$ value | Significant |
| :----: | :------: | :-------: | :---------: |
| Chart Type | $\hat{\beta}=0.036388$ | $p=0.2935$ | No |
| Anchoring | $\hat{\beta}=0.656509$ | $p< 2e-16$ | Yes |
| Alignment | $\hat{\beta}=0.361369$ | $p< 2e-16$ | Yes |
| Rounding by 10s | $\hat{\beta}=0.023832$ | $p=0.0350$ | Yes |

The interaction results in the GLMM from the summary:

| Interaction | Estimate | $p$ value | Significant |
| :----: | :------: | :-------: | :---------: |
| Chart Type $\times$ Anchoring | $\hat{\beta}=0.018127$ | $p=0.5396$ | No |
| Chart Type $\times$ Alignment | $\hat{\beta}=0.004644$ | $p=0.8732$ | No |
| Anchoring $\times$ Alignment | $\hat{\beta}=-0.173381$ | $p=9.48e-09$ | Yes |
| Three Way Interaction | $\hat{\beta}=0.022538$ | $p=0.4365$ | No |

#### Build Distance Model

In [43]:
modelDistance <- glmmTMB(
  absError ~ chartType_num * anchorDistance * alignmentDistance +
    (1 + chartType_num + anchorDistance + alignmentDistance | userID),
  data = data,
  family = nbinom2(link = "log")
)

In [44]:
summary(modelDistance)

 Family: nbinom2  ( log )
Formula:          
absError ~ chartType_num * anchorDistance * alignmentDistance +  
    (1 + chartType_num + anchorDistance + alignmentDistance |          userID)
Data: data

      AIC       BIC    logLik -2*log(L)  df.resid 
  23077.3   23203.3  -11519.7   23039.3      5593 

Random effects:

Conditional model:
 Groups Name              Variance Std.Dev. Corr              
 userID (Intercept)       0.75736  0.8703                     
        chartType_num     0.01891  0.1375    0.24             
        anchorDistance    0.07442  0.2728   -0.94 -0.21       
        alignmentDistance 0.02654  0.1629   -0.71 -0.33  0.54 
Number of obs: 5612, groups:  userID, 60

Dispersion parameter for nbinom2 family (): 2.01 

Conditional model:
                                               Estimate Std. Error z value
(Intercept)                                    -0.30019    0.12753  -2.354
chartType_num                                  -0.09497    0.06040  -1.572
anchorD

### Build Response Time Models

In [26]:
modelResponseTime <- glmmTMB(
  responseTime ~ chartType_num * anchor_num * alignment_num +
    (1 + anchor_num + alignment_num + chartType_num | userID),
  data = data,
  family = Gamma(link = "log")
)

#### Summary of the Response Time Model

In [27]:
summary(modelResponseTime)

 Family: Gamma  ( log )
Formula:          
responseTime ~ chartType_num * anchor_num * alignment_num + (1 +  
    anchor_num + alignment_num + chartType_num | userID)
Data: data

      AIC       BIC    logLik -2*log(L)  df.resid 
  27055.1   27181.1  -13508.5   27017.1      5593 

Random effects:

Conditional model:
 Groups Name          Variance  Std.Dev. Corr              
 userID (Intercept)   0.1248288 0.35331                    
        anchor_num    0.0019921 0.04463   0.12             
        alignment_num 0.0007469 0.02733   0.31 -0.44       
        chartType_num 0.0033118 0.05755   0.16  0.35  0.04 
Number of obs: 5612, groups:  userID, 60

Dispersion estimate for Gamma family (sigma^2): 0.147 

Conditional model:
                                         Estimate Std. Error z value Pr(>|z|)
(Intercept)                             1.8728939  0.0463525   40.41  < 2e-16
chartType_num                           0.0008645  0.0110938    0.08  0.93788
anchor_num                     

These results per factor in the GLMM from the summary:
| Factor | Estimate | $p$ value | Significant |
| :----: | :------: | :-------: | :---------: |
| Chart Type | $\hat{\beta}=0.0008645$ | $p=0.93788$ | No |
| Anchoring | $\hat{\beta}=0.1137857$ | $p< 2e-16$ | Yes |
| Alignment | $\hat{\beta}=0.0447381$ | $p< 2e-16$ | Yes |

The interaction results in the GLMM from the summary:

| Interaction | Estimate | $p$ value | Significant |
| :----: | :------: | :-------: | :---------: |
| Chart Type $\times$ Anchoring | $\hat{\beta}=0.0022411$ | $p=0.78566$ | No |
| Chart Type $\times$ Alignment | $\hat{\beta}=0.0076082$ | $p=0.35589$ | No |
| Anchoring $\times$ Alignment | $\hat{\beta}=-0.0234473$ | $p=0.00448$ | Yes |
| Three Way Interaction | $\hat{\beta}=-0.0107225$ | $p=0.19339$ | No |
