# Defacing pre-registration - Statistical analysis in R

In this notebook, we vary the dataset parameters (i.e number of subjects, raters and image rated by raters) to determine the minimum number of ratings we need to detect a significant effect.

## Function to simulate data with missing values

In [1]:
source("simulate_data.R")

## Starting model

In [7]:
n_sub <- 580 #nbr of subjects available in the dataset
n_drop <- 100
n_rater <- 4 #nbr of raters
mean <- 20
sd <- 10
#Define for each rater the percentage of biased ratings
perc_biased <- c(20,40,50,60)
bias <- 10

for (j in seq(0, n_sub, by=n_drop)){
    print(sprintf("_______________%.02f missing values__________", j*100/n_sub))

    df <- simulate_normal_data(n_sub-j, n_sub, n_rater, perc_biased, mean=mean, sd=sd, bias=bias)
    df$ratings <- as.numeric(df$ratings)

    library(rstatix)
    suppressWarnings(res.aov <- anova_test(data = df, dv = ratings, wid = sub, within = c(defaced, rater)))
    print(get_anova_table(res.aov))
}

[1] "_______________0.00 missing values__________"
ANOVA Table (type III tests)

         Effect  DFn     DFd        F         p p<.05   ges
1       defaced 1.00  579.00 1874.932 1.02e-183     * 0.039
2         rater 3.00 1737.00    8.918  7.29e-06     * 0.011
3 defaced:rater 2.95 1709.79   74.979  6.20e-45     * 0.005
[1] "_______________17.24 missing values__________"
ANOVA Table (type III tests)

         Effect DFn DFd       F        p p<.05   ges
1       defaced   1 263 922.270 5.79e-88     * 0.041
2         rater   3 789   4.164 6.00e-03     * 0.011
3 defaced:rater   3 789  26.585 2.18e-16     * 0.004
[1] "_______________34.48 missing values__________"
ANOVA Table (type III tests)

         Effect DFn DFd       F        p p<.05   ges
1       defaced   1  90 294.151 4.15e-30     * 0.040
2         rater   3 270   2.204 8.80e-02       0.016
3 defaced:rater   3 270  12.541 1.06e-07     * 0.006
[1] "_______________51.72 missing values__________"
ANOVA Table (type III tests)

         

ERROR: Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...): 0 (non-NA) cases


## Only subset of full data

In [9]:
n_sub <- 130 #nbr of subjects available in the dataset
n_drop <- 30
n_rater <- 4 #nbr of raters
mean <- 20
sd <- 10
#Define for each rater the percentage of biased ratings
perc_biased <- c(20,40,50,60)
bias <- 10

for (j in seq(0, n_sub, by=n_drop)){
    print(sprintf("_______________%.02f missing values__________", j*100/n_sub))

    df <- simulate_normal_data(n_sub-j, n_sub, n_rater, perc_biased, mean=mean, sd=sd, bias=bias)
    df$ratings <- as.numeric(df$ratings)

    library(rstatix)
    suppressWarnings(res.aov <- anova_test(data = df, dv = ratings, wid = sub, within = c(defaced, rater)))
    print(get_anova_table(res.aov))
}

[1] "_______________0.00 missing values__________"
ANOVA Table (type III tests)

         Effect DFn DFd       F        p p<.05   ges
1       defaced   1 129 312.525 2.84e-36     * 0.031
2         rater   2 258   4.884 8.00e-03     * 0.024
3 defaced:rater   2 258  11.430 1.75e-05     * 0.004
[1] "_______________23.08 missing values__________"
ANOVA Table (type III tests)

         Effect DFn DFd       F        p p<.05   ges
1       defaced   1  58 165.259 1.27e-18     * 0.039
2         rater   2 116  10.007 9.79e-05     * 0.095
3 defaced:rater   2 116   5.011 8.00e-03     * 0.004
[1] "_______________46.15 missing values__________"
ANOVA Table (type III tests)

         Effect DFn DFd      F        p p<.05   ges
1       defaced   1  24 48.456 3.38e-07     * 0.053
2         rater   2  48 15.314 7.18e-06     * 0.259
3 defaced:rater   2  48  2.794 7.10e-02       0.005
[1] "_______________69.23 missing values__________"
ANOVA Table (type III tests)

         Effect DFn DFd        F     p p<

## Vary number of raters

In [12]:
n_sub <- 580 #nbr of subjects available in the dataset
n_drop <- 30
n_rater <- 3 #nbr of raters
mean <- 20
sd <- 10
#Define for each rater the percentage of biased ratings
perc_biased <- c(20,40,50,60)
bias <- 10

for (j in seq(0, n_sub, by=n_drop)){
    print(sprintf("_______________%.02f missing values__________", j*100/n_sub))

    df <- simulate_normal_data(n_sub-j, n_sub, n_rater, perc_biased, mean=mean, sd=sd, bias=bias)
    df$ratings <- as.numeric(df$ratings)

    library(rstatix)
    suppressWarnings(res.aov <- anova_test(data = df, dv = ratings, wid = sub, within = c(defaced, rater)))
    print(get_anova_table(res.aov))
}

[1] "_______________0.00 missing values__________"
ANOVA Table (type III tests)

         Effect DFn DFd       F        p p<.05   ges
1       defaced   1 129 312.525 2.84e-36     * 0.031
2         rater   2 258   4.884 8.00e-03     * 0.024
3 defaced:rater   2 258  11.430 1.75e-05     * 0.004
[1] "_______________23.08 missing values__________"
ANOVA Table (type III tests)

         Effect DFn DFd       F        p p<.05   ges
1       defaced   1  58 165.259 1.27e-18     * 0.039
2         rater   2 116  10.007 9.79e-05     * 0.095
3 defaced:rater   2 116   5.011 8.00e-03     * 0.004
[1] "_______________46.15 missing values__________"
ANOVA Table (type III tests)

         Effect DFn DFd      F        p p<.05   ges
1       defaced   1  24 48.456 3.38e-07     * 0.053
2         rater   2  48 15.314 7.18e-06     * 0.259
3 defaced:rater   2  48  2.794 7.10e-02       0.005
[1] "_______________69.23 missing values__________"
ANOVA Table (type III tests)

         Effect DFn DFd        F     p p<