<a href="https://colab.research.google.com/github/armacintosh/Tutorials/blob/main/GTheoryPractice.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Generalizability Theory in R 
[article walkthrough](https://scholarworks.umass.edu/cgi/viewcontent.cgi?article=1337&context=pare)

Citation: Huebner, A. & Lucht, M. (2019). Generalizability Theory in R. Practical Assessment, Research & Evaluation, 24(5).
Available online: http://pareonline.net/getvn.asp?v=24&n=5 

## Facets
if a sample of people are administered a set of items that are graded by different raters on multiple occasions, then items, raters, and occasions would all be considered facets in the study. 

## Designs
Facets are crossed or nested. 

*   In a **crossed** design, all conditions of one facet are observed in all conditions of every other facet. For example, in a crossed one-facet design with items as the only facet, every person is measured on each item, as this
is referred to as a *p x i* design.

*   There are designs in which facets are **nested** within one another. 
For example, if the single-facet scenario described above was modified
so that each person was administered a unique set of
items (rather than all persons receiving the same items),
then items would be nested in persons. This design is
referred to as a *𝑖:p*.

*   An example of a nested design with two facets is the *p x (i:o)* design, in which all persons answer the same items, but the items
are different on each occasion.


## G studies 
ordinarily involve estimating the variance of the measurements, which is decomposed into the variance of the components. This information allows the researcher to identify the sources contributing the greatest variability to the
measurements.

## D studies 
The D study is used to identify the optimal number of conditions of each facet in order to maximize reliability. The D study variance components can be
derived from the G study variance components listed in Table 1 by dividing the G study components by the proposed number of facets.

The D study also involves the computation of two reliability coefficients:

1.   generalizability - 𝑬𝜌^2
2.   index of dependability - 𝛷







# Get packages

In [3]:
# install.packages('gtheory')
library(gtheory)

Installing package into ‘/usr/local/lib/R/site-library’
(as ‘lib’ is unspecified)

also installing the dependencies ‘minqa’, ‘nloptr’, ‘RcppEigen’, ‘lme4’


Loading required package: lme4

Loading required package: Matrix



# Read in Data 

In [6]:
#Read in the pi_dat data:
Person <- as.factor(rep(1:6,each = 4))
Item <- as.factor(rep(1:4,times = 6))
Score <- c(9,9,7,4,9,8,4,6,8,8,6,2,
 9,8,6,3,10,9,8,7,6,4,5,1)
pi_dat <- data.frame(Person,Item,Score)

# Read in the pio_cross_dat data:
Person <- as.factor(rep(1:6,each = 8))
Occasion <- as.factor(rep(1:2,each = 4,times = 6))
Item <- as.factor(rep(1:4,times = 12))
Score <- c(9,9,7,4,9,8,5,5,9,8,4,6,
 6,5,3,3,8,8,6,2,8,7,3,2,
 9,8,6,3,9,6,6,2,10,9,8,7,
 8,8,9,7,6,4,5,1,3,2,3,2)
pio_cross_dat <- data.frame(Person,Item,Score,Occasion)

# Read in the pio_nest_dat data:
Person <- as.factor(rep(1:6,each = 4))
Occasion <- as.factor(rep(1:2,each = 2,times = 6))
Item <- as.factor(rep(1:4,times = 6))
Score <- c(9,9,5,5,9,8,3,3,8,8,3,2,
 9,8,6,2,10,9,9,7,6,4,3,2)
pio_nest_dat <- data.frame(Person,Item,Score,Occasion) 

# Single Facet Design
First, an ANOVA analysis is conducted on the data in order to determine the DF, SS, and MS for the data, shown in columns two through four of Table 3, where 𝛼 denotes a generic effect. 

In [7]:
summary(aov(Score~Person+Item, data = pi_dat))

            Df Sum Sq Mean Sq F value   Pr(>F)    
Person       5  44.50   8.900   6.965  0.00151 ** 
Item         3  76.33  25.444  19.913 1.73e-05 ***
Residuals   15  19.17   1.278                     
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

For G-study, the model must be specified for the formula argument; for the one-facet 𝑝x𝑖 crossed design, the formula is written as follows:
Note, The gtheory package is essentially a wrapper for lme4.

In [9]:
formula1 <- Score ~ (1|Person)+(1|Item)

This formula is passed as an argument to the gstudy() function to determine
the variance component estimates and the percentage of total variation for each term.

variance components estimates and percentage of the total variation, are
displayed in the fifth and sixth columns of Table 3. This portion of Table 3 may be obtained by typing g1$components. 

In [13]:
g1 <- gstudy(data = pi_dat, formula1)
g1$components

source,var,percent,n
<chr>,<dbl>,<dbl>,<dbl>
Person,1.905558,26.4,1
Item,4.027814,55.9,1
Residual,1.277775,17.7,1


A D study can be performed by using the dstudy() function with the same formula.

The object of measurement is always the person or group being measured, and the score is the numeric value of the measured variable. For this example, the results of the dstudy() function are stored in the variable d1.


In [20]:
d1 <- dstudy(g1,colname.objects="Person", colname.scores="Score", data= pi_dat)

d1$components

d1$var.universe # universe score variance
d1$var.error.rel # relative error variance
d1$generalizability # g coefficient
d1$dependability # dependability coefficient

source,var,percent,n
<chr>,<dbl>,<dbl>,<dbl>
Person,1.9055579,59.0,1
Item,1.0069535,31.2,4
Residual,0.3194438,9.9,4


To get estimates for different number of conditions

It is important to note that as the value of 𝑛_i increases, the generalizability and dependability increase.

In [22]:
# number of conditions 
n_i <- c(1,2,5,10)

#relative error variance
rel_err_var <- g1$components[3,2]/n_i

#absolute error variance
abs_err_var <-
g1$components[2,2]/n_i+g1$components[3,
2]/n_i

#calculate generalizability coefficient
gen_coef <-
g1$components[1,2]/(g1$components[1,2]
+ rel_err_var)

#calculate dependability coefficient
dep_coef <-
g1$components[1,2]/(g1$components[1,2]
+ abs_err_var)
round(rel_err_var,2)
round(abs_err_var,2)
round(gen_coef,2)
round(dep_coef,2) 

# Two Facet Deisgn

The variance in a two-facet *p x i x O* design can be attributed to the seven variance components as shown in Table 1. We use the pio_cross_dat to illustrate.