Skip to content

thmsrgrs/iDIFr

Repository files navigation

iDIFr

Intersectional Differential Item Functioning Analysis

R-CMD-check CRAN status

iDIFr is an R package for detecting Differential Item Functioning (DIF) using Logistic Regression, IRT Likelihood Ratio Tests, and model-based recursive partitioning (MOB) — with first-class support for intersectional group designs and built-in Intersectional Contrast Analysis (ICA).

Why iDIFr?

Most DIF packages focus on two-group comparisons along a single demographic dimension. iDIFr is built around the idea that test-takers belong to multiple groups simultaneously, and that DIF sometimes only appears at the intersection of those identities.

Key features:

  • Intersectional group support — define groups using ~ gender * nationality * age_band
  • Effect sizes as first-class outputs — results lead with Nagelkerke ΔR² and standardised chi, not just p-values
  • Three methods in one interface — LR, LRT, and MOB with consistent output
  • Built-in ICAica = TRUE classifies each item as amplified, pure intersection, obscured, or none by comparing single-variable and intersectional analyses
  • Transparent cell-size guidancecheck_groups() and merge_groups() help you manage sparse intersectional cells
  • Tidy outputtidy() returns a flat data frame for use with dplyr and ggplot2

Installation

# From CRAN
install.packages("iDIFr")

# Development version from GitHub
# install.packages("remotes")
remotes::install_github("thmsrgrs/iDIFr")

Quick start

library(iDIFr)

# 1. Check your group structure first
check_groups(my_data, group = ~ gender * nationality * age_band)

# 2. Run DIF analysis — method selection is required
result <- idifr(
  data   = my_data,
  items  = 1:20,
  group  = ~ gender * nationality * age_band,
  method = c("LR", "LRT")
)

# 3. Explore results
print(result)                       # Flagged items with effect sizes
summary(result)                     # Full breakdown by method + concordance
plot(result)                        # Effect size heatmap
plot(result, type = "concordance")  # Method agreement
tidy(result)                        # Flat data frame
tidy(result, table = "direction")   # Group-level direction table

Methods

Argument Method Effect size Best for
"LR" Logistic Regression Nagelkerke ΔR² General use, no IRT assumptions
"LRT" IRT Likelihood Ratio Test Standardised chi (df-scaled) IRT-based programmes
"MOB" Model-based recursive partitioning Standardised score difference Intersectional designs, exploratory

Intersectional Contrast Analysis (ICA)

Pass ica = TRUE to idifr() to run ICA automatically. After the main analysis, iDIFr runs one additional idifr() per demographic variable and classifies each item by comparing where it was flagged:

Classification Meaning
amplified Flagged in single-variable and intersectional runs
pure_intersection Flagged only in the intersectional run
obscured Flagged in a single-variable run but not intersectionally
none Not flagged anywhere
result <- idifr(
  data   = my_data,
  items  = 1:20,
  group  = ~ gender * nationality * age_band,
  method = "LR",
  ica    = TRUE
)

print(result)                  # ICA section printed automatically
tidy(result, table = "ica")    # Flat ICA classification table

Note: ICA runs N + 1 analyses without cross-analysis p-value correction. Interpret pure_intersection and obscured findings with caution in small samples.

Effect size thresholds

iDIFr requires both statistical significance (after p-value adjustment) and a meaningful effect size before flagging an item. This reduces false positives in large samples.

Method Metric Negligible Moderate Large
LR (uniform) Nagelkerke ΔR² < .035 .035–.070 ≥ .070
LR (non-uniform) MAPPD < .05 .05–.10 ≥ .10
LRT (uniform) Std. chi (df-scaled) < 0.10×√(df/2) 0.10–0.20×√(df/2) ≥ 0.20×√(df/2)
LRT (non-uniform) MAPPD < .05 .05–.10 ≥ .10
MOB Std. score difference < .35 .35–.70 ≥ .70

LRT thresholds are df-adjusted following Oshima et al. (1997) to maintain equivalent sensitivity across designs with different numbers of groups. The MOB threshold of 0.35 is intentionally conservative to avoid over-detection in multigroup designs.

Group management

# Inspect cell sizes before analysis
check_groups(my_data, group = ~ gender * nationality * age_band)

# Merge sparse cells
grp <- check_groups(my_data, group = ~ gender * nationality * age_band)
merged_data <- merge_groups(
  grp,
  nationality = list("Other" = c("DE", "FR", "ES"))
)

# Merge multiple variables in one call
merged_data <- merge_groups(
  grp,
  nationality = list("Other" = c("DE", "FR")),
  age_band    = list("18-30" = c("18-24", "25-30"))
)

# Exclude groups below a minimum size at run time
result <- idifr(
  my_data, 1:20,
  group            = ~ gender * nationality * age_band,
  method           = "LR",
  exclude_below_min = TRUE,
  min_cell_size    = 50
)

Simulating DIF data

simulate_dif() generates synthetic dichotomous item response data with known DIF structure, including intersection-only DIF for validating iDIFr on controlled data:

# Standard DIF
dat <- simulate_dif(n_persons = 1000, n_items = 20, dif_items = c(3, 7))

# DIF confined to a single intersectional cell
dat_ix <- simulate_dif(
  n_persons     = 2000,
  n_items       = 20,
  dif_items     = c(5, 12),
  dif_effect    = 1.5,
  dif_structure = "intersection",
  dif_group     = list(group = "G1", nationality = "UK", age_band = "Young"),
  demo_vars     = list(nationality = c("UK", "DE", "FR"),
                       age_band    = c("Young", "Old")),
  seed          = 42
)

# Mixed DIF — some items standard, some intersectional
dat_mixed <- simulate_dif(
  n_persons     = 2000,
  n_items       = 20,
  dif_items     = list(standard = c(3, 7), intersection = c(12, 15)),
  dif_effect    = 1.0,
  dif_structure = "mixed",
  dif_group     = list(group = "G1", nationality = "UK", age_band = "Young"),
  demo_vars     = list(nationality = c("UK", "DE", "FR"),
                       age_band    = c("Young", "Old")),
  seed          = 42
)

Citation

If you use iDIFr in published work, please cite:

Rogers, T. (2026). iDIFr: Intersectional Differential Item Functioning Analysis. R package version 1.0.1. 
https://CRAN.R-project.org/package=iDIFr

Contributing

Bug reports and feature requests are welcome via GitHub Issues.

About

No description, website, or topics provided.

Resources

License

Unknown, MIT licenses found

Licenses found

Unknown
LICENSE
MIT
LICENSE.md

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors