Python port of the R ARTool package.
PyARTool implements the Aligned Rank Transform (ART) for conducting nonparametric analyses of variance on factorial models. It faithfully translates the R ARTool package by Wobbrock, Findlater, Gergle, Higgins, Kay, and Elkin to Python, producing numerically identical results.
- Overview
- Installation
- Quick Start
- API Reference
- Supported Designs
- Formula Syntax
- Detailed Walkthrough
- P-Value Adjustment Methods
- R Parity & Validation
- Architecture & Implementation Notes
- Dependencies
- Example Scripts
- Citations
- License
The Aligned Rank Transform (ART) is a nonparametric technique that allows you to use standard ANOVA procedures on ranked data, while correctly handling main effects, interactions, and contrasts in factorial designs. It works by:
- Aligning the response variable to strip out effects not of interest for each term.
- Ranking the aligned responses.
- Running standard ANOVAs on the aligned-and-ranked data.
PyARTool automates this entire pipeline and additionally supports the ART-C procedure for post-hoc contrast tests (Elkin et al., 2021).
Use the Aligned Rank Transform when:
- Your data violates ANOVA assumptions (non-normality, heteroscedasticity).
- You have a factorial design (two or more factors) — ART handles interactions correctly, unlike simpler rank-based tests.
- You need post-hoc pairwise or interaction contrasts on nonparametric data.
For more background, see the ARTool project page.
pip install pyartoolgit clone <this-repo>
cd PyARTool
# Create and activate a virtual environment (recommended)
python -m venv .venv
source .venv/bin/activate # macOS/Linux
# .venv\Scripts\activate # Windows
# Install in editable mode
pip install -e .- Python >= 3.9
- numpy >= 1.22
- pandas >= 1.4
- scipy >= 1.8
- statsmodels >= 0.13
from pyartool import art, anova_art, art_con, load_higgins1990_table5
# Load data
df = load_higgins1990_table5()
# Step 1: Apply the Aligned Rank Transform
m = art("DryMatter ~ Moisture * Fertilizer + (1|Tray)", data=df)
# Step 2: Run the nonparametric ANOVA
print(anova_art(m))
# Term Df Df.res F Pr(>F)
# 0 Moisture 3 8.0 23.833 2.419913e-04
# 1 Fertilizer 3 24.0 122.402 1.110223e-14
# 2 Moisture:Fert. 9 24.0 5.118 6.466476e-04
# Step 3: Post-hoc contrasts
print(art_con(m, "Moisture"))
# contrast estimate SE df t.ratio p.value
# 0 m1 - m2 -23.083 4.12 8 -5.607 0.0023
# ...from pyartool import art
result = art(formula, data)| Parameter | Type | Description |
|---|---|---|
formula |
str |
R-style formula (see Formula Syntax). |
data |
pd.DataFrame |
Data in long format. Factor columns should be pd.Categorical or string. |
Returns an ArtResult object containing:
| Attribute | Description |
|---|---|
result.formula |
Original formula string. |
result.data |
Original DataFrame. |
result.aligned |
DataFrame of aligned responses (one column per term). |
result.aligned_ranks |
DataFrame of ranks of aligned responses. |
result.residuals |
Residuals from the cell-means model. |
result.cell_means |
Cell means for every term. |
result.estimated_effects |
Estimated effects for every term. |
from pyartool import anova_art
anova_table = anova_art(m)| Parameter | Type | Description |
|---|---|---|
m |
ArtResult |
Object returned by art(). |
Returns a pd.DataFrame with columns: Term, Df, Df.res, F, Pr(>F).
The model type is determined automatically by the formula:
| Formula pattern | Model | R equivalent |
|---|---|---|
Y ~ A * B |
OLS (fixed effects) | lm() |
Y ~ A * B + (1|S) |
Mixed-effects (REML) | lmer() |
Y ~ A * B + Error(S) |
Repeated measures | aov(Error()) |
from pyartool import summary_art
s = summary_art(m)Returns an ArtSummary object with:
| Attribute | Description |
|---|---|
s.aligned_col_sums |
Dict of column sums of aligned responses (should all be ~0). |
s.aligned_anova_f_values |
Array of F values from ANOVAs on non-target aligned responses (should all be ~0). |
These diagnostics verify that the ART alignment procedure correctly stripped out effects not of interest. If values are not close to zero, the ART may not be appropriate for your data.
from pyartool import art_con
contrasts = art_con(m, formula, *, response="art", method="pairwise",
interaction=False, adjust="tukey")| Parameter | Type | Default | Description |
|---|---|---|---|
m |
ArtResult |
— | Object returned by art(). |
formula |
str |
— | Term to contrast: "A", "A:B", or "A:B:C". |
response |
str |
"art" |
"art" (ranked) or "aligned" (unranked). |
method |
str |
"pairwise" |
Contrast method. |
interaction |
bool |
False |
If True, compute difference-of-difference contrasts. |
adjust |
str|None |
"tukey" |
P-value adjustment (see below). |
Returns a pd.DataFrame with columns: contrast, estimate, SE, df, t.ratio, p.value.
For advanced users who need the underlying statsmodels fit objects:
from pyartool import artlm, artlm_con
# Get the fitted model for a specific ART term
lm_result = artlm(m, "A:B")
# Get the fitted model for an ART-C contrast term
lm_con_result = artlm_con(m, "A:B")PyARTool bundles the same example datasets as the R package:
from pyartool import (
load_higgins1990_table1, # 3x3 between-subjects
load_higgins1990_table5, # 4x4 split-plot (Moisture x Fertilizer)
load_elkin_ab, # 2x2 within-subjects
load_elkin_abc, # 2x2x2 within-subjects
load_higgins_abc, # 2x2x2 mixed design
)
df = load_higgins1990_table5()
print(df.head())
# Tray Moisture Fertilizer DryMatter
# 0 t1 m1 f1 3.3
# 1 t1 m1 f2 4.3
# 2 t1 m1 f3 4.5
# 3 t1 m1 f4 5.8
# 4 t2 m1 f1 4.0| Design | Formula Example | Model |
|---|---|---|
| Between-subjects factorial | Y ~ A * B |
OLS (lm) |
| Split-plot / mixed-effects | Y ~ A * B + (1|Subject) |
Mixed (lmer via MixedLM) |
| Repeated measures (aov) | Y ~ A * B + Error(Subject) |
RM-ANOVA (aov) |
| 2-factor | Y ~ A * B |
Any of the above |
| 3-factor | Y ~ A * B * C + (1|S) |
Any of the above |
| N-factor | Y ~ A * B * C * D + ... |
Any of the above |
PyARTool uses R-style formula strings. The parser supports all the same patterns as the R ARTool package:
# Full factorial (A + B + A:B)
"Y ~ A * B"
# Three-way factorial (all main effects, 2-way, and 3-way interactions)
"Y ~ A * B * C"
# You can also spell out terms explicitly
"Y ~ A + B + A:B"# Random intercept for Subject — fits a mixed-effects model (lmer)
"Y ~ A * B + (1|Subject)"# Repeated measures — fits an aov() with Error()
"Y ~ A * B + Error(Subject)"- The response variable (left of
~) must be a single numeric column. - All factor columns should be
pd.Categoricalor string type. Numeric columns used as factors will raise a warning. - The formula must specify the full factorial — all lower-order terms must be present for any interaction term. PyARTool will raise an error if the design is not fully crossed.
A simple 3x3 factorial design with no repeated measures:
from pyartool import art, anova_art, load_higgins1990_table1
df = load_higgins1990_table1()
print(df.head())
# Subject Row Column Response
# 0 s1 1 1 9
# 1 s2 1 1 6
# ...
# Fit the ART model (no grouping term = OLS)
m = art("Response ~ Row * Column", data=df)
# Run ANOVA
print(anova_art(m))
# Term Df Df.res F Pr(>F)
# 0 Row 2 27.0 29.993 1.383278e-07
# 1 Column 2 27.0 77.867 6.149827e-12
# 2 Row:Column 4 27.0 0.642 6.374203e-01When you have a grouping factor (e.g., trays, subjects), include (1|Group) to fit a mixed-effects model:
from pyartool import art, anova_art, summary_art, art_con
from pyartool import load_higgins1990_table5
df = load_higgins1990_table5()
# Moisture varies between trays; Fertilizer varies within trays
m = art("DryMatter ~ Moisture * Fertilizer + (1|Tray)", data=df)
# Check diagnostics
s = summary_art(m)
print("Aligned column sums:", s.aligned_col_sums)
# Should all be ~0
# ANOVA
print(anova_art(m))
# Term Df Df.res F Pr(>F)
# 0 Moisture 3 8.0 23.833 2.419913e-04
# 1 Fertilizer 3 24.0 122.402 1.110223e-14
# 2 Moisture:Fertilizer 9 24.0 5.118 6.466476e-04
# Post-hoc: pairwise contrasts on Moisture
# Default adjustment is Tukey HSD
print(art_con(m, "Moisture"))
# contrast estimate SE df t.ratio p.value
# 0 m1 - m2 -23.083 4.12 8 -5.607 0.0023
# 1 m1 - m3 -33.750 4.12 8 -8.198 0.0002
# ...
# Interaction contrasts with Holm adjustment
print(art_con(m, "Moisture:Fertilizer", adjust="holm"))A 2x2x2 fully within-subjects design:
from pyartool import art, anova_art, art_con, load_elkin_abc
df = load_elkin_abc()
m = art("Y ~ A * B * C + (1|S)", data=df)
# Full ANOVA table
print(anova_art(m))
# Term Df Df.res F Pr(>F)
# 0 A 1 49.0 288.181 0.000000e+00
# 1 B 1 49.0 28.103 2.732842e-06
# 2 C 1 49.0 60.510 4.168039e-10
# 3 A:B 1 49.0 28.528 2.377711e-06
# 4 A:C 1 49.0 16.545 1.720573e-04
# 5 B:C 1 49.0 76.258 1.481193e-11
# 6 A:B:C 1 49.0 75.592 1.690836e-11
# Contrasts on the 3-way interaction
print(art_con(m, "A:B:C", adjust="holm"))
# Contrasts on a 2-way interaction (averaged over 3rd factor)
print(art_con(m, "A:B", adjust="holm"))
# Single-factor contrasts
print(art_con(m, "A")) # Tukey by default
# Different adjustment methods
print(art_con(m, "B:C", adjust="bonferroni"))If you prefer traditional repeated-measures ANOVA (via aov) instead of mixed-effects models, use Error():
from pyartool import art, anova_art, load_higgins_abc
df = load_higgins_abc()
m = art("Y ~ A * B * C + Error(Subject)", data=df)
print(anova_art(m))
# Term Df Df.res F Pr(>F)
# 0 A 1 4.0 120.471 3.914986e-04
# 1 B 1 4.0 120.471 3.914986e-04
# 2 C 1 4.0 14.322 1.936216e-02
# 3 A:B 1 4.0 81.920 8.257143e-04
# 4 A:C 1 4.0 0.126 7.406643e-01
# 5 B:C 1 4.0 0.232 6.552898e-01
# 6 A:B:C 1 4.0 0.972 3.800992e-01The adjust parameter in art_con() supports these methods:
| Value | Method | Description |
|---|---|---|
"tukey" |
Tukey HSD | Default. Uses the studentized range distribution. Best for pairwise comparisons. |
"holm" |
Holm-Bonferroni | Step-down procedure. Good general-purpose choice. |
"bonferroni" |
Bonferroni | Conservative; multiplies p-values by number of tests. |
"fdr" or "bh" |
Benjamini-Hochberg | Controls false discovery rate. Less conservative. |
"none" or None |
No adjustment | Raw (unadjusted) p-values. |
Note: The default is "tukey", matching R's emmeans / art.con() behavior.
PyARTool has been validated to produce numerically identical results to R's ARTool package across all bundled datasets and model types:
| Dataset | Design | ANOVA | Contrasts |
|---|---|---|---|
| Higgins1990Table1 | 3x3 OLS | Exact match | — |
| Higgins1990Table5 | 4x4 split-plot (lmer) | Exact match | Moisture (Tukey), Fertilizer (Tukey), Moisture:Fertilizer (Holm, 120 pairs) |
| ElkinABC | 2x2x2 within (lmer) | Exact match | A:B:C (Holm), A:B (Holm), A (Tukey), B:C (Bonferroni) |
| ElkinAB | 2x2 within (lmer) | Exact match | A (Tukey), B (Tukey), A:B (Holm) |
| HigginsABC | 2x2x2 mixed (aov) | Exact match | — |
The companion files artool_example.r and example.py run the same analyses in R and Python respectively, allowing side-by-side output comparison.
To run both:
# R version (requires R and the ARTool package)
Rscript artool_example.r
# Python version
python example.pyPyARTool/
src/pyartool/
__init__.py # Public API exports
art.py # Core ART: alignment + ranking (art())
formula.py # R-style formula parser
effects.py # Cell means & estimated effects
anova.py # ANOVA: OLS, split-plot, and RM dispatching
models.py # artlm: model fitting (OLS / MixedLM / aov)
summary.py # Diagnostic checks (summary_art())
contrasts.py # ART-C contrasts (art_con(), artlm_con())
datasets.py # Bundled dataset loaders
data/ # CSV files for bundled datasets
tests/ # Test suite (84 tests)
example.py # Python example script
artool_example.r # R reference script
pyproject.toml # Package metadata & dependencies
README.md # This file
1. R-style formula parsing. PyARTool parses R formula syntax (Y ~ A * B + (1|S)) with a custom parser rather than relying on patsy for formula interpretation. This ensures identical handling of interactions, Error() terms, and grouping terms.
2. Patsy name-conflict handling. Factor column names that conflict with patsy reserved words (e.g., a column literally named C or S) are automatically prefixed with _f_ internally before model fitting and unaliased in output. This is transparent to the user.
3. Sum (deviation) coding. For Type III ANOVA equivalence with R, all categorical variables are explicitly coded with statsmodels Sum coding (C(var, Sum)) rather than the default Treatment coding.
4. Split-plot ANOVA for mixed models. The ANOVA for mixed-effects models implements a full split-plot SS decomposition to correctly compute between-group and within-group error strata, matching R's lmer + Kenward-Roger behavior.
5. Satterthwaite degrees of freedom. For mixed-model contrasts, per-contrast degrees of freedom are computed using the Satterthwaite approximation with analytical gradients and the REML Fisher information matrix, matching R's lmerTest / emmeans.
6. Tukey HSD via studentized range. The default p-value adjustment for pairwise contrasts uses scipy.stats.studentized_range, matching R's emmeans Tukey method.
| R Package | Python Equivalent |
|---|---|
base R (lm, aov) |
statsmodels (OLS, formula API) |
lme4 (lmer) |
statsmodels.regression.mixed_linear_model |
car (Type III Anova) |
statsmodels.stats.anova + custom split-plot |
emmeans (contrasts) |
Custom implementation in contrasts.py |
stats::p.adjust |
statsmodels.stats.multitest.multipletests |
stats::qtukey |
scipy.stats.studentized_range |
PyARTool requires:
numpy >= 1.22
pandas >= 1.4
scipy >= 1.8
statsmodels >= 0.13
All dependencies are automatically installed when installing PyARTool via pip.
Two companion scripts are included for cross-validation:
Runs all five example analyses using PyARTool. This is the best place to start understanding how to use the package.
python example.pyRuns the same five analyses using R's ARTool package. Use this to compare outputs side-by-side.
Rscript artool_example.rBoth scripts cover:
- Between-subjects 3x3 —
Higgins1990Table1(OLS, no repeated measures) - Split-plot 4x4 —
Higgins1990Table5(mixed-effects with(1|Tray)) - 2x2x2 within-subjects —
ElkinABC(mixed-effects with(1|S)) - 2x2 within-subjects —
ElkinAB(mixed-effects with(1|S)) - 2x2x2 mixed with Error() —
HigginsABC(repeated measures ANOVA)
If you use PyARTool in your research, please cite the original R package and methods papers:
Package:
Kay, M., Elkin, L. A., Higgins, J. J., and Wobbrock, J. O. (2025). ARTool: Aligned Rank Transform for Nonparametric Factorial ANOVAs. R package version 0.11.2. https://github.com/mjskay/ARTool. DOI: 10.5281/zenodo.594511.
ART procedure (used by art() and anova_art()):
Wobbrock, J. O., Findlater, L., Gergle, D., and Higgins, J. J. (2011). The Aligned Rank Transform for Nonparametric Factorial Analyses Using Only ANOVA Procedures. Proceedings of the ACM Conference on Human Factors in Computing Systems (CHI 2011). Vancouver, British Columbia (May 7--12, 2011). New York: ACM Press, pp. 143--146. DOI: 10.1145/1978942.1978963.
ART-C procedure (used by art_con() and artlm_con()):
Elkin, L. A., Kay, M., Higgins, J. J., and Wobbrock, J. O. (2021). An Aligned Rank Transform Procedure for Multifactor Contrast Tests. Proceedings of the ACM Symposium on User Interface Software and Technology (UIST 2021). Virtual Event (October 10--14, 2021). New York: ACM Press, pp. 754--768. DOI: 10.1145/3472749.3474784.
GPL-2.0-or-later, matching the original R ARTool package.