PyARTool: Aligned Rank Transform for Nonparametric Factorial ANOVAs

Python port of the R ARTool package.

PyARTool implements the Aligned Rank Transform (ART) for conducting nonparametric analyses of variance on factorial models. It faithfully translates the R ARTool package by Wobbrock, Findlater, Gergle, Higgins, Kay, and Elkin to Python, producing numerically identical results.

Overview

The Aligned Rank Transform (ART) is a nonparametric technique that allows you to use standard ANOVA procedures on ranked data, while correctly handling main effects, interactions, and contrasts in factorial designs. It works by:

Aligning the response variable to strip out effects not of interest for each term.
Ranking the aligned responses.
Running standard ANOVAs on the aligned-and-ranked data.

PyARTool automates this entire pipeline and additionally supports the ART-C procedure for post-hoc contrast tests (Elkin et al., 2021).

When to Use ART

Use the Aligned Rank Transform when:

Your data violates ANOVA assumptions (non-normality, heteroscedasticity).
You have a factorial design (two or more factors) — ART handles interactions correctly, unlike simpler rank-based tests.
You need post-hoc pairwise or interaction contrasts on nonparametric data.

For more background, see the ARTool project page.

Installation

From PyPI (recommended)

pip install pyartool

From source

git clone <this-repo>
cd PyARTool

# Create and activate a virtual environment (recommended)
python -m venv .venv
source .venv/bin/activate   # macOS/Linux
# .venv\Scripts\activate    # Windows

# Install in editable mode
pip install -e .

Requirements

Python >= 3.9
numpy >= 1.22
pandas >= 1.4
scipy >= 1.8
statsmodels >= 0.13

Quick Start

from pyartool import art, anova_art, art_con, load_higgins1990_table5

# Load data
df = load_higgins1990_table5()

# Step 1: Apply the Aligned Rank Transform
m = art("DryMatter ~ Moisture * Fertilizer + (1|Tray)", data=df)

# Step 2: Run the nonparametric ANOVA
print(anova_art(m))
#              Term  Df  Df.res        F        Pr(>F)
# 0        Moisture   3     8.0   23.833  2.419913e-04
# 1      Fertilizer   3    24.0  122.402  1.110223e-14
# 2  Moisture:Fert.   9    24.0    5.118  6.466476e-04

# Step 3: Post-hoc contrasts
print(art_con(m, "Moisture"))
#   contrast  estimate    SE  df  t.ratio   p.value
# 0  m1 - m2   -23.083  4.12   8   -5.607    0.0023
# ...

API Reference

`art()` — Aligned Rank Transform

from pyartool import art

result = art(formula, data)

Parameter	Type	Description
`formula`	`str`	R-style formula (see Formula Syntax).
`data`	`pd.DataFrame`	Data in long format. Factor columns should be `pd.Categorical` or string.

Returns an ArtResult object containing:

Attribute	Description
`result.formula`	Original formula string.
`result.data`	Original DataFrame.
`result.aligned`	DataFrame of aligned responses (one column per term).
`result.aligned_ranks`	DataFrame of ranks of aligned responses.
`result.residuals`	Residuals from the cell-means model.
`result.cell_means`	Cell means for every term.
`result.estimated_effects`	Estimated effects for every term.

`anova_art()` — ANOVA on ART Data

from pyartool import anova_art

anova_table = anova_art(m)

Parameter	Type	Description
`m`	`ArtResult`	Object returned by `art()`.

Returns a pd.DataFrame with columns: Term, Df, Df.res, F, Pr(>F).

The model type is determined automatically by the formula:

Formula pattern	Model	R equivalent
`Y ~ A * B`	OLS (fixed effects)	`lm()`
`Y ~ A * B + (1\|S)`	Mixed-effects (REML)	`lmer()`
`Y ~ A * B + Error(S)`	Repeated measures	`aov(Error())`

`summary_art()` — Diagnostic Summary

from pyartool import summary_art

s = summary_art(m)

Returns an ArtSummary object with:

Attribute	Description
`s.aligned_col_sums`	Dict of column sums of aligned responses (should all be ~0).
`s.aligned_anova_f_values`	Array of F values from ANOVAs on non-target aligned responses (should all be ~0).

These diagnostics verify that the ART alignment procedure correctly stripped out effects not of interest. If values are not close to zero, the ART may not be appropriate for your data.

`art_con()` — Contrast Tests (ART-C)

from pyartool import art_con

contrasts = art_con(m, formula, *, response="art", method="pairwise",
                    interaction=False, adjust="tukey")

Parameter	Type	Default	Description
`m`	`ArtResult`	—	Object returned by `art()`.
`formula`	`str`	—	Term to contrast: `"A"`, `"A:B"`, or `"A:B:C"`.
`response`	`str`	`"art"`	`"art"` (ranked) or `"aligned"` (unranked).
`method`	`str`	`"pairwise"`	Contrast method.
`interaction`	`bool`	`False`	If `True`, compute difference-of-difference contrasts.
`adjust`	`str\|None`	`"tukey"`	P-value adjustment (see below).

Returns a pd.DataFrame with columns: contrast, estimate, SE, df, t.ratio, p.value.

`artlm()` / `artlm_con()` — Access Fitted Models

For advanced users who need the underlying statsmodels fit objects:

from pyartool import artlm, artlm_con

# Get the fitted model for a specific ART term
lm_result = artlm(m, "A:B")

# Get the fitted model for an ART-C contrast term
lm_con_result = artlm_con(m, "A:B")

Dataset Loaders

PyARTool bundles the same example datasets as the R package:

from pyartool import (
    load_higgins1990_table1,   # 3x3 between-subjects
    load_higgins1990_table5,   # 4x4 split-plot (Moisture x Fertilizer)
    load_elkin_ab,             # 2x2 within-subjects
    load_elkin_abc,            # 2x2x2 within-subjects
    load_higgins_abc,          # 2x2x2 mixed design
)

df = load_higgins1990_table5()
print(df.head())
#   Tray Moisture Fertilizer  DryMatter
# 0   t1       m1         f1        3.3
# 1   t1       m1         f2        4.3
# 2   t1       m1         f3        4.5
# 3   t1       m1         f4        5.8
# 4   t2       m1         f1        4.0

Supported Designs

Design	Formula Example	Model
Between-subjects factorial	`Y ~ A * B`	OLS (`lm`)
Split-plot / mixed-effects	`Y ~ A * B + (1\|Subject)`	Mixed (`lmer` via `MixedLM`)
Repeated measures (aov)	`Y ~ A * B + Error(Subject)`	RM-ANOVA (`aov`)
2-factor	`Y ~ A * B`	Any of the above
3-factor	`Y ~ A * B * C + (1\|S)`	Any of the above
N-factor	`Y ~ A * B * C * D + ...`	Any of the above

Formula Syntax

PyARTool uses R-style formula strings. The parser supports all the same patterns as the R ARTool package:

Fixed effects

# Full factorial (A + B + A:B)
"Y ~ A * B"

# Three-way factorial (all main effects, 2-way, and 3-way interactions)
"Y ~ A * B * C"

# You can also spell out terms explicitly
"Y ~ A + B + A:B"

Random / grouping effects (mixed-effects model)

# Random intercept for Subject — fits a mixed-effects model (lmer)
"Y ~ A * B + (1|Subject)"

Error terms (repeated measures ANOVA)

# Repeated measures — fits an aov() with Error()
"Y ~ A * B + Error(Subject)"

Important notes

The response variable (left of ~) must be a single numeric column.
All factor columns should be pd.Categorical or string type. Numeric columns used as factors will raise a warning.
The formula must specify the full factorial — all lower-order terms must be present for any interaction term. PyARTool will raise an error if the design is not fully crossed.

Detailed Walkthrough

Example 1: Between-Subjects Factorial

A simple 3x3 factorial design with no repeated measures:

from pyartool import art, anova_art, load_higgins1990_table1

df = load_higgins1990_table1()
print(df.head())
#   Subject Row Column  Response
# 0      s1   1      1         9
# 1      s2   1      1         6
# ...

# Fit the ART model (no grouping term = OLS)
m = art("Response ~ Row * Column", data=df)

# Run ANOVA
print(anova_art(m))
#         Term  Df  Df.res       F        Pr(>F)
# 0        Row   2    27.0  29.993  1.383278e-07
# 1     Column   2    27.0  77.867  6.149827e-12
# 2  Row:Column  4    27.0   0.642  6.374203e-01

Example 2: Split-Plot / Mixed-Effects

When you have a grouping factor (e.g., trays, subjects), include (1|Group) to fit a mixed-effects model:

from pyartool import art, anova_art, summary_art, art_con
from pyartool import load_higgins1990_table5

df = load_higgins1990_table5()

# Moisture varies between trays; Fertilizer varies within trays
m = art("DryMatter ~ Moisture * Fertilizer + (1|Tray)", data=df)

# Check diagnostics
s = summary_art(m)
print("Aligned column sums:", s.aligned_col_sums)
# Should all be ~0

# ANOVA
print(anova_art(m))
#                  Term  Df  Df.res        F        Pr(>F)
# 0            Moisture   3     8.0   23.833  2.419913e-04
# 1          Fertilizer   3    24.0  122.402  1.110223e-14
# 2  Moisture:Fertilizer  9    24.0    5.118  6.466476e-04

# Post-hoc: pairwise contrasts on Moisture
# Default adjustment is Tukey HSD
print(art_con(m, "Moisture"))
#   contrast  estimate    SE  df  t.ratio   p.value
# 0  m1 - m2   -23.083  4.12   8   -5.607    0.0023
# 1  m1 - m3   -33.750  4.12   8   -8.198    0.0002
# ...

# Interaction contrasts with Holm adjustment
print(art_con(m, "Moisture:Fertilizer", adjust="holm"))

Example 3: Multi-Factor Within-Subjects

A 2x2x2 fully within-subjects design:

from pyartool import art, anova_art, art_con, load_elkin_abc

df = load_elkin_abc()
m = art("Y ~ A * B * C + (1|S)", data=df)

# Full ANOVA table
print(anova_art(m))
#    Term  Df  Df.res        F        Pr(>F)
# 0     A   1    49.0  288.181  0.000000e+00
# 1     B   1    49.0   28.103  2.732842e-06
# 2     C   1    49.0   60.510  4.168039e-10
# 3   A:B   1    49.0   28.528  2.377711e-06
# 4   A:C   1    49.0   16.545  1.720573e-04
# 5   B:C   1    49.0   76.258  1.481193e-11
# 6 A:B:C   1    49.0   75.592  1.690836e-11

# Contrasts on the 3-way interaction
print(art_con(m, "A:B:C", adjust="holm"))

# Contrasts on a 2-way interaction (averaged over 3rd factor)
print(art_con(m, "A:B", adjust="holm"))

# Single-factor contrasts
print(art_con(m, "A"))  # Tukey by default

# Different adjustment methods
print(art_con(m, "B:C", adjust="bonferroni"))

Example 4: Repeated Measures with Error()

If you prefer traditional repeated-measures ANOVA (via aov) instead of mixed-effects models, use Error():

from pyartool import art, anova_art, load_higgins_abc

df = load_higgins_abc()
m = art("Y ~ A * B * C + Error(Subject)", data=df)

print(anova_art(m))
#    Term  Df  Df.res        F        Pr(>F)
# 0     A   1     4.0  120.471  3.914986e-04
# 1     B   1     4.0  120.471  3.914986e-04
# 2     C   1     4.0   14.322  1.936216e-02
# 3   A:B   1     4.0   81.920  8.257143e-04
# 4   A:C   1     4.0    0.126  7.406643e-01
# 5   B:C   1     4.0    0.232  6.552898e-01
# 6 A:B:C   1     4.0    0.972  3.800992e-01

P-Value Adjustment Methods

The adjust parameter in art_con() supports these methods:

Value	Method	Description
`"tukey"`	Tukey HSD	Default. Uses the studentized range distribution. Best for pairwise comparisons.
`"holm"`	Holm-Bonferroni	Step-down procedure. Good general-purpose choice.
`"bonferroni"`	Bonferroni	Conservative; multiplies p-values by number of tests.
`"fdr"` or `"bh"`	Benjamini-Hochberg	Controls false discovery rate. Less conservative.
`"none"` or `None`	No adjustment	Raw (unadjusted) p-values.

Note: The default is "tukey", matching R's emmeans / art.con() behavior.

R Parity & Validation

PyARTool has been validated to produce numerically identical results to R's ARTool package across all bundled datasets and model types:

Dataset	Design	ANOVA	Contrasts
Higgins1990Table1	3x3 OLS	Exact match	—
Higgins1990Table5	4x4 split-plot (lmer)	Exact match	Moisture (Tukey), Fertilizer (Tukey), Moisture:Fertilizer (Holm, 120 pairs)
ElkinABC	2x2x2 within (lmer)	Exact match	A:B:C (Holm), A:B (Holm), A (Tukey), B:C (Bonferroni)
ElkinAB	2x2 within (lmer)	Exact match	A (Tukey), B (Tukey), A:B (Holm)
HigginsABC	2x2x2 mixed (aov)	Exact match	—

The companion files artool_example.r and example.py run the same analyses in R and Python respectively, allowing side-by-side output comparison.

To run both:

# R version (requires R and the ARTool package)
Rscript artool_example.r

# Python version
python example.py

Architecture & Implementation Notes

Package Structure

PyARTool/
  src/pyartool/
    __init__.py          # Public API exports
    art.py               # Core ART: alignment + ranking (art())
    formula.py           # R-style formula parser
    effects.py           # Cell means & estimated effects
    anova.py             # ANOVA: OLS, split-plot, and RM dispatching
    models.py            # artlm: model fitting (OLS / MixedLM / aov)
    summary.py           # Diagnostic checks (summary_art())
    contrasts.py         # ART-C contrasts (art_con(), artlm_con())
    datasets.py          # Bundled dataset loaders
    data/                # CSV files for bundled datasets
  tests/                 # Test suite (84 tests)
  example.py             # Python example script
  artool_example.r       # R reference script
  pyproject.toml         # Package metadata & dependencies
  README.md              # This file

Key Design Decisions

1. R-style formula parsing. PyARTool parses R formula syntax (Y ~ A * B + (1|S)) with a custom parser rather than relying on patsy for formula interpretation. This ensures identical handling of interactions, Error() terms, and grouping terms.

2. Patsy name-conflict handling. Factor column names that conflict with patsy reserved words (e.g., a column literally named C or S) are automatically prefixed with _f_ internally before model fitting and unaliased in output. This is transparent to the user.

3. Sum (deviation) coding. For Type III ANOVA equivalence with R, all categorical variables are explicitly coded with statsmodels Sum coding (C(var, Sum)) rather than the default Treatment coding.

4. Split-plot ANOVA for mixed models. The ANOVA for mixed-effects models implements a full split-plot SS decomposition to correctly compute between-group and within-group error strata, matching R's lmer + Kenward-Roger behavior.

5. Satterthwaite degrees of freedom. For mixed-model contrasts, per-contrast degrees of freedom are computed using the Satterthwaite approximation with analytical gradients and the REML Fisher information matrix, matching R's lmerTest / emmeans.

6. Tukey HSD via studentized range. The default p-value adjustment for pairwise contrasts uses scipy.stats.studentized_range, matching R's emmeans Tukey method.

Dependency Mapping (R to Python)

R Package	Python Equivalent
`base R` (`lm`, `aov`)	`statsmodels` (OLS, formula API)
`lme4` (`lmer`)	`statsmodels.regression.mixed_linear_model`
`car` (Type III Anova)	`statsmodels.stats.anova` + custom split-plot
`emmeans` (contrasts)	Custom implementation in `contrasts.py`
`stats::p.adjust`	`statsmodels.stats.multitest.multipletests`
`stats::qtukey`	`scipy.stats.studentized_range`

Dependencies

PyARTool requires:

numpy >= 1.22
pandas >= 1.4
scipy >= 1.8
statsmodels >= 0.13

All dependencies are automatically installed when installing PyARTool via pip.

Example Scripts

Two companion scripts are included for cross-validation:

`example.py` — Python

Runs all five example analyses using PyARTool. This is the best place to start understanding how to use the package.

python example.py

`artool_example.r` — R

Runs the same five analyses using R's ARTool package. Use this to compare outputs side-by-side.

Rscript artool_example.r

Both scripts cover:

Between-subjects 3x3 — Higgins1990Table1 (OLS, no repeated measures)
Split-plot 4x4 — Higgins1990Table5 (mixed-effects with (1|Tray))
2x2x2 within-subjects — ElkinABC (mixed-effects with (1|S))
2x2 within-subjects — ElkinAB (mixed-effects with (1|S))
2x2x2 mixed with Error() — HigginsABC (repeated measures ANOVA)

Citations

If you use PyARTool in your research, please cite the original R package and methods papers:

Package:

Kay, M., Elkin, L. A., Higgins, J. J., and Wobbrock, J. O. (2025). ARTool: Aligned Rank Transform for Nonparametric Factorial ANOVAs. R package version 0.11.2. https://github.com/mjskay/ARTool. DOI: 10.5281/zenodo.594511.

ART procedure (used by art() and anova_art()):

Wobbrock, J. O., Findlater, L., Gergle, D., and Higgins, J. J. (2011). The Aligned Rank Transform for Nonparametric Factorial Analyses Using Only ANOVA Procedures. Proceedings of the ACM Conference on Human Factors in Computing Systems (CHI 2011). Vancouver, British Columbia (May 7--12, 2011). New York: ACM Press, pp. 143--146. DOI: 10.1145/1978942.1978963.

ART-C procedure (used by art_con() and artlm_con()):

Elkin, L. A., Kay, M., Higgins, J. J., and Wobbrock, J. O. (2021). An Aligned Rank Transform Procedure for Multifactor Contrast Tests. Proceedings of the ACM Symposium on User Interface Software and Technology (UIST 2021). Virtual Event (October 10--14, 2021). New York: ACM Press, pp. 754--768. DOI: 10.1145/3472749.3474784.

License

GPL-2.0-or-later, matching the original R ARTool package.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.github/workflows		.github/workflows
src/pyartool		src/pyartool
tests		tests
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
artool_example.r		artool_example.r
example.py		example.py
pyproject.toml		pyproject.toml

Folders and files

Latest commit

History

Repository files navigation

PyARTool: Aligned Rank Transform for Nonparametric Factorial ANOVAs

Table of Contents

Overview

When to Use ART

Installation

From PyPI (recommended)

From source

Requirements

Quick Start

API Reference

art() — Aligned Rank Transform

anova_art() — ANOVA on ART Data

summary_art() — Diagnostic Summary

art_con() — Contrast Tests (ART-C)

artlm() / artlm_con() — Access Fitted Models

Dataset Loaders

Supported Designs

Formula Syntax

Fixed effects

Random / grouping effects (mixed-effects model)

Error terms (repeated measures ANOVA)

Important notes

Detailed Walkthrough

Example 1: Between-Subjects Factorial

Example 2: Split-Plot / Mixed-Effects

Example 3: Multi-Factor Within-Subjects

Example 4: Repeated Measures with Error()

P-Value Adjustment Methods

R Parity & Validation

Architecture & Implementation Notes

Package Structure

Key Design Decisions

Dependency Mapping (R to Python)

Dependencies

Example Scripts

example.py — Python

artool_example.r — R

Citations

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`art()` — Aligned Rank Transform

`anova_art()` — ANOVA on ART Data

`summary_art()` — Diagnostic Summary

`art_con()` — Contrast Tests (ART-C)

`artlm()` / `artlm_con()` — Access Fitted Models

`example.py` — Python

`artool_example.r` — R

Packages