ATOMS: Adaptive Tournament Model Selection

This repository implements ATOMS (Adaptive Tournament Model Selection) and benchmark algorithms, in the following paper:

Capponi, A., Huang, C., Sidaoui, J. A., Wang, K., and Zou, J. (2025). The Nonstationarity-Complexity Tradeoff in Return Prediction. Available at SSRN: https://ssrn.com/abstract=5980654

Algorithms

ATOMS: adaptive model selection via (i) adaptive validation-window selection and (ii) a tournament procedure.
Fixed-window baselines:
- Fixed-val($\ell$): select the model with the lowest average validation loss over the last $\ell$ periods.
- Fixed-CV: select a model using cross-validation on a fixed historical window.

This implementation focuses on model selection (ATOMS and baselines). It does not include the full large-scale training pipeline in the paper.

Quickstart

1) Create an environment and install

python -m venv .venv
source .venv/bin/activate   # macOS/Linux
# .venv\Scripts\activate    # Windows PowerShell

python -m pip install -U pip
python -m pip install -e .

2) Run the demo

Notebook walkthrough: example/demo.ipynb.

The demo generates a synthetic nonstationary dataset, trains a small set of candidate models,
runs ATOMS and baselines, and reports performance summaries.

Core usage

Inputs

ATOMS operates on per-observation validation losses organized by time period.

The inputs include:

val_losses: list of length n_models.
- Each entry is an array of validation losses concatenated by period in chronological order.
- Single response: shape (n_obs,)
- Multiple responses (e.g., 17 industry portfolios): shape (n_obs, n_responses) (selection is done separately for each response)
val_sizes: list/array (n_0, n_1, ..., n_{T-1}), where n_t is the number of validation observations
in period t (same concatenation order as val_losses).

ATOMS

from atoms import ATOMS

atoms = ATOMS(delta=0.1, M=1.0, seed=0)
best_idx = atoms.select(val_losses, val_sizes)  # int (single response) or (n_responses,) array

Here, best_idx is a 0-based index into the candidate model list, corresponding to the selected model. In multi-response settings, it outputs one selected index per response.

Fixed-window baselines

Fixed-val($\ell$)

from atoms import fixed_val_select

best_idx = fixed_val_select(val_losses, val_sizes, L=10)

Fixed-CV

from atoms import fixed_cv_select

best_idx = fixed_cv_select(
    specs,
    X_by_period,
    y_by_period,
    t=t,
    cv_window_periods=36,
    n_splits=5,
)

Here, specs is a list of CandidateSpec objects (see below), t is the testing period, and X_by_period and y_by_period store the data in period form: X_by_period[s] is the feature matrix with shape $(n_s,d)$, and y_by_period[s] is the corresponding response array with shape $(n_s,)$ for a single response, or $(n_s,R)$ for $R$ responses.

Candidate specifications

A candidate “model” is represented by a CandidateSpec:

from atoms import CandidateSpec
from sklearn.linear_model import Ridge

spec = CandidateSpec(
    name="Ridge (10 periods)",
    estimator_factory=lambda: Ridge(alpha=1.0),
    train_window=10,   # number of periods used for training
)

Computing out-of-sample $R^2$

The paper reports two $R^2$ metrics:

$R^2$ with zero benchmark:

$$1 - \frac{\sum_{i=1}^n (\hat{y}_i - y_i)^2}{\sum_{i=1}^n y_i^2}$$

$R^2$ with demeaned denominator:

$$1 - \frac{\sum_{i=1}^n (\hat{y}_i - y_i)^2}{\sum_{i=1}^n (y_i-\overline{y})^2}$$

where $\overline{y}$ is the mean of $y_1,...,y_n$.

They can be computed via

from atoms import oos_r2, oos_r2_over_periods

r2_zero = oos_r2(y_true, y_pred, demean=False)
r2      = oos_r2(y_true, y_pred, demean=True)

Regime-switching model

The repository also includes the monthly refit Markov-switching forecast used for the new algorithm. It can run either on a dated dataframe or directly on the period-based synthetic data structure used in example/demo.ipynb:

from atoms import run_regime_switch_on_periods

pred_df = run_regime_switch_on_periods(
    X_by_period,
    y_by_period,
    start_month="2000-01",
    min_train_months=12,
    k_regimes=2,
)

This returns a dataframe with y_true, y_pred, forecast_month, and regime probability columns for the out-of-sample months.

$R^2$ over user-specified subperiods

You can compute $R^2$ over named windows specified in period indices (0-based, inclusive):

period_sizes = [n_0, n_1, ..., n_{T-1}]   # sample size per period in the concatenation
windows = {
    "Full": (0, T - 1),
    "Late sample": (20, T - 1),
}

r2_by_window = oos_r2_over_periods(
    y_true,
    y_pred,
    period_sizes,
    windows,
    demean=False,
)

Project structure

src/atoms/selection.py — ATOMS: adaptive window selection + tournament
src/atoms/baselines.py — Fixed-val and Fixed-CV baselines
src/atoms/metrics.py — OOS R^2 metrics
src/atoms/specs.py — CandidateSpec definition
src/atoms/synthetic.py — Synthetic nonstationary data generators
examples/demo_synthetic.ipynb — step-by-step demo notebook

Citation

@article{CHS25,
  title={The Nonstationarity-Complexity Tradeoff in Return Prediction},
  author={Capponi, Agostino and Huang, Chengpiao and Sidaoui, J.~Antonio and Wang, Kaizheng and Zou, Jiacheng},
  journal={Available at SSRN 5980654},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
__pycache__		__pycache__
example		example
src		src
.DS_Store		.DS_Store
.gitattributes		.gitattributes
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ATOMS: Adaptive Tournament Model Selection

Algorithms

Quickstart

1) Create an environment and install

2) Run the demo

Core usage

Inputs

ATOMS

Fixed-window baselines

Fixed-val($\ell$)

Fixed-CV

Candidate specifications

Computing out-of-sample $R^2$

Regime-switching model

$R^2$ over user-specified subperiods

Project structure

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ATOMS: Adaptive Tournament Model Selection

Algorithms

Quickstart

1) Create an environment and install

2) Run the demo

Core usage

Inputs

ATOMS

Fixed-window baselines

Fixed-val($\ell$)

Fixed-CV

Candidate specifications

Computing out-of-sample $R^2$

Regime-switching model

$R^2$ over user-specified subperiods

Project structure

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages