Multi-timescale transfer-operator pipeline — supplementary code

This directory contains the supplementary code accompanying

Kaur, Jain, & Berman, Using timescale as a state coordinate reveals the metastable geometry of behavior (2026).

The repository has two layers:

figures/ — One Python script per published figure (Fig. 2 panels B–E, Fig. 3, Fig. 4, Fig. 5, and Supp. Figs. S1–S11). Each script loads precomputed inputs from data/ and writes PNG and PDF outputs to outputs/. Running python figures/run_all.py regenerates every panel in 5–10 minutes.
*.ipynb notebooks — End-to-end pipeline demonstrations (wavelet decomposition → PCA → delay embedding → clustering → transition matrix → G-PCCA) for each system. The notebooks show how the cached data/ files were produced and let you re-run the upstream steps on your own data via user_data.ipynb.

If you only want to reproduce the published figures, you can ignore the notebooks and use figures/ directly.

Pipeline at a glance

For every system, the pipeline runs:

Wavelet decomposition (Morlet, $\omega_0 = 5$) of each measurement channel into 25 dyadically spaced frequency bands.
PCA on the wavelet amplitudes, keeping the components that exceed a temporal-shuffle threshold (each feature permuted in time; not a phase randomization).
Delay embedding at dimension $d$, chosen by Cao's $E_1(d)$ saturation criterion.
K-means partition into $N$ clusters, with $N$ chosen by the entropy gap $\Delta H(N)$.
Transition matrix $T(\tau)$ at a working lag $\tau$ chosen by the implied-timescales plateau.
Eigendecomposition of $T$ and selection of the number of basins $M$ from the largest spectral-gap ratio $|\lambda_M| / |\lambda_{M+1}|$.
G-PCCA (Reuter–Weber 2018) for soft basin memberships $\chi_j(i) \in [0, 1]$.
Hub-and-arm geometry in the leading non-trivial eigenvectors: $\pi$-weighted hub + per-basin arm vectors as centroid$-$hub.

To check a candidate representation on your own data, diagnostics.run_diagnostics evaluates the paper's four falsifiable criteria (spectral gap, participation ratio, simplex/arms geometry, and held-out prediction beating a memoryless null) and prints a pass/fail table; diagnostics.stationarity_drift flags violations of the single-stationary-distribution assumption. When pooling multiple recordings, pass a list of per-individual cluster sequences to pipeline.make_transition_matrix so transitions are never counted across the splice between individuals.

Repository layout

slowmode/
├── pipeline.py             # core algorithms (wavelets, PCA, embed, kmeans, T)
├── gpcca_utils.py          # G-PCCA wrapper, hub/arm geometry
├── diagnostics.py          # the four falsifiable criteria + run_diagnostics()
├── figures.py              # shared matplotlib styling, ARM_PALETTE, plot_psd
├── lorenz_simulation.py    # standalone driven-Lorenz simulator
├── requirements.txt
│
├── figures/                # one script per published figure
│   ├── _paths.py           # repo path setup + setup() style helper
│   ├── figure_2B.py
│   ├── figure_2C.py
│   ├── figure_2D.py
│   ├── figure_2E.py
│   ├── figure_3.py
│   ├── figure_4.py
│   ├── figure_5.py
│   ├── supp_figure_1.py
│   ├── supp_figure_2.py
│   ├── supp_figure_3.py
│   ├── supp_figure_4.py
│   ├── supp_figure_5.py
│   ├── supp_figure_6.py
│   ├── supp_figure_7.py
│   ├── supp_figure_8.py
│   ├── supp_figure_9.py
│   ├── supp_figure_10.py
│   ├── supp_figure_11.py
│   └── run_all.py          # convenience: run every script in order
│
├── data/                   # precomputed inputs shipped with the repo (~120 MB)
│   ├── lorenz/
│   ├── worms/
│   └── flies/
│
├── outputs/                # PNG + PDF written by figure scripts
│
├── lorenz.ipynb            # pipeline demo (Lorenz)
├── worms.ipynb             # pipeline demo (C. elegans)
├── flies.ipynb             # pipeline demo (Drosophila)
├── user_data.ipynb         # pipeline applied to your own time series
│
└── README.md

Figure-to-script map

The numbering matches the published manuscript. Fig. 1 is a hand-drawn pipeline schematic and has no Python source. Fig. 2 panel A (driver double-well sketch + sample $h(t)$ trace) is generated inside lorenz.ipynb; panels B–E are mirrored as separate scripts so they can be inspected and re-rendered independently (the published figure is assembled in Illustrator from these per-panel outputs).

Figure	Script	Notes
Fig. 2A	`lorenz.ipynb`	Driver double-well + sample $h(t)$
Fig. 2B	`figures/figure_2B.py`	3D attractor + (x,y),(y,z),(x,z) projections coloured by $\phi_2$
Fig. 2C	`figures/figure_2C.py`	$
Fig. 2D	`figures/figure_2D.py`	$h(t)$, multi $\phi_2$, fixed $\phi_2$ time series
Fig. 2E	`figures/figure_2E.py`	Pearson $
Fig. 3	`figures/figure_3.py`	Worms (panels A–E)
Fig. 4	`figures/figure_4.py`	Flies, methodology (panels A–E)
Fig. 5	`figures/figure_5.py`	Flies, biology (panels A–D)
S1	`figures/supp_figure_1.py`	Lorenz: Cao $E_1$ fixed + multi; mean dwell vs $\beta$; entropy gap vs $N$
S2	`figures/supp_figure_2.py`	Worm operator diagnostics
S3	`figures/supp_figure_3.py`	Worm biological correspondence + leave-one-worm-out CV
S4	`figures/supp_figure_4.py`	12-worm grid of dwell-time CCDFs
S5	`figures/supp_figure_5.py`	Worm residence-time model selection + slow-mode shape
S6	`figures/supp_figure_6.py`	Worm time-evolving landscape $V(\phi_2, t)$
S7	`figures/supp_figure_7.py`	Fly parameter selection + basin-count justification
S8	`figures/supp_figure_8.py`	Fly residence-time model selection + slow-mode shape
S9	`figures/supp_figure_9.py`	Fly reproducibility / individuality
S10	`figures/supp_figure_10.py`	One-step-Markov surrogate dwell null (flies + worms)
S11	`figures/supp_figure_11.py`	Residence-time robustness to smoothing scale $\Delta$

Quick start: regenerate all paper figures

pip install -r requirements.txt          # installs numpy/scipy/matplotlib/pygpcca/powerlaw/umap-learn
python figures/run_all.py                # ~5--10 min, writes everything to outputs/

To render a single panel:

python figures/figure_3.py

Installation

Python 3.10 or newer. Required packages:

numpy
scipy
scikit-learn
matplotlib
pygpcca         # G-PCCA implementation (Reuter & Weber 2018)
powerlaw        # MLE power-law fits (Clauset et al. 2009)
umap-learn      # 2D embedding for worm/fly behavior maps

pip install numpy scipy scikit-learn matplotlib pygpcca powerlaw umap-learn

Running on Google Colab

Each notebook starts with a cell that, when uncommented, will pip install the missing dependencies and clone this repository:

# !pip install pygpcca powerlaw umap-learn
# !git clone https://github.com/bermanlabemory/slowmode.git
# %cd slowmode

After that, python figures/run_all.py (or any individual script) runs top-to-bottom.

Data shipped in the repository

We ship enough precomputed input data in data/ for every figure script to run end-to-end. Total size: ~125 MB. The dwell-time figures (Fig. 5D and Supp. Figs. S5, S8, S10, S11) recompute metastable residences on the fly from the shipped cluster sequences (via pipeline.metastable_residences), so they depend only on states_*.pkl and the G-PCCA memberships.

File	Size	What
Lorenz (`data/lorenz/`, 4.7 MB)
`lorenz_panel2B_data.npz`	4.9 MB	Per-frame + per-cluster $\phi_2$ for Fig 2B
`lorenz_supp_data.npz`	48 KB	Cao $E_1$, $\Delta h(N)$, dwell vs $\beta$, eigenvalue spectra
`lorenz_corr_sweep.npz`	2 KB	Single-seed $\beta$ sweep for Fig 2D
`lorenz_corr_seeds.npz`	4 KB	Multi-seed $\beta$ sweep with SEMs for Fig 2E
Worms (`data/worms/`, 33 MB)
`states_worms.pkl`	1.5 MB	Cluster sequences at $N = 250$, per worm × segment
`all_valid_segments_worms.pkl`	17 MB	Raw eigenworm coefficients (Stephens 2008) for Fig 3A power spectrum
`worm_eigs_tau3s.npz`	9 KB	Multi-timescale eigenvectors at $\tau = 3$ s
`gpcca_worms_M2_tau3s.npz`	16 KB	G-PCCA basin membership $\chi$, $\pi$, eigvals
`gpcca_worms_M3_M4_tau3s.npz`	18 KB	G-PCCA at deeper hierarchies $M = 3, 4$
`worms_costa_per_cluster.npz`	34 KB	Per-cluster $(\bar\omega,
`worms_umap_canonical_full.npz`	15 MB	2D UMAP grid for Fig 3 C/D, S3 A/B
`arm_dynamics_worms_tau3s.npz`	25 KB	Basin-level MI and apparent decay rate
`worms_supp_data.npz`	7 KB	Cao $E_1$, entropy gap, eigenvalue ladders
`M_eq_1_vs_M_eq_2_test.npz`	10 KB	Leave-one-worm-out held-out MI vs null
`lognormal_reanalysis_worms.npz`	89 KB	Slow-mode (GED) shape for S5C
Flies (`data/flies/`, 88 MB)
`states_flies.pkl`	43 MB	Multi-timescale cluster sequences at $N = 1000$
`states_flies_fixed.pkl`	43 MB	Fixed-timescale cluster sequences at $N = 3000$
`fly_eigs_tau2s.npz`	130 KB	Leading eigenvectors at $\tau = 2$ s
`gpcca_flies_M4_tau2s.npz`	85 KB	G-PCCA basin membership at $M = 4$
`arm_dynamics_results_tau2s.npz`	871 KB	$r_k(\tau)$, predictive MI, per-fly stats
`behavior_density_chi_tau2s.npz`	2.9 MB	$\chi$-weighted Berman 2014 behavior map
`pr_leave_one_out.npz`	3 KB	Leave-one-fly-out PR contrast (Fig 4D)
`cv_flies_tau2s.npz`	9 KB	Leave-one-fly-out CV of basin count
`per_fly_pcca_refit_tau2s.npz`	10 KB	Per-fly G-PCCA refit cosines
`lognormal_reanalysis.npz`	878 KB	Slow-mode (GED) shape for S8C
`flies_supp_method_data.npz`	8 KB	PCA shuffle, Cao $E_1$, $\Delta h(N)$
`flies_supp_method_v2_data.npz`	3 KB	S7 $\tau$/$M$ sweeps + basin-count CV

Files not shipped (regenerable from raw recordings):

File	Approx. size	Where to get it
`flies_wlets_pca.pkl`	612 MB	Recompute from joint-angle CSVs (`pipeline.morlet_wavelet_amplitudes` + `pipeline.pca_with_shuffle_threshold`); raw recordings are in the Berman 2014 dataset
`flies_zvals.pkl`	163 MB	Recompute from the same raw recordings using `MotionMapper`
`worms_wlets_pca.pkl`	15 MB	Recompute from the Stephens 2008 / Broekmans 2016 eigenworm coefficients shipped in `all_valid_segments_worms.pkl`

Recomputing these large intermediate files is what each *.ipynb notebook shows; they're not needed if you just want to regenerate the figures.

Pipeline demonstration notebooks

The notebooks document how the cached files in data/ were produced and serve as worked examples for applying the pipeline to your own data. They do not generate the published figures themselves (the figures/*.py scripts do); for figure regeneration you can skip the notebooks entirely.

Notebook	What it shows
`lorenz.ipynb`	Driven-Lorenz simulation, wavelet pipeline, multi- vs fixed-timescale eigendecomposition, $\beta$ sweep
`worms.ipynb`	Eigenworm → wavelet PCA → cluster → transition matrix → G-PCCA at $M=2$
`flies.ipynb`	Joint-angle → wavelet PCA → cluster → transition matrix → G-PCCA at $M=4$, plus the Berman 2014 behavior-map projection
`user_data.ipynb`	Applies the same pipeline to a generic $(T, d)$ time series of your choosing

Per-notebook runtime estimates

Rough estimates on a 2024 MacBook Pro without a GPU.

Workflow	Time
`python figures/run_all.py` (just regenerate all paper figures from `data/`)	5–10 min
`lorenz.ipynb` (single $\beta$, $N=200$)	~5 min
`lorenz.ipynb` ($\beta$ sweep + $N=1300$, multi seeds)	~30–60 min
`worms.ipynb` (cached cluster sequences)	~5–10 min
`worms.ipynb` (re-cluster from scratch at $N=250$)	~20 min
`flies.ipynb` (cached cluster sequences)	~10–15 min
`flies.ipynb` (re-cluster at $N=1000$ on all 30 flies)	~1–2 h
`user_data.ipynb`	depends on $T$, $N$

Citing

If this code is useful in your work, please cite the manuscript:

@article{KaurJainBerman2026,
  title   = {Using timescale as a state coordinate reveals the metastable
             geometry of behavior},
  author  = {Kaur, R. and Jain, K. and Berman, G. J.},
  year    = {2026},
}

The G-PCCA implementation we wrap is pygpcca (Reuter & Weber 2018); the power-law fitting routine is the powerlaw package (Alstott et al. 2014); UMAP via McInnes et al. (2018).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Multi-timescale transfer-operator pipeline — supplementary code

Pipeline at a glance

Repository layout

Figure-to-script map

Quick start: regenerate all paper figures

Installation

Running on Google Colab

Data shipped in the repository

Pipeline demonstration notebooks

Per-notebook runtime estimates

Citing

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
data		data
figures		figures
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
diagnostics.py		diagnostics.py
figures.py		figures.py
flies.ipynb		flies.ipynb
gpcca_utils.py		gpcca_utils.py
lorenz.ipynb		lorenz.ipynb
lorenz_simulation.py		lorenz_simulation.py
pipeline.py		pipeline.py
requirements.txt		requirements.txt
user_data.ipynb		user_data.ipynb
worms.ipynb		worms.ipynb

Folders and files

Latest commit

History

Repository files navigation

Multi-timescale transfer-operator pipeline — supplementary code

Pipeline at a glance

Repository layout

Figure-to-script map

Quick start: regenerate all paper figures

Installation

Running on Google Colab

Data shipped in the repository

Pipeline demonstration notebooks

Per-notebook runtime estimates

Citing

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages