Skip to content

yadapo/DiSPAH

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DiSPAH

Code for the paper:

Decomposing heterogeneity in disease progression speeds and pathways
Yuichiro Yada, Honda Naoki
npj Digital Medicine (2026) — https://www.nature.com/articles/s41746-026-02665-8

DiSPAH (Disease progression Speed and Pthway Analysis based on Hidden Markov model) is a machine learning framework that models individual patient disease progression using an individual-progression-speed continuous-time hidden Markov model (IPS-CTHMM). Applied to ALS (ALSFRS-R longitudinal scores), it simultaneously infers each patient's latent disease state trajectory and progression speed.


Repository structure

File Description
IPSCTHMM_model.py Core IPS-CTHMM model (EM algorithm, forward-backward)
ALSdataread.py Data loading and preprocessing for AnswerALS and PRO-ACT dataset
AnswerALS_DiSPAH.py Main DiSPAH analysis on AnswerALS cohort
AnswerALS_DiSPAH_twostage.py Two-stage model fitting (population-wide parameters and individual speeds)
AnswerALS_posthoc_analysis.py Speed/cluster associations with transcriptomics and proteomics
PROACT_DiSPAH.py Cross-cohort validation on PRO-ACT
AnswerALS_prediction.py Prediction of speed and pathway from baseline features
IPSCTHMM_simulator.py Simulator for synthetic data experiments
stratification.py Trajectory clustering (DTW + hierarchical clustering)
num_state_selection.py Number of latent states selection via cross-validation
compute_state_characteristics.py State characterization table (domain-level expectations)
PROACT_first3_prediction.py PRO-ACT holdout prediction using first 3 visits
ENCALS_prediction_restrictPred.py Survival prediction comparison with ENCALS score
ENCALS_prediction_milestones_restrictPred.py Functional milestone prediction comparison with ENCALS
survival_analysis.py Kaplan-Meier and Cox proportional hazard analyses (overall survival)
survival_functional_milestone_analysis.py Milestone-based survival analysis by ALSFRS-R domain
extract_discordant_state_sequences.py Extract latent state sequences for discordant patients
speed_slope_discordant_analysis.py Identification and analysis of speed-slope discordant patients
simdata_interval_exp.py Simulation experiment varying observation intervals
replot_simdata_pathway_speed_recovery.py Plot speed/pathway recovery curves from simulation outputs
recompute_speed_metrics_spearman_and_scatter.py Recompute speed recovery metrics from saved simulation outputs
visualization_same_start_different_speed.py Visualize patient pairs with similar baseline but different speed
plot_predcomparion.py Prediction comparison bar chart

Code for each figure

Fig. 2 — DiSPAH applied to AnswerALS cohort

Model training and latent state/pathway estimation:

python AnswerALS_DiSPAH.py

Speed and cluster associations with clinical features (sex, age, riluzole, mutations):

python AnswerALS_posthoc_analysis.py

Overall survival analysis (KM curves, Cox regression, forest plots):

python survival_analysis.py \
  --dispah_csv AnswerALS_covar_estimated_results.csv \
  --outdir survival_out

Fig. 3 — Cross-cohort validation and prediction

DiSPAH applied to PRO-ACT cohort:

python PROACT_DiSPAH.py

Fig. 4 & 5 — Association analysis with genetic information and omics data of patient-derived motor neurons.

Speed and cluster associations with clinical features (sex, age, riluzole, mutations):

python AnswerALS_posthoc_analysis.py

Fig. 6 — Prediction of progression speeds and pathways from information available at the first medical visit.

**Leave-one-out cross validation **

python AnswerALS_prediction.py \
--num_genes 0

Table 1 — Benchmark comparison of ENCALS-like Cox models and DiSPAH-derived features.

Comparison with ENCALS score for survival and functional milestone prediction:

python ENCALS_prediction_restrictPred.py \
  --out_dir encals_pred_out
python ENCALS_prediction_milestones_restrictPred.py \
  --out_dir encals_milestone_out

Supplementary figures and tables

Speed-slope discordant patient analysis (Supplemental Fig. 2&3)

python speed_slope_discordant_analysis.py \
  --est_csv AnswerALS_covar_estimated_results.csv \
  --outdir discordant_out

Robustness of relative progression speed estimates to fixing the transition-rate matrix (Supplementary Fig. 4)

python AnswerALS_DiSPAH_twostage.py

Patient pairs with similar baseline but different speeds (Supplementary Fig. 5)

python visualization_same_start_different_speed.py \
  --est-results-csv AnswerALS_covar_estimated_results.csv

Simulation-based validation of patient-specific information estimation (Supplementary Fig. 7)

python simdata_interval_exp.py \
  --outdir sim_interval_exp

ALSFRS-R domain-specific functional milestone analysis (Supplementary Fig. 8)

python survival_functional_milestone_analysis.py \
  --dispah_csv AnswerALS_covar_estimated_results.csv \
  --outdir milestone_out

Holdout prediction on PRO-ACT using first 3 visits (speed-CTHMM vs uniform-CTHMM; Supplementary Fig. 10)

python PROACT_first3_prediction.py

Characteristics of the estimated latent disease states (Supplementary Table 2)

python compute_state_characteristics.py \
  --out-dir AnswerALS_DiSPAH

Sensitivity analysis for the standard deviation of the speed prior (Supplementary Table 4)

python AnswerALS_DiSPAH_twostage.py

The remaining supplementary figures and tables are generated as byproducts of the code for the main figures and tables.


Requirements

easydict==1.10
fastdtw==0.3.4
gseapy==1.0.5
jax==0.4.13
jaxlib==0.4.13
matplotlib==3.7.2
mygene==3.2.2
numpy==1.25.1
numpyro==0.12.1
pandas==2.0.3
scikit-learn==1.3.0
scipy==1.11.1
seaborn==0.12.2
statsmodels==0.14.0

Install with:

pip install -r requirements.txt

Data availability

This code is designed for the AnswerALS and PRO-ACT datasets. Both require separate data access applications.


Citation

@article{yada2026dispah,
  title   = {Decomposing heterogeneity in disease progression speeds and pathways},
  author  = {Yada, Yuichiro and Naoki, Honda},
  journal = {npj Digital Medicine},
  year    = {2026},
  url     = {https://www.nature.com/articles/s41746-026-02665-8}
}

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages