Model Training and Evaluation Pipeline

This repository contains the code used to train, evaluate, and post-process machine learning models across multiple configurations and random seeds.

The pipeline is designed to support large-scale experimentation and comparative analysis across disease cohorts, timepoints, and model settings.

The final trained models and the complete MISPA dataset are available for download at the following link: https://drive.google.com/drive/u/1/folders/1jbK-60344zbNYwgxpiV_oz9djLEKwKF6

Overview

Trains models across multiple configurations
Uses 100 random seeds per configuration for robustness
Produces final trained models and aggregated evaluation results
Supports downstream analysis grouped by disease cohort, train/test timepoint, and model configuration

Running the Pipeline

To train models for all configurations over 100 random seeds, run:

python main.py

This will generate all intermediate outputs and trained models required for analysis.

Requirements All required Python packages are listed in requirements.txt. Install dependencies with:

pip install -r requirements.txt

These models correspond to the full set of configurations and random seeds used in the study.

Post-processing and Analysis The notebook final_leidos.ipynb is used to post-process and analyze model outputs. Results are aggregated and grouped by:

Disease cohort, Train/test timepoint, Model configuration. This notebook generates the final summaries used for evaluation.

Reproducibility All experiments are run using fixed random seeds Each configuration is evaluated over 100 independent seeds Trained models and outputs are saved for reproducibility and downstream analysis Notes Ensure sufficient compute and storage before running the full pipeline Downloading pre-trained models is recommended if retraining is not feasible

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
Leidos_data_98_Abs_with_clinical_features_v2.csv		Leidos_data_98_Abs_with_clinical_features_v2.csv
Leidos_data_v3.csv		Leidos_data_v3.csv
README.md		README.md
bnMLP.py		bnMLP.py
cutoffs.csv		cutoffs.csv
data.py		data.py
leidos final.ipynb		leidos final.ipynb
main.py		main.py
requirements.txt		requirements.txt
setup.py		setup.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Model Training and Evaluation Pipeline

Overview

Running the Pipeline

About

Uh oh!

Releases

Packages

Languages

Lee-CBG/Leidos

Folders and files

Latest commit

History

Repository files navigation

Model Training and Evaluation Pipeline

Overview

Running the Pipeline

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages