Skip to content

JunShern/ml_workflow

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ml_workflow

Skeleton/template for ML research codebase, accompanies this blog post on my ML Experiment Workflow.

Repo overview (adapted from Cookiecutter Data Science):

├── README.md          <- The README for developers using this project.
├── requirements.txt   <- The requirements file for reproducing the analysis environment
├── env.sh             <- Script to define and set environment variables
│
├── data
│   ├── processed      <- The final, canonical data sets for modeling.
│   └── raw            <- The original, immutable data dump.
│
├── models             <- Trained and serialized models, model predictions, or model summaries
│
├── my_project         <- Source code, replace with your project name
│   ├── __init__.py    <- Makes my_project a Python module
│   │
│   ├── common         <- Modules shared by different parts of the project
│   │   └── utils.py
│   │
│   ├── data           <- Scripts to download or generate data
│   │   ├── config.py
│   │   └── make_dataset.py
│   │
│   └── model          <- Scripts to train and test models
│       ├── config.py
│       ├── test.py
│       └── train.py
│
└── experiments        <- Configs and scripts for running experiments
    ├── agent.sbatch   <- sbatch file to run jobs on SLURM
    └── sweep.py       <- Run hyperparameter sweeps

Setup:

# Get dependencies
conda create -n workflow python=3.9
conda activate workflow
pip install -r requirements.txt

# Fill in the environment variables in `env.sh` then apply them
source env.sh

Example usage:

# Train a single model
python -m my_project.model.train

# Run hyperparameter sweep
python experiments/sweep.py