Skip to content


Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time


This repo is a bootstrap for experiments and includes helper functions scripts for pytorch training and slurm job scheduler.

Basic idea: Create a python script(experiment) which accepts command line arguments. Provide arg_lists and generate slurm_jobs using the cross product of the given arg lists.

Quick Start

First thing setup your ssh workflow. Then lets kick-start our experiment.

ssh prince
git clone
mv exp.bootstrp my_exp
cd my_exp

First debug the experiment on a interactive session loads the modules needed, update as needed. Personally I am using python3 with pip --user packages. You can call it with install for the first time

srun -t2:30:00 --mem=5000 --gres=gpu:1 --pty /bin/bash

. ./ install
cd experiments/cifar10/
python --epoch 1

After we are sure that our main script works, we can start create automated experiments with scripts. First thing to do is updating some of the SLURM fields under experiments/default_conf.yaml. Replace NET_ID with you net_id for example if you are a fellow NYU student and using prince. You may need to completely change this file according to your needs if you are working in another system or have different requirements.


Note that each element of the experiment key in the yaml file is a dictionary itself involves argument lists for <exp_name>/ Each of the values in these argument lists are cross-product with others in the dictionary to generate all possible combinations.

Now we can generate experiment scripts.

cd ../
python --debug

if they all look nice then you can create the experiment folder. and submit the jobs

bash /scratch/ue225/my_project/exps/cifar10/cifarLR_03.26/

which would output something like this

log Let say you wanna define a new experiment. You would do by creating a new folder experiments/new_folder/ and a experiments/new_folder/main.pyscript that is intended to be run. The script should accept --log_folder and --conf_file flags at minimum. Then you can change exp_name at experiments/default_conf.yaml to new_folder and create new experiments.


Visualizing Tensorboard Events

there are several options

  • You can scp like
scp prince:/scratch/ue225/my_project/exps/cifar10/cifarLR
.26/tb_logs ./
  • You can open a tunnel to the prince and run tensorboard on prince and connect to it through port forwarding. You can look my (remote Jupyter and port forwarding]( notes.
  • You can use sshfs and get the logs sync into your local file system. Details here

and read your results log log log


I am excited to collaborate and learn from you if you figured out better ways experimenting or wanna add text/code to this repo. Please create an issue or reach_out to me.


  • change create_experiments such that maybe the defaults included in the experiment.yaml and dumped.
  • Source code needs to be copied!


This repo is a bootstrap for experiments and includes helper functions scripts for pytorch training and slurm job scheduler.






No releases published


No packages published