- utils.py : A code containing common codes
- train.py : A normal training code without 'raytune'
- train_tune.py : A training code with 'raytune'
- pytorch
- raytune
-Installation:
$ pip install 'ray[tune]'
$ git clone https://github.com/machine-intelligence-lab/raytune_example.git
<train.py>
$ python3 train.py
<train_tune.py>
$ CUDA_VISIBLE_DEVICES=x,x python3 train_tune.py
- It runs total 5 trials and each trial runs maximum 30 epochs
- It assigns 2 CPUs and 0.25 GPUs to each trial (That is, 4 trials share one gpu)
- It saves logs into ".ray_result/expr_name"
(expr_name has a name like 'DEFAULT_yyyy_mm_dd_hh_mm_ss')
If tensorboard is installed, you can visualize trial results
Command:
$ tensorboard --logdir=.ray_result/expr_name
If you run trials on a server, then it might be more convinient just copying the experiment directory to your local computer where a web-browser works