In [14]:
import sys

sys.path.insert(0, "../..")
from pathlib import Path
from src.settings import SearchSpace


First:
- go to the terminal
- go to the ML22 folder, use `cd ~/ML22` or `j ML` if necessary
- if you are planning on closing the laptop to let this run by itself, start a tmux session by giving the command `tmux`. You can name the session with `ctrl+b $`, leave the tmux with `ctrl+b` d, access it again with `tmux a -t 0` or replace the 0 with the name you gave it. See [tmux](https://github.com/tmux/tmux/wiki/Getting-Started) for more info.
- in that folder is the file `hypertune.py`. Run it with `poetry run python hypertune.py`
- In the hypertune.py file, a tune_dir is specified: `models/ray`. We will check the contents of that folder after the hypertune finished. You can also use tensorboard to check the results. 

 0.0992908 |           119 |            3 |     26 |         576.744  |    0 |   0.942187

In [16]:
tune_dir = Path("../../models/ray")
tune_dir.exists()


True

In [18]:
from ray.tune import ExperimentAnalysis
import ray
ray.init(ignore_reinit_error=True)

analysis = ExperimentAnalysis(tune_dir)


2022-12-05 19:00:18,528	INFO worker.py:1360 -- Calling ray.init() again after it has already been called.
2022-12-05 19:00:18,545	INFO experiment_analysis.py:795 -- No `self.trials`. Drawing logdirs from checkpoint file. This may result in some information that is out of sync, as checkpointing is periodic.


So, we find some info:

In [19]:
analysis.results_df.columns


Index(['iterations', 'train_loss', 'test_loss', 'Accuracy', 'time_this_iter_s',
       'done', 'timesteps_total', 'episodes_total', 'training_iteration',
       'experiment_id', 'date', 'timestamp', 'time_total_s', 'pid', 'hostname',
       'node_ip', 'time_since_restore', 'timesteps_since_restore',
       'iterations_since_restore', 'warmup_time', 'experiment_tag',
       'config/input_size', 'config/output_size', 'config/tune_dir',
       'config/data_dir', 'config/hidden_size', 'config/dropout',
       'config/num_layers'],
      dtype='object')

Let's focus on the parameters we wanted to tune.

In [32]:
import plotly.express as px

plot = analysis.results_df
select = ["Accuracy", "config/hidden_size", "config/dropout", "config/num_layers"]
p = plot[select].reset_index().dropna()


Let's sort by accuracy

In [33]:
p.sort_values("Accuracy", inplace=True)

Make a parallel plot

In [34]:
px.parallel_coordinates(p, color="Accuracy")



iteritems is deprecated and will be removed in a future version. Use .items instead.



Get the best trial

In [35]:
analysis.get_best_trial(metric="test_loss", mode="min")


train_ddbb95a8

The top ten

In [37]:
p[-10:]


Unnamed: 0,trial_id,Accuracy,config/hidden_size,config/dropout,config/num_layers
42,dcea1fe6,0.921875,125,0.05373,4
38,db0306c0,0.921875,125,0.03694,2
36,d9439ca0,0.923438,97,0.151226,4
46,dfda7f84,0.923438,123,0.297968,2
47,e07332b0,0.9375,126,0.112087,4
48,f7dbd542,0.942187,127,0.296891,2
14,39894416,0.951562,124,0.028091,4
2,0707ac58,0.953125,119,0.031863,2
23,cedc5414,0.964063,102,0.083409,2
43,ddbb95a8,0.96875,127,0.278089,4


Or the best config

In [39]:
analysis.get_best_config(metric="Accuracy", mode="max")


{'input_size': 3,
 'output_size': 20,
 'tune_dir': PosixPath('/Users/raoulgrouls/code/ML22/models/ray'),
 'data_dir': PosixPath('/Users/raoulgrouls/code/ML22/data/external/gestures-dataset'),
 'hidden_size': 127,
 'dropout': 0.27808873419311075,
 'num_layers': 4}