# Unleash the Ray - Random Search

Let's revisit our random search example but now with Ray

A lot of this code is going to be familiar as we already had our pipeline wraped in a function

In [2]:
%load_ext autoreload
%autoreload 2

from dependencies import *
from tuning import *

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


### Let's start Ray

In [4]:
ray.shutdown()
ray.init(num_cpus=5,num_gpus=0, include_dashboard=False)

{'node_ip_address': '192.168.123.68',
 'raylet_ip_address': '192.168.123.68',
 'redis_address': '192.168.123.68:6379',
 'object_store_address': '/tmp/ray/session_2020-11-09_13-00-41_683682_23101/sockets/plasma_store',
 'raylet_socket_name': '/tmp/ray/session_2020-11-09_13-00-41_683682_23101/sockets/raylet',
 'webui_url': None,
 'session_dir': '/tmp/ray/session_2020-11-09_13-00-41_683682_23101',
 'metrics_export_port': 52169}

After initialisation the [Ray Dashboard](https://docs.ray.io/en/master/ray-dashboard.html) is available on the **webui_url** port

## Go Random Go!

So to move into a random search configuration we just need to change our search parameters to use `distributions`

In our last example:

```
ray_tuning_config = {
    'randomforestclassifier__n_estimators': tune.grid_search([1,5,15,50,100]),
    'randomforestclassifier__criterion': tune.grid_search(['gini', 'entropy']),
    'randomforestclassifier__max_features': tune.grid_search(['auto', 'sqrt', 'log2']),
    'randomforestclassifier__bootstrap': tune.grid_search([True, False]),
    'randomforestclassifier__min_samples_leaf': tune.grid_search([1,2,3,4]),
    'randomforestclassifier__min_samples_split': tune.grid_search([3,4,5,6])
}
```

Replace grid search with other tune distribution that we can sample from >> [read the docs](https://docs.ray.io/en/latest/tune/api_docs/grid_random.html?highlight=tune.grid#random-distributions-api)

Choose appropriate distriutions for the different parameter types

In [None]:
ray_tuning_config = {
    'randomforestclassifier__n_estimators': tune.randint(1, 150),
    'randomforestclassifier__criterion': tune.choice(['gini', 'entropy']),
    'randomforestclassifier__max_features': tune.choice(['auto', 'sqrt', 'log2']),
    'randomforestclassifier__bootstrap': tune.choice([True, False]),
    'randomforestclassifier__min_samples_leaf': tune.randint(1,4),
    'randomforestclassifier__min_samples_split': tune.randint(3,6)
}

There is a slight change to the run function too

In [None]:
analysis = tune.run(
                e2e_simple_training,
                config=ray_tuning_config,
    
                num_samples=50, # Specify the number of samples to make from (non grid) distributions
    
                resources_per_trial=dict(cpu=1, gpu=0))

In [None]:
from pprint import pprint
print("Best config: ")
pprint(analysis.get_best_config(metric="mean_f1_score"))

In [None]:
df = analysis.dataframe()
top_n_df = df.nlargest(10, "mean_f1_score")

In [None]:
plot_some_tune_results(top_n_df)

In [None]:
%load_ext tensorboard
from tensorboard import notebook 
%tensorboard --logdir "~/ray_results/grid_search"

In [None]:
ray.shutdown()