# Tutorial: distributed computations on Kubernetes

This short tutorials explains how to run the hyperparamerter optimisation over multiple servers. It requires to set-up a Kubernetes cluster: you can refer to the `kubernetes/readme.md` file, where it is explained how to set-up the cluster (with a few lines of code, as all the configuration files are ready).

In [None]:
# auto-reload in notebooks
%reload_ext autoreload
%autoreload 2
%matplotlib inline

# constants: if you are running the K( setup proposed in giotto-deep, do not change these values!
USER = "root"
MYSQL_IP = "mysql-service"  # "127.0.0.1", if you run locally on your infrastructure
PSW = "password"
REDIS_IP = "redis-service"  # "localhost", if you run locally on your infrastructure

## General principles of distributions of computations

In this notebook we will show it is possible to distribute the HyperParametersOptimisation computations to different pods (i.e. computing servers).

There are two databases involved: one is **Redis** and is used to queue the computations, while the other is **MySQL** and it is used to keep track of the state of the HPO.

On a local setting, in which neither `K8` or `minikube` are running, you would start MySQL with the following command:

```
docker run --name=user_mysql_1 --env="MYSQL_ROOT_PASSWORD=password" -p 3306:3306 -d mysql:latest
```

Simlarly, to make sure that also Redis runs:

```
redis-server
```

Of course, both Redis and MySQL have to be installed (in the case of MySQL we use a docker image, while for redis we installed it via `brew install redis` on MacOS).

Then, each worker will be able to run an HPO: hence, each HPO run will also ave to be connected to MySQL.


**addendum**: 
 - to stop MySQL

```
/usr/local/bin/mysql.server stop
```
 - to make sure that the account to login to mysql is `root:password`, do

```
ALTER USER 'root'@'localhost' IDENTIFIED WITH mysql_native_password BY 'password';
```


In [None]:
# load RQ
from redis import Redis
from rq import Queue
from rq import Retry

# needed! make sure the file parallel_hpo.py exists in the cwd
from parallel_hpo import connect_to_mysql, run_hpo_parallel, test_fnc

# connect to mysql
connect_to_mysql(USER, PSW, MYSQL_IP)
print("MySQL connected")

# connect to redis
redis = Redis(host=REDIS_IP, port=6379, db=0)
print("Redis connected")



## Enqueuing principles

In order to distribute comptuations, the technique we use here use queues. Basically, all the computations are stored in a queue (in Redis!) and whence a worker is available, it starts crunching the job (in a FIFO logic).

In [None]:
# prepare the job queue
q = Queue(connection=redis)
job = q.enqueue(test_fnc, "hello!", retry=Retry(max=3))
job

So far you have enqueued one job: now you have to start the workers so that the jobs can be crunched!

In our setup of `minikube` or `K8` the workers are automatically set-up. If you are running your infrastructure from scratch on local, then you need to fire up some workers (each from a different terminal or moving the job to the background with `&`):

```
rq worker --url <redis-url> high default low
```

`<redis-url>` would most probably be `127.0.0.1` (or `localhost` or `0.0.0.0`). No need for this if you use the set-up we provide.

To monitor the workers and the jobs, you can run the dashboard with:

```
rq-dashboard
```


In [None]:
# you need to wait a bit before being able to see the result
job.result

## Enqueuing the jobs for HPO

In the next section we enqueue the HPO and make sure that the workers are actively cruching the jobs! If more than one worker is active, the job gets distributed!

But how does `optuna` knows how to distribute the computations? This is what MySQL database is about.

You can set up multiple workers to have the HPO run in parallel and optuna will store in a MySQL database the data of each run every time a trial is finished. Every time a new trial starts, then the databse is read and -- depending on the HPO technique -- the new set of hyperparameters is used and recorded.

In [None]:

# enqueue hpo jobs
hpo_job = q.enqueue(run_hpo_parallel, args=(USER, PSW, MYSQL_IP), retry=Retry(max=3))
hpo_job2 = q.enqueue(run_hpo_parallel, args=(USER, PSW, MYSQL_IP), retry=Retry(max=3))
print("jobs enqueued")

As before, make sure there are some active workers to cruch the jobs!