# Hyperparameter optimization
---
This notebook uses Cirrus to optimize the hyperparameters of a logistic regression model.

## Setup
---

In [1]:
# To ease development, each time a cell is run, all modules will be reloaded.
%load_ext autoreload
%autoreload 2

In [2]:
import logging
import sys
import atexit

In [3]:
# Cirrus produces logs, but they will not show unless we add a handler that prints.
from cirrus import utilities
utilities.set_logging_handler()

[     _initialize | ResourceManager] Initializing no-retries Lambda client.
[     _initialize | ResourceManager] Initializing IAM resource.
[     _initialize | ResourceManager] Initializing EC2 resource.
[     _initialize | ResourceManager] Initializing Cloudwatch Logs client.
[     _initialize | ResourceManager] Initializing S3 resource.
[     _initialize | ResourceManager] Initializing STS client.


In [4]:
from cirrus import instance, automate, lr, GridSearch

## Instances, base task configuration, hyperparameters
---

First, we start some EC2 instances.

In [5]:
NUM_INSTANCES = 2

instances = []
for i in range(NUM_INSTANCES):
    inst = instance.Instance(
        name="hyperparameter_example_instance_%d" % i,
        disk_size=32,
        typ="m4.2xlarge",
        username="ubuntu",
        ami_owner_name=("self", "cirrus_server_image")
    )
    inst.start()
    instances.append(inst)

[        __init__ |      MainThread] Resolving AMI owner/name to AMI ID.
[        __init__ |      MainThread] Done.
[         _exists |      MainThread] Listing instances.
[         _exists |      MainThread] No existing instance with the same name was found.
[ _start_and_wait |      MainThread] Starting a new instance.
[ _start_and_wait |      MainThread] Waiting for instance to enter running state.
[ _start_and_wait |      MainThread] Fetching instance metadata.
[ _start_and_wait |      MainThread] Done.
[           start |      MainThread] Done.
[        __init__ |      MainThread] Resolving AMI owner/name to AMI ID.
[        __init__ |      MainThread] Done.
[         _exists |      MainThread] Listing instances.
[         _exists |      MainThread] No existing instance with the same name was found.
[ _start_and_wait |      MainThread] Starting a new instance.
[ _start_and_wait |      MainThread] Waiting for instance to enter running state.
[ _start_and_wait |      MainThread] Fetc

Second, we define the base configuration for our machine learning task.

In [6]:
base_task_config = {
    "n_workers": 16,
    "n_ps": 1,
    "dataset": "criteo-kaggle-19b",
    "learning_rate": 0.0001,
    "epsilon": 0.0001,
    "progress_callback": None,
    "train_set": [(0, 799)],
    "test_set": (800, 850),
    "minibatch_size": 200,
    "model_bits": 19,
    "opt_method": "adagrad",
    "timeout": 60,
    "lambda_size": 192
}

Third, we identify our hyperparameters and their possible values.

In [7]:
hyperparameter_names = [
    "n_workers",
    "learning_rate"
]
hyperparameter_values = [
    [8, 16, 32],
    [0.001, 0.01]
]

All of the above defines a hyperparameter optimization task, which consists of one machine learning task per assignment of values to the hyperparameters.

In [8]:
search = GridSearch(
    task=lr.LogisticRegression,
    param_base=base_task_config,
    hyper_vars=hyperparameter_names,
    hyper_params=hyperparameter_values,
    instances=instances
)

[(8, 0.001), (8, 0.01), (16, 0.001), (16, 0.01), (32, 0.001), (32, 0.01)]


## Run
---

Next, we run our hyperparameter optimization task.

In [None]:
search.run()

Run this cell to see the present accuracy of experiment `I`.

In [None]:
I = 1

for line in search.cirrus_objs[I].ps.error_output().split("\n")[-20:]:
    print(line)

## Cleanup
---

When we're satisfied with the results, we kill our task.

In [None]:
search.kill_all()

We also need to terminate our instances in order to avoid continuing charges.

In [9]:
for inst in instances:
    inst.cleanup()

[         cleanup |      MainThread] Terminating instance.
[         cleanup |      MainThread] Waiting for instance to terminate.
[         cleanup |      MainThread] Done.
[         cleanup |      MainThread] Terminating instance.
[         cleanup |      MainThread] Waiting for instance to terminate.
[         cleanup |      MainThread] Done.


If a cell errors, running this should clean up any resources that were created. After running this cell, the kernel will become unusable and need to be restarted.

In [None]:
atexit._run_exitfuncs()