# SigOpt Explorations: How are My Hyperparameters Affecting My Training Time?
_By [Alexandra Johnson](https://sigopt.com/about#alexandra), Software Engineer_

SigOpt helps you optimize your hyperparameters, but did you know that SigOpt can also help you explore how long it takes to train your models? This jupyter notebook tutorial will show you how to use the [SigOpt API](https://sigopt.com/docs/overview) to learn new and interesting insights about how parameter values affect model training time. We'll show you the single-threaded case, but [parallelization](https://sigopt.com/docs/overview/parallel) is simple, too.

## How it Works
If you've been using the SigOpt [optimization loop](https://sigopt.com/docs/overview/optimization) to tune your models, you've been requesting a suggestion, evaluating your metric, and reporting an observation. [Suggestion](https://sigopt.com/docs/objects/suggestion) and [Observation](sigopt.com/docs/objects/observation) objects support a field called [metadata](https://sigopt.com/docs/overview/metadata), which allows you to store extra data on the object. This notebook walks you through creating observations with training time data, and plotting interesting graphs using this information. 

## Setup
Run the following commands to clone this repo, install dependencies, and get this notebook up and running. After running these commands a browser window should pop up, and from there you can navigate to this notebook ([this link](http://localhost:8888/notebooks/How%20are%20My%20Hyperparameters%20Affecting%20My%20Training%20Time%3F.ipynb) should also take you directly to the notebook one jupyter is running).
```
git clone https://github.com/sigopt/sigopt-examples.git
cd sigopt-examples/estimated-training-time/
pip install -r requirements.txt
jupyter notebook
```
Next, get your SigOpt API token from your [user dashboard](https://sigopt.com/user/profile). Set as the environment variable `SIGOPT_API_TOKEN`, or insert directly into the code below.

In [None]:
# Insert your SigOpt API token below, or set as the environment variable SIGOPT_API_TOKEN
from sigopt.interface import Connection
conn = Connection() # attempt to use environment variable SIGOPT_API_TOKEN
# conn = Connection(client_token=SIGOPT_API_TOKEN) # enter token directly

# Tutorial
The first step is to either provide an experiment id, or to create an interesting experiment. Here, we're creating an experiment and an evaluation function for you. If you'd like to time your own models, fill these in yourself!

In [None]:
experiment = conn.experiments().create(
    name="Timing Test Experiment",
    parameters=[
        {'name': 'x', 'type': 'int', 'bounds': {'min': 0, 'max': 2}},
        {'name': 'y', 'type': 'int', 'bounds': {'min': 0, 'max': 2}},
        {'name': 'z', 'type': 'double', 'bounds': {'min': 0, 'max': 1}},
    ],
    metadata={
        'owner': 'Alexandra'
    }
)
print "View you experiment at https://sigopt.com/experiment/{}".format(experiment.id)

In [None]:
import math, random
def evaluate_metric(assignments):
    x = assignments['x']
    y = assignments['y']
    z = assignments['z']
    sum = 0
    for i in range(100 * x):
        for j in range(100 * y):
            sum += math.sin(z * random.random())
    return sum

## Report Your Timing Data
Train your model using a modified evaluation loop that captures the time needed to run the evaluation function. `evaluate_metric` is the function that you implement. If you need inspiration, check out our example on [tuning a text classifier](https://github.com/sigopt/sigopt-examples/tree/master/text-classifier).

In [None]:
from time import time

# train for 10x to 20x the number of dimensions
for _ in range(30):
    suggestion = conn.experiments(experiment.id).suggestions().create()
    start_ts = time()
    value = evaluate_metric(suggestion.assignments)
    end_ts = time()
    # Report an observation with training_time metadata
    observation = conn.experiments(experiment.id).observations().create(
        suggestion=suggestion.id,
        value=value,
        metadata={'training_time': (end_ts - start_ts)}
    )

## Graph A Distribution of Training Times

In [None]:
# Initialize plotly
from plotly.offline import iplot, init_notebook_mode
from plotly.graph_objs import Scatter, Histogram

init_notebook_mode() # run at the start of every notebook

In [None]:
# Plot a histogram of the distribution of training times
observations = conn.experiments(experiment.id).observations().fetch().data
training_time_map = {
    observation.id: observation.metadata['training_time'] 
    for observation 
    in observations
}

iplot({
    'data': [Histogram(x=training_time_map.values())],
    'layout': {
        'title': 'Distribution of Training Times',
        'xaxis': {'title': 'seconds'},
    },
})

## Graph Training Time versus Parameter Value
Now we have our `observations` and our `training_time_map` from the last step, we can create some interesting scatterplots of training time versus parameter value for each parameter in our experiment.

In [None]:
# Create x,y axes for our graphs, one graph per parameter
# y axis is always estimated training times
# x axis is parameter values
# Don't include observations for which there is no estimated training time (ie, entered assignments manually)
# rather than using a suggestion
training_times = []
parameter_names = [p.to_json()['name'] for p in experiment.parameters]
parameter_values = {param: [] for param in parameter_names}
for o in observations:
    training_time = training_time_map.get(o.id, None)
    if training_time is not None:
        training_times.append(training_time)
        for (param, value) in o.assignments.to_json().iteritems():
            parameter_values[param].append(value)

In [None]:
# Ensure you have run the plotly setup earlier in the notebook
for (param, values) in parameter_values.iteritems():
    iplot({
            'data': [Scatter(x=values, y=training_times, mode="markers")],
            'layout': {
                'title': u'Training Time vs {}'.format(param),
                'xaxis': {'title': 'seconds'},
            },
            
    })

## Next Steps
Did you notice anything interesting with your experiments? Do the graphs match up to your intuition about how parameter values affect training time? Email <contact@sigopt.com> or tweet to [@SigOpt](twitter.com/sigopt) to let us know what you find! Happy Optimizing!