# SigOpt Explorations: How are My Hyperparameters Affecting My Training Time?
_By [Alexandra Johnson](https://sigopt.com/about#alexandra), Software Engineer_

SigOpt helps you optimize your hyperparameters, but did you know that SigOpt can also help you explore how long it takes to train your models? This jupyter notebook tutorial will show you how to take advantage of information already present in the [SigOpt API](https://sigopt.com/docs/overview) to learn new and interesting insights about how parameter values affect model training time.

## How it Works
If you've been using the SigOpt [optimization loop](https://sigopt.com/docs/overview/optimization) to tune your models, you've been requesting a suggestion, evaluating your metric, and reporting an observation. [Suggestion](https://sigopt.com/docs/objects/suggestion) and [Observation](sigopt.com/docs/objects/observation) objects expose their creation time as a unix timestamp in the `created` field. Observation created time minus suggestion created time provides an estimated training time, in seconds. This notebook walks you through fetching the observations for several similar experiments, linking each observation to its suggestion and calculating estimated training time, then plotting a graph of parameter value vs estimated training time for individual parameters. 

## TL/DR
Skip to the appendix at the bottom to see how hyperparameters affect training time in our example of [Tuning a Text Classifier](blog.sigopt.com/post/133089144983/sigopt-for-ml-automatically-tuning-text).

## Setup
Run the following commands to clone this repo, install dependencies, and get this notebook up and running. After running these commands a browser window should pop up, and from there you can navigate to this notebook ([this link](http://localhost:8888/notebooks/How%20are%20My%20Hyperparameters%20Affecting%20My%20Training%20Time%3F.ipynb) should also take you directly to the notebook one jupyter is running).
```
git clone https://github.com/sigopt/sigopt-examples.git
cd sigopt-examples/estimated-training-time/
pip install -r requirements.txt
jupyter notebook
```
Next, get your SigOpt API token from your [user dashboard](https://sigopt.com/user/profile). Set as the environment variable `SIGOPT_API_TOKEN`, or insert directly into the code below.

In [None]:
# Insert your SigOpt API token below, or set as the environment variable SIGOPT_API_TOKEN
from sigopt.interface import Connection
conn = Connection() # attempt to use environment variable SIGOPT_API_TOKEN
# conn = Connection(client_token=SIGOPT_API_TOKEN) # enter token directly

Grab the ids of some interesting experiments that you'd like to look at training time for! You can find an experiment's id by clicking on the properties tab of any experiment page. This notebook can look at **ids of multiple experiments**, but they will all be **combined into one training time graph**, so pick multiple experiments only if they represent the same model. Find your experiments on your [experiment dashboard](https://sigopt.com/experiment/list).

In [None]:
# Pick your experiment ids! Use multiple ids if you used multiple experiments to train the same model
# experiment_ids = [] 
experiments = [conn.experiments(id).fetch() for id in experiment_ids]

In [None]:
# Want to ensure we're looking at experiments that use the same parameters
def equal(x, y):
    assert x == y
    return x
parameter_names = reduce(equal, [sorted(p.name for p in e.parameters) for e in experiments])
print parameter_names

In [None]:
# Total number of observations we'll be considering
reduce(lambda x, y: x + y, [e.progress.observation_count for e in experiments])

In [None]:
# Fetch all observations and all suggestions for all experiments
observations = []
suggestions = []
for e in experiments:
    obs = conn.experiments(e.id).observations().fetch()
    observations.extend(obs.data)
    suggs = conn.experiments(e.id).suggestions().fetch()
    suggestions.extend(suggs.data)

In [None]:
# Create a map of observation id -> estimated training time
estimated_training_times_by_id = {}
suggestions_by_id = {s.id: s for s in suggestions}
for o in observations:
    if o.suggestion:
        suggestion = suggestions_by_id.get(o.suggestion)
        if suggestion:
            estimated_training_times_by_id[o.id] = (o.created - suggestion.created)

In [None]:
# Create x,y axes for our graphs, one graph per parameter
# y axis is always estimated training times
# x axis is parameter values
# Don't include observations for which there is no estimated training time (ie, entered assignments manually)
# rather than using a suggestion
estimated_training_times = []
parameter_values = {param: [] for param in parameter_names}
for o in observations:
    ett = estimated_training_times_by_id.get(o.id, None)
    if ett is not None:
        estimated_training_times.append(ett)
        for (param, value) in o.assignments.to_json().iteritems():
            parameter_values[param].append(value)

In [None]:
# Sanity check: ensure that all lists have the same length
count_observations_with_training_time = len(estimated_training_times)
for (_, values) in parameter_values.iteritems():
    assert len(values) == count_observations_with_training_time

In [None]:
# Plot training time versus parameter value for each parameter
from plotly.offline import iplot, init_notebook_mode
from plotly.graph_objs import Scatter

init_notebook_mode() # run at the start of every notebook

for (param, values) in parameter_values.iteritems():
    iplot({
            'data': [Scatter(x=values, y=estimated_training_times, mode="markers")],
            'layout': {'title': 'Estimated Training Time (s) vs {}'.format(param)}
    })

## Next Steps
Did you notice anything interesting with your experiments? Do the graphs match up to your intuition about how parameter values affect training time? Email <contact@sigopt.com> or tweet to [@SigOpt](twitter.com/sigopt) to let us know what you find! Happy Optimizing!

## Appendix: Advanced Users
Advanced users can take advtange of [metadata](https://sigopt.com/docs/overview/metadata) to store their own information about training time.

Metadata is a user-provided object that SigOpt stores on your behalf under the metadata field. Think of metadata as your annotation for a SigOpt object. This field is currently supported by [Experiments](https://sigopt.com/docs/endpoints/experiments), [Observations](https://sigopt.com/docs/endpoints/observations) and [Suggestions](https://sigopt.com/docs/endpoints/suggestions). 

In [None]:
# Example evaluation loop using metadata to store total training time
# Make sure you've instantiated conn from earlier in the notebook
from time import time
# experiment_id = USER DEFINED

for _ in range(60):
    suggestion = conn.experiments(experiment_id).suggestions().create()
    start_ts = time()
    value = evaluate_metric(suggestion.assignments) # You implement this
    end_ts = time()
    # Report an observation with training_time metadata
    observation = conn.experiments(experiment_id).observations().create(
        suggestion=suggestion.id,
        value=value,
        metadata={'training_time': (end_ts - start_ts)}
    )

## Appendix: What we noticed
We ran this notebook on three experiments from our text classifier example, and produced the graphs below. It's interesting to note how by visual inspection, it appears that the two parameters with the most influence on time are `min_n_gram` and `n_gram_offset`. 

![Estimated training time versus log regularization coefficient](https://github.com/sigopt/sigopt-examples/blob/master/estimated-training-time/ett-vs-log-reg-coefficient.png?raw=true)
![Estimated training time versus l1 coefficient](https://github.com/sigopt/sigopt-examples/blob/master/estimated-training-time/ett-vs-l1-coefficient.png?raw=true)
![Estimated training time versus log minimum document frequency](https://github.com/sigopt/sigopt-examples/blob/master/estimated-training-time/ett-vs-log-min-df.png?raw=true)
![Estimated training time versus document frequency offset](https://github.com/sigopt/sigopt-examples/blob/master/estimated-training-time/ett-vs-df-offset.png?raw=true)
![Estimated training time versus min n-gram](https://github.com/sigopt/sigopt-examples/blob/master/estimated-training-time/ett-vs-min-n-gram.png?raw=true)
![Estimated training time versus n-gram offset](https://github.com/sigopt/sigopt-examples/blob/master/estimated-training-time/ett-vs-n-gram-offset.png?raw=true)