Skip to content

Commit

Permalink
Merge pull request #35 from automl/development
Browse files Browse the repository at this point in the history
Development
  • Loading branch information
aaronkl committed Jul 11, 2016
2 parents 15e1a7b + 8471d02 commit 228bfab
Show file tree
Hide file tree
Showing 75 changed files with 5,007 additions and 418 deletions.
1 change: 0 additions & 1 deletion .travis.yml
Original file line number Diff line number Diff line change
@@ -1,6 +1,5 @@
language: python
python:
- "2.6"
- "2.7"
- "3.3"
- "3.4"
Expand Down
4 changes: 4 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,10 @@
RoBO - a Robust Bayesian Optimization framework.
================================================

[![Build Status](https://travis-ci.org/automl/RoBO.svg?branch=development)](https://travis-ci.org/automl/RoBO)
[![Coverage Status](https://coveralls.io/repos/github/automl/RoBO/badge.svg?branch=development)](https://coveralls.io/github/automl/RoBO?branch=development)
[![Code Health](https://landscape.io/github/automl/RoBO/development/landscape.svg?style=flat)](https://landscape.io/github/automl/RoBO/development)

Documentation
-------------
http://robo-fork.readthedocs.org/en/latest/
Expand Down
40 changes: 39 additions & 1 deletion docs/advanced.rst
Original file line number Diff line number Diff line change
Expand Up @@ -65,4 +65,42 @@ It used the HMC method implemented in GPy to sample the marginal loglikelihood.
bo.run(10)
RoBO will then compute an marginalised acquistion value by computing the acquisition value based on each single GP and sum over all of them.
RoBO will then compute an marginalised acquistion value by computing the acquisition value based on each single GP and sum over all of them.


Fabolas
-------

The general idea of Fabolas is to expand the traditional way to model the objective function :math:`f(\bm{x}, s)` by an additional input :math:`s` that specifies the amount of training data to evaluate a point :math:`\bm{x}`:


At the end we want to find the best points :math:`\bm{x}_{\star}` on the full dataset :math:`s=s_{max}`. Because of that Fabolas uses the information gain acquisition function but models the distribution over the minimum :math:`p_{min}(\bm{x, s})` only on the subspace :math:`s_{max}` such that :math:`p_{min}(\bm{x, s=s_{max}})`.

By additionally modeling the evaluation time of a point :math:`\bm{x}` and dividing the information gain by the cost :math:`c(\bm{x}, s)` it would take to evaluate :math:`\bm{x}` on :math:`s`, Fabolas evaluate points only on small subsets of the data and extrapolates their error to the full dataset size. For more details have a look at the paper http://arxiv.org/abs/1605.07079


Fabolas has the same interface as RoBO`s fmin function (see :ref:`fmin`). First you have to define your objective function which now should depend on :math:`\bm{x}` and :math:`s`:

.. code-block:: python
def objective_function(x, s):
# Train your algorithm here with x on the dataset subset with length s
# Estimate the validation error and the cost on the validation data set
return np.array([[validation_error]]), np.array([[cost]])
Your objective function should return the validation error and the total cost :math:`c(\bm{x}, s)` of the point :math:`\bm{x}`. Normally the cost is the time it took to train and validate :math:`\bm{x}`.
After defining your objective function you also have to define the input bounds for :math:`\bm{x}` and :math:`s`. Make sure that the dataset size :math:`s` is the last dimension.
It is often a good idea to set the data set size on a log scale.

.. code-block:: python
X_lower = np.array([-10, -10, s_min])
X_upper = np.array([10, 10, s_max])
Then you can call Fabolas by:

.. code-block:: python
x_best = fabolas_fmin(objective_function, X_lower, X_upper, num_iterations=100)
You can find a full example for training a support vector machine on MNIST `here <http://https://github.com/automl/RoBO/blob/development/examples/example_fmin_fabolas.py>`_
20 changes: 17 additions & 3 deletions docs/basics.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,11 @@
Basic Usage
===========


.. _fmin:

RoBO in a few lines of code
-------------------------
---------------------------

RoBO offers a simple interface such that you can use it as a optimizer for black box function without knowing what's going on inside. In order to do that you first have to
define the objective function and the bounds of the configuration space:
Expand Down Expand Up @@ -159,7 +162,7 @@ Saving output
^^^^^^^^^^^^^

You can save RoBO's output by passing the parameters 'save_dir' and 'num_save'. The first parameter 'save_dir' specifies where the results will be saved and
the second parameter 'num_save' after how many iterations the output should be saved.
the second parameter 'num_save' after how many iterations the output should be saved. RoBO will save the ouput both in .csv and Json format.

.. code-block:: python
Expand All @@ -170,7 +173,7 @@ the second parameter 'num_save' after how many iterations the output should be s
save_dir="path_to_directory",
num_save=1)
RoBO will save then the following information:
RoBO will save then the following information in the CSV file:

- X: The configuration it evaluated so far
- y: Their corresponding function values
Expand All @@ -179,6 +182,17 @@ RoBO will save then the following information:
- time_function: The time each function evaluation took
- optimizer_overhead: The time RoBO needed to pick a new configuration

Following information will be saved in Json in below shown format.

.. code-block:: javascript
{
"Acquisiton":{"type" },
"Model":{"Y" ,"X" ,"hyperparameters" },
"Task":{"opt": ,"fopt": ,"original_X_lower": ,"original_X_upper": , },
"Solver":{"optimization_overhead" ,"incumbent_fval" ,"iteration" ,"time_func_eval" ,"incumbent" ,"runtime" }
}
Implementing the Bayesian optimization loop
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Expand Down
1 change: 0 additions & 1 deletion examples/example_branin.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,5 @@
'''
Created on Jun 23, 2015
@author: Aaron Klein
'''

Expand Down
7 changes: 2 additions & 5 deletions examples/example_fmin.py
Original file line number Diff line number Diff line change
@@ -1,8 +1,4 @@
'''
Created on Jul 3, 2015

@author: Aaron Klein
'''
import numpy as np

from robo.fmin import fmin
Expand All @@ -12,7 +8,8 @@
# It gets a numpy array with shape (N,D) where N >= 1 are the number of
# datapoints and D are the number of features
def objective_function(x):
return np.sin(3 * x) * 4 * (x - 1) * (x + 2)
y = np.sin(3 * x) * 4 * (x - 1) * (x + 2)
return y

# Defining the bounds and dimensions of the input space
X_lower = np.array([0])
Expand Down
125 changes: 125 additions & 0 deletions examples/example_fmin_fabolas.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,125 @@
import os
import sys
import time
import numpy as np

from sklearn import svm

from robo.fmin import fabolas_fmin


# Example script to optimize the C and gamma parameter of a
# support vector machine on MNIST with Fabolas.
# Have a look into the paper " Fast Bayesian Optimization of Machine Learning
# Hyperparameters on Large Datasets" (http://arxiv.org/abs/1605.07079)
# to see how it works. Note in order run this example you need scikit-learn
# you can install by: pip install sklearn


def load_dataset():
# This function loads the MNIST data, its copied from the Lasagne tutorial
# We first define a download function, supporting both Python 2 and 3.
if sys.version_info[0] == 2:
from urllib import urlretrieve
else:
from urllib.request import urlretrieve

def download(filename, source='http://yann.lecun.com/exdb/mnist/'):
print("Downloading %s" % filename)
urlretrieve(source + filename, filename)

# We then define functions for loading MNIST images and labels.
# For convenience, they also download the requested files if needed.
import gzip

def load_mnist_images(filename):
if not os.path.exists(filename):
download(filename)
# Read the inputs in Yann LeCun's binary format.
with gzip.open(filename, 'rb') as f:
data = np.frombuffer(f.read(), np.uint8, offset=16)
# The inputs are vectors now, we reshape them to monochrome 2D images,
# following the shape convention: (examples, channels, rows, columns)
data = data.reshape(-1, 1, 28, 28)
# The inputs come as bytes, we convert them to float32 in range [0,1].
# (Actually to range [0, 255/256], for compatibility to the version
# provided at http://deeplearning.net/data/mnist/mnist.pkl.gz.)
return data / np.float32(256)

def load_mnist_labels(filename):
if not os.path.exists(filename):
download(filename)
# Read the labels in Yann LeCun's binary format.
with gzip.open(filename, 'rb') as f:
data = np.frombuffer(f.read(), np.uint8, offset=8)
# The labels are vectors of integers now, that's exactly what we want.
return data

# We can now download and read the training and test set images and labels.
X_train = load_mnist_images('train-images-idx3-ubyte.gz')
y_train = load_mnist_labels('train-labels-idx1-ubyte.gz')
X_test = load_mnist_images('t10k-images-idx3-ubyte.gz')
y_test = load_mnist_labels('t10k-labels-idx1-ubyte.gz')

# We reserve the last 10000 training examples for validation.
X_train, X_val = X_train[:-10000], X_train[-10000:]
y_train, y_val = y_train[:-10000], y_train[-10000:]

X_train = X_train.reshape(X_train.shape[0], 28 * 28)
X_val = X_val.reshape(X_val.shape[0], 28 * 28)
X_test = X_test.reshape(X_test.shape[0], 28 * 28)

# We just return all the arrays in order, as expected in main().
# (It doesn't matter how we do this as long as we can read them again.)
return X_train, y_train, X_val, y_val, X_test, y_test


# The optimization function that we want to optimize.
# It gets a numpy array x with shape (1,D) where D are the number of parameters
# and s which is the ratio of the training data that is used to
# evaluate this configuration
def objective_function(x, s):

# Start the clock to determine the cost of this function evaluation
start_time = time.time()

# Shuffle the data and split up the request subset of the training data
size = int(np.exp(s))
s_max = y_train.shape[0]
shuffle = np.random.permutation(np.arange(s_max))
train_subset = X_train[shuffle[:size]]
train_targets_subset = y_train[shuffle[:size]]

# Train the SVM on the subset set
C = np.exp(float(x[0, 0]))
gamma = np.exp(float(x[0, 1]))
clf = svm.SVC(gamma=gamma, C=C)
clf.fit(train_subset, train_targets_subset)

# Validate this hyperparameter configuration on the full validation data
y = 1 - clf.score(X_val, y_val)

c = time.time() - start_time

return np.array([[np.log(y)]]), np.array([[c]])

# Load the data, change that to
X_train, y_train, X_val, y_val, X_test, y_test = load_dataset()


# We optimize s on a log scale, as we expect that the performance varies
# logarithmically across s
s_min = np.log(100)
s_max = np.log(X_train.shape[0])

# Defining the bounds and dimensions of the
# input space (configuration space + environment space)
# We also optimize the hyperparameters of the svm on a log scale
X_lower = np.array([-10, -10, s_min])
X_upper = np.array([10, 10, s_max])

# Start Fabolas to optimize the objective function
x_best = fabolas_fmin(objective_function, X_lower, X_upper, num_iterations=100)

print x_best
print objective_function(x_best[:, :-1], s=x_best[:, None, -1])
33 changes: 33 additions & 0 deletions examples/example_json_dump.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
'''
Created on June 5th, 2016
@author: Numair Mansur (numair.mansur@gmail.com)
'''

import george

from robo.maximizers.direct import Direct
from robo.models.gaussian_process import GaussianProcess
from robo.task.synthetic_functions.levy import Levy
from robo.acquisition.ei import EI
from robo.solver.bayesian_optimization import BayesianOptimization


task = Levy()
kernel = george.kernels.Matern52Kernel([1.0], ndim=1)


model = GaussianProcess(kernel)

ei = EI(model, task.X_lower, task.X_upper)

maximizer = Direct(ei, task.X_lower, task.X_upper)

bo = BayesianOptimization(acquisition_func=ei,
model=model,
maximize_func=maximizer,
task=task
,save_dir='../JsonDumps/'
)

print bo.run(20)
5 changes: 2 additions & 3 deletions examples/example_mcmc.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,10 +14,9 @@
noise = 1.0
cov_amp = 2
exp_kernel = george.kernels.Matern52Kernel([1.0, 1.0], ndim=2)
noise_kernel = george.kernels.WhiteKernel(noise, ndim=2)
kernel = cov_amp * (exp_kernel + noise_kernel)
kernel = cov_amp * exp_kernel

prior = DefaultPrior(len(kernel))
prior = DefaultPrior(len(kernel) + 1)
model = GaussianProcessMCMC(kernel, prior=prior,
chain_length=100, burnin_steps=200, n_hypers=20)

Expand Down
9 changes: 4 additions & 5 deletions examples/example_priors.py
Original file line number Diff line number Diff line change
Expand Up @@ -64,10 +64,9 @@ def sample_from_prior(self, n_samples):
config_kernel = george.kernels.Matern52Kernel(np.ones([task.n_dims]),
ndim=task.n_dims)

noise_kernel = george.kernels.WhiteKernel(0.01, ndim=task.n_dims)
kernel = cov_amp * (config_kernel + noise_kernel)
kernel = cov_amp * config_kernel

prior = MyPrior(len(kernel))
prior = MyPrior(len(kernel) + 1)

model = GaussianProcessMCMC(kernel, prior=prior, burnin=burnin,
chain_length=chain_length, n_hypers=n_hypers)
Expand All @@ -82,8 +81,8 @@ def sample_from_prior(self, n_samples):
bo = BayesianOptimization(acquisition_func=acquisition_func,
model=model,
maximize_func=maximizer,
task=task)

task=task
)
bo.run(20)


3 changes: 1 addition & 2 deletions examples/example_rf.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,8 +24,7 @@
# Define the acquisition function
acquisition_func = EI(model,
X_upper=branin.X_upper,
X_lower=branin.X_lower,
par=0.1)
X_lower=branin.X_lower)

# Strategy of estimating the incumbent
rec = PosteriorMeanAndStdOptimization(model, branin.X_lower,
Expand Down
33 changes: 33 additions & 0 deletions examples/example_walker.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
import george
import numpy as np

from robo.models.gaussian_process_mcmc import GaussianProcessMCMC
from robo.acquisition.ei import EI
from robo.maximizers.direct import Direct
from robo.task.controlling_tasks.walker import Walker
from robo.solver.bayesian_optimization import BayesianOptimization
from robo.priors.default_priors import DefaultPrior
from robo.acquisition.integrated_acquisition import IntegratedAcquisition



task = Walker()
test = '/test'

kernel = 1 * george.kernels.Matern52Kernel(np.ones([task.n_dims]),ndim=task.n_dims)
prior = DefaultPrior(len(kernel) + 1)
model = GaussianProcessMCMC(kernel, prior=prior,
chain_length=100, burnin_steps=200, n_hypers=8)

ei = EI(model, task.X_lower, task.X_upper)
acquisition_func = IntegratedAcquisition(model, ei, task.X_lower, task.X_upper)

maximizer = Direct(acquisition_func, task.X_lower, task.X_upper)

bo = BayesianOptimization(acquisition_func=acquisition_func,
model=model,
maximize_func=maximizer,
task=task,
save_dir = test)

print bo.run(2)
2 changes: 1 addition & 1 deletion requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,6 @@ scipy>=0.13.3
matplotlib>=1.3.1
cma
direct
george
git+https://github.com/sfalkner/george.git
git+https://github.com/SheffieldML/GPy.git
git+https://bitbucket.org/aadfreiburg/random_forest_run

0 comments on commit 228bfab

Please sign in to comment.