Skip to content

Commit

Permalink
resolving conflict
Browse files Browse the repository at this point in the history
  • Loading branch information
JasperSnoek committed Aug 26, 2013
2 parents c1ebb23 + 6eab58c commit a10dff3
Show file tree
Hide file tree
Showing 86 changed files with 8,703 additions and 1,394 deletions.
24 changes: 12 additions & 12 deletions spearmint-lite/spearmint-lite.py
Original file line number Diff line number Diff line change
@@ -1,21 +1,21 @@
##
# Copyright (C) 2012 Jasper Snoek, Hugo Larochelle and Ryan P. Adams
#
#
# This code is written for research and educational purposes only to
# supplement the paper entitled "Practical Bayesian Optimization of
# Machine Learning Algorithms" by Snoek, Larochelle and Adams Advances
# in Neural Information Processing Systems, 2012
#
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
#
# This program is distributed in the hope that it will be useful, but
# WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
# General Public License for more details.
#
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see
# <http://www.gnu.org/licenses/>.
Expand Down Expand Up @@ -81,7 +81,7 @@ def main():

# Otherwise run in controller mode.
main_controller(options, args)

##############################################################################
##############################################################################
def main_controller(options, args):
Expand Down Expand Up @@ -151,18 +151,18 @@ def main_controller(options, args):
values = float(val)
complete = np.matrix(variables)
durations = float(dur)

infile.close()
# Some stats
sys.stderr.write("#Complete: %d #Pending: %d\n" %
sys.stderr.write("#Complete: %d #Pending: %d\n" %
(complete.shape[0], pending.shape[0]))

# Let's print out the best value so far
if type(values) is not float and len(values) > 0:
best_val = np.min(values)
best_job = np.argmin(values)
sys.stderr.write("Current best: %f (job %d)\n" % (best_val, best_job))

# Now lets get the next job to run
# First throw out a set of candidates on the unit hypercube
# Increment by the number of observed so we don't take the
Expand All @@ -173,7 +173,7 @@ def main_controller(options, args):

# Ask the chooser to actually pick one.
# First mash the data into a format that matches that of the other
# spearmint drivers to pass to the chooser modules.
# spearmint drivers to pass to the chooser modules.
grid = candidates
if (complete.shape[0] > 0):
grid = np.vstack((complete, candidates))
Expand All @@ -187,7 +187,7 @@ def main_controller(options, args):
np.nonzero(grid_idx == 1)[0],
np.nonzero(grid_idx == 2)[0],
np.nonzero(grid_idx == 0)[0])

# If the job_id is a tuple, then the chooser picked a new job not from
# the candidate list
if isinstance(job_id, tuple):
Expand All @@ -207,11 +207,11 @@ def main_controller(options, args):
output = ""
for p in params:
output = output + str(p) + " "

output = "P P " + output + "\n"
outfile = open(res_file,"a")
outfile.write(output)
outfile.close()
outfile.close()

# And that's it
if __name__ == '__main__':
Expand Down
7 changes: 7 additions & 0 deletions spearmint/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
*.pyc
*.sw[op]
*.pkl
**/jobs
**/output
**/best_job_and_result.txt
**/trace.csv
60 changes: 44 additions & 16 deletions spearmint/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,11 @@ Spearmint
---------

Spearmint is a package to perform Bayesian optimization according to the
algorithms outlined in the paper:
algorithms outlined in the paper:

**Practical Bayesian Optimization of Machine Learning Algorithms**
Jasper Snoek, Hugo Larochelle and Ryan P. Adams
*Advances in Neural Information Processing Systems*, 2012
**Practical Bayesian Optimization of Machine Learning Algorithms**
Jasper Snoek, Hugo Larochelle and Ryan P. Adams
*Advances in Neural Information Processing Systems*, 2012

This code is designed to automatically run experiments (thus the code
name 'spearmint') in a manner that iteratively adjusts a number of
Expand All @@ -30,10 +30,17 @@ On Ubuntu linux you can install this package using the command:
apt-get install python-scipy

* [Google Protocol Buffers](https://developers.google.com/protocol-buffers/) (for the fully automated code).
<<<<<<< HEAD
Note that you should be able to install protocol-buffers from source without requiring administrator privileges. Otherwise, on Ubuntu linux you can install this package using the command:

apt-get install python-protobuf
=======
On Ubuntu linux you can install this package using the command:

apt-get install python-protobuf

>>>>>>> 6eab58cb958478c1dbf5318dcc574e467971ead6
and on Mac with:

pip install protobuf
Expand All @@ -48,7 +55,19 @@ expected improvement, UCB or random. The drivers determine how
experiments are distributed and run on the system. As the code is
designed to run experiments in parallel (spawning a new experiment as
soon a result comes in), this requires some engineering. The current
<<<<<<< HEAD
implementations of these are in the 'spearmint' and 'spearmint-lite' subdirectories:
=======
implementations of these are in 'spearmint.py', 'spearmint_sync.py'
and 'spearmint-lite.py':

**Spearmint.py** is designed to run on a system with Sun Grid Engine and
uses SGE to distribute experiments on a multi-node cluster in parallel
using a queueing system in a fault-tolerant way. It is particularly
well suited to the Amazon EC2 system. Using [StarCluster](http://star.mit.edu/cluster/)
will allow you to set up a large cluster and start distributing experiments
within minutes.
>>>>>>> 6eab58cb958478c1dbf5318dcc574e467971ead6
**Spearmint** is designed to automatically manage the launching and associated bookkeeping
of experiments in either a single machine or cluster environment. This requires that you provide
Expand Down Expand Up @@ -88,7 +107,7 @@ variables over which to optimize and SIZE is the number of variables
of this type with these bounds. Spearmint will call your wrapper
function with a dictionary type (in python) containing each of your
variables in a vector of size 'size', which you can access using the
name specified.
name specified.

Now take a look at branin.py (the wrapper which was
specified in the 'name' variable at the top of config.pb). You will
Expand Down Expand Up @@ -142,13 +161,13 @@ completed thus far. The output directory contains a text file for
each job-id, containing the output of that job. So if you want to
see, e.g. what the output (i.e. standard out and standard error) was
for the best job (as obtained from trace.csv) you can look up
job-id.txt in the output directory.
job-id.txt in the output directory.

If you are debugging your code,
or the code is crashing for some reason, it's a good idea to look at
these files. Finally for ease of use, spearmint also prints out at
each iteration a file called 'best_job_and_result.txt' that contains the
best result observed so far, the job-id it came from and a dump of
best result observed so far, the job-id it came from and a dump of
the names and values of all of the parameters corresponding to that result.

A script, bin/cleanup, is provided to completely restart an experiment
Expand All @@ -169,6 +188,7 @@ above.
To run multiple jobs in parallel, pass to spearmint the argument:
`--max-concurrent=<#jobs>`

<<<<<<< HEAD
Spearmint is designed to be run in parallel either using multiple processors on a single machine or in a cluster environment. These different environments, however, involve different queuing and fault-checking code and are thus coded as 'driver' modules. Currently two drivers are available, but one can easily create a driver for a different environment by creating a new driver module (see the driver subdirectory for examples).

Using the `--driver=sge` flag, Spearmint can run on a system with Sun Grid Engine and it uses SGE to distribute experiments on a multi-node cluster in parallel using a queueing system in a fault-tolerant way. It is particularly
Expand All @@ -177,6 +197,9 @@ well suited to the Amazon EC2 system. Using [StarCluster](http://star.mit.edu/c
Using the `--driver=local` flag will run Spearmint on a single machine with potentially many cores. This driver simply spawns a new process on the current machine to run a new experiment. This does not allow you to distribute across multiple machines, however.

Running the basic code: Spearmint-lite
=======
Running the basic code: Spearmint-lite
>>>>>>> 6eab58cb958478c1dbf5318dcc574e467971ead6
---------------------------------------

Spearmint-lite is designed to be simple. To run an experiment in
Expand All @@ -186,32 +209,37 @@ experiment specification, which must be provided in config.json, is in
JSON format. You must specify your problem as a sequence of JSON
objects. As in the protocol buffer format above, each object must
have a name, a type (float, int or enum), a 'min', a 'max' and a
'size'. Nothing else needs to be specified.
'size'. Nothing else needs to be specified.

Go back to the top-level directory and run:
Go back to the top-level directory and run:

python spearmint-lite.py braninpy

Spearmint-lite will run one iteration of Bayesian
optimization and write out to a file named results.dat in the braninpy
subdirectory. results.dat will contain a white-space delimited line
for each experiment, of the format:
for each experiment, of the format:
`<result> <time-taken> <list of parameters in the same order as config.json>`

Spearmint will propose new experiments and append them to results.dat each
time it is run. Each proposed experiment will have a 'pending' result and
time-taken, indicated by the letter P. The user must then run the experiment
Spearmint will propose new experiments and append them to results.dat each
time it is run. Each proposed experiment will have a 'pending' result and
time-taken, indicated by the letter P. The user must then run the experiment
and fill in these values. Note that the time can safely be set to an arbitrary
value if the chooser module does not use it (only GPEIperSecChooser currently
does). Spearmint will condition on the pending experiments when proposing new
value if the chooser module does not use it (only GPEIperSecChooser currently
does). Spearmint will condition on the pending experiments when proposing new
ones, so any number of experiments can be conducted in parallel.

A script, **cleanup.sh**, is provided to completely clean up all the intermediate
files and results in an experimental directory and restart the
experiment from scratch.

<<<<<<< HEAD
Chooser modules:
---------------
=======
Choser modules:
---------------
>>>>>>> 6eab58cb958478c1dbf5318dcc574e467971ead6
The chooser modules implement functions that tell spearmint which next
job to run. Some correspond to 'acquisition functions' in the
Expand All @@ -230,7 +258,7 @@ coarse-to-fine grid.

* **RandomChooser**: Experiments are sampled randomly from the unit hypercube.

* **GPEIOptChooser:** The GP EI MCMC algorithm from the paper. Jobs
* **GPEIOptChooser:** The GP EI MCMC algorithm from the paper. Jobs
are first sampled densely from a dense grid on the unit hypercube
and then the best candidates are optimized 'fine-tuned' according
to EI.
Expand Down
16 changes: 16 additions & 0 deletions spearmint/bin/cleanup
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
#! /bin/bash

# This is a simple script to cleanup the intermediate files in
# spearmint experiment directories

GARBAGE="trace.csv output/* jobs/* expt-grid.pkl expt-grid.pkl.lock \
*.pyc *GP*Chooser*.pkl *Chooser*hyperparameters.txt best_job_and_result.txt"

[[ -n "$1" ]] || { echo "Usage: cleanup.sh <experiment_dir>"; exit 0 ; }
if [ -d $1 ]
then
cd $1
rm -f $GARBAGE
else
echo "$1 is not a valid directory"
fi
3 changes: 3 additions & 0 deletions spearmint/bin/make_protobufs
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
#!/bin/bash

protoc --python_out=./spearmint spearmint.proto
File renamed without changes.
17 changes: 17 additions & 0 deletions spearmint/bin/spearmint
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
#!/bin/bash

SOURCE="${BASH_SOURCE[0]}"

# resolve $SOURCE until the file is no longer a symlink
while [ -h "$SOURCE" ]; do
DIR="$( cd -P "$( dirname "$SOURCE" )" && pwd )"
SOURCE="$(readlink "$SOURCE")"

# if $SOURCE was a relative symlink, we need to resolve it
# relative to the path where the symlink file was located
[[ $SOURCE != /* ]] && SOURCE="$DIR/$SOURCE"
done
DIR="$( cd -P "$( dirname "$SOURCE" )" && pwd )"

CMD="python ${DIR}/../spearmint/main.py $@"
PYTHONPATH=${DIR}/.. ${CMD}
18 changes: 0 additions & 18 deletions spearmint/cleanup.sh

This file was deleted.

File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
10 changes: 10 additions & 0 deletions spearmint/examples/dejong/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
## De Jong's function (2D)

A simple benchmark for optimization, De Jong's function is continuous, convex, and unimodal with a single parameter.

f(x,y) = x^2 + y^2

### Global optimium

X = 0
Y = 0
18 changes: 18 additions & 0 deletions spearmint/examples/dejong/config.pb
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
language: PYTHON
name: "dejong"

variable {
name: "X"
type: FLOAT
size: 1
min: -5.12
max: 5.12
}

variable {
name: "Y"
type: FLOAT
size: 1
min: -5.12
max: 5.12
}
15 changes: 15 additions & 0 deletions spearmint/examples/dejong/dejong.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
def dejong(x,y):
return x*x + y*y

# Write a function like this called 'main'
def main(job_id, params):
x = params['X'][0]
y = params['Y'][0]
res = dejong(x, y)
print "De Jong's function in 2D:"
print "\tf(%.2f, %0.2f) = %f" % (x, y, res)
return dejong(x, y)


if __name__ == "__main__":
main(23, {'X': [1.2], 'Y': [4.3]})
10 changes: 10 additions & 0 deletions spearmint/examples/faker/config.pb
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
language: PYTHON
name: "faker"

variable {
name: "X"
type: FLOAT
size: 2
min: 0
max: 1
}
12 changes: 12 additions & 0 deletions spearmint/examples/faker/faker.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
import sys
import math
import time
import random


# A fake task that sleeps and returns a random result so we can test and debug
# spearmint without wasting CPU and energy computing real functions.

def main(job_id, params):
time.sleep(random.random() * 2)
return random.random() * 100
12 changes: 12 additions & 0 deletions spearmint/examples/rosenbrocks_valley/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
## Rosenbrock's Valley

A slightly more difficult benchmark for optimization, Rosenbrock's valley (a.k.k the banana function) has a global
optimimum lying inside a long, narrow parabolic valley with a flat floor.

f(x,y) = 100(y - x^2)^2 + (1 - x)^2

Global minimum:

X = 0
Y = 0

Loading

0 comments on commit a10dff3

Please sign in to comment.