Skip to content

Commit

Permalink
Merge pull request #30 from alan-turing-institute/random-search2
Browse files Browse the repository at this point in the history
Implement RandomSearch
  • Loading branch information
ablaom committed Apr 3, 2020
2 parents adae301 + b7c6d24 commit 0bfee5c
Show file tree
Hide file tree
Showing 14 changed files with 745 additions and 190 deletions.
6 changes: 5 additions & 1 deletion Project.toml
Expand Up @@ -6,13 +6,17 @@ version = "0.3.0"
[deps]
ComputationalResources = "ed09eef8-17a6-5b46-8889-db040fac31e3"
Distributed = "8ba89e20-285c-5b6f-9357-94700520ee1b"
Distributions = "31c24e10-a181-5473-b8eb-7969acd0382f"
MLJBase = "a7f614a8-145f-11e9-1d2a-a57a1082229d"
MLJModelInterface = "e80e1ace-859a-464e-9ed9-23947d8ae3ea"
Random = "9a3f8284-a2c9-5f02-9a11-845980a1fd5c"
RecipesBase = "3cdcf5f2-1ef4-517c-9805-6587b60abb01"

[compat]
ComputationalResources = "^0.3"
MLJBase = "^0.12"
Distributions = "^0.22,^0.23"
MLJBase = "^0.12.2"
MLJModelInterface = "^0.2"
RecipesBase = "^0.8"
julia = "^1"

Expand Down
28 changes: 22 additions & 6 deletions README.md
Expand Up @@ -61,14 +61,14 @@ This repository contains:
developers to conveniently implement common hyperparameter
optimization strategies, such as:

- [x] search a list of explicitly specified models `list = [model1,
model2, ...]`
- [x] search models generated by an arbitrary iterator, eg `models = [model1,
model2, ...]` (built-in `Explicit` strategy)

- [x] grid search
- [x] grid search (built-in `Grid` strategy)

- [ ] Latin hypercubes

- [ ] random search
- [x] random search (built-in `RandomSearch` strategy)

- [ ] bandit

Expand Down Expand Up @@ -232,6 +232,8 @@ In setting up a tuning task, the user constructs an instance of the

### Implementation requirements for new tuning strategies

As sample implementations, see [/src/strategies/](/src/strategies)

#### Summary of functions

Several functions are part of the tuning strategy API:
Expand Down Expand Up @@ -373,6 +375,11 @@ is `fit!` the first time, and not on subsequent calls (unless
`force=true`). (Specifically, `MLJBase.fit(::TunedModel, ...)` calls
`setup` but `MLJBase.update(::TunedModel, ...)` does not.)

The `setup` function is called once only, when a `TunedModel` machine
is `fit!` the first time, and not on subsequent calls (unless
`force=true`). (Specifically, `MLJBase.fit(::TunedModel, ...)` calls
`setup` but `MLJBase.update(::TunedModel, ...)` does not.)

The `verbosity` is an integer indicating the level of logging: `0`
means logging should be restricted to warnings, `-1`, means completely
silent.
Expand Down Expand Up @@ -440,6 +447,14 @@ any number of models. If `models!` returns a number of models
exceeding the number needed to complete the history, the list returned
is simply truncated.

Some simple tuning strategies, such as `RandomSearch`, will want to
return as many models as possible in one hit. The argument
`n_remaining` is the difference between the current length of the
history and the target number of iterations `tuned_model.n` set by the
user when constructing his `TunedModel` instance, `tuned_model` (or
`default_n(tuning, range)` if left unspecified).


#### The `best` method: To define what constitutes the "optimal model"

```julia
Expand Down Expand Up @@ -487,8 +502,9 @@ where:
model

- `tuning_report(::MyTuningStrategy, ...)` is a method the implementer
may overload. It should return a named tuple. The fallback is to
return the raw history:
may overload. It should return a named tuple with `history` as one
of the keys (the format up to the implementation.) The fallback is
to return the raw history:

```julia
MLJTuning.tuning_report(tuning, history, state) = (history=history,)
Expand Down
10 changes: 7 additions & 3 deletions src/MLJTuning.jl
Expand Up @@ -7,7 +7,7 @@ module MLJTuning
export TunedModel

# defined in strategies/:
export Explicit, Grid
export Explicit, Grid, RandomSearch

# defined in learning_curves.jl:
export learning_curve!, learning_curve
Expand All @@ -17,26 +17,30 @@ export learning_curve!, learning_curve

import MLJBase
using MLJBase
import MLJBase: Bounded, Unbounded, DoublyUnbounded,
LeftUnbounded, RightUnbounded
using RecipesBase
using Distributed
import Distributions
import ComputationalResources: CPU1, CPUProcesses,
CPUThreads, AbstractResource
using Random


## CONSTANTS

const DEFAULT_N = 10
const DEFAULT_N = 10 # for when `default_n` is not implemented


## INCLUDE FILES

include("utilities.jl")
include("tuning_strategy_interface.jl")
include("tuned_models.jl")
include("ranges.jl")
include("range_methods.jl")
include("strategies/explicit.jl")
include("strategies/grid.jl")
include("strategies/random_search.jl")
include("plotrecipes.jl")
include("learning_curves.jl")

Expand Down
145 changes: 145 additions & 0 deletions src/range_methods.jl
@@ -0,0 +1,145 @@
## BOUNDEDNESS TRAIT

# For random search and perhaps elsewhere, we need a variation on the
# built-in boundedness notions:
abstract type PositiveUnbounded <: Unbounded end
abstract type Other <: Unbounded end

boundedness(::NumericRange{<:Any,<:Bounded}) = Bounded
boundedness(::NumericRange{<:Any,<:LeftUnbounded}) = Other
boundedness(::NumericRange{<:Any,<:DoublyUnbounded}) = Other
function boundedness(r::NumericRange{<:Any,<:RightUnbounded})
if r.lower >= 0
return PositiveUnbounded
end
return Other
end

"""
MLJTuning.grid([rng, ] prototype, ranges, resolutions)
Given an iterable `ranges` of `ParamRange` objects, and an iterable
`resolutions` of the same length, return a vector of models generated
by cloning and mutating the hyperparameters (fields) of `prototype`,
according to the Cartesian grid defined by the specifed
one-dimensional `ranges` (`ParamRange` objects) and specified
`resolutions`. A resolution of `nothing` for a `NominalRange`
indicates that all values should be used.
Specification of an `AbstractRNG` object `rng` implies shuffling of
the results. Otherwise models are ordered, with the first
hyperparameter referenced cycling fastest.
"""
grid(rng::AbstractRNG, prototype::Model, ranges, resolutions) =
shuffle(rng, grid(prototype, ranges, resolutions))

function grid(prototype::Model, ranges, resolutions)

iterators = broadcast(iterator, ranges, resolutions)

A = MLJBase.unwind(iterators...)

N = size(A, 1)
map(1:N) do i
clone = deepcopy(prototype)
for k in eachindex(ranges)
field = ranges[k].field
recursive_setproperty!(clone, field, A[i,k])
end
clone
end
end


## PRE-PROCESSING OF USER-SPECIFIED CARTESIAN RANGE OBJECTS

"""
process_grid_range(user_specified_range, resolution, verbosity)
Utility to convert a user-specified range (see [`Grid`](@ref)) into a
pair of tuples `(ranges, resolutions)`.
For example, if `r1`, `r2` are `NumericRange`s and `s` is a
NominalRange` with 5 values, then we have:
julia> MLJTuning.process_grid_range([(r1, 3), r2, s], 42, 1) ==
((r1, r2, s), (3, 42, 5))
true
If `verbosity` > 0, then a warning is issued if a `Nominal` range is
paired with a resolution.
"""
process_grid_range(user_specified_range, args...) =
throw(ArgumentError("Unsupported range. "))

process_grid_range(usr::Union{ParamRange,Tuple{ParamRange,Int}}, args...) =
process_grid_range([usr, ], args...)

function process_grid_range(user_specified_range::AbstractVector,
resolution, verbosity)
# r unpaired:
stand(r) = throw(ArgumentError("Unsupported range. "))
stand(r::NumericRange) = (r, resolution)
stand(r::NominalRange) = (r, length(r.values))

# (r, res):
stand(t::Tuple{NumericRange,Integer}) = t
function stand(t::Tuple{NominalRange,Integer})
verbosity < 0 ||
@warn "Ignoring a resolution specified for a `NominalRange`. "
return (first(t), length(first(t).values))
end

ret = zip(stand.(user_specified_range)...) |> collect
return first(ret), last(ret)
end

"""
process_random_range(user_specified_range,
bounded,
positive_unbounded,
other)
Utility to convert a user-specified range (see [`RandomSearch`](@ref))
into an n-tuple of `(field, sampler)` pairs.
"""
process_random_range(user_specified_range, args...) =
throw(ArgumentError("Unsupported range #1. "))

const DIST = Distributions.Distribution

process_random_range(user_specified_range::Union{ParamRange, Tuple{Any,Any}},
args...) =
process_random_range([user_specified_range, ], args...)

function process_random_range(user_specified_range::AbstractVector,
bounded,
positive_unbounded,
other)
# r not paired:
stand(r) = throw(ArgumentError("Unsupported range #2. "))
stand(r::NumericRange) = stand(r, boundedness(r))
stand(r::NumericRange, ::Type{<:Bounded}) = (r.field, sampler(r, bounded))
stand(r::NumericRange, ::Type{<:Other}) = (r.field, sampler(r, other))
stand(r::NumericRange, ::Type{<:PositiveUnbounded}) =
(r.field, sampler(r, positive_unbounded))
stand(r::NominalRange) = (n = length(r.values);
(r.field, sampler(r, fill(1/n, n))))
# (r, d):
stand(t::Tuple{ParamRange,Any}) = stand(t...)
stand(r, d) = throw(ArgumentError("Unsupported range #3. "))
stand(r::NominalRange, d::AbstractVector{Float64}) = _stand(r, d)
stand(r::NumericRange, d:: Union{DIST, Type{<:DIST}}) = _stand(r, d)
_stand(r, d) = (r.field, sampler(r, d))

# (field, s):
stand(t::Tuple{Union{Symbol,Expr},Any}) = t

return Tuple(stand.(user_specified_range))

# ret = zip(stand.(user_specified_range)...) |> collect
# return first(ret), last(ret)
end
70 changes: 0 additions & 70 deletions src/ranges.jl

This file was deleted.

10 changes: 7 additions & 3 deletions src/strategies/grid.jl
Expand Up @@ -9,7 +9,11 @@ default `resolution` in each numeric dimension.
### Supported ranges:
- A single one-dimensional range (`ParamRange` object) `r`, or pair of
A single one-dimensional range or vector of one-dimensioinal ranges
can be specified. Specifically, in `Grid` search, the `range` field
of a `TunedModel` instance can be:
- A single one-dimensional range (ie, `ParamRange` object) `r`, or pair of
the form `(r, res)` where `res` specifies a resolution to override
the default `resolution`.
Expand Down Expand Up @@ -83,7 +87,7 @@ end

function setup(tuning::Grid, model, user_range, verbosity)
ranges, resolutions =
process_user_range(user_range, tuning.resolution, verbosity)
process_grid_range(user_range, tuning.resolution, verbosity)
resolutions = adjusted_resolutions(tuning.goal, ranges, resolutions)

fields = map(r -> r.field, ranges)
Expand Down Expand Up @@ -123,7 +127,7 @@ end

function default_n(tuning::Grid, user_range)
ranges, resolutions =
process_user_range(user_range, tuning.resolution, -1)
process_grid_range(user_range, tuning.resolution, -1)

resolutions = adjusted_resolutions(tuning.goal, ranges, resolutions)
len(t::Tuple{NumericRange,Integer}) = length(iterator(t[1], t[2]))
Expand Down

0 comments on commit 0bfee5c

Please sign in to comment.