Merge branch 'dev' of https://github.com/alan-turing-institute/MLJTun…

…ing.jl into dev
JuliaAI · Mar 25, 2020 · 7c0bac0 · 7c0bac0
2 parents e3775f1 + 7de9ab0
commit 7c0bac0
Show file tree

Hide file tree

Showing 6 changed files with 160 additions and 73 deletions.
diff --git a/README.md b/README.md
@@ -153,14 +153,17 @@ begin, on the basis of the specific strategy and a user-specified
   measures](https://alan-turing-institute.github.io/MLJ.jl/dev/performance_measures/)
   for details.
 
-- The *history* is a vector of tuples generated by the tuning
-  algorithm - one tuple per iteration - used to determine the optimal
-  model and which also records other user-inspectable statistics that
-  may be of interest - for example, evaluations of a measure (loss or
-  score) different from one being explicitly optimized. Each tuple is
-  of the form `(m, r)`, where `m` is a model instance and `r` is
-  information
-  about `m` extracted from an evaluation.
+- The *history* is a vector of tuples of the form `(m, r)` generated
+  by the tuning algorithm - one tuple per iteration - where `m` is a
+  model instance that has been evaluated, and `r` (called the
+  *result*) contains three kinds of information: (i) whatever parts of
+  the evaluation needed to determine the optimal model; (ii)
+  additional user-inspectable statistics that may be of interest - for
+  example, evaluations of a measure (loss or score) different from one
+  being explicitly optimized; and (iii) any model "metadata" that a
+  tuning strategy implementation may need to be recorded for
+  generating the next batch of model candidates - for example an
+  implementation-specific representation of the model.
 
 - A *tuning strategy* is an instance of some subtype `S <:
   TuningStrategy`, the name `S` (e.g., `Grid`) indicating the tuning
@@ -320,19 +323,22 @@ which is recorded in its `field` attribute, but for composite models
 this might be a be a "nested name", such as `:(atom.max_depth)`.
 
 
-#### The `result` method: For declaring what parts of an evaluation goes into the history
+#### The `result` method: For building each entry of the history
 
 ```julia
-MLJTuning.result(tuning::MyTuningStrategy, history, e)
+MLJTuning.result(tuning::MyTuningStrategy, history, state, e, metadata)
 ```
 
-This method is for extracting from an evaluation `e` of some model `m`
-the value of `r` to be recorded in the corresponding tuple `(m, r)` of
-the history. The value of `r` is also allowed to depend on previous
-events in the history. The fallback is:
+This method is for constructing the result object `r` in each tuple
+`(m, r)` written to the history. Here `e` is the evaluation of the
+model `m` (as returned by a call to `evaluation!`) and `metadata` is
+any metadata associated with `m` when this is included in the output
+of `models!` (see below), and `nothing` otherwise. The value of `r` is
+also allowed to depend on previous events in the history. The fallback
+is:
 
 ```julia
-MLJTuning.result(tuning, history, e) = (measure=e.measure, measurement=e.measurement)
+MLJTuning.result(tuning, history, state, e, metadata) = (measure=e.measure, measurement=e.measurement)
 ```
 
 Note in this case that the result is always a named tuple of
@@ -354,18 +360,13 @@ state = setup(tuning::MyTuningStrategy, model, range, verbosity)
 ```
 
 The `setup` function is for initializing the `state` of the tuning
-algorithm (needed, by the algorithm's `models!` method; see below). Be
-sure to make this object mutable if it needs to be updated by the
-`models!` method. The `state` generally stores, at the least, the
-range or some processed version thereof. In momentum-based gradient
-descent, for example, the state would include the previous
-hyperparameter gradients, while in GP Bayesian optimization, it would
-store the (evolving) Gaussian processes.
-
-If a variable is to be reported as part of the user-inspectable
-history, then it should be written to the history instead of stored in
-state. An example of this might be the `temperature` in simulated
-annealing.
+algorithm (available to the `models!` method). Be sure to make this
+object mutable if it needs to be updated by the `models!` method. 
+
+The `state` is a place to record the outcomes of any necessary
+intialization of the tuning algorithm (performed by `setup`) and a
+place for the `models!` method to save and read transient information
+that does not need to be recorded in the history.
 
 The `setup` function is called once only, when a `TunedModel` machine
 is `fit!` the first time, and not on subsequent calls (unless
@@ -420,18 +421,24 @@ selection of `n - length(history)` models from the grid, so that
 non-deterministically (such as simulated annealing), `models!` might
 return a single model, or return a small batch of models to make use
 of parallelization (the method becoming "semi-sequential" in that
-case). In sequential methods that generate new models
-deterministically (such as those choosing models that optimize the
-expected improvement of a surrogate statistical model) `models!` would
-return a single model.
+case). 
+
+##### Including model metadata
+
+If a tuning strategy implementation needs to pass additional
+"metadata" along with each model, to be passed to `result` for
+recording in the history, then instead of model instances, `models!`
+should returne a vector of *tuples* of the form `(m, metadata)`, where
+`m` is a model instance, and `metadata` the associated data. See the
+discussion above on `result`.
 
 If the tuning algorithm exhausts it's supply of new models (because,
 for example, there is only a finite supply) then `models!` should
-return an empty vector. Under the hood, there is no fixed "batch-size"
-parameter, and the tuning algorithm is happy to receive any number of
-models. If `models!` returns a number of models exceeding the number
-needed to complete the history, the trailing excess is simply ignored.
-
+return an empty vector or `nothing`. Under the hood, there is no fixed
+"batch-size" parameter, and the tuning algorithm is happy to receive
+any number of models. If `models!` returns a number of models
+exceeding the number needed to complete the history, the list returned
+is simply truncated.
 
 #### The `best` method: To define what constitutes the "optimal model"
 

diff --git a/src/MLJTuning.jl b/src/MLJTuning.jl
@@ -23,6 +23,12 @@ import ComputationalResources: CPU1, CPUProcesses,
     CPUThreads, AbstractResource
 using Random
 
+
+## CONSTANTS
+
+const DEFAULT_N = 10
+
+
 ## INCLUDE FILES
 
 include("utilities.jl")

diff --git a/src/strategies/explicit.jl b/src/strategies/explicit.jl
@@ -1,16 +1,12 @@
 mutable struct Explicit <: TuningStrategy end 
 
 # models! returns all available models in the range at once:
-MLJTuning.models!(tuning::Explicit, model, history::Nothing,
-                  state, verbosity) = state
-MLJTuning.models!(tuning::Explicit, model, history,
-                  state, verbosity) = state[length(history) + 1:end]
-
-function MLJTuning.default_n(tuning::Explicit, range)
-    try
-        length(range)
-    catch MethodError
-        10
-    end
+function MLJTuning.models!(tuning::Explicit,
+                           model,
+                           history,
+                           state,
+                           verbosity)
+    history === nothing && return state
+    return state[length(history) + 1:end]
 end
 
diff --git a/src/tuned_models.jl b/src/tuned_models.jl
@@ -197,13 +197,30 @@ end
 
 ## FIT AND UPDATE METHODS
 
+# A *metamodel* is either a `Model` instance, `model`, or a tuple
+# `(model, s)`, where `s` is extra data associated with `model` that
+# the tuning strategy implementation wants available to the `result`
+# method for recording in the history.
+
+_first(m::MLJBase.Model) = m
+_last(m::MLJBase.Model) = nothing
+_first(m::Tuple{Model,Any}) = first(m)
+_last(m::Tuple{Model,Any}) = last(m)
+
 # returns a (model, result) pair for the history:
-function event(model, resampling_machine, verbosity, tuning, history)
+function event(metamodel,
+               resampling_machine,
+               verbosity,
+               tuning,
+               history,
+               state)
+    model = _first(metamodel)
+    metadata = _last(metamodel)
     resampling_machine.model.model = model
     verb = (verbosity == 2 ? 0 : verbosity - 1)
     fit!(resampling_machine, verbosity=verb)
     e = evaluate(resampling_machine)
-    r = result(tuning, history, e)
+    r = result(tuning, history, state, e, metadata)
 
     if verbosity > 2
         println(params(model))
@@ -212,20 +229,30 @@ function event(model, resampling_machine, verbosity, tuning, history)
         println("$r")
     end
 
-    return deepcopy(model), r
+    return model, r
 end
 
-function assemble_events(models, resampling_machine,
-                         verbosity, tuning, history, acceleration::CPU1)
-    map(models) do m
-        event(m, resampling_machine, verbosity, tuning, history)
+function assemble_events(metamodels,
+                         resampling_machine,
+                         verbosity,
+                         tuning,
+                         history,
+                         state,
+                         acceleration::CPU1)
+    map(metamodels) do m
+        event(m, resampling_machine, verbosity, tuning, history, state)
     end
 end
 
-function assemble_events(models, resampling_machine,
-                         verbosity, tuning, history, acceleration::CPUProcesses)
-    pmap(models) do m
-        event(m, resampling_machine, verbosity, tuning, history)
+function assemble_events(metamodels,
+                         resampling_machine,
+                         verbosity,
+                         tuning,
+                         history,
+                         state,
+                         acceleration::CPUProcesses)
+    pmap(metamodels) do m
+        event(m, resampling_machine, verbosity, tuning, history, state)
     end
 end
 
@@ -238,26 +265,36 @@ _length(::Nothing) = 0
 # builds on an existing `history` until the length is `n` or the model
 # supply is exhausted (method shared by `fit` and `update`). Returns
 # the bigger history:
-function build(history, n, tuning, model::M,
-               state, verbosity, acceleration, resampling_machine) where M
+function build(history,
+               n,
+               tuning,
+               model,
+               state,
+               verbosity,
+               acceleration,
+               resampling_machine)
     j = _length(history)
     models_exhausted = false
     while j < n && !models_exhausted
-        _models = models!(tuning, model, history, state, verbosity)
-        models = _models === nothing ? M[] : collect(_models)
-        Δj = length(models)
+        metamodels = models!(tuning, model, history, state, verbosity)
+        Δj = _length(metamodels)
         Δj == 0 && (models_exhausted = true)
         shortfall = n - Δj
         if models_exhausted && shortfall > 0 && verbosity > -1
             @info "Only $j (of $n) models evaluated.\n"*
             "Model supply exhausted. "
         end
         Δj == 0 && break
-        shortfall < 0 && (models = models[1:n - j])
+        shortfall < 0 && (metamodels = metamodels[1:n - j])
         j += Δj
 
-        Δhistory = assemble_events(models, resampling_machine,
-                                 verbosity, tuning, history, acceleration)
+        Δhistory = assemble_events(metamodels,
+                                   resampling_machine,
+                                   verbosity,
+                                   tuning,
+                                   history,
+                                   state,
+                                   acceleration)
         history = _vcat(history, Δhistory)
     end
     return history

diff --git a/src/tuning_strategy_interface.jl b/src/tuning_strategy_interface.jl
@@ -5,7 +5,7 @@ MLJBase.show_as_constructed(::Type{<:TuningStrategy}) = true
 setup(tuning::TuningStrategy, model, range, verbosity) = range
 
 # for building each element of the history:
-result(tuning::TuningStrategy, history, e) =
+result(tuning::TuningStrategy, history, state, e, metadata) =
     (measure=e.measure, measurement=e.measurement)
 
 # for generating batches of new models and updating the state (but not
@@ -29,4 +29,11 @@ end
 tuning_report(tuning::TuningStrategy, history, state) = (history=history,)
 
 # for declaring the default number of models to evaluate:
-default_n(tuning::TuningStrategy, range) = 10
+function default_n(tuning::TuningStrategy, range)
+    try
+        length(range)
+    catch MethodError
+        DEFAULT_N
+    end
+end
+
diff --git a/test/tuned_models.jl b/test/tuned_models.jl
@@ -1,14 +1,16 @@
-module TestTunedModels
-
 using Distributed
 
 using Test
-using MLJTuning
 using MLJBase
 import ComputationalResources: CPU1, CPUProcesses, CPUThreads
 using Random
 Random.seed!(1234)
-@everywhere using ..Models
+
+@everywhere begin
+    using ..Models
+    using MLJTuning # gets extended in tests
+end
+
 using ..TestUtilities
 
 N = 30
@@ -86,7 +88,39 @@ end
     @test map(event -> last(event).measurement[1], history) ≈ results
 end)
 
+@everywhere begin
+
+    # variation of the Explicit strategy that annotates the models
+    # with metadata
+    mutable struct MockExplicit <: MLJTuning.TuningStrategy end
+
+    annotate(model) = (model, params(model)[1])
+
+    function MLJTuning.models!(tuning::MockExplicit,
+                               model,
+                               history,
+                               state,
+                               verbosity)
+        history === nothing && return annotate.(state)
+        return  annotate.(state)[length(history) + 1:end]
+    end
+
+    MLJTuning.result(tuning::MockExplicit, history, state, e, metadata) =
+        (measure=e.measure, measurement=e.measurement, K=metadata)
 end
 
-true
+@test MockExplicit == MockExplicit
+
+@testset_accelerated("passing of model metadata", accel,
+                     (exclude=[CPUThreads],), begin
+                     tm = TunedModel(model=first(r), tuning=MockExplicit(),
+                                     range=r, resampling=CV(nfolds=2),
+                                     measures=[rms, l1], acceleration=accel)
+                     fitresult, meta_state, report = fit(tm, 0, X, y);
+                     history, _, state = meta_state;
+                     for (m, r) in history
+                         #@test m.K == r.K
+                     end
+end)
 
+true