MLJ allows quick evaluation of a supervised model's performance against a battery of selected losses or scores. For more on available performance measures, see Performance Measures.
In addition to hold-out and cross-validation, the user can specify their own list of train/test pairs of row indices for resampling, or define their own re-usable resampling strategies.
For simultaneously evaluating multiple models and/or data sets, see Benchmarking.
using MLJ
MLJ.color_off()
using MLJ
X = (a=rand(12), b=rand(12), c=rand(12));
y = X.a + 2X.b + 0.05*rand(12);
model = @load RidgeRegressor pkg=MultivariateStats
cv=CV(nfolds=3)
evaluate(model, X, y, resampling=cv, measure=l2, verbosity=0)
Alternatively, instead of applying evaluate
to a model + data, one
may call evaluate!
on an existing machine wrapping the model in
data:
mach = machine(model, X, y)
evaluate!(mach, resampling=cv, measure=l2, verbosity=0)
(The latter call is a mutating call as the learned parameters stored in the machine potentially change. )
evaluate!(mach,
resampling=cv,
measure=[l1, rms, rmslp1], verbosity=0)
my_loss(yhat, y) = maximum((yhat - y).^2);
my_per_observation_loss(yhat, y) = abs.(yhat - y);
MLJ.reports_each_observation(::typeof(my_per_observation_loss)) = true;
my_weighted_score(yhat, y) = 1/mean(abs.(yhat - y));
my_weighted_score(yhat, y, w) = 1/mean(abs.((yhat - y).^w));
MLJ.supports_weights(::typeof(my_weighted_score)) = true;
MLJ.orientation(::typeof(my_weighted_score)) = :score;
holdout = Holdout(fraction_train=0.8)
weights = [1, 1, 2, 1, 1, 2, 3, 1, 1, 2, 3, 1];
evaluate!(mach,
resampling=CV(nfolds=3),
measure=[my_loss, my_per_observation_loss, my_weighted_score, l1],
weights=weights, verbosity=0)
Users can either provide their own list of train/test pairs of row indices for resampling, as in this example:
fold1 = 1:6; fold2 = 7:12;
evaluate!(mach,
resampling = [(fold1, fold2), (fold2, fold1)],
measure=[l1, l2], verbosity=0)
Or define their own re-usable ResamplingStrategy
objects, - see
Custom resampling strategies below.
MLJBase.Holdout
MLJBase.CV
MLJBase.StratifiedCV
To define your own resampling strategy, make relevant parameters of
your strategy the fields of a new type MyResamplingStrategy <: MLJ.ResamplingStrategy
, and implement one of the following methods:
MLJ.train_test_pairs(my_strategy::MyResamplingStrategy, rows)
MLJ.train_test_pairs(my_strategy::MyResamplingStrategy, rows, y)
MLJ.train_test_pairs(my_strategy::MyResamplingStrategy, rows, X, y)
Each method takes a vector of indices rows
and return a
vector [(t1, e1), (t2, e2), ... (tk, ek)]
of train/test pairs of row
indices selected from rows
. Here X
, y
are the input and target
data (ignored in simple strategies, such as Holdout
and CV
).
Here is the code for the Holdout
strategy as an example:
struct Holdout <: ResamplingStrategy
fraction_train::Float64
shuffle::Bool
rng::Union{Int,AbstractRNG}
function Holdout(fraction_train, shuffle, rng)
0 < fraction_train < 1 ||
error("`fraction_train` must be between 0 and 1.")
return new(fraction_train, shuffle, rng)
end
end
# Keyword Constructor
function Holdout(; fraction_train::Float64=0.7, shuffle=nothing, rng=nothing)
if rng isa Integer
rng = MersenneTwister(rng)
end
if shuffle === nothing
shuffle = ifelse(rng===nothing, false, true)
end
if rng === nothing
rng = Random.GLOBAL_RNG
end
return Holdout(fraction_train, shuffle, rng)
end
function train_test_pairs(holdout::Holdout, rows)
train, test = partition(rows, holdout.fraction_train,
shuffle=holdout.shuffle, rng=holdout.rng)
return [(train, test),]
end
MLJBase.evaluate!
MLJBase.evaluate