Skip to content

Latest commit

 

History

History
59 lines (45 loc) · 1.72 KB

learning_curves.md

File metadata and controls

59 lines (45 loc) · 1.72 KB

Learning Curves

A learning curve in MLJ is a plot of some performance estimate, as a function of some model hyperparameter. This can be useful when tuning a single model hyperparameter, or when deciding how many iterations are required for some iterative model. The learning_curve! method does not actually generate a plot, but generates the data needed to do so.

To generate learning curves you must bind data to a model by instantiating a machine. You can choose to supply all available data, as performance estimates are computed using a resampling strategy, defaulting to Holdout(fraction_train=0.7).

X, y = @load_boston;

atom = @load RidgeRegressor pkg=MultivariateStats
ensemble = EnsembleModel(atom=atom, n=1000)
mach = machine(ensemble, X, y)

r_lambda = range(ensemble, :(atom.lambda), lower=10, upper=500, scale=:log10)
curve = MLJ.learning_curve!(mach; range=r_lambda, resampling=CV(), measure=mav)
using Plots
plot(curve.parameter_values,
     curve.measurements,
     xlab=curve.parameter_name,
     xscale=curve.parameter_scale,
     ylab = "CV estimate of RMS error")

In the case of the number of iterations in some iterative model, learning_curve! will not restart the training from scratch for each new value, unless a non-holdout resampling strategy is specified (and provided the model implements an appropriate update method).

atom.lambda=200
r_n = range(ensemble, :n, lower=1, upper=250)
curves = MLJ.learning_curve!(mach; range=r_n, verbosity=0, n=5)
plot(curves.parameter_values, 
     curves.measurements, 
     xlab=curves.parameter_name,
     ylab="Holdout estimate of RMS error")

API reference

learning_curve!