MLJ has a model registry, allowing the user to search models and their
properties, without loading all the packages containing model code. In
turn, this allows one to efficiently find all models solving a given
machine learning task. The task itself is specified with the help of
the matching
method, and the search executed with the models
methods, as detailed below.
Terminology. In this section the word "model" refers to a metadata
entry in the model registry, as opposed to an actual model struct
that such an entry represents. One can obtain such an entry with the
info
command:
using MLJ
MLJ.color_off()
info("PCA")
So a "model" in the present context is just a named tuple containing
metadata, and not an actual model type or instance. If two models with
the same name occur in different packages, the package name must be
specified, as in info("LinearRegressor", pkg="GLM")
.
We list all models (named tuples) using models()
, and list the models for which code is already loaded with localmodels()
:
localmodels()
localmodels()[2]
One can search for models containing specified strings or regular expressions in their docstring
attributes, as in
models("forest")
or by specifying a filter (Bool
-valued function):
filter(model) = model.is_supervised &&
model.input_scitype >: MLJ.Table(Continuous) &&
model.target_scitype >: AbstractVector{<:Multiclass{3}} &&
model.prediction_type == :deterministic
models(filter)
Multiple test arguments may be passed to models
, which are applied
conjunctively.
Common searches are streamlined with the help of the matching
command, defined as follows:
-
matching(model, X, y) == true
exactly whenmodel
is supervised and admits inputs and targets with the scientific types ofX
andy
, respectively -
matching(model, X) == true
exactly whenmodel
is unsupervised and admits inputs with the scientific types ofX
.
So, to search for all supervised probabilistic models handling input
X
and target y
, one can define the testing function task
by
task(model) = matching(model, X, y) && model.prediction_type == :probabilistic
And execute the search with
models(task)
Also defined are Bool
-valued callable objects matching(model)
,
matching(X, y)
and matching(X)
, with obvious behaviour. For example,
matching(X, y)(model) = matching(model, X, y)
.
So, to search for all models compatible with input X
and target y
,
for example, one executes
models(matching(X, y))
while the preceding search can also be written
models() do model
matching(model, X, y) &&
model.prediction_type == :probabilistic
end
models
localmodels
matching