A model is an object storing hyperparameters associated with some
machine learning algorithm, and that is all. In MLJ, hyperparameters
include configuration parameters, like the number of threads, and
special instructions, such as "compute feature rankings", which may or
may not affect the final learning outcome. However, the logging level
(verbosity
below) is excluded. Learned parameters (such as the
coefficients in a linear model) have no place in the model struct.
The name of the Julia type associated with a model indicates the
associated algorithm (e.g., DecisionTreeClassifier
). The outcome of
training a learning algorithm is called a fitresult. For
ordinary multivariate regression, for example, this would be the
coefficients and intercept. For a general supervised model, it is the
(generally minimal) information needed to make new predictions.
The ultimate supertype of all models is MLJModelInterface.Model
, which
has two abstract subtypes:
abstract type Supervised <: Model end
abstract type Unsupervised <: Model end
Supervised
models are further divided according to whether they are
able to furnish probabilistic predictions of the target (which they
will then do by default) or directly predict "point" estimates, for each
new input pattern:
abstract type Probabilistic <: Supervised end
abstract type Deterministic <: Supervised end
Further division of model types is realized through Trait declarations.
Associated with every concrete subtype of Model
there must be a
fit
method, which implements the associated algorithm to produce the
fitresult. Additionally, every Supervised
model has a predict
method, while Unsupervised
models must have a transform
method. More generally, methods such as these, that are dispatched on
a model instance and a fitresult (plus other data), are called
operations. Probabilistic
supervised models optionally implement a
predict_mode
operation (in the case of classifiers) or a
predict_mean
and/or predict_median
operations (in the case of
regressors) although MLJModelInterface also provides fallbacks that will suffice
in most cases. Unsupervised
models may implement an
inverse_transform
operation.