[Link to tutorial](https://juliaai.github.io/DataScienceTutorials.jl/getting-started/learning-networks-2/)

In [1]:
using Pkg
Pkg.activate(".")
Pkg.instantiate()

[32m[1m  Activating[22m[39m project at `~/Repos/mike_scratch/mlj_tutorial/A-learning-networks-2`


└ @ nothing /Users/mph/Repos/mike_scratch/mlj_tutorial/A-learning-networks-2/Manifest.toml:0


In [2]:
using MLJ
using StableRNGs
import DataFrames: DataFrame

In [3]:
Ridge = @load RidgeRegressor pkg=MultivariateStats

rng = StableRNG(71)
x1 = rand(rng, 300)
x2 = rand(rng, 300)
x3 = rand(rng, 300)
y = exp.(x1 - x2 - 2*x3 + 0.1*rand(rng,300))
X = DataFrame(x1=x1, x2=x2, x3=x3)

test, train = partition(eachindex(y), 0.8)

┌ Info: For silent loading, specify `verbosity=0`. 
└ @ Main /Users/mph/.julia/packages/MLJModels/tMgLW/src/loading.jl:168


import MLJMultivariateStatsInterface ✔




([1, 2, 3, 4, 5, 6, 7, 8, 9, 10  …  231, 232, 233, 234, 235, 236, 237, 238, 239, 240], [241, 242, 243, 244, 245, 246, 247, 248, 249, 250  …  291, 292, 293, 294, 295, 296, 297, 298, 299, 300])

There are two approaches to generating a model from a network -- using `@from_network` and writing the full model. The first is usually simpler.

## Using `@from_network` macro

In [4]:
Xs = source(X)
ys = source(y)

Source @140 ⏎ `AbstractVector{Continuous}`

In [5]:
# first layer
std_model = Standardizer()
stand = machine(std_model, Xs)
W = transform(stand, Xs)

box_model = UnivariateBoxCoxTransformer()
box_mach = machine(box_model, ys)
z = transform(box_mach, ys)

Node{Machine{UnivariateBoxCoxTransformer,…}}
  args:
    1:	Source @140
  formula:
    transform(
        [0m[1mMachine{UnivariateBoxCoxTransformer,…}[22m, 
        Source @140)

In [6]:
# Second layer
ridge_model = Ridge(lambda=0.1)
ridge = machine(ridge_model, W, z)
ẑ = predict(ridge, W)

Node{Machine{RidgeRegressor,…}}
  args:
    1:	Node{Machine{Standardizer,…}}
  formula:
    predict(
        [0m[1mMachine{RidgeRegressor,…}[22m, 
        transform(
            [0m[1mMachine{Standardizer,…}[22m, 
            Source @974))

In [7]:
# output
ŷ = inverse_transform(box_mach, ẑ)

Node{Machine{UnivariateBoxCoxTransformer,…}}
  args:
    1:	Node{Machine{RidgeRegressor,…}}
  formula:
    inverse_transform(
        [0m[1mMachine{UnivariateBoxCoxTransformer,…}[22m, 
        predict(
            [0m[1mMachine{RidgeRegressor,…}[22m, 
            transform(
                [0m[1mMachine{Standardizer,…}[22m, 
                Source @974)))

A learning network needs to be exported to create a new stand-alone model type. Instances of that type can be bound with data in a machine, which can be evaluated. Somewhat paradoxically, one can wrap a learning network in a certain kind of machine called a learning network machine before exporting it. In fact, the export process requires us to do so. Since a composite model type does not yet exist, one constructs machine using a "surrogate" model, whose name indicates the ultimate model supertype (Deterministic, Probabilistic, Unsupervised, Static). This surrogate model has no fields.

In [8]:
surrogate = Deterministic()
mach = machine(surrogate, Xs, ys; predict=ŷ)

fit!(mach)
predict(mach, X[test[1:5], :])

┌ Info: Training Machine{UnivariateBoxCoxTransformer,…}.
└ @ MLJBase /Users/mph/.julia/packages/MLJBase/MuLnJ/src/machines.jl:464
┌ Info: Training Machine{Standardizer,…}.
└ @ MLJBase /Users/mph/.julia/packages/MLJBase/MuLnJ/src/machines.jl:464


┌ Info: Training Machine{RidgeRegressor,…}.
└ @ MLJBase /Users/mph/.julia/packages/MLJBase/MuLnJ/src/machines.jl:464


5-element Vector{Float64}:
 1.231116310512772
 0.8325844617296161
 0.1472180186904795
 0.13611519633371513
 0.8525369962311635

We defined the learning network machine `mach` above. The following code defines a new model subtype `WrappedRegressor <: Supervised` with a single field regressor.

In [9]:
@from_network mach begin
    mutable struct CompositeModel
        regressor=ridge_model
    end
end

This defines a consturctor `CompositeModel` and attributes a name to teh different models; the ordering / connection between the nodes is inferred from `y\hat` via the `<= y\hat`.

Note: if this were a probabilistic model (e.g. RidgeClassifier) we would have needed to add `is_probabilistic=true` at the end.

In [10]:
cm = machine(CompositeModel(), X, y)
res = evaluate!(cm, resampling=Holdout(fraction_train=0.8, rng=71), measure=rms)
round(res.measurement[1], sigdigits=3)

0.013

## Defining a model from scratch

Instead of using `@from_network`, we can define the model from scratch.

In [11]:
mutable struct CompositeModel2 <: DeterministicComposite
    std_model::Standardizer
    box_model::UnivariateBoxCoxTransformer
    ridge_model::Ridge
end

function MLJ.fit(m::CompositeModel2, verbosity::Int, X, y)
    Xs = source(X)
    ys = source(y)
    W = MLJ.transform(machine(m.std_model, Xs), Xs)
    box = machine(m.box_model, ys)
    z = MLJ.transform(box, ys)
    ẑ = predict(machine(m.ridge_model, W, z), W)
    ŷ = inverse_transform(box, ẑ)
    mach = machine(Deterministic(), Xs, ys; predict=ŷ)
    return!(mach, m, verbosity - 1)
end

mdl = CompositeModel2(Standardizer(), UnivariateBoxCoxTransformer(),
                      Ridge(lambda=0.1))
cm = machine(mdl, X, y)
res = evaluate!(cm, resampling=Holdout(fraction_train=0.8), measure=rms)
round(res.measurement[1], sigdigits=3)

0.0207

We now have a constructor to a model which cane be used as a stand-alone object, tuned and composed as you would with any other model.