# Getting started

## Learning networks(1)

> https://juliaai.github.io/DataScienceTutorials.jl/getting-started/learning-networks/
> <br> (project folder) https://raw.githubusercontent.com/juliaai/DataScienceTutorials.jl/gh-pages/__generated/A-learning-networks.tar.gz

In [1]:
using Pkg; Pkg.activate("D:/JULIA/6_ML_with_Julia/A-learning-networks"); Pkg.instantiate()

[32m[1m  Activating[22m[39m project at `D:\JULIA\6_ML_with_Julia\A-learning-networks`


> Preliminary steps <br>
> Defining a learning network
> 1. Sources and nodes
> 2. Modifying hyperparameters

### Preliminary steps

---

Let's generate a ```DataFrame``` with some dummy regression data, let's  also load the good old ridge regressor.

In [2]:
using MLJ, StableRNGs
import DataFrames
Ridge = @load RidgeRegressor pkg = MultivariateStats

┌ Info: For silent loading, specify `verbosity=0`. 
└ @ Main C:\Users\jeffr\.julia\packages\MLJModels\tMgLW\src\loading.jl:168


import MLJMultivariateStatsInterface ✔


MLJMultivariateStatsInterface.RidgeRegressor

In [3]:
rng = StableRNG(551234) # for reproducibility

x1 = rand(rng, 300)
x2 = rand(rng, 300)
x3 = rand(rng, 300)

y = exp.(x1 - x2 -2x3 + 0.1*rand(rng,300))

300-element Vector{Float64}:
 0.7989768524813763
 0.25551474166617105
 2.0625710215755886
 0.5122281643109177
 0.1443917495388283
 0.11813879976181635
 0.44907730616054214
 0.44296882087669526
 1.2573760732039552
 0.960949506473049
 0.08631117949081181
 0.5170659064112336
 0.22987789068338135
 ⋮
 0.22611734270859704
 0.10449289882428948
 0.5548012493987347
 1.0427930206469724
 0.3552638788719311
 0.684735167384825
 0.33419361690888305
 0.22963034143307637
 0.15806411334270462
 0.8097083052020333
 0.40011919768268533
 0.635678448957407

In [4]:
X = DataFrames.DataFrame(x1 = x1, x2 = x2, x3 = x3)
first(X, 3) |> pretty

┌────────────┬────────────┬────────────┐
│[1m x1         [0m│[1m x2         [0m│[1m x3         [0m│
│[90m Float64    [0m│[90m Float64    [0m│[90m Float64    [0m│
│[90m Continuous [0m│[90m Continuous [0m│[90m Continuous [0m│
├────────────┼────────────┼────────────┤
│ 0.984002   │ 0.771482   │ 0.232099   │
│ 0.891795   │ 0.747399   │ 0.770914   │
│ 0.806395   │ 0.0182751  │ 0.0721645  │
└────────────┴────────────┴────────────┘


Let's also prepare the train and test split which will be useul later on.

In [5]:
test, train = partition(eachindex(y), 0.8);

### Defining a learning network

---

In MLJ, a learning network is a directed acylclic graph (DAG) whose nodes apply trained or untrained operations such as a ```predict``` or ```transform``` (trained) or ```+```, ```vcat``` etc. (untrained). Learning networks can be seen as pipelines on steroids.

![image.png](pictures/DAG.png)

It corresponds to a fairly standard regression workflow: the data is standardized, the target is transformed using a Box-Cox transformation, a ridge regression is applied and the result is converted back by inverting the transform.

**Note** : actually this DAG is simple enough that it could also have been done with a pipeline.

### Sources and nodes

In MLJ a learning network starts at **source** nodes and flows through nodes (```X``` and ```y```) defining operations/transformations (```W```, ```z```, $\hat{z}$, $\hat{y}$). To define the source nodes, use the ```source``` function, you should specify whether it's a target:

In [6]:
Xs = source(X)
ys = source(y)

Source @717 ⏎ `AbstractVector{Continuous}`

To define an "trained-operation" node, you must simply create a machine wrapping a model and another node(the data) and indicate which operation should be performed (e.g. ```transform```):

In [7]:
stand = machine(Standardizer(), Xs)
W = transform(stand, Xs)

Node{Machine{Standardizer,…}}
  args:
    1:	Source @239
  formula:
    transform(
        [0m[1mMachine{Standardizer,…}[22m, 
        Source @239)

You can ```fit!``` a trained-operation node at any point, MLJ will fit whatever it needs that is upstream of that node. In this case, there is just a source node upstream of ```W``` so fitting ```W``` will just fit the standardizer:

In [8]:
# fit the standardizer

fit!(W, rows = train)

┌ Info: Training Machine{Standardizer,…}.
└ @ MLJBase C:\Users\jeffr\.julia\packages\MLJBase\MuLnJ\src\machines.jl:464


Node{Machine{Standardizer,…}}
  args:
    1:	Source @239
  formula:
    transform(
        [0m[1mMachine{Standardizer,…}[22m, 
        Source @239)

If you want to get the transformed data, you can then call the node speciying on which part of the data the operation should be performed:

In [9]:
W()             # transforms all data

Unnamed: 0_level_0,x1,x2,x3
Unnamed: 0_level_1,Float64,Float64,Float64
1,1.45878,1.1752,-0.93074
2,1.14634,1.08675,0.926941
3,0.856967,-1.59115,-1.48215
4,-1.06436,-1.5056,-0.234452
5,-0.977492,1.14465,0.819002
6,-0.374852,0.89315,1.62356
7,-1.32187,1.93078,-1.64073
8,-1.71146,1.21606,-1.47321
9,-0.266556,-1.39119,-1.34372
10,1.15833,1.27126,-1.33944


In [10]:
W(rows=test, )  # transforms only test data

Unnamed: 0_level_0,x1,x2,x3
Unnamed: 0_level_1,Float64,Float64,Float64
1,1.45878,1.1752,-0.93074
2,1.14634,1.08675,0.926941
3,0.856967,-1.59115,-1.48215
4,-1.06436,-1.5056,-0.234452
5,-0.977492,1.14465,0.819002
6,-0.374852,0.89315,1.62356
7,-1.32187,1.93078,-1.64073
8,-1.71146,1.21606,-1.47321
9,-0.266556,-1.39119,-1.34372
10,1.15833,1.27126,-1.33944


In [11]:
W(X[3:4, :])    # transforms specific data

Unnamed: 0_level_0,x1,x2,x3
Unnamed: 0_level_1,Float64,Float64,Float64
1,0.856967,-1.59115,-1.48215
2,-1.06436,-1.5056,-0.234452


Let's now define the other nodes:

In [12]:
box_model = UnivariateBoxCoxTransformer()
box = machine(box_model, ys)
z = transform(box, ys)

Node{Machine{UnivariateBoxCoxTransformer,…}}
  args:
    1:	Source @717
  formula:
    transform(
        [0m[1mMachine{UnivariateBoxCoxTransformer,…}[22m, 
        Source @717)

In [13]:
ridge_model = Ridge(lambda = 0.1)
ridge = machine(ridge_model, W, z)
ẑ = predict(ridge, W)
ŷ = inverse_transform(box, ẑ)

Node{Machine{UnivariateBoxCoxTransformer,…}}
  args:
    1:	Node{Machine{RidgeRegressor,…}}
  formula:
    inverse_transform(
        [0m[1mMachine{UnivariateBoxCoxTransformer,…}[22m, 
        predict(
            [0m[1mMachine{RidgeRegressor,…}[22m, 
            transform(
                [0m[1mMachine{Standardizer,…}[22m, 
                Source @239)))

Note that we have not yet done any training, but if we now call ```fit!``` on ```ŷ```, it will fit all nodes upstream of ```ŷ``` that need to be re-trained:

In [14]:
fit!(ŷ, rows = train);

┌ Info: Training Machine{UnivariateBoxCoxTransformer,…}.
└ @ MLJBase C:\Users\jeffr\.julia\packages\MLJBase\MuLnJ\src\machines.jl:464
┌ Info: Not retraining Machine{Standardizer,…}. Use `force=true` to force.
└ @ MLJBase C:\Users\jeffr\.julia\packages\MLJBase\MuLnJ\src\machines.jl:467
┌ Info: Training Machine{RidgeRegressor,…}.
└ @ MLJBase C:\Users\jeffr\.julia\packages\MLJBase\MuLnJ\src\machines.jl:464


Now that ```ŷ``` has been fitted, you can apply the full graph on test data (or any compatible data). For instance, let's get the ```rms``` between the ground truth and the predicted values:

In [15]:
rms(y[test], ŷ(rows = test))

0.03360496363407844

### Modifying hyperparameters

Hyperparameters can be accessed using the dot syntax as usual. Let's modify the regularisation parameter of the ridge regression:

In [16]:
ridge_model.lambda = 5.0;

Since the node ```ẑ``` corresponds to a machine that wraps ridge_model, that node has effectively changed and will be retrained:

In [17]:
fit!(ŷ, rows=train)

┌ Info: Not retraining Machine{UnivariateBoxCoxTransformer,…}. Use `force=true` to force.
└ @ MLJBase C:\Users\jeffr\.julia\packages\MLJBase\MuLnJ\src\machines.jl:467
┌ Info: Not retraining Machine{Standardizer,…}. Use `force=true` to force.
└ @ MLJBase C:\Users\jeffr\.julia\packages\MLJBase\MuLnJ\src\machines.jl:467
┌ Info: Updating Machine{RidgeRegressor,…}.
└ @ MLJBase C:\Users\jeffr\.julia\packages\MLJBase\MuLnJ\src\machines.jl:465


Node{Machine{UnivariateBoxCoxTransformer,…}}
  args:
    1:	Node{Machine{RidgeRegressor,…}}
  formula:
    inverse_transform(
        [0m[1mMachine{UnivariateBoxCoxTransformer,…}[22m, 
        predict(
            [0m[1mMachine{RidgeRegressor,…}[22m, 
            transform(
                [0m[1mMachine{Standardizer,…}[22m, 
                Source @239)))

In [18]:
rms(y[test], ŷ(rows=test))

0.03834272597361206