# Introduction to Machine Learning in Julia with MLJ

Welcome to this little Jupyter Notebook for getting to know MLJ, the goto ML platform within Julia.

To start with, take a look at [MLJ's github page](https://github.com/alan-turing-institute/MLJ.jl):
* super well organized: own [Github Organization "JuliaAI"](https://github.com/JuliaAI)
* well maintained and supported: see the maintainers and support below

> -----------------------------
>
> <div align="center">
>     <img src="https://github.com/alan-turing-institute/MLJ.jl/raw/dev/material/MLJLogo2.svg" alt="MLJ" width="200">
> </div>
> 
> <h2 align="center">A Machine Learning Framework for Julia
> </h2>
> 
> 
> MLJ (Machine Learning in Julia) is a toolbox written in Julia
> providing a common interface and meta-algorithms for selecting,
> tuning, evaluating, composing and comparing over [160 machine learning
> models](https://alan-turing-institute.github.io/MLJ.jl/dev/list_of_supported_models/)
> written in Julia and other languages.
> 
> **New to MLJ?** Start [here](https://alan-turing-institute.github.io/MLJ.jl/dev/).
> 
> **Integrating an existing machine learning model into the MLJ
> framework?** Start [here](https://alan-turing-institute.github.io/MLJ.jl/dev/quick_start_guide_to_adding_models/).
> 
> MLJ was initially created as a Tools, Practices and Systems project at
> the [Alan Turing Institute](https://www.turing.ac.uk/)
> in 2019. Current funding is provided by a [New Zealand Strategic
> Science Investment
> Fund](https://www.mbie.govt.nz/science-and-technology/science-and-innovation/funding-information-and-opportunities/investment-funds/strategic-science-investment-fund/ssif-funded-programmes/university-of-auckland/)
> awarded to the University of Auckland.
> 
> MLJ been developed with the support of the following organizations:
> 
> <div align="center">
>     <img src="https://github.com/alan-turing-institute/MLJ.jl/raw/dev/material/Turing_logo.png" width = 100/>
>     <img src="https://github.com/alan-turing-institute/MLJ.jl/raw/dev/material/UoA_logo.png" width = 100/>
>     <img src="https://github.com/alan-turing-institute/MLJ.jl/raw/dev/material/IQVIA_logo.png" width = 100/>
>     <img src="https://github.com/alan-turing-institute/MLJ.jl/raw/dev/material/warwick.png" width = 100/>
>     <img src="https://github.com/alan-turing-institute/MLJ.jl/raw/dev/material/julia.png" width = 100/>
> </div>
> 
> 
> ### The MLJ Universe
> 
> The functionality of MLJ is distributed over a number of repositories
> illustrated in the dependency chart below. These repositories live at
> the [JuliaAI](https://github.com/JuliaAI) umbrella organization.
> 
> <div align="center">
>     <img src="https://github.com/alan-turing-institute/MLJ.jl/raw/dev/material/MLJ_stack.svg" alt="Dependency Chart">
> </div>
> 
> *Dependency chart for MLJ repositories. Repositories with dashed
> connections do not currently exist but are planned/proposed.*
> 
> <br>
> <p align="center">
> <a href="CONTRIBUTING.md">Contributing</a> &nbsp;•&nbsp; 
> <a href="ORGANIZATION.md">Code Organization</a> &nbsp;•&nbsp;
> <a href="ROADMAP.md">Road Map</a> 
> </br>
> 
> #### Contributors
> 
> *Core design*: A. Blaom, F. Kiraly, S. Vollmer
> 
> *Lead contributor*: A. Blaom
> 
> *Active maintainers*: A. Blaom, S. Okon, T. Lienart, D. Aluthge
> 
>
> ------------------------

Disclaimer: Many examples and text snippets are taken directly from documentation and examples provided by MLJ.

# Let's jump into it: Supervised Learning

In [None]:
] activate .

In [None]:
using MLJ

### Loading a Machine Learning Model

In [None]:
DecisionTreeClassifier = @iload DecisionTreeClassifier  # interactive model loading

In [None]:
DecisionTreeClassifier = @load DecisionTreeClassifier pkg=DecisionTree  # declaritive model loading
tree = DecisionTreeClassifier()  # instance

MLJ is essentially a big wrapper providing unified access to other packages containing the models

### Loading Data

In [None]:
import RDatasets
iris = RDatasets.dataset("datasets", "iris"); # a DataFrame
y, X = unpack(iris, ==(:Species), colname -> true); # y = a vector, and X = a DataFrame 
first(X, 3) |> pretty

In [None]:
?unpack

----------------
### Fit & Predict

In [None]:
mach = machine(tree, X, y)  # adding a mutable cache to the model+data for performant training 

In [None]:
train, test = partition(eachindex(y), 0.7, shuffle=false); # 70:30 split

In [None]:
fit!(mach, rows=train)
yhat = predict(mach, X[test,:]);
yhat[3:5]

In [None]:
using Distributions
isa(yhat[1], Distribution)

In [None]:
Distributions.mode.(yhat[3:5])

In [None]:
log_loss(yhat, y[test]) |> mean

In [None]:
measures()

In [None]:
for m in measures()
    if "log_loss" in m.instances
        display(m)
    end
end

### Evaluate = auto fit/predict

In [None]:
mach = machine(tree, X, y)  # adding a mutable cache to the model for performant training 
evaluate!(mach, resampling=Holdout(fraction_train=0.7, shuffle=false),
    measures=[log_loss, brier_score], verbosity=0)

In [None]:
tree.max_depth = 3
evaluate!(mach, resampling=CV(shuffle=true), measure=[accuracy, balanced_accuracy], operation=predict_mode, verbosity=0)

### Unsupervised Learning: fit!, transform, inverse_transform

In [None]:
v = [1, 2, 3, 4]
mach2 = machine(UnivariateStandardizer(), v)
fit!(mach2)
w = transform(mach2, v)

In [None]:
inverse_transform(mach2, w)

--------------------------------

# MLJ features


MLJ (Machine Learning in Julia) is a toolbox written in Julia
providing a common interface and meta-algorithms for selecting,
tuning, evaluating, composing and comparing machine learning models
written in Julia and other languages. In particular MLJ wraps a large
number of [scikit-learn](https://scikit-learn.org/stable/) models. 


* Data agnostic, train models on any data supported by the
  [Tables.jl](https://github.com/JuliaData/Tables.jl) interface,

* Extensive support for model composition (*pipelines* and *learning
  networks*),

* Convenient syntax to tune and evaluate (composite) models.

* Consistent interface to handle probabilistic predictions.

* Extensible [tuning
  interface](https://github.com/alan-turing-institute/MLJTuning.jl),
  to support growing number of optimization strategies, and designed
  to play well with model composition.


More information is available from the [MLJ design paper](https://github.com/alan-turing-institute/MLJ.jl/blob/master/paper/paper.md)

### Model Registry

MLJ has a model registry, allowing the user to search models and their properties.

In [None]:
models(matching(X,y))

In [None]:
?models

In [None]:
info("DecisionTreeClassifier", pkg="DecisionTree")

-----------------

# A more advanced example

Disclaimer: This is taken almost completely from an existing MLJ example

As in other frameworks, MLJ also supports a variety of unsupervised models for pre-processing data, reducing dimensionality, etc. It also provides a [wrapper](https://alan-turing-institute.github.io/MLJ.jl/dev/tuning_models/) for tuning model hyper-parameters in various ways. Data transformations, and supervised models are then typically combined into linear [pipelines](https://alan-turing-institute.github.io/MLJ.jl/dev/composing_models/#Linear-pipelines-1). However, a more advanced feature of MLJ not common in other frameworks allows you to combine models in more complicated ways. We give a simple demonstration of that next.

We start by loading the model code we'll need:

In [None]:
RidgeRegressor = @load RidgeRegressor pkg=MultivariateStats
RandomForestRegressor = @load RandomForestRegressor pkg=DecisionTree;

The next step is to define "learning network" - a kind of blueprint for the new composite model type. Later we "export" the network as a new stand-alone model type. Learning networks can be seen as pipelines on steroids.

Let's consider the following simple DAG:
![graph](https://alan-turing-institute.github.io/DataScienceTutorials.jl/assets/diagrams/composite1.svg)

Our learing network will:

- standarizes the input data

- learn and apply a Box-Cox transformation to the target variable

- blend the predictions of two supervised learning models - a ridge regressor and a random forest regressor; we'll blend using a simple average (for a more sophisticated stacking example, see [here](https://alan-turing-institute.github.io/DataScienceTutorials.jl/getting-started/stacking/))

- apply the *inverse* Box-Cox transformation to this blended prediction

**The basic idea is to proceed as if one were composing the various steps "by hand", but to wrap the training data in "source nodes" first.** In place of production data, one typically uses some dummy data, to test the network as it is built. When the learning network is "exported" as a new stand-alone model type, it will no longer be bound to any data. You bind the exported model to production data when your're ready to use your new model type (just like you would with any other MLJ model).

There is no need to `fit!` the machines you create, as this will happen automatically when you *call* the final node in the network (assuming you provide the dummy data).

*Input layer*

In [None]:
# define some synthetic data:
X, y = make_regression(100)
y = abs.(y)

test, train = partition(eachindex(y), 0.8);

# wrap as source nodes:
Xs = source(X)
ys = source(y)

*First layer and target transformation*

In [None]:
std_model = Standardizer()
stand = machine(std_model, Xs)
W = MLJ.transform(stand, Xs)

box_model = UnivariateBoxCoxTransformer()
box = machine(box_model, ys)
z = MLJ.transform(box, ys)

*Second layer*

In [None]:
ridge_model = RidgeRegressor(lambda=0.1)
ridge = machine(ridge_model, W, z)

forest_model = RandomForestRegressor(n_trees=50)
forest = machine(forest_model, W, z)

ẑ = 0.5*predict(ridge, W) + 0.5*predict(forest, W)

*Output*

In [None]:
ŷ = inverse_transform(box, ẑ)

No fitting has been done thus far, we have just defined a sequence of operations. We can test the netork by fitting the final predction node and then calling it to retrieve the prediction:

In [None]:
fit!(ŷ);
ŷ()[1:4]

To "export" the network a new stand-alone model type, we can use a macro:

In [None]:
@from_network machine(Deterministic(), Xs, ys, predict=ŷ) begin
    mutable struct CompositeModel
        rgs1 = ridge_model
        rgs2 = forest_model
    end
end

Here's an instance of our new type:

In [None]:
composite = CompositeModel()

Since we made our model mutable, we could change the regressors for different ones.

For now we'll evaluate this model on the famous Boston data set:

In [None]:
X, y = @load_boston
evaluate(composite, X, y, resampling=CV(nfolds=6, shuffle=true), measures=[rms, mae])

### Check out more [Data Science Tutorials in Julia](https://alan-turing-institute.github.io/DataScienceTutorials.jl/)

In [None]:
# try out one tutorial of your choice right in here
# ...

# Thank you for being here

further information about MLJ in general:
* MLJ repository: https://github.com/alan-turing-institute/MLJ.jl
* MLJ docs: https://alan-turing-institute.github.io/MLJ.jl/dev/
* MLJ tutorials: https://alan-turing-institute.github.io/DataScienceTutorials.jl/

further information about MLJ's model composition feature
* MLJ docs: https://alan-turing-institute.github.io/MLJ.jl/dev/composing_models/
* MLJ paper: https://arxiv.org/abs/2012.15505
* MLJ tutorial: https://alan-turing-institute.github.io/DataScienceTutorials.jl/getting-started/learning-networks/

In case you have more questions or suggestions, always feel welcome to reach out to me at Meetup and Julia User Group Munich, or directly at Stephan.Sahm@gmx.de