<img src="https://julialang.org/assets/infra/logo.svg" alt="Julia" width="200" style="max-width:100%;">

[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/darenasc/mlj-tutorials/master?filepath=README.ipynb)

# Julia programming language

The sintax in Julia is very intuitive and similar to languages such as Matlab and Python.

In [None]:
print("Hello world!")

In [None]:
2 + 2

In [None]:
typeof(42.0)

## Type inference and multiple dispatch

*Type inference* is the process of identifying the types of the arguments to dispatch the right method.

Blogpost about [type dispatch](http://www.stochasticlifestyle.com/type-dispatch-design-post-object-oriented-programming-julia/) by [Christopher Rackauckas](http://www.chrisrackauckas.com/).

In [None]:
function function_x(x::String)
    println("this is a string: $x")
end

function function_x(x::Int)
    println("$(x^2) is the square of $x")
end

In [None]:
# each call to the function_x() will dispatch the corresponding method depending on the parameter's type
function_x("a string")
function_x(2)

## Automatic differentiation

Differentiation of almost arbitrary programs with respect to their input. ([source]( https://render.githubusercontent.com/view/ipynb?commit=89317894e2e5370a80e45d52db8a4055a4fdecd6&enc_url=68747470733a2f2f7261772e67697468756275736572636f6e74656e742e636f6d2f6d6174626573616e636f6e2f454d455f4a756c69615f776f726b73686f702f383933313738393465326535333730613830653435643532646238613430353561346664656364362f315f496e74726f64756374696f6e2e6970796e62&nwo=matbesancon%2FEME_Julia_workshop&path=1_Introduction.ipynb&repository_id=270611906&repository_type=Repository#Automatic-differentiation) by [@matbesancon](https://github.com/matbesancon))

In [2]:
using ForwardDiff

function sqrt_babylonian(s)
    x = s / 2
    while abs(x^2 - s) > 0.001
        x = (x + s/x) / 2
    end
    x
end

sqrt_babylonian (generic function with 1 method)

In [3]:
sqrt_babylonian(2) - sqrt(2)

2.123901414519125e-6

In [4]:
@show ForwardDiff.derivative(sqrt_babylonian, 2);
@show ForwardDiff.derivative(sqrt, 2);

ForwardDiff.derivative(sqrt_babylonian, 2) = 0.353541906958862
ForwardDiff.derivative(sqrt, 2) = 0.35355339059327373


## Unitful computations
Physicists' dreams finally made true. ([soure](https://render.githubusercontent.com/view/ipynb?commit=89317894e2e5370a80e45d52db8a4055a4fdecd6&enc_url=68747470733a2f2f7261772e67697468756275736572636f6e74656e742e636f6d2f6d6174626573616e636f6e2f454d455f4a756c69615f776f726b73686f702f383933313738393465326535333730613830653435643532646238613430353561346664656364362f315f496e74726f64756374696f6e2e6970796e62&nwo=matbesancon%2FEME_Julia_workshop&path=1_Introduction.ipynb&repository_id=270611906&repository_type=Repository#Unitful-computations) by [@matbesancon](https://github.com/matbesancon))

In [6]:
using Unitful
using Unitful: J, kg, m, s

┌ Info: Precompiling Unitful [1986cc42-f94f-5a68-af5c-568840ba703d]
└ @ Base loading.jl:1242


In [7]:
3J + 1kg * (1m / 1s)^2

4.0 kg m² s⁻²

<img src="https://github.com/alan-turing-institute/MLJ.jl/raw/master/material/MLJLogo2.svg?sanitize=true" alt="MLJ" width="200" style="max-width:100%;">

# MLJ

MLJ (Machine Learning in Julia) is a toolbox written in Julia providing a common interface and meta-algorithms for selecting, tuning, evaluating, composing and comparing machine learning models written in Julia and other languages. MLJ is released under the MIT licensed and sponsored by the [Alan Turing Institute](https://www.turing.ac.uk/).

### The MLJ Universe

The functionality of MLJ is distributed over a number of repositories
illustrated in the dependency chart below.

[MLJ](https://github.com/alan-turing-institute/MLJ) * [MLJBase](https://github.com/alan-turing-institute/MLJBase.jl) * [MLJModelInterface](https://github.com/alan-turing-institute/MLJModelInterface.jl) * [MLJModels](https://github.com/alan-turing-institute/MLJModels.jl) * [MLJTuning](https://github.com/alan-turing-institute/MLJTuning.jl) * [MLJLinearModels](https://github.com/alan-turing-institute/MLJLinearModels.jl) * [MLJFlux](https://github.com/alan-turing-institute/MLJFlux.jl) * [MLJTutorials](https://github.com/alan-turing-institute/MLJTutorials) * [MLJScientificTypes](https://github.com/alan-turing-institute/MLJScientificTypes.jl) * [ScientificTypes](https://github.com/alan-turing-institute/ScientificTypes.jl)


<div align="center">
    <img src="https://github.com/alan-turing-institute/MLJ.jl/raw/master/material/MLJ_stack.svg?sanitize=true" alt="Dependency Chart">
</div>

*Dependency chart for MLJ repositories. Repositories with dashed
connections do not currently exist but are planned/proposed.*

MLJ provides access to to a wide variety of machine learning models. For the most up-to-date list of available models `models()`.

In [None]:
using MLJ
models()

## Fit, predict, transform

The following example is using the `fit()`, `predict()`, and `transform()` functions of MLJ.

In [None]:
import Statistics
using PrettyPrinting
using StableRNGs

In [None]:
X, y = @load_iris;

let's also load the DecisionTreeClassifier:

In [None]:
@load DecisionTreeClassifier
tree_model = DecisionTreeClassifier()

## MLJ Machine

In MLJ, a *model* is an object that only serves as a container for the hyperparameters of the model. A *machine* is an object wrapping both a model and data and can contain information on the *trained* model; it does *not* fit the model by itself. However, it does check that the model is compatible with the scientific type of the data and will warn you otherwise.

In [None]:
tree = machine(tree_model, X, y)

A machine is used both for supervised and unsupervised model. In this tutorial we give an example for the supervised model first and then go on with the unsupervised case.

## Training and testing a supervised model

Now that you've declared the model you'd like to consider and the data, we are left with the standard training and testing step for a supervised learning algorithm.

## Splitting the data

To split the data into a training and testing set, you can use the function `partition` to obtain indices for data points that should be considered either as training or testing data:

In [None]:
rng = StableRNG(566)
train, test = partition(eachindex(y), 0.7, shuffle=true, rng=rng)
test[1:3]

## Fitting and testing the machine

To fit the machine, you can use the function `fit!` specifying the rows to be used for the training:

In [None]:
fit!(tree, rows=train)

Note that this **modifies** the machine which now contains the trained parameters of the decision tree. You can inspect the result of the fitting with the `fitted_params` method:

In [None]:
fitted_params(tree) |> pprint

This `fitresult` will vary from model to model though classifiers will usually give out a tuple with the first element corresponding to the fitting and the second one keeping track of how classes are named (so that predictions can be appropriately named).

You can now use the machine to make predictions with the `predict` function specifying rows to be used for the prediction:

In [None]:
ŷ = predict(tree, rows=test)
@show ŷ[1]

Note that the output is probabilistic, effectively a vector with a score for each class. You could get the mode by using the `mode` function on `ŷ` or using `predict_mode`:

In [None]:
ȳ = predict_mode(tree, rows=test)
@show ȳ[1]
@show mode(ŷ[1])

To measure the discrepancy between ŷ and y you could use the average cross entropy:

In [None]:
mce = cross_entropy(ŷ, y[test]) |> mean
round(mce, digits=4)

# [Check out MLJ example with TreeParzen.jl](TreeParzen_example.ipynb)

# A more advanced example

In [None]:
using MLJ
using StableRNGs
import DataFrames
@load RidgeRegressor pkg=MultivariateStats

In this example we will show how to generate a model from a network; there are two approaches:

* using the `@from_network` macro
* writing the model in full

the first approach should usually be the one considered as it's simpler.

Generating a model from a network allows subsequent composition of that network with other tasks and tuning of that network.

### Using the @from_network macro

Let's define a simple network

*Input layer*

In [None]:
rng = StableRNG(6616) # for reproducibility
x1 = rand(rng, 300)
x2 = rand(rng, 300)
x3 = rand(rng, 300)
y = exp.(x1 - x2 -2x3 + 0.1*rand(rng, 300))
X = DataFrames.DataFrame(x1=x1, x2=x2, x3=x3)
test, train = partition(eachindex(y), 0.8);

Xs = source(X)
ys = source(y, kind=:target)

*First layer*

In [None]:
std_model = Standardizer()
stand = machine(std_model, Xs)
W = MLJ.transform(stand, Xs)

box_model = UnivariateBoxCoxTransformer()
box = machine(box_model, ys)
z = MLJ.transform(box, ys)

*Second layer*

In [None]:
ridge_model = RidgeRegressor(lambda=0.1)
ridge = machine(ridge_model, W, z)
ẑ = predict(ridge, W)

*Output*

In [None]:
ŷ = inverse_transform(box, ẑ)

No fitting has been done thus far, we have just defined a sequence of operations.

To form a model out of that network is easy using the `@from_network` macro:

In [None]:
@from_network CompositeModel(std=std_model, box=box_model,
                             ridge=ridge_model) <= ŷ;

The macro defines a constructor CompositeModel and attributes a name to the different models; the ordering / connection between the nodes is inferred from `ŷ` via the `<= ŷ`.

**Note**: had the model been probabilistic (e.g. `RidgeClassifier`) you would have needed to add `is_probabilistic=true` at the end.

In [None]:
cm = machine(CompositeModel(), X, y)
res = evaluate!(cm, resampling=Holdout(fraction_train=0.8, rng=51),
                measure=rms)
round(res.measurement[1], sigdigits=3)

## You can check more [Data Science tutorials in Julia](https://alan-turing-institute.github.io/DataScienceTutorials.jl/).