<a href="https://colab.research.google.com/github/kuanhoong/mlr3/blob/main/demo/Introduction_to_mlr3.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Introduction to Machine Learning with mlr3

![](https://mlr3.mlr-org.com/reference/figures/mlr3verse.svg)

[mlr3](https://mlr3.mlr-org.com/) is an R package that makes machine learning easier. It provides a unified interface to existing machine learning algorithms and makes it easy to experiment with different models. mlr3 also provides a number of features that make it easier to evaluate and compare different models.

# Resource
- [slides](https://docs.google.com/presentation/d/1fUIBp-oZ8cygwnNzUH7Y7jk2s_ZhshswcsANuMpImFA/edit?usp=sharing)
- [Github](https://github.com/kuanhoong/mlr3)

## Installation

- Install [R](https://cran.r-project.org/mirrors.html) from here.
- Install [RStudio](https://www.rstudio.com/products/rstudio/download/) from here.
- Follow kuanhoong's github repo [here](https://github.com/kuanhoong/mlr3) to test your set-up (note: it is not necessary to establish communication between Kaggle and Github for this session).
- Install all the packages in the mlr3 by running `install.packages("mlr3")` and `install.packages("mlr3verse")` in Kaggle/Google Colab notebook.

In [None]:
install.packages("mlr3")
install.packages("mlr3verse")
install.packages("GGally")
library(mlr3)
library(mlr3verse)
library(mlr3viz)
library(GGally)

## Example 1: Iris

In [None]:
head(iris) ##included in R

## Constructing Learners and Tasks

In [None]:
task = TaskClassif$new("iris", iris, "Species")

In [None]:
autoplot(task, type = "pairs")

In [None]:
task

In [None]:
task$nrow

In [None]:
task$ncol

## Basic train + predict

In [None]:
learner = lrn("classif.rpart")

In [None]:
## train 
learner$train(task)

In [None]:
print(learner$model)

In [None]:
## check hyperparameters
as.data.table(learner$param_set)

In [None]:
## change learner behaviour
learner$param_set$values = list(maxdepth = 1, xval = 0)

In [None]:
## train 
learner$train(task)

In [None]:
print(learner$model)

In [None]:
install.packages("data.table")
library(data.table)

In [None]:
## create new_data
new_data = data.table("Sepal.Length"=c(4,2), "Sepal.Width"=c(3,2), "Petal.Length"=c(2,3), "Petal.Width" =c(1,2))
new_data


In [None]:
## prediction

prediction = learner$predict_newdata(new_data)
prediction

In [None]:
## change prediction type
learner$predict_type = "prob"
prediction = learner$predict_newdata(new_data)
prediction

In [None]:
## view prediction object by using data.table

as.data.table(prediction)

In [None]:
prediction$response

## Example 2: Penguins

In [None]:
# create learning task
task_penguins = as_task_classif(species ~ ., data = palmerpenguins::penguins)
task_penguins

In [None]:
# load learner and set hyperparameter
learner = lrn("classif.rpart", cp = .01)

In [None]:
# train/test split
split = partition(task_penguins, ratio = 0.67)

# train the model
learner$train(task_penguins, split$train_set)

# predict data
prediction = learner$predict(task_penguins, split$test_set)

# calculate performance
prediction$confusion

measure the model performance

In [None]:
measure = msr("classif.acc")
prediction$score(measure)

In [None]:
## confusion matrix
prediction$confusion

In [None]:
autoplot(prediction)

# Resample

In [None]:
# 3-fold cross validation
resampling = rsmp("cv", folds = 3L)

# run experiments
rr = resample(task_penguins, learner, resampling)

# access results
rr$score(measure)[, .(task_id, learner_id, iteration, classif.acc)]

# Example 3

Regression with mtcars data


In [None]:
install.packages("ranger")
library(ranger)

In [None]:
### Create Task ---------------------------------------------------------------#
data("mtcars", package = "datasets")
task_mtcars = as_task_regr(mtcars, target = "mpg", id = "cars")

In [None]:
task_mtcars
task_mtcars$feature_names
task_mtcars$target_names

In [None]:
### Create Learner ------------------------------------------------------------#
lrn_rf = lrn("regr.ranger")
lrn_rf

In [None]:
lrn_rf$param_set
lrn_rf$model

In [None]:
# train/test split
split = partition(task_mtcars, ratio = 0.67)

# train the model
lrn_rf$train(task_mtcars, split$train_set)

# predict data
prediction = lrn_rf$predict(task_mtcars, split$test_set)

# prediction results
prediction$response

In [None]:
# Inspect predictions
head(data.frame(
id = 1:length(prediction$truth),
truth = prediction$truth,
response = prediction$response))

In [None]:
# Get nice visualization with a one-liner
mlr3viz::autoplot(prediction)

In [None]:
# Define MAE metric
mae <- mlr3::msr("regr.mae")
# Assess performance (MSE by default)
prediction$score()


In [None]:
prediction$score(mae)

In [None]:
# Define MSE metric
mse <- mlr3::msr("regr.mse")
prediction$score(mse)

# Conclusion

In this webinar, you learned about mlr3 which is a powerful R package that makes machine learning easier. It provides a unified interface to existing machine learning algorithms and makes it easy to experiment with different models. mlr3 also provides a number of features that make it easier to evaluate and compare different models.

# Resources

- [mlr3 website](https://mlr3.mlr-org.com/)
- [Flexible and Robust Machine Learning Using mlr3 in R (ebook)](https://mlr3book.mlr-org.com/)
- [Exploring the World of Machine Learning with mlr3 in R](https://medium.com/@moonchangin/introduction-to-machine-learning-in-r-mlr3-e3229b97d422)
- [Building ML models using mlr3 ](https://medium.com/@natalie.a.foss/building-ml-models-using-mlr3-b91c9b26a9a3)
- [mlr3 cheatsheets](https://cheatsheets.mlr-org.com/)
- [Introduction to Machine Learning (I2ML)](https://slds-lmu.github.io/i2ml/)

# About Me

## Poo Kuan Hoong, Ph.D
Connect with me:
- [Twitter](https://twitter.com/kuanhoong)
- [LinkedIn](https://www.linkedin.com/in/kuanhoong/)
- [GitHub](https://github.com/kuanhoong/)