***
**Introduction to Machine Learning** <br>
__[https://slds-lmu.github.io/i2ml/](https://slds-lmu.github.io/i2ml/)__
***

# Exercise sheet 2: Supervised Regression

## Exercise 1: HRO

Throughout the lecture, we will frequently use the R package `mlr3`, resp. the Python package `sklearn`, and its
descendants, providing an integrated ecosystem for all common machine learning tasks. Let’s recap the HRO
principle and see how it is reflected in either `mlr3` or `sklearn`.<br>
An overview of the most important objects andtheir usage, illustrated with numerous examples, can be found at  __[https://mlr3book.mlr-org.com/basics.html](https://mlr3book.mlr-org.com/basics.html)__
and __[https://scikit-learn.org/stable/index.html](https://scikit-learn.org/stable/index.html)__.

> a) How are the key concepts (i.e., hypothesis space, risk and optimization) you learned about in the lecture videos
implemented?

> **\# Entering your answer here:**

> b) Have a look at `mlr3::tsk("iris")`/`from sklearn.datasets import load_iris`. What attributes does this
object store?

In [None]:
# Entering your code here:

> c) Pick a module for classification or regression of your choice. What are the different settings for this learner?

<div class="alert alert-block alert-info">
<b>R Hint:</b> use <code>mlr3::mlr_learners$keys()</code> to see all available learners. <br>
<b>Python Hint:</b> Import the specific module and use <code>get_params()</code> to see all available settings.
</div>

In [None]:
# Entering your code here:

## Exercise 2: Loss Functions for Regression Tasks

In this exercise, we will examine loss functions for regression tasks somewhat more in depth.

![unnamed-chunk-5-1-1-2.png](attachment:unnamed-chunk-5-1-1-2.png)

<blockquote> a) Consider the above linear regression task. How will the model parameters be affected by adding the new outlier
point (orange) if you use <br>
    <blockquote>
     1. <i>L1</i> loss <br>
     2. <i>L2</i> loss 
    </blockquote>
in the empirical risk? (You do not need to actually compute the parameter values.)
</blockquote>

![unnamed-chunk-6-1-1.png](attachment:unnamed-chunk-6-1-1.png)

> **\# Entering your answer here:**

> b) The second plot visualizes another loss function popular in regression tasks, the so-called *Huber loss* (depending on $ϵ > 0$; here: $ϵ = 5$). Describe how the Huber loss deals with residuals as compared to *L1* and *L2* loss. <br>
Can you guess its definition?

> **\# Entering your answer here:**

> c) Derive the least-squares estimator, i.e., the solution to the linear model when using *L2* loss, analytically via
$$\hat{\theta} = \text{argmin}_{\theta \in \Theta} \| y - X \hat{\theta} \|_2^2$$.

> **\# Entering your answer here:**

## Exercise 3: Polynomial Regression

Assume the following (noisy) data-generating process from which we have observed $50$ realizations:
$$y = −3 + 5 · sin(0.4πx) + ϵ$$
with $ϵ ∼ N (0, 1)$.

![unnamed-chunk-7-1-1.png](attachment:unnamed-chunk-7-1-1.png)

> a) We decide to model the data with a cubic polynomial (including intercept term). State the corresponding
hypothesis space.

> **\# Entering your answer here:**

> b) Demonstrate that this hypothesis space is simply a parameterized family of curves by plotting curves for $3$
different models belonging to the considered model class.

In [None]:
# Entering your code here:

> c) State the empirical risk w.r.t. $\theta$ for a member of the hypothesis space. Use *L2* loss and be as explicit as possible.

> **\# Entering your answer here:**

> d) We can minimize this risk using gradient descent. In order to make this somewhat easier, we will denote the
transformed feature matrix, containing $x$ to the power from $0$ to $3$, by $\tilde{X}$ , such that we can express our model
by $\tilde{X} \theta$ (note that the model is still linear in its parameters, even if $X$ has been transformed in a non-linear
manner!). Derive the gradient of the empirical risk w.r.t $\theta$.

> **\# Entering your answer here:**

> e) Using the result from d), state the calculation to update the current parameter $\theta^{[t]}$.

> **\# Entering your answer here:**

> f) You will not be able to fit the data perfectly with a cubic polynomial. Describe the advantages and disadvantages
that a more flexible model class would have. Would you opt for a more flexible learner?

> **\# Entering your answer here:**

## Exercise 4: Predicting `abalone`

We want to predict the age of an abalone using its longest shell measurement and its weight. <br>
See __[https://archive.ics.uci.edu/ml/machine-learning-databases/abalone/](https://archive.ics.uci.edu/ml/machine-learning-databases/abalone/)__ for more details.

In [None]:
url <- "https://archive.ics.uci.edu/ml/machine-learning-databases/abalone/abalone.data"
abalone <- read.table(url, sep = ",", row.names = NULL)
colnames(abalone) <- c(
"sex", "longest_shell", "diameter", "height", "whole_weight",
"shucked_weight", "visceral_weight", "shell_weight", "rings")
abalone <- abalone[, c("longest_shell", "whole_weight", "rings")]

In [None]:
import pandas as pd
url = "https://archive.ics.uci.edu/ml/machine-learning-databases/abalone/abalone.data"
abalone = pd.read_csv(url, sep=",",
names=["sex", "longest_shell", "diameter", "height", "whole_weight",
"shucked_weight", "visceral_weight", "shell_weight", "rings"])
abalone = abalone[["longest_shell", "whole_weight", "rings"]]

> a) Plot `LongestShell` and `WholeWeight` on the x- and y-axis, respectively, and color points according to `Rings`.

In [None]:
# Entering your code here:

> b) `R`: Create an `mlr3` task for the abalone data. Define a linear regression learner (for this you will need to load
the `mlr3learners` extension package first) and use it to train a linear model on the abalone data. <br>
`Python`: Initiate a linear regression learner (for this you will need to import the `from sklearn.linear model
import LinearRegression` extension package first) and use it to train a linear model on the abalone data.

In [None]:
# Entering your code here:

> c) Compare the fitted and observed targets visually.

<div class="alert alert-block alert-info">
<b>R Hint:</b> use <code>autoplot()</code>. <br>
<b>Python Hint:</b> use <code>import matplotlib.pyplot as plt</code>. 
</div>

In [None]:
# Entering your code here:

> d) Assess the model’s training loss in terms of MAE.

<div class="alert alert-block alert-info">
<b>R Hint:</b> losses are retrieved by calling <code>$score()</code>, which accepts different mlr measures, on the prediction object. <br>
<b>Python Hint:</b> The MAE metric is retrieved by calling <code>from sklearn.metrics import
mean_absolute_error</code>.
</div>

In [None]:
# Entering your code here:

![abalone.JPG](attachment:abalone.JPG)

__[https://en.wikipedia.org/wiki/Abalone#/media/File:LivingAbalone.JPG](https://en.wikipedia.org/wiki/Abalone#/media/File:LivingAbalone.JPG)__