<h1 align="center">Machine Learning Tutorial: Regression</h1>

### 1. What is Regression?
Regression analysis is the process to find the relationship between the ***dependent variable*** and the ***independent variable***.

Generally, regression analysis is mainly used for 2 completely different purposes:

1. ***Predict*** or ***estimate*** the future value/trend using the observed data. This is widely used in salary forecasting, price estimation, drug efficiency test and so on.

2. ***Reveal the causal relationships*** between the dependent and the independt variable (How one variable could influence the other).

Within the first condition, we have to carefully ***justify why the prediction using regression is valid and accurate***.

Within the second condition, we have to ***explain where this causal relationship comes from***.

There are lots of types of regression in the machine learning:

#### 1.1 Linear Regression
As we mentioned before, the relationship within the linear regression model is strictly linear.

#### 1.2 Logistic Regression
The logistic regression use the sigmoid function $f(x)=\frac{1}{1+e^{-x}}$ to output a value between 0~1.

This regression model could be used to solve classification problems.

#### 1.3 Polynomial Regression
The polynomial regression model could learn a non-linear relationship $y=a_0+a_1x+a_2x^2...$

From the above equation we could see that, ***the original feature is transformed to polynomial features of given degrees and then the final relationship is modeled as a linear model.***

#### 1.4 Support Vector Regression (SVR)
This is very similar to the Support Vector Machine (which we covered in the classification tutorial) and slight modified to solve regression problems.

The core goal of SVR is that we want to have the maximum number of data points between the boundary lines and the best-fit line.

#### 1.5 Decision Tree Regression
This regression model use a tree-like structure to solve the problem.

Basically you will meet lots of "test" stored in the roots. You could choose branches based on the answer for each "test" and reach the leaf node which embeds the final answer.

#### 1.6 Ridge Regression (L2 Regularisation)
This is the more powerful and robust version of linear regression by adding the L2 regularisation term. We will return to this later.

#### 1.7 Lasso Regression (L1 Regularisation)
This is similar to the ridge regression but using L1 regularisation term instead.

### 2. What is Supervised Learning?
Supervised Learning is the machine learning task to learn a mapping between the input features and the output and the goal is to ***generalise from the training data to accurately find the result for unseen data***.

Here, we will start with the linear regression to help you understand more about the supervised learning and feel the power of regression.

### 3. Preparation

#### 3.1 Loss Function
Loss is defined as the difference between the predicted result and the true value, which could be used to measure the distance between points in the feature spaces.

A valid loss function must obey the following rules:

1. The result is non-negative.

2. The loss/distance is symmetric: $Loss(A, B)=Loss(B, A)$

3. Triangular Inequality: $Loss(A,C) \leqslant Loss(A,B)+Loss(B,C)$ for any possible A, B, C"

##### 3.1.1 Mean Squared Error (MSE)
MSE is a very commonly used loss function within the regression problem as it suits problems with continuous variables very much.

The MSE could expressed as: $MSE(y,\hat{y})=\frac{1}{N}\sum_{i=1}^{N} (y_i-\hat{y_i})^2$

In [None]:
def MSE(prediction, target):
    '''
        Return the MSE between the prediction and the target.

    Argument:
        
    '''