# Linear Models for Regression

Given a training data set comprising N observations ${\mathbf{x}^n \in \mathbb{R}^D }$ , where $n =1 ,...,N$ , together with corresponding target values ${ y^n \in \mathbb{R} }$ , the goal is to predict the value of $y $ for a new value of $\mathbf{x}$.  

In the simplest approach, this can be done by directly constructing an appropriate function $f(\mathbf{x} )$ whose values for new inputs $\mathbf{x}$ constitute the predictions for the corresponding values of $y$. 




But first we define the required packages and create a small dataset.


### Packages

Let's first import all the packages that you will need during this assignment.
* `numpy` is the main package for scientific computing with Python.
* `matplotlib` and `seaborn` are libraries to plot graphs in Python.
* `np.random.seed(1)` is used to keep all the random function calls consistent.


In [None]:
# Insert the required Packages
import numpy as np
import matplotlib.pyplot as plt
# to make this notebook's output stable across runs
np.random.seed(1)

plt.style.use('fivethirtyeight')
%matplotlib inline

Define a function to generate a random data set

In [None]:
def dataset_regression(n=300,form = 'linear'): 
    x=np.sort(np.random.uniform(size=n)*2.*np.pi)
    
    if form =='non_linear':
        y= 3.*np.sin(x)
        # Add some noise to the observation 
        y += np.random.randn(n)/2.
    else:
        y = 2. * x - 5.5
        # Add some noise to the observation 
        y += np.random.randn(n)
    
    
    
    # Reshape the observations to the shape (N,1)
    x = x.reshape((-1, 1))
    y = y.reshape((-1, 1))
    return x,y

In [None]:
# Load the dataset
n_samples = 150 # number of samples
X_l, y_l = dataset_regression(n_samples, form = 'linear')
X_n, y_n = dataset_regression(n_samples, form = 'non_linear')

#print the shape of tha data: 
print('We have %d %d-dimensional observations.'%X_l.shape)

fig,ax = plt.subplots(1,2,figsize=(15,5))

# Plot the dataset
ax[0].scatter(X_l[:,0], y_l[:,0],s=40, cmap=plt.cm.Spectral, label = 'Dataset')
ax[0].set_xlabel('x')
ax[0].set_ylabel('y')
ax[0].legend()

ax[1].scatter(X_n[:,0], y_n[:,0],s=40, cmap=plt.cm.Spectral, label = 'Dataset')
ax[1].set_xlabel('x')
ax[1].set_ylabel('y')
ax[1].legend()

plt.show()

## Linear Basis Function Models
The simplest linear model for regression is one that involves a linear combination of
the input variables

The output of the model for only one observation $\mathbf{x}^{i} \in \mathbb{R}^D $
$$ y^i = w_0 + w_1 x_1^i + ... + w_D x_D^i  \tag{1}$$

or for $N$ observations in matrix Form:
$$ \mathbf{y} = \begin{bmatrix}
 y^1 \\ y^2 \\ \vdots \\y^N
\end{bmatrix}, \quad  \mathbf{x} = \begin{bmatrix}
1&  \mathbf{x}^1 \\ 1 & \mathbf{x}^2 \\ \vdots & \vdots \\ 1 & \mathbf{x}^N
\end{bmatrix},  \quad 
\mathbf{w} = \begin{bmatrix}
 w_0 \\ w_1 \\ \vdots \\w_D
\end{bmatrix}   \Rightarrow$$

$$\mathbf{y} =  \mathbf{x}\mathbf{w} \tag{2}$$


In [None]:
# 1D Models
def constant_model(x):
    return 0*x+3

def inverter_model(x):
    return (-1)*x+0    

In [None]:
# Linear dataset
res_1 = constant_model(X_l)
res_2 = inverter_model(X_l)

# Non linear dataset
res_3 = constant_model(X_n)
res_4 = inverter_model(X_n)


fig,ax = plt.subplots(1,2,figsize=(15,5))

# Plot the dataset
ax[0].scatter(X_l[:,0], y_l[:,0],s=40, cmap=plt.cm.Spectral, label = 'Dataset')
ax[0].plot(X_l, res_1, c='g', label="constant")
ax[0].plot(X_l, res_2, c ='r', label="inverter")
ax[0].set_xlabel('x')
ax[0].set_ylabel('y')
ax[0].legend()

ax[1].scatter(X_n[:,0], y_n[:,0],s=40, cmap=plt.cm.Spectral, label = 'Dataset')
ax[1].plot(X_n, res_3, c='g', label="constant")
ax[1].plot(X_n, res_4, c ='r', label="inverter")
ax[1].set_xlabel('x')
ax[1].set_ylabel('y')
ax[1].legend()

plt.show()


### Bringing data into matrix form

Write a function to create a general linear model in **Matrix Form**, see Equation 2 :

   **Note** you can use [ np.hstack](https://docs.scipy.org/doc/numpy/reference/generated/numpy.hstack.html) to stack arrays in sequence horizontally and  [ np.ones_like](https://docs.scipy.org/doc/numpy/reference/generated/numpy.ones_like.html) to create an array of ones with the same shape and type as a given array.  


In [None]:
# Returns the training data with an additional dimension(bias)
def data_augmentation(X):
    X_aug = np.hstack((np.ones_like(X), X))
    
    return X_aug

Test your function and run the next code

In [None]:
X_aug = data_augmentation(X_l)
print('The first five augmented observations are:\n', X_aug[:5,:])

### Task 1 (5 Min)

Write a function to create a general linear model in **Matrix Form**, see Equation 2 :

**Note** The matrix multiplication can be done using np.dot(a,b).

In [None]:
# returns Predicted
def linear_model(X_aug,w):
    
    # ToDo: compute y_estimate
    y_estimate =    
    
    return  y_estimate

Define the best weight vector for the linear data set and test your function `linear_model(X,w)` on linear dataset 

**Note**  you can define a vector matrix with numpy through:
```python
w = np.array([[1],[2]])
```
or use [np.expand_dims](https://https://docs.scipy.org/doc/numpy-1.15.1/reference/generated/numpy.expand_dims.html)  to expand the shape of one-dimensional array.


```python
w = np.array([1,2])
w = np.expand_dims(w,axis=1)
```




In [None]:
# Todo: write the best weight vector
w =

Test your function and run the next code

In [None]:
# Make prediction 
y_estimate = linear_model(X_aug,w)

# Plot the result
fig,ax = plt.subplots(1,1,figsize=(7,5))

# Plot the dataset
ax.scatter(X_l[:,0], y_l[:,0],s=40, cmap=plt.cm.Spectral, label = 'Dataset')
ax.plot(X_l, y_estimate, c='g', label="Linear model")
ax.set_xlabel('x')
ax.set_ylabel('y')
ax.legend()

plt.show()

## Error Function

In [None]:
# the index of the maximum error
idx = np.argmax(- y_estimate[:,0] + y_l[:,0] )

# Plot the error
fig,ax = plt.subplots(1,1,figsize=(7,5))


# Plot the dataset
ax.scatter(X_l[:,0], y_l[:,0],s=40, cmap=plt.cm.Spectral, label = 'Dataset')
ax.plot(X_l, y_estimate, c='g', label="Linear model")
ax.arrow(X_l[idx,0],y_estimate[idx,0], 0,- y_estimate[idx,0] + y_l[idx,0]-0.3, head_width=0.1, head_length=0.2, fc='k', ec='k')
ax.set_xlabel('x')
ax.set_ylabel('y')
ax.legend()

plt.show()

To estimate the accuracy of our model, we use the error function. One such function is the Mean Squared Error function, which measures the average of the squared difference between an estimation and the ground-truth value. The squared loss  would be the sum of square of the errors for each training point $\mathbf{x}^i$ divided by the amount of points $N$.

$$L = \frac{1}{2N} \sum^N_{i=1}(y^i- \hat{y}^i )^2 \tag{3}$$

### Task 2 (5 Min)

Write a function to estimate the  mean square error, see Equation 3. This function should return the loss function value and the error vector:
$$
\mathbf{E} =\mathbf{y} -  \hat{\mathbf{y}} $$


**Note** Use the command ``y.shape[0]`` to estimate $N$,  ``errors**2`` to deterime the quadratic error and  [np.sum](https://docs.scipy.org/doc/numpy-1.15.1/reference/generated/numpy.sum.html)  to determine the sum of all  quadratic errors over the dataset.


In [None]:
def loss_function(y,y_estimate):
  
  # ToDo: Compute error vector
  E =
  # ToDo: Compute the mean squared error
  L =

  return L, E

use the ``loss_function`` to determine the mean square error of your model.

In [None]:
L, E = loss_function(y_l,y_estimate)
print('The mean Square error is ', L)

## Linear Regression  using scikit Learn
**Note** see the example on [sklearn.linear_model.LinearRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LinearRegression.html)

In [None]:
from sklearn.linear_model import LinearRegression
# fit_intercept: 
model_l = LinearRegression(fit_intercept=True)
model_l.fit(X_l, y_l)
yfit_l = model_l.predict(X_l)

model_n = LinearRegression(fit_intercept=False)
model_n.fit(X_n, y_n)
yfit_n = model_n.predict(X_n)

run the next code to test your model

In [None]:
# Plot the result
fig,ax = plt.subplots(1,2,figsize=(15,5))
ax[0].scatter(X_l, y_l , s=40, cmap=plt.cm.Spectral, label = 'Ground Truth')
ax[0].plot(X_l, yfit_l, c='g', label="Trained model")
ax[0].set_xlabel('x')
ax[0].set_ylabel('y')
ax[0].legend()

ax[1].scatter(X_n, y_n, s=40, cmap=plt.cm.Spectral, label = 'Ground Truth')
ax[1].plot(X_n, yfit_n, c='g', label="Trained model")
ax[1].set_xlabel('x')
ax[1].set_ylabel('y')
ax[1].legend()

plt.show()


## Polynomial regression
The key property of the linear model is that it is a linear function of the parameters $w_0 ,\ldots,w_D$ .It is also, however, a linear function of the input variables $x_i$ , and this imposes significant limitations on the model. We therefore extend the class of models by considering linear combinations of fixed nonlinear functions of the input variables, of the form

$$y^i=  f( x^i, w )= w_0 + \sum_{j =1 }^{ D} w_j φ_j ( x^i )\tag{7}$$

where $φ_j ( x^i )$  are known as basis functions

The Polynomial regression is a simple example for nonlinear basis functions
$$ y = w_0 + w_1 x+  w_1 x^2 +  w_3 x^3  \tag{8}$$

**Note** see the example on [sklearn.preprocessing.PolynomialFeatures](https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.PolynomialFeatures.html)

In [None]:
from sklearn.preprocessing import PolynomialFeatures
poly=PolynomialFeatures(degree=3)

poly_X =poly.fit_transform(X_n)

In [None]:
from sklearn.linear_model import LinearRegression
model=LinearRegression()
model.fit(poly_X,y_n)

yfit = model.predict(poly_X)

Run the next code to test your model

In [None]:
# Plot the result
fig,ax = plt.subplots(1,1,figsize=(7,5))
ax.scatter(X_n, y_n , s=40, cmap=plt.cm.Spectral, label = 'Ground Truth')
ax.plot(X_n, yfit, c='g', label="Trained model")
ax.set_xlabel('x')
ax.set_ylabel('y')
ax.legend()
plt.show()