# PolynomialRegression from scratch

![Creative Commons License](https://i.creativecommons.org/l/by/4.0/88x31.png)  
This work by Jephian Lin is licensed under a [Creative Commons Attribution 4.0 International License](http://creativecommons.org/licenses/by/4.0/).

In [None]:
import numpy as np
import matplotlib.pyplot as plt

## Algorithm
**Input:**  
- `X`: an array of shape `(N,1)` whose rows are samples and columns are features
- `y`: the labels of shape `(N,)`
- `degree`: the degree of the polynomial 
- `**kwargs`: keywords for your linear regression function

**Output:**  
Revised output of your linear regression function.

**Steps:**
1. Let `X_ex = X**np.arange(1, degree + 1)` .
2. Suppose `LR` is your linear regression fuction.  
Let `predict_lin,coef,intercept = LR(X_ex, y, **kwargs)` .  
3. Define the function `predict` that sends `X_test` to `(X_test**np.arange(1, degree+1)).dot(coef) + intercept` .

## Pseudocode
Translate the algorithm into the pseudocode.  
This helps you to identify the parts that you don't know how to do it.  

    1. 
    2. 
    3. ...

## Code

In [None]:
### your answer here

## Test
Take some sample data from [PolynomialRegression-with-scikit-learn](PolynomialRegression-with-scikit-learn.ipynb) and check if your code generates similar outputs with the existing packages.

##### Name of the data
Description of the data.

In [None]:
### results with your code

In [None]:
### results with existing packages

## Comparison

##### Exercise 1
Let  
```python
degree = 3
x = np.arange(5)
X = x[:,np.newaxis]
```

###### 1(a)
Let `X_ex1 = X**np.arange(1, degree+1)` .  
The new data `X_ex1` is supposed to be the same as the output of `sklearn.preprocessing.PolynomialFeatures` with `include_bias=False` .  
Check if this is true.

In [None]:
### your answer here

###### 1(b)
Let `X_ex1 = X**np.arange(0, degree+1)` .  
The new data `X_ex1` is supposed to be the same as the output of `sklearn.preprocessing.PolynomialFeatures` with `include_bias=False` .  
Check if this is true.

In [None]:
### your answer here

##### Exercise 2
Let  
```python
x = np.arange(10)
y = 0.1*x**2 + 0.2*x + 0.3 + 0.5*np.random.randn(10)
X = x[:,np.newaxis]
```

###### 2(a)
Let `degree=2` .
Apply the linear regresssion algorithm to `X`  
1. by your code with `algorithm=="projection"` ,  
2. by your code with `algorithm=="grad_descent"` ,  
3. by `sklearn.linear_model.LinearRegresssion` .  

Check if the outputs are almost the same (up to some numerical errors).  

In [None]:
### your answer here

###### 2(b)
Modify your code so that it prints the mean square error at each step of the gradient descent.  
Check if it is always decreasing.

In [None]:
### your answer here

##### Exercise 3
Add a new keyword `regularization`, which can be `None`, `"L1"`, or `"L2"` .  
Add another keyword `alpha`, which is a positive number.  

When `regularization==None`, the cost function is 
$$\frac{1}{N}\sum_{i=0}^{N-1}\|f({\bf x}_i) - y_i\|^2.$$ 
When `regularization=="L1"`, the cost function is 
$$\frac{1}{N}\sum_{i=0}^{N-1}\|f({\bf x}_i) - y_i\|^2 + \alpha\sum_{i=0}^{d-1}|c_i|.$$ 
When `regularization=="L2"`, the cost function is 
$$\frac{1}{N}\sum_{i=0}^{N-1}\|f({\bf x}_i) - y_i\|^2 + \alpha\sum_{i=0}^{d-1}c_i^2.$$ 
Here ${\bf x}_i$ are the data, $y_i$ are the labels, and $c_i$ are the coefficients to be solved.

The regularization avoids the coefficients being too high.

###### 3(a)
When `regularization=="L1"`, the correct gradient is `g = g0 + alpha * np.sign(c)` , where `g0` is the gradient when `regularization==None` .  
Update your code for L1.

In [None]:
### your answer here

###### 3(b)
When `regularization=="L2"`, the correct gradient is `g = g0 + alpha * 2 * v` , where `g0` is the gradient when `regularization==None` .  
Update your code for L2.

In [None]:
### your answer here