# PolynomialRegression from scratch

![Creative Commons License](https://i.creativecommons.org/l/by/4.0/88x31.png)  
This work by Jephian Lin is licensed under a [Creative Commons Attribution 4.0 International License](http://creativecommons.org/licenses/by/4.0/).

In [None]:
import numpy as np
import matplotlib.pyplot as plt

## Algorithm
**Input:**  
- `X`: an array of shape `(N,1)` whose rows are samples and columns are features
- `y`: the labels of shape `(N,)`
- `degree`: the degree of the polynomial 
- `**kwargs`: keywords for your linear regression function

**Output:**  
Revised output of your linear regression function.

**Steps:**
1. Let `X_ex = X**np.arange(1, degree + 1)` .
2. Suppose `LR` is your linear regression fuction.  
Let `predict_lin,coef,intercetp = LR(X_ex, **kwargs)` .  
3. Define the function `predict` that sends `X_test` to `(X_test**np.arange(1, degree+1)).dot(coef) + intercept` .

## Pseudocode
Translate the algorithm into the pseudocode.  
This helps you to identify the parts that you don't know how to do it.  

    1. 
    2. 
    3. ...

## Code

In [None]:
### your answer here
import numpy as np
from sklearn.metrics import mean_squared_error

class MyPolynomialRegression():
    def __init__(self, degree, fit_intercept=True, algorithm="projection", learning_rate=0.01, n_iter=10000, regularization=None, alpha=1):
        self.degree = degree
        self.fit_intercept = fit_intercept
        self.algorithm = algorithm
        self.learning_rate = learning_rate
        self.n_iter = n_iter
        self.regularization = regularization
        self.alpha = alpha
        self.coef_ = 0
        self.intercept_ = 0
        
    def predict(self, X):
        return (X**np.arange(1, self.degree+1)).dot(self.coef_) + self.intercept_
    
    def fit(self, X, y):
        X_ex = X**np.arange(1, self.degree + 1)
        
        if self.fit_intercept:
            A = np.hstack([np.ones([X_ex.shape[0],1]), X_ex])
        else:
            A = X_ex.copy()
            
        N = A.shape[0]
        dp = A.shape[1]
        
        if self.algorithm == "projection":
            v = np.linalg.inv(A.T.dot(A)).dot(A.T).dot(y)
        if self.algorithm == "grad_descent":
            v = np.random.randn(dp)
            
            self.fit_score = np.zeros(self.n_iter)
            self.fit_MSE = np.zeros(self.n_iter)
        
            if self.fit_intercept:
                for i in range(self.n_iter):
                    self.coef_ = v[1:]
                    self.intercept_ = v[0]
                    self.fit_score[i] = self.score(X, y)
                    self.fit_MSE[i] = mean_squared_error(y, self.predict(X))
                    gradient = (2/N)*((A.dot(v)-y).T.dot(A))
                    if self.regularization == "L1":
                        gradient = gradient + self.alpha * np.sign(v)
                    if self.regularization == "L2":
                        gradient = gradient + self.alpha * 2 * v
                    v = v - self.learning_rate * gradient
            else:
                for i in range(self.n_iter):
                    self.coef_ = v
                    self.intercept_ = 0.0
                    self.fit_score[i] = self.score(X, y)
                    self.fit_MSE[i] = mean_squared_error(y, self.predict(X))
                    gradient = (2/N)*((A.dot(v)-y).T.dot(A))
                    if self.regularization == "L1":
                        gradient = gradient + self.alpha * np.sign(v)
                    if self.regularization == "L2":
                        gradient = gradient + self.alpha * 2 * v
                    v = v - self.learning_rate * gradient
        
        if self.fit_intercept:
            self.coef_ = v[1:]
            self.intercept_ = v[0]
        else:
            self.coef_ = v
            self.intercept_ = 0.0
        
    
    def score(self, X, y):
        y_new = self.predict(X)
        R2 = 1-((y-y_new)**2).sum()/((y-y.mean())**2).sum()
        return R2

## Test
Take some sample data from [PolynomialRegression-with-scikit-learn](PolynomialRegression-with-scikit-learn.ipynb) and check if your code generates similar outputs with the existing packages.

##### Name of the data
Description of the data.

In [None]:
### results with your code

x = np.arange(10)
y = 0.1*x**2 + 0.2*x + 0.3 + 0.5*np.random.randn(10)
X = x[:,np.newaxis]
x_test = np.linspace(0,10,20)
X_test = x_test[:,np.newaxis]

model = MyPolynomialRegression(2)
model.fit(X, y)
y_new = model.predict(X_test)
plt.scatter(x,y)
plt.plot( x_test , y_new, c='r')

print(model.coef_)
print(model.intercept_)

In [None]:
### results with existing packages
from sklearn.pipeline import make_pipeline
from sklearn.preprocessing import PolynomialFeatures
from sklearn.linear_model import LinearRegression

def PolynomialRegression(degree=2, fit_intercept=True):
    return make_pipeline(PolynomialFeatures(degree=degree, include_bias=False), 
                         LinearRegression(fit_intercept=fit_intercept))

model = PolynomialRegression(2)
model.fit(X, y)
y_new = model.predict(X_test)
plt.scatter(x,y)
plt.plot( x_test , y_new, c='r') 

print(model[1].coef_)
print(model[1].intercept_)

## Comparison

##### Exercise 1
Let  
```python
degree = 3
x = np.arange(5)
X = x[:,np.newaxis]
```

###### 1(a)
Let `X_ex1 = X**np.arange(1, degree+1)` .  
The new data `X_ex1` is supposed to be the same as the output of `sklearn.preprocessing.PolynomialFeatures` with `include_bias=False` .  
Check if this is true.

In [None]:
### your answer here
from sklearn.preprocessing import PolynomialFeatures

x = np.arange(10)
X = x[:,np.newaxis]
degree = 2

model = PolynomialFeatures(degree=degree, include_bias=False)
X_ex = model.fit_transform(X)

X_ex1 = X**np.arange(1, degree+1)
print(X_ex1 == X_ex)

###### 1(b)
Let `X_ex1 = X**np.arange(0, degree+1)` .  
The new data `X_ex1` is supposed to be the same as the output of `sklearn.preprocessing.PolynomialFeatures` with `include_bias=True` .  
Check if this is true.

In [None]:
### your answer here
model = PolynomialFeatures(degree=degree, include_bias=True)
X_ex = model.fit_transform(X)

X_ex1 = X**np.arange(0, degree+1)
print(X_ex1 == X_ex)

##### Exercise 2
Let  
```python
x = np.arange(10)
y = 0.1*x**2 + 0.2*x + 0.3 + 0.5*np.random.randn(10)
X = x[:,np.newaxis]
```

###### 2(a)
Let `degree=2` .
Apply the linear regresssion algorithm to `X`  
1. by your code with `algorithm=="projection"` ,  
2. by your code with `algorithm=="grad_descent"` ,  
3. by `sklearn.linear_model.LinearRegresssion` .  

Check if the outputs are almost the same (up to some numerical errors).  

In [None]:
### your answer here
model = MyPolynomialRegression(degree=2, algorithm="projection")
model.fit(X, y)
print("projection:")
print(model.coef_)
print(model.intercept_)

model = MyPolynomialRegression(degree=2, algorithm="grad_descent", learning_rate=0.0006, n_iter=30000)
model.fit(X, y)
print("grad_descent:")
print(model.coef_)
print(model.intercept_)

model = PolynomialRegression(2)
model.fit(X, y)
print("sklearn:")
print(model[1].coef_)
print(model[1].intercept_)

###### 2(b)
Modify your code so that it prints the mean square error at each step of the gradient descent.  
Check if it is always decreasing.

In [None]:
### your answer here
x = np.arange(10)
y = 0.1*x**2 + 0.2*x + 0.3 + 0.5*np.random.randn(10)
X = x[:,np.newaxis]

model = MyPolynomialRegression(degree=2, algorithm="grad_descent", learning_rate=0.0006, n_iter=30000)
model.fit(X, y)
print(model.fit_MSE[:10:1])
plt.scatter(np.arange(model.n_iter), model.fit_MSE)

##### Exercise 3
Add a new keyword `regularization`, which can be `None`, `"L1"`, or `"L2"` .  
Add another keyword `alpha`, which is a positive number.  

When `regularization==None`, the cost function is 
$$\frac{1}{N}\sum_{i=0}^{N-1}\|f({\bf x}_i) - y_i\|^2.$$ 
When `regularization=="L1"`, the cost function is 
$$\frac{1}{N}\sum_{i=0}^{N-1}\|f({\bf x}_i) - y_i\|^2 + \sum_{i=0}^{d-1}|c_i|.$$ 
When `regularization=="L2"`, the cost function is 
$$\frac{1}{N}\sum_{i=0}^{N-1}\|f({\bf x}_i) - y_i\|^2 + \sum_{i=0}^{d-1}c_i^2.$$ 
Here ${\bf x}_i$ are the data, $y_i$ are the labels, and $c_i$ are the coefficients to be solved.

The regularization avoids the coefficients being too high.

In [None]:
model = MyPolynomialRegression(degree=2, algorithm="grad_descent", learning_rate=0.0006, n_iter=30000)
model.fit(X, y)
print(model.coef_)
print(model.intercept_)
print("MSE:", model.fit_MSE[-1])

###### 3(a)
When `regularization=="L1"`, the correct gradient is `g = g0 + alpha * np.sign(c)` , where `g0` is the gradient when `regularization==None` .  
Update your code for L1.

In [None]:
### your answer here
model = MyPolynomialRegression(degree=2, algorithm="grad_descent", learning_rate=0.0006, n_iter=30000, regularization="L1")
model.fit(X, y)
print(model.coef_)
print(model.intercept_)
print("MSE:", model.fit_MSE[-1])

###### 3(b)
When `regularization=="L2"`, the correct gradient is `g = g0 + alpha * 2 * v` , where `g0` is the gradient when `regularization==None` .  
Update your code for L2.

In [None]:
### your answer here
model = MyPolynomialRegression(degree=2, algorithm="grad_descent", learning_rate=0.0006, n_iter=30000, regularization="L2")
model.fit(X, y)
print(model.coef_)
print(model.intercept_)
print("MSE:", model.fit_MSE[-1])

##### Jephian:
Wonderful.