# E 2.1: Linear Regression

In this exercise we will explore linear regression.  We will implement 
a linear regression fitter, evaluate its performance on the ZIP code 
data set and compare it to the implementation in the `scikit-learn` 
library.


## A. Implementation

Implement a linear regression estimator using the results derived in
the lecture.  You will need to implement two functions, a fit function 
that determines the parameter vector $\hat{\beta}$ from a training
dataset and an prediction function that computes $\hat{Y}$ 
from a test dataset. 

The fit function should implement 

\begin{align}
\hat{\beta} = (X^T X)^{-1} X^T Y
\end{align}

where $X$ is the training dataset and $Y$ are the training labels. Remember our 
convention to include the bias in $X$ by adding the constant $1$ to the 
$p$-vectors $x_i$ from the training data. That is, the dimensions of $X$
are $(N, p+1)$.

The prediction function should implement

\begin{align}
\hat{Y} = X^T \hat{\beta}
\end{align}

where $\hat{Y}$ are the predicted labels and $X$ is the input data from
the test data set (again with a constant $1$ bias column added).

#### *Hint:* 
*Use the `numpy` library to perform all matrix operations.*

In [2]:
import numpy as np

class LREstimator(object):
    
    def __init__(self):
        self.betas = None
        
    def fit(self, xs_train, ys_train):
        X = np.ones((xs_train.shape[0], xs_train.shape[1] + 1))
        X[:,1:] = xs_train
        s = np.linalg.inv(X.T.dot(X))
        self.betas = np.linalg.multi_dot(s, X.T, ys_train)
        
        return self
    
    
    def predict(self, xs):
        # protect against premature calls 
        if self.betas is None:
            raise ValueError('can not predict from unfitted model')
        
        ys = xs.dot(self.betas)
        
        return ys

    print('TAT')

TAT
