# Day 2.1: Making our own OLS class

Today we are going to write our own OLS class, plus some helper functions that generate random data. Specifically our goals are going to be:
- Make a linear projection function with b0, b1, x as input, y as output
- Write a data generating function
- Give a brief explanation of the scipy.optimize.minimize function
- Minimize the squared errors to estimate b0 and b1
- Create a class that implements the same minimization, 
  that takes data in instantiation, and has an 'estimate' method.

In [1]:
# for later
import numpy as np
from scipy.stats import distributions as iid
from scipy.stats import rv_continuous

## Building blocks of an OLS class
Now, before defining the OLS class methods, we are going to write its methods first outside the class so that we know they are behaving properly. Our goal is to write 2 main methods:
1. Linear projection
    - inputs: b0, b1, x
    - outputs: y
2. Minimizer function: minimizes the squared distance between the linear projection and y

### 1. Linear projection
Suppose we have a vector X that is N by 2, where the first column is a column of ones, and a vector of betas: b = [b0, b1]. The projection matrix, or the matrix that predicts y, is given by $Xb$.

In [2]:
def linear_projection(X, b):
    return X@b

### 2. Data generating process
We will create data that has N observations and a "true" but noisy relationship between $x$ and $y$. This type of data is used in Monte Carlo simulations to test theory.

In [3]:
# Now let's create some data with a function that takes a true
# value for beta, and a desired number of observ. and creates X and y
def dataGenerator(beta, N):
    # create an X vector
    # note I instantiate iid.norm() and call method rvs() in the same step!
    x = iid.norm().rvs(N)

    # create a random error
    e = iid.norm().rvs(N)
    # add an intercept by vertically stacking x with an array of ones,
    # then transposing
    X = 
    # create y
    y = 
    
    return X, y

In [4]:
# create data
beta_true = [2,8]
N = 100

X, y = dataGenerator(beta_true, N)

### 3. Minimizer function
In order to minimize, we are going to use a minimizer function from ``scipy.optimize``. The documentation for this function can be found [here](https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.minimize.html), but the key arguments are the following:
* ``fun``: the function to be minimized. This must be a function of only one input; if there are multiple inputs, we will "mask" these using lambda functions.
* ``x0``: The start guess for the solution. In the case that the solution has a global minimum (as the least squares problem does) the choice will only affect computation time.

Thus the final syntax is `minimize(fun = function(x), x0 = [start guess])`.

The function returns an instance of the ``OptimizeResult`` class, which has several attributes. The only one we will be interested in for now is ``x``, the solution that solves the minimization.

Let's set up a function that returns the object we want to minimize: the sum of squared errors:

In [5]:
def sse(y, X, b):
    yhat = linear_projection(X, b)
    sse = np.sum((yhat - y)**2)
    return sse

In [6]:
sse(y,X,beta_true)

73.86071221203768

Now we can minimize this function, making it a function of just one variable by masking the other inputs in a lambda function:

In [7]:
from scipy.optimize import minimize

In [8]:
# remember the syntax: 
# minimize(fun = function(x), x0 = [start guess])
# the lambda function allows sse to be a function of only x, the other inputs
# come from the variables X and y we already defined.
minimize(lambda x: sse(y, X, x), x0 = [0,0])
# as expected, we get an intercept of around 2 and a slope around 8

      fun: 70.0929008081801
 hess_inv: array([[ 0.00511418, -0.00068726],
       [-0.00068726,  0.00413653]])
      jac: array([0.00000000e+00, 2.86102295e-06])
  message: 'Optimization terminated successfully.'
     nfev: 27
      nit: 6
     njev: 9
   status: 0
  success: True
        x: array([2.19598518, 7.96358817])

In [16]:
# let's do it again with a higher N, letting the LLN work for us!
X, y = dataGenerator(beta_true, 10000)
minimize(lambda x: sse(y, X, x), x0 = [0,0])
# now it's even more accurate!

      fun: 10101.768512789446
 hess_inv: array([[5.00039507e-05, 4.39765871e-07],
       [4.39765871e-07, 4.91367966e-05]])
      jac: array([0., 0.])
  message: 'Optimization terminated successfully.'
     nfev: 57
      nit: 7
     njev: 19
   status: 0
  success: True
        x: array([2.02638879, 8.00740285])

## Creating an OLS class
Now let's take our functions and organize them into an OLS class. This class will have attributes X and y, and methods defined by the functions above.

What happened to the arguments in minimize?? Here something cool happens, and it actually starts with the `sse` function. Now that `X` and `y` are attributes of the class, the `self.sse()` method knows what they are thanks to the `self` argument that is implicitly passed into it! Therefore we can call `sse()` as only a function of one argument: `b`. Here's proof:

In [11]:
# instantiate model
model1 = OLS(X, y)

# call sse method
model1.sse([2,1])

5011505.949727456

Now that `OLS.sse()` is only a function of one argument, we can omit the arguments altogether in the minimze function, and call it just by its name, `self.sse`. It already knows that the single argument is what it is minimizing over! Let's test it:

In [12]:
# call the solve_OLS() method
model1.estimate()

array([1.99100375, 7.99879494])

## Exercises
Try these out for yourself:
1. After estimating $\widehat{\beta}$, add it as an attribute of the OLS class.
2. Estimate the White-robust SEs. There are a few ways to do this; which do you prefer? Why?
     - Estimate and return them with beta in a tuple
     - Estimate them with beta and add as an attribute
     - Write a method that calculates and returns them upon request (nice code will avoid re-estimating the betas each time you do this. How can this be avoided?)
3. Rewrite the estimation in terms of matrix algebra instead of a minimization.