# Week 1 

The purpose of this week's excercise is twofold: First, introduce you to Numpy and making you familiar to the library and some of its pitfalls. Secondly, you will use this knowledge to estimate the linear model using OLS.

## A short introduction to Numpy and Linear Algebra (Linalg)
First, import all necessary packages. If you are missing a package, you can either install it through your terminal using pip, or an Anaconda terminal using conda.

In [1]:
import numpy as np
from numpy import linalg as la
from numpy import random as random
from tabulate import tabulate
from matplotlib import pyplot as plt

### Entering matrices manually
To create a $1\times9$ *row* vector write,

In [2]:
row = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9])
print(row)

[1 2 3 4 5 6 7 8 9]


To create a $9\times1$ *column* vector write,

In [3]:
col = np.array([[1], [2], [3], [4], [5], [6], [7], [8], [9]])
print(col)

[[1]
 [2]
 [3]
 [4]
 [5]
 [6]
 [7]
 [8]
 [9]]


An easier method is to define a row vector, and transpose it. Notice the double [[]]. Try to see what happens if you transpose a row vector using only [].

In [4]:
col = np.array([[1, 2, 3, 4, 5, 6, 7, 8, 9]]).T
print(col)

[[1]
 [2]
 [3]
 [4]
 [5]
 [6]
 [7]
 [8]
 [9]]


**A short note on numpy vectors**
Numpy does not treat vectors and matrices the same. A *true* numpy vector has the shape (k,), . The shape of a numpy array is an attribute, how do you call this attribute for the `row` and `col` arrays? What is the shape of the `row.T` array? 

In [5]:
# Call the shape attribute for the row and col vars. Check the shape of row.T

print(col.shape,
row.shape,
row.T.shape)

(9, 1) (9,) (9,)


To create a matrix, you combine what you have learned to manually create a $3 \times 3$ matrix called x, that has the numbers 0 to 8.

In [6]:
# FILL IN HERE
x = np.array([[0,1,2], [3,4,5], [6,7,8]])
print(x, '\n\nThe shape of the matrix is:\n', x.shape)

[[0 1 2]
 [3 4 5]
 [6 7 8]] 

The shape of the matrix is:
 (3, 3)


Create the same $3 \times 3$ using `np.arange()` and np.reshape()

In [7]:
x = np.arange(0,8+1).reshape(3,3)
x

array([[0, 1, 2],
       [3, 4, 5],
       [6, 7, 8]])

### Matrix calculations 
There are several types of matrix calculations available to us with the numpy library, and we will introduce some here.

For matrix **multiplication** you can for the matrices `a` and `b` use `a@b`, `np.dot(a, b)` or `a.dot(b)`

Use the `row`, `col` vectors and `x` matrix and perform these matrix multiplications. Does the `row` vector behave as you would expect?

In [8]:
row @ col
row.T @ col #Creates a scalar even though we transpose


array([285])

In [9]:
np.dot(col,col.T)

array([[ 1,  2,  3,  4,  5,  6,  7,  8,  9],
       [ 2,  4,  6,  8, 10, 12, 14, 16, 18],
       [ 3,  6,  9, 12, 15, 18, 21, 24, 27],
       [ 4,  8, 12, 16, 20, 24, 28, 32, 36],
       [ 5, 10, 15, 20, 25, 30, 35, 40, 45],
       [ 6, 12, 18, 24, 30, 36, 42, 48, 54],
       [ 7, 14, 21, 28, 35, 42, 49, 56, 63],
       [ 8, 16, 24, 32, 40, 48, 56, 64, 72],
       [ 9, 18, 27, 36, 45, 54, 63, 72, 81]])

In [10]:
np.dot(x,x.T)

array([[  5,  14,  23],
       [ 14,  50,  86],
       [ 23,  86, 149]])

What happens if you use `/` and `*` operators with the  `row` and `col` vectors or the `x` matrix?

In [11]:
test = row/row
test2 = row * row
test2

array([ 1,  4,  9, 16, 25, 36, 49, 64, 81])

In [12]:
test3 = col/col
test4 = col*col
print(test3, '\n\n', test4)

[[1.]
 [1.]
 [1.]
 [1.]
 [1.]
 [1.]
 [1.]
 [1.]
 [1.]] 

 [[ 1]
 [ 4]
 [ 9]
 [16]
 [25]
 [36]
 [49]
 [64]
 [81]]


In [13]:
test_5 = x/x #dividing by zero, therefore nan
test_6 = x*x
print(test_5, test_5.shape, '\n\n', test_6, test_6.shape)

[[nan  1.  1.]
 [ 1.  1.  1.]
 [ 1.  1.  1.]] (3, 3) 

 [[ 0  1  4]
 [ 9 16 25]
 [36 49 64]] (3, 3)


  test_5 = x/x #dividing by zero, therefore nan


For OLS we need to be able to calculate the inverse. This is done with the `linalg` submodule. Create a new matrix that we can calculate the inverse on. Why can't we take the inverse of `x`?

In [14]:
print('the determinant of x is', np.linalg.det(x))
np.linalg.inv(x) 
#we can't the inverse as the matrix is singular, ie. that the determinant is zero

the determinant of x is 0.0


LinAlgError: Singular matrix

We cannot take the inverse of `x`, what do we normaly need to check before we take the inverse? What `numpy.linalg` method can we use to help us check for this?

In [15]:
print('the determinant of x is', np.linalg.det(x))

the determinant of x is 0.0


Scalar operations can be performed as usual with `*` and `/`, and behaves as expected.

In [16]:
a = np.array([[4, 9], [1, 3]])
print(a/2)
print(a*2)

[[2.  4.5]
 [0.5 1.5]]
[[ 8 18]
 [ 2  6]]


### Stack vectors or matrices together
If you have several 1-D vectors (has the shape (k,)), you can use `np.column_stack()` to get a matrix with the input vectors put together as column.

If you have matrices (or arrays) that are multidimensional (have the shape (k, t)), you can use `np.hstack()` (means horizontal stack). This is very useful if you already have a matrix, and you want to add a vector.

Try to make a matrix with two `row` vectors, this should give you a $9 \times 2$ vector.

Make a new vector, and add it to the `x` matrix. This should then be a $3 \times 4$ matrix

In [17]:
a = np.arange(0,9)
b = np.arange(0,9)
c = np.column_stack((a,b))
print(c, '\n\n', c.shape)

[[0 0]
 [1 1]
 [2 2]
 [3 3]
 [4 4]
 [5 5]
 [6 6]
 [7 7]
 [8 8]] 

 (9, 2)


In [18]:
y = np.arange(0,3).reshape(3,1)

x = np.hstack((x,y))
x

array([[0, 1, 2, 0],
       [3, 4, 5, 1],
       [6, 7, 8, 2]])

### Other methods that you need to know.
The numpy library is vast. Some other methods that are useful are `ones`, `diag`, `diagonal`, `eye`.

## Exercise 1 - Data generation
### 1.1 
Create a synthetic dataset with the following characteristics

\begin{align}
    y_i &= \beta_0 + x_{1i}\beta_1 + x_{2i}\beta_2 + \varepsilon_i
\end{align}

where $\beta_0=1$, $\beta_1 = -0.5$, $\beta_2 = 2$, $x_{1i} \sim \mathcal{N}(0, 4)$, $x_{2i} \sim \mathcal{N}(5, 9)$, $\varepsilon_i \sim \mathcal{N}(0, 1)$, and where $i = 0, ..., 99$. <br>
The code may look something like this:

In [68]:
# Create a seed to always have identical draws.
seed = 42
# Instance a random number generator using this seed.
rng = random.default_rng(seed=seed)
n = 100
b = np.array([1, -0.5, 2]).reshape(-1, 1)

# Make random draws from a normal distribution.
def random_draws(n):
    x0 = np.ones(n)
    x1 = np.random.normal(0,4,n)
    x2 = np.random.normal(5,9,n)
    eps = np.random.normal(0,1,n).reshape(-1,1)
    
    X = np.column_stack((x0,x1,x2))
    # Stack the single columns into a matrix, return
    # the matrix along with eps.
    return X, eps

X, eps = random_draws(n)

# Create y using the betas and X.
y = X@b+eps

### 1.2 
Imagine that you had not generated the dataset yourself, but that you were given a similar data set that was already collected (generated) and ready to analyze. What would you observe and not observe in that data set?

Would not observe $\epsilon$ - for inference, we assume the cond. mean is zero (consistency)

## Exercise 2 - OLS
### 2.1
Make sure that you remember the mathematical equation for the OLS estimation, which we will later use to estimate the beta coefficients using date from the previous excercise. <br> 
**Write out the OLS estimator in matrix form:**


$$\boldsymbol{\hat{\beta}} = (\mathbf{X}'\mathbf{X})^{-1} \mathbf{X'}\mathbf{y}$$

*Hint: Look it up on p.53 in Wooldridge*

### 2.2
As you might remember, to perform inference on the OLS estimators, we need to calculate the standard errors for the previously estimates OLS coefficients. Again, make sure you remember its equation, *and write up the OLS standard errors in matrix form:*

$\mathbf{\widehat{Var(\boldsymbol{\hat{\beta}})}} = \hat{\sigma}^2 (\mathbf{X'}\mathbf{X)^{-1}}$, for $\hat{\sigma}^2 = \frac{SSR}{N - K}$, <br>

where $SSR = \sum_{i=0}^{N - 1} \hat{u}^2_i$, N is the number of observations, and K is the number of explanatory variables including the constant.

*Hint: Look it up on p.55 in Wooldridge* <br>
*Hint: Remember that the variance is a function of $\hat{\sigma}^2$, which is calculated using SSR*

### 2.3
Estimate $\boldsymbol{\hat{\beta}}$ from the synthetic data set. Furthermore, calculate standard errors and t-values (assuming that the assumptions of the classical linear regression model are satisfied). The code may look something like this:

In [27]:
y.shape
#X.shape
print(la.inv((X.T@X)).shape, (X.T@y).shape)

(3, 3) (3, 100) (100, 3) (100, 100)


2

In [72]:
def ols_estimation(y, X):
    # Make sure that y and X are 2-D.
    y = y.reshape(-1, 1)
    if len(X.shape)<2:
        X = X.reshape(-1, 1) 

    # Estimate beta
    b_hat = la.inv(X.T@X)@X.T@y

    # Calculate standard errors
    residual = y-X@b_hat
    sigma = residual.T@residual/(n-b.size)
    cov = sigma*la.inv(X.T@X) #Where does this come from? [OUR variance estimator]
    se = np.sqrt(cov.diagonal()).reshape(-1, 1) #Where does this come from?

    # Calculate t-values
    t_values = b_hat/se
    return b_hat, se, t_values

b_hat, se, t_values = ols_estimation(y, X)

$$
\widehat{\text { s.e. }}\left(\hat{\beta}_j\right)=\sqrt{s^2\left(X^T X\right)_{j j}^{-1}}
$$

(from WIKI)

Python stores vectors as one-dimensional rather than two-dimensional objects. This can sometimes cause havoc when we want to compute matrix products. Compute the outer and inner products of the residuals from above and compare these with your results when using matrix multiplication @. 

In [89]:
res= y-X@b_hat
outer = res*res
outer2 = np.outer(res,res)
inner = res.T@res
inner2 = np.inner(res,res)

In [95]:
(res*res)

(100, 1)

In [98]:
x = np.array([1, 2, 3])
y = np.array([4, 5, 6])

c = np.outer(x,y)
c

array([[ 4,  5,  6],
       [ 8, 10, 12],
       [12, 15, 18]])

In [104]:
res= y-X@b_hat
print(res.shape)
outer = np.outer(res,res)
inner = np.inner(res,res)
matmul_inner = res.T@res
matmul_outer = res@res.T
print(inner.shape)
print(outer.shape)
print(matmul_inner.shape)
print(matmul_outer.shape)
print(np.sum(np.diag(inner)))
matmul_inner

(100, 3)
(100, 100)
(300, 300)
(3, 3)
(100, 100)
129408.03591526193


array([[44455.00593086, 43712.17561797, 42969.34530509],
       [43712.17561797, 43069.34530509, 42426.5149922 ],
       [42969.34530509, 42426.5149922 , 41883.68467931]])

Now if we flatten the residuals to be stored in Python's default mode (i.e. one-dimensional) what happens?

In [105]:
res=res.flatten()
print(res.shape)
outer = np.outer(res,res)
inner = np.inner(res,res)
matmul_inner = res.T@res
matmul_outer = res@res.T
print(inner.shape)
print(outer.shape)
print(matmul_inner.shape)
print(matmul_outer.shape)
# @ -> 'two-dim' matrices (k,1)
# inner/outer -> one-dimensional vectors (k,) 

(300,)
()
(300, 300)
()
()


I have written a code to print a table, using the `tabulate` package. You will need to add the row names for this code to work - each row contains a information about the different coefficients on the explanatory variables.

In [115]:
def print_table(row_names, b, b_hat, se, t_values):
    table = []

    # Make a list, where each row contains the estimated and calculated values.
    for index, name in enumerate(row_names):
        table_row = [
            name, b[index], b_hat[index], se[index], t_values[index]
        ]
        table.append(table_row)

    # Print the list using the tabulate class.
    headers = ['', '\u03b2', '\u03b2\u0302 ', 'Se', 't-value']
    print('OLS Estimates:\n')
    print(tabulate(table, headers, floatfmt=['', '.1f', '.3f', '.3f', '.1f']))

row_names = []
for i in range(len(b_hat)):
    row_names.append(i) 
    i+1

print_table(row_names, b, b_hat, se, t_values)

OLS Estimates:

       β      β̂      Se    t-value
--  ----  ------  -----  ---------
 0   1.0   0.991  0.103        9.6
 1  -0.5  -0.432  0.024      -17.8
 2   2.0   2.000  0.009      215.1


Alternatively, you can print a table which you can paste straight into latex using the following code. This uses panda data frames  which we'll cover next week.

In [116]:
import pandas as pd
dat = pd.DataFrame(zip(b,b_hat.round(4),se.round(4),t_values.round(4)))
dat.columns = ['\u03b2','\u03b2\u0302','se','t-values']
dat.index = ['beta1','beta2','beta3']
print(dat.style.to_latex())

\begin{tabular}{lllll}
 & β & β̂ & se & t-values \\
beta1 & [1.] & [0.9912] & [0.1031] & [9.614] \\
beta2 & [-0.5] & [-0.4321] & [0.0243] & [-17.809] \\
beta3 & [2.] & [2.0005] & [0.0093] & [215.129] \\
\end{tabular}



## Exercise 3 - a simple Monte Carlo Experiment
Carry out a Monte Carlo experiment with $S = 200$ replications and $N = 100$ observations to check if the OLS estimator provides an unbiased estimate of $\boldsymbol{\beta}$
### 3.1
Generate 200 data sets similar to what you did in exercise 1, and estimate $\boldsymbol{\beta}$ on each of them.

*Hint:* Start by making prefilling two arrays using `np.zeros`, one array to store the estimated beta coefficients, and one to store the estimated standard errors. What shape should these arrays have?

Then make a loop where each loop makes a random draw, and then estimates on this random draw. And finally stores the estimated coefficients and standard errors.

In [121]:
# Initialize the variables and lists
s = 200
n = 100

# Allocate memory for arrays to later fill
b_coeffs = np.zeros((s, b.size))
b_ses = np.zeros((s, b.size))

for i in range(s):
    # Generate data
    X, eps = random_draws(n)
    y = X@b+eps

    # Estimate coefficients and variance
    b_hat, se, t_values = ols_estimation(y,X)

    # Store estimates
    b_coeffs[i, :] = b_hat.flatten()
    b_ses[i, :] = se.flatten()

# Make sure that there are no more zeros left in the arrays.
assert np.all(b_coeffs) and np.all(b_ses), 'Not all coefficients or standard errors are non-zero.'

array([[ 0.88458867, -0.51134576,  2.00830273],
       [ 1.11803232, -0.53607209,  1.9988811 ],
       [ 1.02583992, -0.53648291,  1.9916383 ],
       [ 0.92978243, -0.50797933,  2.01795293],
       [ 1.07718156, -0.56184152,  1.99371928],
       [ 0.97975242, -0.46434339,  1.98456008],
       [ 1.003034  , -0.49440053,  1.99490273],
       [ 1.05608458, -0.47741753,  1.99236809],
       [ 0.82510585, -0.49575191,  2.00629499],
       [ 1.13744016, -0.49746097,  1.98840271],
       [ 0.95759333, -0.51113361,  2.00271417],
       [ 1.09622812, -0.49568723,  1.99331654],
       [ 1.19037317, -0.51967063,  1.99280395],
       [ 1.04218823, -0.5245984 ,  1.99013366],
       [ 0.9903253 , -0.50760382,  2.01466636],
       [ 0.93760627, -0.53651894,  1.99900128],
       [ 0.99615251, -0.53715   ,  1.99722683],
       [ 0.80438719, -0.50435244,  2.02035794],
       [ 1.15397767, -0.54419844,  2.00655255],
       [ 1.11578946, -0.48459297,  1.97726236],
       [ 1.21399448, -0.52021867,  1.972

### 3.2
Do the following three calculations:
- Calculate the means of the estimates (means across simulations)
- Calculate the means of the standard errors (means across simulations)
- Calculate the standard error of the MC estimates

In [128]:
b_coeffs.mean(axis=0)

array([ 1.02255108, -0.50007585,  1.99859394])

In [130]:
mean_b_hat = b_coeffs.mean(axis=0) #mean across cols
mean_b_se = b_ses.mean(axis=0) #mean across cols
#mean_mc_se = np.sqrt((np.sum(
#    (b_coeffs - np.mean(b_coeffs, axis=0))*(b_coeffs - np.mean(b_coeffs, axis=0)), axis=0)/(s - 1)
#))

### 3.3
Draw a histogram for the 200 estimates of $\beta_1$

In [None]:
# Fill in here