# Lecture 34
# A Look Ahead; Examples of Regression Example, Sampling from a Finite Population

## The Top-10 List

1. Conditioning ... is the soul of statistics!
1. Symmetry ... is powerful but dangerous
1. Random variables and their distributions
1. Stories (proofs, backgrounds of the distributions covered)
1. Linearity
1. Indicator random variables
1. LOTUS
1. Law of Large Numbers
1. Central Limit Theorem
1. Markov Chains

Items 1 through 4 deal with the Big Picture<sup>&trade;</sup> questions: _What is randomess? How do we think about uncertainty?_

Items 5 through 7 are for computing expected values (mean, variance &amp; standard deviation).

Items 8 through 10 are important for understanding long-run behavior.

## Where to go from here?

Some topics to study from here on out:

* Statistical inference (we have data, need to estimate parameters or make predictions)
* Regress &amp; linear models
* Finance
* Computational biology
* Stochastic processes

## Advice

* Learn R
* Learn C
* Read Mostly Harmless Econometrics

## Ex. A Simple Linear Regression

You've seen this before:

\begin{align}
  Y &= \beta_0 + \beta_1 \, X + \epsilon
\end{align}

* We want to use $X$ to predict $Y$
* $\beta_j$ are linear coeffiecients, with $\beta_0$ being the value of $Y$ when $x=0$ (default value)
* $\epsilon$ error term (since $X$ is not perfect)
* a common assumption is $\mathbb{E}(\epsilon | X) = 0$ (centered at 0, $\epsilon$'s distribution may or may not be normal)

So how would we solve for $\beta_1$?

We can start by treating $Cov$ as an _operator_!

\begin{align}
  Cov(Y, X) &= Cov\left( (\beta_0 + \beta_1 \, X + \epsilon), X \right) \\
  &= Cov(\beta_0, X) + Cov\left( (\beta_1 \, X), X\right) + Cov(\epsilon, X) \\
  \\
  \text{now } Cov(\beta_0, X) &= 0 &\quad \text{ since } Cov \text{ of constant with anything is } 0 \\
  \\
  \text{and } Cov\left( (\beta_1 \, X), X\right) &= \beta_1 \, Cov(X, X) &\quad \text{by definition of }Var \\
  &= \beta_1 \, Var(X) \\
  \\
  \text{and since } \mathbb{E}(\epsilon) &= \mathbb{E}\left( \mathbb{E}(\epsilon|X) \right) = \mathbb{E}(0) = 0 \\
  \text{and further } \mathbb{E}(\epsilon \, X) &= \mathbb{E}\left( \mathbb{E}(\epsilon \, X | X)  \right) &\quad \text{ by Adam's Law} \\
  &= \mathbb{E}\left( X \mathbb{E}(\epsilon | X) \right) &\quad \text{ since } X \text{ is known, we can pull it out} \\
  &= \mathbb{E}(0) \\
  &= 0 \\
  \text{so }Cov(\epsilon, X) &= \mathbb{E}(\epsilon \, X) - \mathbb{E}(\epsilon) \, \mathbb{E}(X) \\
  &= 0 - 0 = 0 \\\\
  \Rightarrow \beta_1 &= \frac{Cov(X,Y)}{Var(X)} &\quad \text{(population version)} 
\end{align}

### Calculate $\beta_1$ with $Cov(X,Y)$ and $Var(X)$

In [1]:
import numpy as np

X = np.array([95, 85, 80, 70, 60])
Y = np.array([85, 95, 70, 65, 70])

# numpy.cov(X, Y) returns the matrix
# [ Cov(X,X), Cov(X,Y)]
# [ Cov(X,Y), Cov(Y,Y)]
covM = np.cov(X,Y)

beta_1 = covM[0,1]/covM[0,0]
beta_1

0.64383561643835618

### Calculate $\beta_1$ via sklearn LinearRegression API

In [2]:
from sklearn import linear_model

regr = linear_model.LinearRegression()
regr.fit(np.matrix(X).T, Y).coef_[0]

0.64383561643835607

----