# Linear Regression Models

**References:**
- William Mendenhall, Dennis Wackerly, Richard Scheaffer: Mathematical Statistics with Applications
- MIT 18.S096 - Topics in Mathematics with Applications in Finance, [Lecture 6: Regression Analysis](https://ocw.mit.edu/courses/mathematics/18-s096-topics-in-mathematics-with-applications-in-finance-fall-2013/video-lectures/lecture-6-regression-analysis)
- [Trevor Hastie, Robert Tibshirani, Jerome Friedman: The Elements of Statistical Learning](https://web.stanford.edu/~hastie/Papers/ESLII.pdf)


\begin{equation}
\boxed{
    y_{i} = \beta_{0} + \sum_{j} \beta_{i} ~ g_{j}(x_{ij}) + \epsilon_{i}}
\end{equation}

where:
- $y_{i}$ is the dependent/response variables ($i=1,..,n$)
- $x_{ij}$ is the the independent variables ($i=1,...,n$, $j=0,...,k$)
- $\beta_{j}$ are the regrssion parameters ($j=0,...,k$)
- $\epsilon_{i}$ is the residual, or error term ($i=1,...,n$)
- $g_{j}(x)$, some arbitrary function of $x$, ($j=1,..,k$)

Assuming that expected value of the residual term is 0:

\begin{equation}
    E[\epsilon_{i}]
    =
    0
\end{equation}

The expected value of $y_{i}$ is:

\begin{equation}
    E[y_{i}]
    =
    \beta_{0} + \sum_{j} \beta_{i} ~ g(x_{ij})
\end{equation}

Sum of squares for error (SSE):

\begin{equation}
    \text{SSE}
    =
    \sum_{i=1}^{n}
    \left( 
    y_{i} - \hat{y}_{i}
    \right)^{2}
\end{equation}

In case the residual term follows a normal distribution with zero mean and some variance $\sigma^{2}:$

\begin{equation}
    \epsilon_{i}
    \sim
    \mathcal{N}(\mu=0, \sigma^{2})
\end{equation}

## Simple Linear Regression ($y=\beta_{0} + \beta_{1} x + \epsilon$)

[Relagated to a seperate notebook](./simple_linear_regression.ipynb)

## Multiple Linear Regression

In [1]:
import scipy.stats
import statsmodels
import numpy as np
from sklearn.linear_model import LinearRegression
from sklearn import datasets
import matplotlib.pyplot as plt
%matplotlib inline