# Linear Regression
## Definition
In statistics, linear regression is a linear approach to modeling the relationship between a scalar response (or dependent variable) and one or more explanatory variables (or independent variables). *-From Wikipedia*

## Formula
### Simple Linear Regression
In simple linear regression, we attempt to model the relationship between two variables, for example, income and number of years of education.  
$$y = \beta_0 + \beta_1x + \varepsilon$$

### Multiple Linear Regression
In multiple regression, we attempt to predict a dependent or response variable $y$ on
the basis of an assumed linear relationship with several independent or predictor variables $x_1, x_2, ... , x_k$.  
$$y = \beta_0 + \beta_1x_1 + \beta_2x_2 + ... + \beta_kx_k + \varepsilon$$   
where  
$y$: dependent or response variable (continuous)  
$x$: independent or predictor variable (discrete or continuous)  
$\varepsilon$: error term in the model. In this context, error does not mean mistake but is a statistical term representing random fluctuations, measurement errors, or the effect of factors outside of our control  
$\beta_0$: constant or intercept of the regression. Initial values of $y$ if all $x_i = 0$  
$\beta_i$: weights of $x_i$ applies to $y$, measure the effect of $x_i$ on $y$  

## Assumptions
Typically, linear regression holds when these assumptions are fulfilled:
1. **Linearity**: The relationship between X and the mean of Y is linear.  

<center>$E(\varepsilon) = 0$, or, equivalently, $E(y) = \beta_0 + \beta_1x_1 + \beta_2x_2 + ... + \beta_kx_k$</center>

2. **Homoscedasticity**: The variance of residual is the same for any value of X.  

<center>$var(\varepsilon) = \sigma^2$, or, equivalently, $var(y) = \sigma^2$</center>  

3. **Independence**: Observations are independent of each other.

<center>$cov(\varepsilon_i, \varepsilon_j) = 0$ for all $i \ne j$, or, equivalently, $cov(y_i, y_j) = 0$</center>  

4. **Normality**: For any fixed value of X, Y is normally distributed.

<center>$\varepsilon \sim N(0, \sigma^2)$</center>  


## Estimation

## Illustration

## Example

In [20]:
import numpy as np
from sklearn.datasets import load_boston
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
X, y = load_boston(return_X_y=True)
print(X.shape)
print(y.shape)

(506, 13)
(506,)


In [32]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
lreg = LinearRegression()
model = lreg.fit(X_train, y_train)
print("R2 score for training set is:", model.score(X_train, y_train))
print(model.intercept_)
print(model.coef_)

R2 score for training set is: 0.7434997532004697
31.631084035694585
[-1.33470103e-01  3.58089136e-02  4.95226452e-02  3.11983512e+00
 -1.54170609e+01  4.05719923e+00 -1.08208352e-02 -1.38599824e+00
  2.42727340e-01 -8.70223437e-03 -9.10685208e-01  1.17941159e-02
 -5.47113313e-01]


In [36]:
y_predicted = model.predict(X_test)
error = y_test - y_predicted
print(np.mean(error ** 2))

21.517444231176995
