# Simple Linear Regression 

## Simple Linear Regression Model

Simple linear regression model has a independent variable (regressor), and the independent variable has a linear relationship with the dependent variable. The model can be written in this formula: 
$$
    y = \beta_0 + \beta_1 x + \epsilon \tag{2.1}
$$
Within the formula, we can found that the parameters: $\beta_1$ is the slope and the $\beta_0$ is the intercept, and the unobservable error $\epsilon$ is i.i.d.(independent, identically and dstributed random variable) with mean zeor and constant variance $\sigma^2$. 

The independent varialbes is viewed as controlled by the experimenter, so it is considered as non-stochastic whereas $y$ is viewded as a random varialbe with a probability distribution, and the mean of this distribution: 
$$
    E(y\vert x) = \beta_0 + \beta_1 x \text{2.2a}
$$
the variance equals: 
$$
    Var(y|x) = Var(\beta_0 + \beta_1 x + \epsilon) = Var(\epsilon) = sigma^2 \tag{2.2b}
$$

## Least Squares Estimation

Suppose a sample of $n$ sets of paired observations $(x_i, y_i), \ i \in R^1$ are available. Thesse observations are assumed to satisfy the simple linear regression model and so we can write: 
$$
    y_i = \beta_0 + \beta_i x_i + \epsilon_i \ (i \in R^1) \tag{2.3}
$$
Sometimes the function $(2.1)$ can be consider as the **Total Regression Model** and the function$(2.3)$ as the **Sample Regression Model**, the sample regression model is written by the $n$ pairs of data, which means the lest squares estimation is: 
$$
    S(\beta_0, \beta_1) = \sum_{i=1}^n (y_i - \beta_0 - \beta_1 x_i)^2 \tag{2.4}
$$
$\beta_0$ and $\beta_1$ 's **least squares estimators is called $\hat{\beta}_0$ and $\hat{\beta}_1$**, $\hat{\beta}_0$ and $\hat{\beta}_1$ have to satisfy these conditions:
$$
    \frac{\partial S}{\partial \beta_0} \vert_{\hat{\beta}_0, \hat{\beta}_1} = -2 \sum_{i = 1}^n (y_i - \hat{\beta}_0 - \hat{\beta}_1 x_i) = 0
$$
and 
$$
    \frac{\partial S}{\partial \beta_1} \vert_{\hat{\beta}_0, \hat{\beta}_1} = -2 \sum_{i = 1}^n (y_i - \hat{\beta}_0 - \hat{\beta}_1 x_i) = 0 
$$

we can simplify the equations above and get these equations: 
$\begin{align*}
    n \hat{\beta}_0 + \hat{\beta}_1 \sum_{i=1}^n x_i &= \sum_{i=1}^n y_i \\
    \hat{\beta}_0 \sum_{i=1}^n x_i + \hat{\beta}_1 \sum_{i=1}^n x_i^2 &= \sum_{i=1}^n x_i y_i \tag{2.5}
\end{align*}$

The function $(2.5)$ is called the **least squares function**, the solution of the least squares function is : 

<center>
$\begin{align*}
    \hat{\beta}_0 &= \bar{y} - \hat{\beta}_1 \bar{x} \tag{2.6}\\
    \hat{\beta}_1 &= \frac{\sum_{i=1}^n x_i y_i - \frac{(\sum_{i=1}^n x_i)(\sum_{i=1}^n y_i)}{n}}{\sum_{i=1}^n x_i^2 - \frac{\sum_{i=1}^n x_i)^2}{n}} \tag{2.7}
\end{align*}$
<center>

In the $(2.6)$ the $\bar{y} = \frac{1}{n} \displaystyle \sum_{i=1}^n y_i$ and $\bar{x} = \frac{1}{n} \displaystyle \sum_{i=1}^n x_i$, there are the averge of $x$ and $y$. Therefore, $\hat{\beta}_0$ and $\hat{\beta}_1$ are the intercept and slope's **Least Squares Emstimator**, hence, the simple linear regression model can be fitted: 
$$
    \hat{y} = \hat{\beta}_0 + \hat{beta}_1 x \tag{2.8}
$$

Through the $(2.7)$, we can check that the **dominator** is $x_i$'s corrected sum of squares, which can be written as $S_{xx}$ :
$$
    S_{xx} = \sum_{i=1}^n x_i^2 - \frac{(\sum_{i=1}^n x_i)^2}{n} = \sum_{i=1}^n (x_i - \bar{x})^2 \tag{2.9}
$$
Simularly: 
$\begin{align*}
    S_{xy} &= \sum_{i=1}^n x_i y_i - \frac{(\sum_{i=1}^n x_i)(\sum_{i=1}^n y_i)}{n} = \sum_{i=1}^n y_i (x_i - \bar{x})^2 \tag{2.10a} \\
    S_{yy} &= \sum_{i=1}^n y_i^2 - \frac{(\sum_{i=1}^n y_i)^2}{n} = \sum_{i=1}^n (x_i - \bar{y})^2 \tag{2.10b}
\end{align*}$

Hence, we can simplify the $(2.7)$: 
$$
    \beta_1 = \frac{S_{xy}}{S_{xx}} \tag{2.11}
$$

The difference between the observed values $y_i$ and the fitted (or predicted) values $\hat{y}_i$ is called as a **residual**. The $i^{th}$ residual defined as: $e_i = y_i \sim \hat{y}_i \ (i = 1,2,\dots,n) = y_i \hat{y}_i = y_i - (\hat{\beta}_0 + \hat{\beta}_1 x_i)$ 

### Example Data

I am trying to use `R` to solve the linear regression problem:
```{r}
x <- c(15.50,23.75,2.00,17.00,5.50,19.00,24.00,2.50,7.50,11.00,13.00,3.75,25.00,9.75,22.00,18.00,6.00,12.50,2.00,21.50)
y <- c(2158.70,1678.15,2316.00,2061.30,2207.50,1708.30,1784.70,2375.00,2357.90,2256.70,2165.20,2399.55,1779.80,2053.50,2414.40,2200.50,2654.20,2654.20,1753.70)



In [25]:
x <- c(15.50,23.75,8.00,17.00,5.50,19.00,24.00,2.50,7.50,11.00,13.00,3.75,25.00,9.75,22.00,18.00,6.00,12.50,2.00,21.50)
y <- c(2158.70,1678.15,2316.00,2061.30,2207.50,1708.30,1784.70,2375.00,2357.90,2256.70,2165.20,2399.55,1779.80,2336.75,1765.30,2053.50,2414.40,2200.50,2654.20,1753.70)

data1 <- data.frame(x,y)

sum_x <- sum(x)
sum_y <- sum(y)
sum_x_2 <- sum(x ^ 2)
sum_y_2 <- sum(x ^ 2)
sum_xy <- sum(x * y)

S_xx <- sum_x_2 - ((sum_x) ^ 2 / length(x))
S_yy <- sum_y_2 - ((sum_y) ^ 2 / length(y))
S_xy <- sum_xy - (sum_x * sum_y / length(x))

data1 <- data.frame(x,y)

In [30]:
hat_beta_1 <- S_xy / S_xx 
hat_beta_0 <- sum_y/length(y) - hat_beta_1 * sum_x / length(x)

In [35]:
hat_beta_1 
hat_beta_0

In [36]:
paste("the beta1 equal",hat_beta_1,"; the beta0 equals",hat_beta_0)