## Import packages

In [36]:
from ISLP import load_data
import pandas as pd
import numpy as np
from ISLP.models import summarize

## Simple Linear Regression

Simple Linear Regression is an approach to predict a quantitative response Y based on a single predictor X, assuming there is approximately a linear relationship between X and Y.

$$
y \approx \beta_0 + \beta_1 x \tag{1}
$$

In equation 1, $\beta_0$ and $\beta_1$ are two unknown constants that represent the *intercept* and *slope* in the linear model. $\beta_0$ and $\beta_1$ are also known as the model *coefficients* or *parameters*. Once we have used our training data to produce estimates $\hat{\beta}_0$ and $\hat{\beta}_1$ for the model coefficients, we can predict future $\hat{y}$ by computing:

$$
\hat{y} \approx \hat{\beta_0} + \hat{\beta_1} x \tag{2}
$$

Let's first load the Boston dataset using the `load_data` method from the `ISLP` package:

In [30]:
Boston_dataset = load_data(dataset="Boston")

We start by fitting simple linear regression model with `lstat` to be the only predictor and `medv` to be the response.

In [23]:
X = Boston_dataset['lstat']
y = Boston_dataset['medv']

## Least Squares Method

The Least Squares Method is used to estimate $\beta_0$ and $\beta_1$ of formula (2):

$$
\hat{\beta_1} = \frac{\sum_{i=1}^{n} (x_i - \bar{x})(y_i - \bar{y})}{\sum_{i=1}^{n} (x_i - \bar{x})^2} \\

\hat{\beta_0} = \bar{y} - \hat{\beta_1} \bar{x}
$$

We now attempt to calculate $\hat{\beta_1}$ and $\hat{\beta_0}$ by first calculate $\bar{x}$ and $\bar{y}$:

In [39]:
X_avg = 
y_avg = 

Then, fill in $\hat{\beta_1}$ and $\hat{\beta_0}$:

In [31]:
beta_1_hat = 
beta_0_hat = 

Print out the value of $\hat{\beta_1}$ and $\hat{\beta_0}$:

In [None]:
print(beta_1_hat)
print(beta_0_hat)

## Using libraries

### Using *Scikit-learn* library:

Import `sklearn` module and call the method to fit the linear regression model with the data.

In [40]:
from sklearn.linear_model import LinearRegression

reg = 

Print out the value of $\hat{\beta_1}$ and $\hat{\beta_0}$:

In [None]:
print(reg.coef_)
print(reg.intercept_)

What is your comment about $\hat{\beta_1}$ and $\hat{\beta_0}$?

### Using *statsmodels* library:

Import `statsmodels` module and call the method to fit the linear regression model with the data.

In [41]:
import statsmodels.api as sm

model = 
results = 

Print out the value of $\hat{\beta_1}$ and $\hat{\beta_0}$:

In [None]:
summarize(results)

What is your comment about $\hat{\beta_1}$ and $\hat{\beta_0}$?