# 

# <center> Linear Regression

## References

* Hands-on Machine Learning with Scikit-Learn, Keras, and TensorFlow - Aurélien Géron
* Machine learning - Fast reference guide - Matt Harrison
* https://www.youtube.com/@patloeber
* https://www.youtube.com/@Dataquestio

## Overview

Linear regression is a statistical method of supervised learning regression that assumes that a particular variable of interest can be explained by a linear combination of input data/features. Estimating the model involves finding the weights or parameters that minimize the prediction errors.

The resulting model will have an equation of the type $ \hat{y} = \hat{\beta_{0}} + \hat{\beta_{1}}x_{1} + \hat{\beta_{2}}x_{2} + ... + \hat{\beta_{n}}x_{n} $ ou $ \hat{y} = \beta x  $ in vectorized form.

There are certain assumptions that must be met for inferences to be made, and if these assumptions are satisfied, parameter estimation using the Ordinary Least Squares (OLS) method results in the best unbiased linear estimator.

#### Gauss-Markov assumptions

* Random sampling

* The model is linear in parameters

* Non-zero variance in X

* No Perfect Collinearity

* The conditional mean of the residuals is zero (the expected value of the errors is zero)

* Homoscedasticity (the variance of the residuals is constant)

* No correlation between residuals


### Some math on estimation methods:

* ### Direct Method
The direct method will optimize the model by deriving the cost function with respect to each parameter and solving the resulting system of equations.

   * #### Close equation
   $ \hat{\beta}  = (X^{T}X)^{-1} X^{T} y $

   * #### Using Pseudo-inverse
   The normal equation method may encounter a matrix that has no inverse. In this case the parameters could not be estimated. The pseudo inverse overcomes this obstacle.

   $ \hat{\beta}  = X^{+} y $
            The idea is to decompose the training matrix X into the product of three matrices.
   $ X^{+} = V \sum^{+}U^{T}  $

   To calculate $\sum^{+}$, all values smaller than a small threshold value are set to zero, the remaining values are replaced by their inverse, and the resulting matrix is transposed.


* ### Gradient Descent method

    This method minimizes the cost function $ \sum{(y - \hat{y})}^2 $ by iteratively adjusting the parameters $ \hat{\beta_{i}} $ on equation $ \hat{y} = \hat{\beta_{0}} + \hat{\beta_{1}}x_{1} + \hat{\beta_{2}}x_{2} + ... + \hat{\beta_{n}}x_{n} $.


    * Normal
    * Batch
    * Stochastic    


## Requirements:

It is expected that the residuals do not exhibit a behavior that can be modeled. If they exhibit a defined behavior, there may be bias problems (functional form), absence of relevant explanatory variable for the model, among others

## More ...

In another file there is a more detailed explanation of each aspect mentioned here. There are also two other files dealing with the interpretation of the models given transformations in the variables and on the interpretation of the model using dummy variables.

___

## Imports

In [1]:
# Suppress warnings
import warnings
warnings.filterwarnings('ignore')

In [2]:
# Imports
import numpy as np
import pandas as pd
from sklearn import datasets
from sklearn.model_selection import train_test_split

___

## Data

In [3]:
# Load data
url = "https://raw.githubusercontent.com/FernandoSartorGH/data/main/housing.csv"
df = pd.read_csv(url)

# Define X and y
X, y = df.iloc[:,0:-1], df.iloc[:,-1]

In [4]:
# Separate data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=1234)

## From scratch

In [5]:
# Import
from my_LinearModels import my_LinearRegression, r2

In [6]:
# Define the model
reg = my_LinearRegression(fit_mode = 'ols')

# Fit the model
reg.fit(X_train, y_train)

# Predictions
predictions = reg.predict(X_test)

# Accuracy
r2 = r2(y_test, predictions)
print("\n r2: %.2f%%" % (r2*100))


 r2: 75.68%


___

## From sklearn

In [7]:
from sklearn.metrics import r2_score
from sklearn.linear_model import LinearRegression

In [8]:
# Define the model
modelo = LinearRegression()

# Fit the model
modelo.fit(X_train, y_train)

# Predictions
y_pred = modelo.predict(X_test)

# Accuracy
r2 = r2_score(y_test, y_pred)
print("\n r2: %.2f%%" % (r2*100))


 r2: 75.68%


___