# 

# <center> Linear Models

### Definição

O algorítmo de regressão linear é estimado utilizando métodos estatísticos que minimizam os erros de previsão. 

Se as premissas de Gauss Markov forem atendidas, a estimação dos parâmetros pelo método dos Mínimos Quadrados Ordinários resulta no melhor estimador linear não viesado. 

A estimação pelo MQO exige que a matriz X possua inversa. Nos casos em que X não possui inversa o uso da pseudoinversa de Moore-Penrose ou do método do Gradiente Descendente resolvem o problema.

### Requisitos

A variável resposta deve ser contínua e assume-se a normalidade dos resíduos.

In [1]:
#suppress warnings
import warnings
warnings.filterwarnings('ignore')

In [2]:
import numpy as np
import pandas as pd
from sklearn import datasets
from sklearn.model_selection import train_test_split

import matplotlib.pyplot as plt
from matplotlib.colors import ListedColormap
#cmap = ListedColormap(['#FF0000', '#00FF00', '#0000FF'])

___

# <center> Linear Regression

In [3]:
from my_LinearModels import my_LinearRegression, r2

## Data

In [4]:
houses = datasets.load_boston()
X, y = houses.data, houses.target

In [5]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=1234)

## Regressor

In [6]:
reg = my_LinearRegression(lr = 0.000001, n_iters = 1000000, fit_mode = 'GD')
reg.fit(X_train, y_train)
predictions = reg.predict(X_test)

r2(y_test, predictions)

Wall time: 22.7 s


0.6601088520543237

In [7]:
%%time
reg = my_LinearRegression(fit_mode = 'ols')
reg.fit(X_train, y_train)
predictions = reg.predict(X_test)

r2(y_test, predictions)

Wall time: 2.99 ms


0.7665382927362874

___

In [8]:
from sklearn.metrics import r2_score
from sklearn.linear_model import LinearRegression

In [9]:
%%time

# Criando o objeto linear regression
modelo = LinearRegression()

# Treinando o modelo com dados de treino e checando o score
modelo.fit(X_train, y_train)
modelo.score(X_train, y_train)

# Previsões
y_pred = modelo.predict(X_test)

# Resultado
r2 = r2_score(y_test, y_pred)
print("\n 2: %.2f%%" % (r2*100))


 2: 76.65%
Wall time: 23.9 ms


___

# <center> Logistic Regression

In [10]:
from my_LinearModels import my_LogisticRegression, accuracy, confusionMatrix

## Data

In [11]:
bc = datasets.load_breast_cancer()
X, y = bc.data, bc.target

In [12]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=1234)

In [13]:
clf = my_LogisticRegression()
clf.fit(X_train, y_train)
predictions = clf.predict(X_test)

In [14]:
accuracy(predictions, y_test)

0.9210526315789473

In [15]:
# confusion matrix
confusionMatrix(y_test, predictions)

Predicted,0,1,All
Actual,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
0,39,6,45
1,3,66,69
All,42,72,114


___

In [39]:
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import confusion_matrix, classification_report

In [40]:
bc = datasets.load_breast_cancer()
X, y = bc.data, bc.target

In [41]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 1234)

In [42]:
modelo = LogisticRegression(solver = 'liblinear')
modelo.fit(X_train, y_train)

predict = modelo.predict(X_test)

In [43]:
 modelo.score(X_test, y_test)

0.9473684210526315

In [49]:
print(classification_report(y_test, predict))

              precision    recall  f1-score   support

           0       1.00      0.87      0.93        45
           1       0.92      1.00      0.96        69

    accuracy                           0.95       114
   macro avg       0.96      0.93      0.94       114
weighted avg       0.95      0.95      0.95       114



___