# **REDES NEURAIS ARTIFICIAIS: REGRESSÃO**

Este projeto tem por objetivo desenvolver um algoritmo de Machine Learning para prever o valor do preço médio de casas em Boston. 

Os dados foram extraídos do site do Kaggle:

https://www.kaggle.com/schirmerchad/bostonhoustingmlnd

In [1]:
import numpy as np
import pandas as pd

In [2]:
df = pd.read_csv('housing.csv',
                    sep=',', encoding='iso-8859-1')

In [4]:
df.head()

Unnamed: 0,RM,LSTAT,PTRATIO,MEDV
0,6.575,4.98,15.3,504000.0
1,6.421,9.14,17.8,453600.0
2,7.185,4.03,17.8,728700.0
3,6.998,2.94,18.7,701400.0
4,7.147,5.33,18.7,760200.0


**Atributos previsores**

RM: é o número médio de cômodos entre os imóveis no bairro.

LSTAT: é a porcentagem de proprietários no bairro considerados de "classe baixa" (proletariado).

PTRATIO: é a razão entre estudantes e professores nas escolas de ensino fundamental e médio no bairro.

**Variável alvo**

MEDV: valor médio das casas

In [7]:
df.shape

(489, 4)

In [8]:
independente = df.iloc[:, 0:3].values
independente

array([[ 6.575,  4.98 , 15.3  ],
       [ 6.421,  9.14 , 17.8  ],
       [ 7.185,  4.03 , 17.8  ],
       ...,
       [ 6.976,  5.64 , 21.   ],
       [ 6.794,  6.48 , 21.   ],
       [ 6.03 ,  7.88 , 21.   ]])

In [9]:
independente.shape

(489, 3)

In [10]:
dependente = df.iloc[:, 3].values

In [11]:
dependente.shape

(489,)

## **TREINAMENTO**

In [12]:
from sklearn.model_selection import train_test_split
x_treino, x_teste, y_treino, y_teste = train_test_split(independente, dependente, test_size = 0.3, random_state = 0)

In [13]:
x_treino.shape, x_teste.shape

((342, 3), (147, 3))

In [14]:
from sklearn.neural_network import MLPRegressor

In [15]:
redes = MLPRegressor(hidden_layer_sizes=(100, 100), activation='relu', verbose=True, max_iter=2000,
                    solver='lbfgs', random_state = 12)

In [16]:
redes.fit(x_treino, y_treino)

RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =        10601     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  1.16928D+11    |proj g|=  3.29234D+06

At iterate    1    f=  1.16918D+11    |proj g|=  4.33329D+06

At iterate    2    f=  2.71782D+10    |proj g|=  4.09900D+08

At iterate    3    f=  2.69460D+10    |proj g|=  2.97406D+08

At iterate    4    f=  1.52752D+10    |proj g|=  2.88989D+08

At iterate    5    f=  8.56902D+09    |proj g|=  1.97168D+08

At iterate    6    f=  8.39413D+09    |proj g|=  1.71036D+08

At iterate    7    f=  8.17655D+09    |proj g|=  5.13025D+07

At iterate    8    f=  7.96029D+09    |proj g|=  8.40343D+07

At iterate    9    f=  7.56165D+09    |proj g|=  1.72619D+08

At iterate   10    f=  6.63223D+09    |proj g|=  2.74530D+08

At iterate   11    f=  6.42420D+09    |proj g|=  2.56539D+08

At iterate   12    f=  5.99207D+09    |proj g|=  1.46870D+08

At iterate   13    f=  5.5

 This problem is unconstrained.



At iterate   15    f=  4.56389D+09    |proj g|=  2.15864D+08

At iterate   16    f=  4.43409D+09    |proj g|=  3.60906D+08

At iterate   17    f=  4.13706D+09    |proj g|=  2.45326D+08

At iterate   18    f=  3.85147D+09    |proj g|=  2.19167D+08

At iterate   19    f=  3.68087D+09    |proj g|=  1.56339D+08

At iterate   20    f=  3.46243D+09    |proj g|=  2.39284D+08

At iterate   21    f=  3.29854D+09    |proj g|=  1.12534D+08

At iterate   22    f=  2.98433D+09    |proj g|=  8.55963D+07

At iterate   23    f=  2.93380D+09    |proj g|=  3.44586D+08

At iterate   24    f=  2.86529D+09    |proj g|=  6.91567D+07

At iterate   25    f=  2.85351D+09    |proj g|=  3.08771D+07

At iterate   26    f=  2.84584D+09    |proj g|=  3.07212D+07

At iterate   27    f=  2.84175D+09    |proj g|=  2.75633D+07

At iterate   28    f=  2.83768D+09    |proj g|=  3.23314D+07

At iterate   29    f=  2.83739D+09    |proj g|=  3.55746D+07

At iterate   30    f=  2.83614D+09    |proj g|=  1.89965D+07

At iter

In [17]:
redes.n_layers_

4

In [18]:
redes.score(x_treino, y_treino)

0.8689899184966059

## **TESTE**

In [19]:
redes.score(x_teste, y_teste)

0.7756554302714839

In [20]:
previsoes_teste = redes.predict(x_teste)

## **MÉTRICAS**

In [21]:
from sklearn.metrics import mean_absolute_error, mean_squared_error

In [22]:
# Erro médio Absoluto
mean_absolute_error(y_teste, previsoes_teste)

59549.90254923086

In [23]:
# Raiz do erro quadrático médio (RMSE)
np.sqrt(mean_squared_error(y_teste, previsoes_teste))

80679.08841185785

### **Validação Cruzada**

In [24]:
from sklearn.model_selection import KFold
from sklearn.model_selection import cross_val_score

In [25]:
# Separando os dados em folds
kfold = KFold(n_splits = 12, shuffle=True, random_state = 5)

In [26]:
# Criando o modelo
from sklearn.neural_network import MLPRegressor
modelo = MLPRegressor(hidden_layer_sizes=(100, 100), activation='relu', verbose=True, max_iter=2000,
                    solver='lbfgs', random_state = 12)
resultado = cross_val_score(modelo, independente, dependente, cv = kfold)
resultado

RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =        10601     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  1.17207D+11    |proj g|=  3.30086D+06

At iterate    1    f=  1.17197D+11    |proj g|=  4.35095D+06

At iterate    2    f=  2.73537D+10    |proj g|=  3.96555D+08

At iterate    3    f=  2.71414D+10    |proj g|=  2.89174D+08

At iterate    4    f=  1.59010D+10    |proj g|=  2.81691D+08


 This problem is unconstrained.



At iterate    5    f=  8.78735D+09    |proj g|=  1.92526D+08

At iterate    6    f=  8.46651D+09    |proj g|=  1.44674D+08

At iterate    7    f=  8.25894D+09    |proj g|=  6.38165D+07

At iterate    8    f=  8.03636D+09    |proj g|=  8.07483D+07

At iterate    9    f=  7.82796D+09    |proj g|=  1.12509D+08

At iterate   10    f=  7.11768D+09    |proj g|=  1.31935D+08

At iterate   11    f=  6.79192D+09    |proj g|=  1.56049D+08

At iterate   12    f=  6.52191D+09    |proj g|=  2.36719D+08

At iterate   13    f=  6.14987D+09    |proj g|=  1.48120D+08

At iterate   14    f=  5.78708D+09    |proj g|=  3.03951D+08

At iterate   15    f=  5.56895D+09    |proj g|=  1.64627D+08

At iterate   16    f=  5.20118D+09    |proj g|=  2.27434D+08

At iterate   17    f=  5.08276D+09    |proj g|=  2.42926D+08

At iterate   18    f=  4.96018D+09    |proj g|=  3.30557D+08

At iterate   19    f=  4.79744D+09    |proj g|=  1.49945D+08

At iterate   20    f=  4.68597D+09    |proj g|=  1.50518D+08

At iter


   evaluations in the last line search.  Termination
   may possibly be caused by a bad search direction.
 This problem is unconstrained.



At iterate    3    f=  2.73677D+10    |proj g|=  3.01545D+08

At iterate    4    f=  1.52859D+10    |proj g|=  2.96644D+08

At iterate    5    f=  8.18756D+09    |proj g|=  1.98750D+08

At iterate    6    f=  7.96223D+09    |proj g|=  1.58041D+08

At iterate    7    f=  7.75325D+09    |proj g|=  4.94765D+07

At iterate    8    f=  7.68329D+09    |proj g|=  3.59771D+07

At iterate    9    f=  7.58588D+09    |proj g|=  2.52375D+08

At iterate   10    f=  7.32861D+09    |proj g|=  1.06436D+08

At iterate   11    f=  7.09753D+09    |proj g|=  4.77420D+07

At iterate   12    f=  6.79893D+09    |proj g|=  1.76302D+08

At iterate   13    f=  6.62474D+09    |proj g|=  2.89632D+08

At iterate   14    f=  6.40093D+09    |proj g|=  2.21532D+08

At iterate   15    f=  6.15485D+09    |proj g|=  1.62473D+08

At iterate   16    f=  6.07220D+09    |proj g|=  1.22904D+08

At iterate   17    f=  5.65977D+09    |proj g|=  8.84005D+07

At iterate   18    f=  5.58239D+09    |proj g|=  2.20433D+08

At iter

 This problem is unconstrained.



At iterate    2    f=  2.71657D+10    |proj g|=  3.92961D+08

At iterate    3    f=  2.69770D+10    |proj g|=  2.92395D+08

At iterate    4    f=  1.55530D+10    |proj g|=  2.82395D+08

At iterate    5    f=  8.67303D+09    |proj g|=  1.88304D+08

At iterate    6    f=  8.26166D+09    |proj g|=  1.38753D+08

At iterate    7    f=  8.04163D+09    |proj g|=  6.39749D+07

At iterate    8    f=  7.60423D+09    |proj g|=  1.47682D+08

At iterate    9    f=  7.33039D+09    |proj g|=  1.44045D+08

At iterate   10    f=  6.36224D+09    |proj g|=  6.95895D+08

At iterate   11    f=  5.29651D+09    |proj g|=  2.55500D+08

At iterate   12    f=  5.13674D+09    |proj g|=  1.94601D+08

At iterate   13    f=  4.91593D+09    |proj g|=  2.02037D+08

At iterate   14    f=  4.69968D+09    |proj g|=  2.07541D+08

At iterate   15    f=  4.63077D+09    |proj g|=  5.26822D+07

At iterate   16    f=  4.40639D+09    |proj g|=  5.63173D+08

At iterate   17    f=  4.19372D+09    |proj g|=  4.65740D+08

At iter


   evaluations in the last line search.  Termination
   may possibly be caused by a bad search direction.
 This problem is unconstrained.



At iterate    3    f=  2.75677D+10    |proj g|=  2.93268D+08

At iterate    4    f=  1.60213D+10    |proj g|=  2.87991D+08

At iterate    5    f=  8.87855D+09    |proj g|=  2.04492D+08

At iterate    6    f=  8.55611D+09    |proj g|=  1.62254D+08

At iterate    7    f=  8.35387D+09    |proj g|=  5.90351D+07

At iterate    8    f=  8.17471D+09    |proj g|=  6.25144D+07

At iterate    9    f=  7.94353D+09    |proj g|=  1.90113D+08

At iterate   10    f=  7.63001D+09    |proj g|=  1.26843D+08

At iterate   11    f=  7.16455D+09    |proj g|=  4.36953D+08

At iterate   12    f=  6.55293D+09    |proj g|=  1.78344D+08

At iterate   13    f=  6.37894D+09    |proj g|=  1.14684D+08

At iterate   14    f=  6.21780D+09    |proj g|=  8.30182D+07

At iterate   15    f=  5.78254D+09    |proj g|=  3.13909D+08

At iterate   16    f=  5.47675D+09    |proj g|=  4.11675D+08

At iterate   17    f=  5.26838D+09    |proj g|=  1.61687D+08

At iterate   18    f=  5.22478D+09    |proj g|=  1.10272D+08

At iter

 This problem is unconstrained.



At iterate    2    f=  2.77336D+10    |proj g|=  4.17909D+08

At iterate    3    f=  2.74602D+10    |proj g|=  2.94928D+08

At iterate    4    f=  1.58104D+10    |proj g|=  2.92482D+08

At iterate    5    f=  8.52160D+09    |proj g|=  2.13277D+08

At iterate    6    f=  8.25554D+09    |proj g|=  1.58299D+08

At iterate    7    f=  8.04449D+09    |proj g|=  5.17796D+07

At iterate    8    f=  7.96924D+09    |proj g|=  3.98534D+07

At iterate    9    f=  7.76917D+09    |proj g|=  2.00880D+08

At iterate   10    f=  7.49377D+09    |proj g|=  9.71439D+07

At iterate   11    f=  7.27315D+09    |proj g|=  5.14347D+07

At iterate   12    f=  6.98473D+09    |proj g|=  2.19006D+08

At iterate   13    f=  6.84673D+09    |proj g|=  2.65959D+08

At iterate   14    f=  6.37467D+09    |proj g|=  1.85169D+08

At iterate   15    f=  6.08354D+09    |proj g|=  1.17621D+08

At iterate   16    f=  5.71731D+09    |proj g|=  3.73635D+08

At iterate   17    f=  5.54196D+09    |proj g|=  2.40095D+08

At iter


   evaluations in the last line search.  Termination
   may possibly be caused by a bad search direction.
 This problem is unconstrained.


At iterate    1    f=  1.17629D+11    |proj g|=  4.33627D+06

At iterate    2    f=  2.79234D+10    |proj g|=  4.17189D+08

At iterate    3    f=  2.76742D+10    |proj g|=  2.99677D+08

At iterate    4    f=  1.58562D+10    |proj g|=  2.91349D+08

At iterate    5    f=  8.95312D+09    |proj g|=  1.98749D+08

At iterate    6    f=  8.64124D+09    |proj g|=  1.54101D+08

At iterate    7    f=  8.44752D+09    |proj g|=  5.66033D+07

At iterate    8    f=  8.30060D+09    |proj g|=  4.72845D+07

At iterate    9    f=  8.03275D+09    |proj g|=  2.05869D+08

At iterate   10    f=  7.72607D+09    |proj g|=  1.12335D+08

At iterate   11    f=  7.01026D+09    |proj g|=  3.46479D+08

At iterate   12    f=  6.76024D+09    |proj g|=  1.16915D+08

At iterate   13    f=  6.52070D+09    |proj g|=  1.71870D+08

At iterate   14    f=  6.17509D+09    |proj g|=  1.26644D+08

At iterate   15    f=  5.93321D+09    |proj g|=  1.95343D+08

At iterate   16    f=  4.92266D+09    |proj g|=  4.92816D+08

At itera

 This problem is unconstrained.



At iterate    1    f=  1.17925D+11    |proj g|=  4.33878D+06

At iterate    2    f=  2.82130D+10    |proj g|=  4.16383D+08

At iterate    3    f=  2.79870D+10    |proj g|=  3.04590D+08

At iterate    4    f=  1.57104D+10    |proj g|=  2.97467D+08

At iterate    5    f=  8.71068D+09    |proj g|=  2.09651D+08

At iterate    6    f=  8.38723D+09    |proj g|=  1.62289D+08

At iterate    7    f=  8.18017D+09    |proj g|=  5.92209D+07

At iterate    8    f=  8.03256D+09    |proj g|=  5.58912D+07

At iterate    9    f=  7.73232D+09    |proj g|=  1.82137D+08

At iterate   10    f=  7.44480D+09    |proj g|=  1.21384D+08

At iterate   11    f=  6.79813D+09    |proj g|=  2.69555D+08

At iterate   12    f=  6.54133D+09    |proj g|=  1.22424D+08

At iterate   13    f=  6.28113D+09    |proj g|=  8.48335D+07

At iterate   14    f=  6.12854D+09    |proj g|=  3.48028D+08

At iterate   15    f=  5.93680D+09    |proj g|=  1.41916D+08

At iterate   16    f=  5.76605D+09    |proj g|=  7.77794D+07

At iter

 This problem is unconstrained.



At iterate    4    f=  1.47656D+10    |proj g|=  2.93729D+08

At iterate    5    f=  8.12451D+09    |proj g|=  2.03038D+08

At iterate    6    f=  7.92520D+09    |proj g|=  1.73321D+08

At iterate    7    f=  7.71980D+09    |proj g|=  5.09719D+07

At iterate    8    f=  7.66533D+09    |proj g|=  4.24537D+07

At iterate    9    f=  7.48096D+09    |proj g|=  9.42671D+07

At iterate   10    f=  7.32935D+09    |proj g|=  1.19812D+08

At iterate   11    f=  6.33274D+09    |proj g|=  1.50230D+08

At iterate   12    f=  6.18805D+09    |proj g|=  2.85869D+08

At iterate   13    f=  5.95260D+09    |proj g|=  3.80606D+08

At iterate   14    f=  5.43723D+09    |proj g|=  1.22719D+08

At iterate   15    f=  5.17991D+09    |proj g|=  2.59555D+08

At iterate   16    f=  4.99062D+09    |proj g|=  1.83932D+08

At iterate   17    f=  4.57496D+09    |proj g|=  2.31705D+08

At iterate   18    f=  4.41939D+09    |proj g|=  5.86567D+08

At iterate   19    f=  3.90606D+09    |proj g|=  9.19341D+07

At iter

 This problem is unconstrained.



At iterate    4    f=  1.56115D+10    |proj g|=  2.91069D+08

At iterate    5    f=  8.57310D+09    |proj g|=  1.97233D+08

At iterate    6    f=  8.31199D+09    |proj g|=  1.66208D+08

At iterate    7    f=  8.11476D+09    |proj g|=  5.50185D+07

At iterate    8    f=  8.01856D+09    |proj g|=  4.63418D+07

At iterate    9    f=  7.78483D+09    |proj g|=  2.70702D+08

At iterate   10    f=  7.41948D+09    |proj g|=  1.55476D+08

At iterate   11    f=  6.92176D+09    |proj g|=  3.32251D+08

At iterate   12    f=  6.45089D+09    |proj g|=  1.50618D+08

At iterate   13    f=  6.27418D+09    |proj g|=  7.42633D+07

At iterate   14    f=  6.19423D+09    |proj g|=  1.59616D+08

At iterate   15    f=  5.97227D+09    |proj g|=  2.09243D+08

At iterate   16    f=  5.63028D+09    |proj g|=  3.23782D+08

At iterate   17    f=  5.54532D+09    |proj g|=  1.99494D+08

At iterate   18    f=  5.22896D+09    |proj g|=  1.86569D+08

At iterate   19    f=  5.21935D+09    |proj g|=  8.47111D+07

At iter

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
  self.n_iter_ = _check_optimize_result("lbfgs", opt_res, self.max_iter)
 This problem is unconstrained.



At iterate    9    f=  7.78926D+09    |proj g|=  1.90975D+08

At iterate   10    f=  6.73607D+09    |proj g|=  4.97473D+08

At iterate   11    f=  6.25526D+09    |proj g|=  4.47441D+08

At iterate   12    f=  5.72333D+09    |proj g|=  9.47419D+07

At iterate   13    f=  5.63572D+09    |proj g|=  3.56187D+08

At iterate   14    f=  5.35577D+09    |proj g|=  1.14145D+08

At iterate   15    f=  5.17159D+09    |proj g|=  1.31215D+08

At iterate   16    f=  4.75569D+09    |proj g|=  1.59748D+08

At iterate   17    f=  4.32767D+09    |proj g|=  5.73083D+08

At iterate   18    f=  4.01116D+09    |proj g|=  8.69957D+08

At iterate   19    f=  3.72586D+09    |proj g|=  2.35414D+08

At iterate   20    f=  3.58830D+09    |proj g|=  2.68724D+08

At iterate   21    f=  3.46652D+09    |proj g|=  1.37649D+08

At iterate   22    f=  3.41397D+09    |proj g|=  3.44585D+08

At iterate   23    f=  3.34219D+09    |proj g|=  8.10373D+07

At iterate   24    f=  3.17263D+09    |proj g|=  1.45465D+08

At iter


   evaluations in the last line search.  Termination
   may possibly be caused by a bad search direction.
 This problem is unconstrained.



At iterate    2    f=  2.68061D+10    |proj g|=  3.94921D+08

At iterate    3    f=  2.65873D+10    |proj g|=  2.86121D+08

At iterate    4    f=  1.55531D+10    |proj g|=  2.80940D+08

At iterate    5    f=  8.53179D+09    |proj g|=  1.96060D+08

At iterate    6    f=  8.22901D+09    |proj g|=  1.46920D+08

At iterate    7    f=  8.04350D+09    |proj g|=  5.76708D+07

At iterate    8    f=  7.87324D+09    |proj g|=  5.64299D+07

At iterate    9    f=  7.65649D+09    |proj g|=  1.68908D+08

At iterate   10    f=  7.40392D+09    |proj g|=  9.64970D+07

At iterate   11    f=  6.91855D+09    |proj g|=  1.09394D+08

At iterate   12    f=  6.74028D+09    |proj g|=  2.24526D+08

At iterate   13    f=  6.56220D+09    |proj g|=  1.63529D+08

At iterate   14    f=  6.34505D+09    |proj g|=  1.52816D+08

At iterate   15    f=  6.23784D+09    |proj g|=  1.47714D+08

At iterate   16    f=  5.80749D+09    |proj g|=  1.83050D+08

At iterate   17    f=  5.48601D+09    |proj g|=  1.32972D+08

At iter

 This problem is unconstrained.



At iterate    1    f=  1.18827D+11    |proj g|=  4.33456D+06

At iterate    2    f=  2.81479D+10    |proj g|=  3.96622D+08

At iterate    3    f=  2.79875D+10    |proj g|=  3.04021D+08

At iterate    4    f=  1.55463D+10    |proj g|=  2.95427D+08

At iterate    5    f=  8.69151D+09    |proj g|=  2.13485D+08

At iterate    6    f=  8.35019D+09    |proj g|=  1.61791D+08

At iterate    7    f=  8.14391D+09    |proj g|=  5.93712D+07

At iterate    8    f=  7.97343D+09    |proj g|=  6.51838D+07

At iterate    9    f=  7.70658D+09    |proj g|=  1.25045D+08

At iterate   10    f=  7.17719D+09    |proj g|=  2.16571D+08

At iterate   11    f=  6.70013D+09    |proj g|=  2.56721D+08

At iterate   12    f=  6.16502D+09    |proj g|=  2.42537D+08

At iterate   13    f=  5.98416D+09    |proj g|=  2.72092D+08

At iterate   14    f=  5.58713D+09    |proj g|=  1.68310D+08

At iterate   15    f=  5.36135D+09    |proj g|=  1.65385D+08

At iterate   16    f=  5.33545D+09    |proj g|=  2.32544D+08

At iter

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
  self.n_iter_ = _check_optimize_result("lbfgs", opt_res, self.max_iter)


array([0.87642701, 0.86425483, 0.82201182, 0.78885709, 0.80743894,
       0.80809082, 0.70198158, 0.80960983, 0.76938105, 0.69058599,
       0.84589773, 0.65546748])

In [28]:
# Usamos a média e o desvio padrão
print("Acurácia Média: %.2f%%" % (resultado.mean() * 100.0))

Acurácia Média: 78.67%


## **RESULTADOS**

**REGRESSÃO LINEAR SIMPLES:** R^2 = 0,57/0,60; RMSE = 99315,5; R^2 Validação Cruzada: 55,97%

**REGRESSÃO LINEAR MÚLTIPLA:** R^2 = 0,73/0,68; RMSE = 96087,3; R^2 Validação Cruzada: 69,25%

**REGRESSÃO POLINOMIAL:** R^2 = 0,59/0,54; RMSE = 114670,6.

**REGRESSÃO SVR:** R^2 = 0,87/0,81; RMSE = 73422,7. R^2 Validação Cruzada: 82,37%.

**REGRESSÃO ÁRVORE DE DECISÃO:** R^2 = 0,91/0,83; RMSE = 71114,5. R^2 Validação Cruzada: 74,60%.

**REGRESSÃO COM RANDOM FOREST:** R^2 = 0,92/0,85; RMSE = 66729,3. R^2 Validação Cruzada: 82,85%.

**REGRESSÃO COM XGBOOST:** R^2 = 0,93/0,84; RMSE = 67788,8. R^2 Validação Cruzada: 83,22%.

**REGRESSÃO COM LIGHT GBM:** R^2 = 0,88/0,82; RMSE = 71906,4. R^2 Validação Cruzada: 82,38%.

**REGRESSÃO COM CATBOOST:** R^2 = 0,90/0,84; RMSE = 69053,3 R^2 Validação Cruzada: 83,40%.

**REGRESSÃO COM REDES NEURAIS:** R^2 = 0,88/0,83; RMSE = 69717,4. R^2 Validação Cruzada: 77,15%. Escalonado.

## **Padronização de escala**

In [29]:
from sklearn.preprocessing import StandardScaler
x_scaler = StandardScaler()
x_treino_scaler = x_scaler.fit_transform(x_treino)

In [31]:
x_treino_scaler

array([[ 0.05327517, -0.70150711, -0.05467118],
       [ 1.12799963, -0.44487061, -0.52922816],
       [ 0.60711128, -0.79792304,  0.230063  ],
       ...,
       [-0.33111532, -0.36121561, -0.33940537],
       [-0.31699486,  0.84398345, -0.29194967],
       [-0.33268427, -0.38815536, -0.90887374]])

In [32]:
y_scaler = StandardScaler()
y_treino_scaler = y_scaler.fit_transform(y_treino.reshape(-1,1))

In [33]:
y_treino_scaler

array([[-1.05925606e-02],
       [ 6.46900118e-01],
       [ 2.85923746e-01],
       [-1.13728667e-01],
       [ 1.44111599e-01],
       [-7.84113359e-01],
       [-1.24822584e+00],
       [-2.81324840e-01],
       [-1.39512694e-01],
       [-1.01616960e+00],
       [ 2.00056152e+00],
       [ 1.21414870e+00],
       [ 1.27860877e+00],
       [-1.52404707e-01],
       [ 9.04740384e-01],
       [ 2.29945267e-03],
       [-2.81324840e-01],
       [ 2.73031732e-01],
       [-2.81324840e-01],
       [ 2.73031732e-01],
       [ 4.27735892e-01],
       [ 2.60139719e-01],
       [-1.06773765e+00],
       [-1.17087376e+00],
       [ 3.49603506e+00],
       [ 9.04740384e-01],
       [-2.07331469e+00],
       [ 1.69895626e-01],
       [ 4.27735892e-01],
       [ 2.76119030e+00],
       [-3.45784907e-01],
       [ 2.73031732e-01],
       [-1.48028208e+00],
       [ 1.57003612e-01],
       [-1.91080747e-01],
       [-1.01616960e+00],
       [ 3.37491799e-01],
       [-2.34845740e-02],
       [ 1.9

In [34]:
x_teste_scaler = x_scaler.transform(x_teste)
x_teste_scaler

array([[-6.24507256e-01, -6.20687880e-01,  1.17917695e+00],
       [ 9.56985082e-01, -8.43295235e-01, -2.61727885e+00],
       [-1.30072075e+00,  1.98112421e+00, -1.81053199e+00],
       [ 9.72674490e-01,  9.82935809e-01,  7.99531373e-01],
       [ 2.90185237e-01, -5.72479917e-01, -3.39405369e-01],
       [ 2.72926888e-01,  9.46070896e-01,  7.99531373e-01],
       [-2.08519115e+00,  2.33134087e+00, -1.81053199e+00],
       [-1.88341711e-01, -2.51777567e-02,  7.99531373e-01],
       [-1.08325729e-01, -2.13755962e-01, -2.44493974e-01],
       [-4.18976010e-01,  1.39296468e-01,  1.17917695e+00],
       [-1.41892805e-02,  1.26651206e+00,  7.99531373e-01],
       [-6.19800434e-01,  4.03022379e-01,  7.99531373e-01],
       [-2.99736508e-01, -7.29864736e-01,  5.14797187e-01],
       [-2.51664988e+00,  3.05162455e+00,  7.99531373e-01],
       [ 7.76556889e-01, -4.85989161e-01,  1.13172126e+00],
       [-7.45315699e-01,  6.32719141e-01,  1.27408835e+00],
       [ 7.36714050e-02, -1.24429444e-01

In [35]:
y_teste_scaler = y_scaler.transform(y_teste.reshape(-1,1))
y_teste_scaler

array([[-2.29756787e-01],
       [ 1.08522857e+00],
       [-1.06773765e+00],
       [ 7.50036225e-01],
       [ 1.18327572e-01],
       [-5.90733160e-01],
       [-1.27400987e+00],
       [ 1.18327572e-01],
       [-3.63765873e-02],
       [-2.68432827e-01],
       [-1.48028208e+00],
       [-1.48028208e+00],
       [-1.65296720e-01],
       [-4.87597053e-01],
       [ 7.50036225e-01],
       [-7.84113359e-01],
       [-2.16864774e-01],
       [ 2.08571666e-01],
       [ 9.25435458e-02],
       [ 1.13679662e+00],
       [ 1.20125669e+00],
       [ 1.44620494e+00],
       [-1.89282650e+00],
       [-8.79446405e-02],
       [-5.39165106e-01],
       [ 3.13505869e+00],
       [ 1.84585736e+00],
       [ 2.65805419e+00],
       [ 2.29945267e-03],
       [-6.21606139e-02],
       [-8.09897386e-01],
       [ 2.29945267e-03],
       [-1.52404707e-01],
       [ 2.52913406e+00],
       [-6.21606139e-02],
       [ 2.58070211e+00],
       [-2.42648800e-01],
       [-1.14508973e+00],
       [ 8.2

In [36]:
redes = MLPRegressor(hidden_layer_sizes=(6,6,6), activation='relu', verbose=True, max_iter=1500,
                    solver='lbfgs', random_state = 12)

In [38]:
redes.fit(x_treino_scaler, y_treino_scaler.ravel())

RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =          115     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  1.46378D+00    |proj g|=  1.97544D+00

At iterate    1    f=  4.31094D-01    |proj g|=  2.78340D-01

At iterate    2    f=  3.91974D-01    |proj g|=  1.95031D-01

At iterate    3    f=  3.17777D-01    |proj g|=  1.69936D-01

At iterate    4    f=  2.87962D-01    |proj g|=  2.23520D-01

At iterate    5    f=  2.45481D-01    |proj g|=  2.78317D-01

At iterate    6    f=  1.97417D-01    |proj g|=  1.04130D-01

At iterate    7    f=  1.81568D-01    |proj g|=  1.07492D-01

At iterate    8    f=  1.60761D-01    |proj g|=  1.30627D-01

At iterate    9    f=  1.26457D-01    |proj g|=  1.21617D-01

At iterate   10    f=  1.17529D-01    |proj g|=  1.12683D-01

At iterate   11    f=  1.03539D-01    |proj g|=  2.61634D-02

At iterate   12    f=  1.00173D-01    |proj g|=  3.17116D-02

At iterate   13    f=  9.3

 This problem is unconstrained.



At iterate   37    f=  7.13029D-02    |proj g|=  1.17874D-02

At iterate   38    f=  7.10690D-02    |proj g|=  4.42663D-03

At iterate   39    f=  7.09035D-02    |proj g|=  3.29202D-03

At iterate   40    f=  7.05117D-02    |proj g|=  6.81710D-03

At iterate   41    f=  7.04289D-02    |proj g|=  9.45045D-03

At iterate   42    f=  7.01716D-02    |proj g|=  5.03428D-03

At iterate   43    f=  6.99832D-02    |proj g|=  4.18464D-03

At iterate   44    f=  6.98241D-02    |proj g|=  1.58673D-02

At iterate   45    f=  6.95533D-02    |proj g|=  6.47964D-03

At iterate   46    f=  6.95228D-02    |proj g|=  1.00810D-02

At iterate   47    f=  6.91868D-02    |proj g|=  7.85405D-03

At iterate   48    f=  6.89923D-02    |proj g|=  9.37762D-03

At iterate   49    f=  6.87619D-02    |proj g|=  4.56974D-03

At iterate   50    f=  6.85360D-02    |proj g|=  7.41762D-03

At iterate   51    f=  6.82562D-02    |proj g|=  1.87878D-02

At iterate   52    f=  6.80253D-02    |proj g|=  9.91262D-03

At iter

In [39]:
redes.n_layers_

5

In [40]:
redes.score(x_treino_scaler, y_treino_scaler)

0.8840227764062709

**TESTE**

In [41]:
redes.score(x_teste_scaler, y_teste_scaler)

0.8259255442446645

In [42]:
previsoes_teste_scaler = redes.predict(x_teste_scaler)

In [43]:
previsoes_teste_scaler

array([-2.59888882e-01,  1.27507402e+00, -8.38271864e-01, -8.78397518e-01,
        1.07502723e-01, -9.22845124e-01, -7.70075099e-01, -2.99692112e-01,
        1.05960786e-02, -4.33305555e-01, -1.17229051e+00, -5.85023244e-01,
        2.57793720e-02, -1.36920206e+00,  3.06571741e-02, -7.72445900e-01,
       -2.93163530e-02,  6.42772666e-01, -1.22064038e-01,  1.20701089e+00,
        1.39681005e+00,  1.64978821e+00, -1.31488238e+00,  1.58250721e-01,
       -8.97689203e-02,  2.83796888e+00,  1.86009300e+00,  2.77549340e+00,
       -3.44717212e-01, -1.34956555e-01, -8.45018767e-01, -3.65488136e-01,
        7.48788409e-02,  1.55981906e+00, -6.63844097e-02,  2.67788981e+00,
       -1.08754382e-02, -1.52965729e+00,  1.40450239e-01,  5.85970223e-01,
       -1.15158864e+00, -4.82106561e-01, -8.46105539e-02,  1.64382302e-01,
       -1.92245345e-01, -4.15692378e-01,  9.00289013e-02, -8.11893407e-01,
        1.66158884e+00, -1.35306354e-01,  1.57635806e+00,  3.56088478e-02,
        8.15563821e-01, -

## **MÉTRICAS**

**Revertendo a transformação**

In [44]:
previsoes_teste_scaler

array([-2.59888882e-01,  1.27507402e+00, -8.38271864e-01, -8.78397518e-01,
        1.07502723e-01, -9.22845124e-01, -7.70075099e-01, -2.99692112e-01,
        1.05960786e-02, -4.33305555e-01, -1.17229051e+00, -5.85023244e-01,
        2.57793720e-02, -1.36920206e+00,  3.06571741e-02, -7.72445900e-01,
       -2.93163530e-02,  6.42772666e-01, -1.22064038e-01,  1.20701089e+00,
        1.39681005e+00,  1.64978821e+00, -1.31488238e+00,  1.58250721e-01,
       -8.97689203e-02,  2.83796888e+00,  1.86009300e+00,  2.77549340e+00,
       -3.44717212e-01, -1.34956555e-01, -8.45018767e-01, -3.65488136e-01,
        7.48788409e-02,  1.55981906e+00, -6.63844097e-02,  2.67788981e+00,
       -1.08754382e-02, -1.52965729e+00,  1.40450239e-01,  5.85970223e-01,
       -1.15158864e+00, -4.82106561e-01, -8.46105539e-02,  1.64382302e-01,
       -1.92245345e-01, -4.15692378e-01,  9.00289013e-02, -8.11893407e-01,
        1.66158884e+00, -1.35306354e-01,  1.57635806e+00,  3.56088478e-02,
        8.15563821e-01, -

In [45]:
previsoes_teste_inverse = y_scaler.inverse_transform(previsoes_teste_scaler.reshape(-1,1))

In [47]:
previsoes_teste_inverse

array([[412991.73644368],
       [663024.21874183],
       [318778.03730495],
       [312241.90745754],
       [472836.72358515],
       [305001.76818305],
       [329886.71395796],
       [406508.12670469],
       [457051.45024426],
       [384743.62623457],
       [264369.22321282],
       [360030.09691612],
       [459524.68040404],
       [232293.99632532],
       [460319.23313607],
       [329500.53055277],
       [450550.05248043],
       [560027.67296014],
       [435442.23848138],
       [651937.31021844],
       [682853.98907936],
       [724061.99309271],
       [241142.2128865 ],
       [481103.14351972],
       [440702.84024135],
       [917606.58137768],
       [758318.86656135],
       [907429.85379097],
       [399173.91850129],
       [433342.1565315 ],
       [317679.02394911],
       [395790.51046533],
       [467522.5688776 ],
       [709406.77849354],
       [444511.97936438],
       [891531.05405976],
       [453553.92163039],
       [206157.19489692],
       [4782

In [48]:
y_teste

array([ 417900.,  632100.,  281400.,  577500.,  474600.,  359100.,
        247800.,  474600.,  449400.,  411600.,  214200.,  214200.,
        428400.,  375900.,  577500.,  327600.,  420000.,  489300.,
        470400.,  640500.,  651000.,  690900.,  147000.,  441000.,
        367500.,  966000.,  756000.,  888300.,  455700.,  445200.,
        323400.,  455700.,  430500.,  867300.,  445200.,  875700.,
        415800.,  268800.,  590100.,  497700.,  231000.,  315000.,
        388500.,  449400.,  413700.,  352800.,  453600.,  306600.,
        898800.,  514500.,  743400.,  474600.,  600600.,  304500.,
        661500.,  489300.,  422100.,  184800.,  525000.,  249900.,
        407400.,  361200.,  428400.,  392700.,  428400.,  472500.,
        258300.,  550200.,  346500.,  199500.,  302400.,  611100.,
        396900.,  585900.,  279300.,  483000.,  462000.,  218400.,
        518700.,  420000.,  392700.,  980700.,  455700.,  514500.,
        480900.,  520800.,  485100.,  525000.,  390600.,  5691

In [49]:
from sklearn.metrics import mean_absolute_error, mean_squared_error

In [50]:
# Erro médio Absoluto
mean_absolute_error(y_teste, previsoes_teste_inverse)

54239.75771276947

In [51]:
# Raiz do erro quadrático médio (RMSE)
np.sqrt(mean_squared_error(y_teste, previsoes_teste_inverse))

71067.4486827693

**Revertendo a transformação**

In [52]:
x_treino_inverse = x_scaler.inverse_transform(x_treino_scaler)

In [53]:
x_treino_inverse

array([[ 6.266,  7.9  , 18.4  ],
       [ 6.951,  9.71 , 17.4  ],
       [ 6.619,  7.22 , 19.   ],
       ...,
       [ 6.021, 10.3  , 17.8  ],
       [ 6.03 , 18.8  , 17.9  ],
       [ 6.02 , 10.11 , 16.6  ]])

In [54]:
x_treino

array([[ 6.266,  7.9  , 18.4  ],
       [ 6.951,  9.71 , 17.4  ],
       [ 6.619,  7.22 , 19.   ],
       ...,
       [ 6.021, 10.3  , 17.8  ],
       [ 6.03 , 18.8  , 17.9  ],
       [ 6.02 , 10.11 , 16.6  ]])

In [55]:
y_treino_inverse = y_scaler.inverse_transform(y_treino_scaler)

In [56]:
y_treino_inverse

array([[ 453600.],
       [ 560700.],
       [ 501900.],
       [ 436800.],
       [ 478800.],
       [ 327600.],
       [ 252000.],
       [ 409500.],
       [ 432600.],
       [ 289800.],
       [ 781200.],
       [ 653100.],
       [ 663600.],
       [ 430500.],
       [ 602700.],
       [ 455700.],
       [ 409500.],
       [ 499800.],
       [ 409500.],
       [ 499800.],
       [ 525000.],
       [ 497700.],
       [ 281400.],
       [ 264600.],
       [1024800.],
       [ 602700.],
       [ 117600.],
       [ 483000.],
       [ 525000.],
       [ 905100.],
       [ 399000.],
       [ 499800.],
       [ 214200.],
       [ 480900.],
       [ 424200.],
       [ 289800.],
       [ 510300.],
       [ 451500.],
       [ 766500.],
       [ 510300.],
       [ 369600.],
       [ 466200.],
       [ 312900.],
       [ 407400.],
       [ 273000.],
       [ 392700.],
       [ 294000.],
       [ 556500.],
       [ 178500.],
       [ 728700.],
       [ 157500.],
       [ 367500.],
       [ 327

In [57]:
x_teste_inverse = x_scaler.inverse_transform(x_teste_scaler)

In [58]:
x_teste_inverse

array([[ 5.834,  8.47 , 21.   ],
       [ 6.842,  6.9  , 13.   ],
       [ 5.403, 26.82 , 14.7  ],
       [ 6.852, 19.78 , 20.2  ],
       [ 6.417,  8.81 , 17.8  ],
       [ 6.406, 19.52 , 20.2  ],
       [ 4.903, 29.29 , 14.7  ],
       [ 6.112, 12.67 , 20.2  ],
       [ 6.163, 11.34 , 18.   ],
       [ 5.965, 13.83 , 21.   ],
       [ 6.223, 21.78 , 20.2  ],
       [ 5.837, 15.69 , 20.2  ],
       [ 6.041,  7.7  , 19.6  ],
       [ 4.628, 34.37 , 20.2  ],
       [ 6.727,  9.42 , 20.9  ],
       [ 5.757, 17.31 , 21.2  ],
       [ 6.279, 11.97 , 18.7  ],
       [ 6.51 ,  7.39 , 14.7  ],
       [ 5.807, 16.03 , 18.6  ],
       [ 6.739,  4.69 , 15.2  ],
       [ 7.327, 11.25 , 13.   ],
       [ 7.135,  4.45 , 17.   ],
       [ 4.519, 36.98 , 20.2  ],
       [ 5.85 ,  8.77 , 19.2  ],
       [ 5.569, 15.1  , 19.2  ],
       [ 7.645,  3.01 , 14.9  ],
       [ 7.333,  7.79 , 13.   ],
       [ 7.61 ,  3.11 , 14.7  ],
       [ 6.395, 13.27 , 20.2  ],
       [ 6.019, 12.92 , 19.2  ],
       [ 6

In [59]:
y_teste_inverse = y_scaler.inverse_transform(y_teste_scaler)

In [60]:
y_teste_inverse

array([[ 417900.],
       [ 632100.],
       [ 281400.],
       [ 577500.],
       [ 474600.],
       [ 359100.],
       [ 247800.],
       [ 474600.],
       [ 449400.],
       [ 411600.],
       [ 214200.],
       [ 214200.],
       [ 428400.],
       [ 375900.],
       [ 577500.],
       [ 327600.],
       [ 420000.],
       [ 489300.],
       [ 470400.],
       [ 640500.],
       [ 651000.],
       [ 690900.],
       [ 147000.],
       [ 441000.],
       [ 367500.],
       [ 966000.],
       [ 756000.],
       [ 888300.],
       [ 455700.],
       [ 445200.],
       [ 323400.],
       [ 455700.],
       [ 430500.],
       [ 867300.],
       [ 445200.],
       [ 875700.],
       [ 415800.],
       [ 268800.],
       [ 590100.],
       [ 497700.],
       [ 231000.],
       [ 315000.],
       [ 388500.],
       [ 449400.],
       [ 413700.],
       [ 352800.],
       [ 453600.],
       [ 306600.],
       [ 898800.],
       [ 514500.],
       [ 743400.],
       [ 474600.],
       [ 600

In [61]:
y_teste

array([ 417900.,  632100.,  281400.,  577500.,  474600.,  359100.,
        247800.,  474600.,  449400.,  411600.,  214200.,  214200.,
        428400.,  375900.,  577500.,  327600.,  420000.,  489300.,
        470400.,  640500.,  651000.,  690900.,  147000.,  441000.,
        367500.,  966000.,  756000.,  888300.,  455700.,  445200.,
        323400.,  455700.,  430500.,  867300.,  445200.,  875700.,
        415800.,  268800.,  590100.,  497700.,  231000.,  315000.,
        388500.,  449400.,  413700.,  352800.,  453600.,  306600.,
        898800.,  514500.,  743400.,  474600.,  600600.,  304500.,
        661500.,  489300.,  422100.,  184800.,  525000.,  249900.,
        407400.,  361200.,  428400.,  392700.,  428400.,  472500.,
        258300.,  550200.,  346500.,  199500.,  302400.,  611100.,
        396900.,  585900.,  279300.,  483000.,  462000.,  218400.,
        518700.,  420000.,  392700.,  980700.,  455700.,  514500.,
        480900.,  520800.,  485100.,  525000.,  390600.,  5691

# **DESAFIO 4**

DESENVOLVER UM ALGORITMO DE REDES NEURAIS ARTFICIAIS DE REGRESSÃO PARA DATASET DO LINK A SEGUIR:

https://www.kaggle.com/mirichoi0218/insurance/code