# **REDES NEURAIS ARTIFICIAIS: REGRESSÃO**

Este projeto tem por objetivo desenvolver um algoritmo de Machine Learning para prever o valor do seguro de plano de saúde. 

Os dados foram extraídos do site do Kaggle:

https://www.kaggle.com/mirichoi0218/insurance/code

In [1]:
import numpy as np
import pandas as pd

In [2]:
df = pd.read_csv('insurance.csv',
                    sep=',', encoding='iso-8859-1')

In [3]:
df.head()

Unnamed: 0,age,sex,bmi,children,smoker,region,charges
0,19,female,27.9,0,yes,southwest,16884.924
1,18,male,33.77,1,no,southeast,1725.5523
2,28,male,33.0,3,no,southeast,4449.462
3,33,male,22.705,0,no,northwest,21984.47061
4,32,male,28.88,0,no,northwest,3866.8552


**Atributos previsores** (somente age,bmi,children)

age: idade do beneficiário principal

sex: gênero do contratante de seguro, feminino, masculino

bmi: Índice de massa corporal, fornecendo uma compreensão do corpo, pesos que são relativamente altos ou baixos em relação à altura,
índice objetivo de peso corporal (kg/m^2) usando a razão de altura para peso, idealmente de 18,5 a 24,9

children: número de filhos cobertos pelo seguro de saúde / número de dependentes

smoker: Fumante

region: área residencial do beneficiário nos EUA, nordeste, sudeste, sudoeste, noroeste

**Variável alvo**

charges: custos médicos individuais cobrados pelo seguro de saúde

In [4]:
df.shape

(1338, 7)

In [5]:
independente = df.iloc[:, [0,2,3]].values
independente

array([[19.  , 27.9 ,  0.  ],
       [18.  , 33.77,  1.  ],
       [28.  , 33.  ,  3.  ],
       ...,
       [18.  , 36.85,  0.  ],
       [21.  , 25.8 ,  0.  ],
       [61.  , 29.07,  0.  ]])

In [6]:
independente.shape

(1338, 3)

In [7]:
dependente = df.iloc[:, 6].values

In [8]:
dependente.shape

(1338,)

## **TREINAMENTO**

In [9]:
from sklearn.model_selection import train_test_split
x_treino, x_teste, y_treino, y_teste = train_test_split(independente, dependente, test_size = 0.3, random_state = 0)

In [10]:
x_treino.shape, x_teste.shape

((936, 3), (402, 3))

In [11]:
from sklearn.neural_network import MLPRegressor

In [70]:
redes = MLPRegressor(hidden_layer_sizes=(200, 200), activation='relu', verbose=True, max_iter=3000,
                    solver='lbfgs', random_state = 15)

In [71]:
redes.fit(x_treino, y_treino)

RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =        41201     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  1.58026D+08    |proj g|=  1.29382D+05

At iterate    1    f=  6.38955D+07    |proj g|=  3.36074D+05

At iterate    2    f=  6.37493D+07    |proj g|=  4.79045D+04

At iterate    3    f=  6.37473D+07    |proj g|=  2.03924D+04

At iterate    4    f=  6.37466D+07    |proj g|=  2.32001D+04

At iterate    5    f=  6.37388D+07    |proj g|=  5.76124D+04

At iterate    6    f=  6.37313D+07    |proj g|=  6.37420D+04

At iterate    7    f=  6.37240D+07    |proj g|=  3.22828D+04

At iterate    8    f=  6.37214D+07    |proj g|=  1.72532D+04


 This problem is unconstrained.



At iterate    9    f=  6.37208D+07    |proj g|=  1.63136D+04

At iterate   10    f=  6.37202D+07    |proj g|=  2.58463D+04

At iterate   11    f=  6.37191D+07    |proj g|=  2.99470D+04

At iterate   12    f=  6.36784D+07    |proj g|=  4.72400D+04

At iterate   13    f=  6.36187D+07    |proj g|=  5.20836D+04

At iterate   14    f=  6.35447D+07    |proj g|=  1.16857D+04

At iterate   15    f=  6.35425D+07    |proj g|=  1.53621D+04

At iterate   16    f=  6.35423D+07    |proj g|=  1.28739D+04

At iterate   17    f=  6.35420D+07    |proj g|=  1.35434D+04

At iterate   18    f=  6.35415D+07    |proj g|=  1.58496D+04

At iterate   19    f=  6.35409D+07    |proj g|=  1.29023D+04

At iterate   20    f=  6.35405D+07    |proj g|=  1.69616D+04

At iterate   21    f=  6.35400D+07    |proj g|=  1.49444D+04

At iterate   22    f=  6.35354D+07    |proj g|=  2.13373D+04

At iterate   23    f=  6.34897D+07    |proj g|=  1.46515D+05

At iterate   24    f=  6.34793D+07    |proj g|=  2.14110D+05

At iter

In [64]:
redes.n_layers_

4

In [72]:
redes.score(x_treino, y_treino)

0.1311145821565438

## **TESTE**

In [73]:
redes.score(x_teste, y_teste)

0.11554430351369116

In [21]:
previsoes_teste = redes.predict(x_teste)

## **MÉTRICAS**

In [74]:
from sklearn.metrics import mean_absolute_error, mean_squared_error

In [75]:
# Erro médio Absoluto
mean_absolute_error(y_teste, previsoes_teste)

9182.333574680157

In [76]:
# Raiz do erro quadrático médio (RMSE)
np.sqrt(mean_squared_error(y_teste, previsoes_teste))

11655.946342538547

### **Validação Cruzada**

In [77]:
from sklearn.model_selection import KFold
from sklearn.model_selection import cross_val_score

In [78]:
# Separando os dados em folds
kfold = KFold(n_splits = 12, shuffle=True, random_state = 5)

In [79]:
# Criando o modelo
from sklearn.neural_network import MLPRegressor
modelo = MLPRegressor(hidden_layer_sizes=(200, 200), activation='relu', verbose=True, max_iter=3000,
                    solver='lbfgs', random_state = 12)
resultado = cross_val_score(modelo, independente, dependente, cv = kfold)
resultado

RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =        41201     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  1.63078D+08    |proj g|=  1.27548D+05

At iterate    1    f=  1.62286D+08    |proj g|=  2.30429D+05

At iterate    2    f=  6.57042D+07    |proj g|=  1.60834D+05

At iterate    3    f=  6.56665D+07    |proj g|=  4.36640D+04

At iterate    4    f=  6.56638D+07    |proj g|=  3.39517D+04

At iterate    5    f=  6.56630D+07    |proj g|=  1.89696D+04

At iterate    6    f=  6.56598D+07    |proj g|=  2.84496D+04

At iterate    7    f=  6.56514D+07    |proj g|=  3.25690D+04

At iterate    8    f=  6.56382D+07    |proj g|=  1.69512D+04

At iterate    9    f=  6.56368D+07    |proj g|=  1.73661D+04


 This problem is unconstrained.



At iterate   10    f=  6.56354D+07    |proj g|=  1.86816D+04

At iterate   11    f=  6.56331D+07    |proj g|=  1.87482D+04

At iterate   12    f=  6.56159D+07    |proj g|=  3.89895D+04

At iterate   13    f=  6.56017D+07    |proj g|=  7.42756D+04

At iterate   14    f=  6.55992D+07    |proj g|=  1.33033D+05

At iterate   15    f=  6.55853D+07    |proj g|=  9.61358D+04

At iterate   16    f=  6.55427D+07    |proj g|=  8.94515D+04

At iterate   17    f=  6.55398D+07    |proj g|=  9.20581D+04

At iterate   18    f=  6.55361D+07    |proj g|=  5.42692D+04

At iterate   19    f=  6.55335D+07    |proj g|=  6.42700D+04

At iterate   20    f=  6.55266D+07    |proj g|=  8.24848D+04

At iterate   21    f=  6.55197D+07    |proj g|=  9.06039D+04

At iterate   22    f=  6.55091D+07    |proj g|=  5.94987D+04

At iterate   23    f=  6.54966D+07    |proj g|=  1.10151D+05

At iterate   24    f=  6.54954D+07    |proj g|=  5.61925D+04

At iterate   25    f=  6.54945D+07    |proj g|=  1.06405D+05

At iter

 This problem is unconstrained.



At iterate    4    f=  6.56500D+07    |proj g|=  2.09099D+04

At iterate    5    f=  6.56492D+07    |proj g|=  1.93447D+04

At iterate    6    f=  6.56455D+07    |proj g|=  3.66128D+04

At iterate    7    f=  6.56272D+07    |proj g|=  4.65871D+04

At iterate    8    f=  6.56252D+07    |proj g|=  6.35854D+04

At iterate    9    f=  6.56236D+07    |proj g|=  2.79290D+04

At iterate   10    f=  6.56232D+07    |proj g|=  2.80971D+04

At iterate   11    f=  6.56222D+07    |proj g|=  1.78982D+04

At iterate   12    f=  6.56161D+07    |proj g|=  9.18352D+04

At iterate   13    f=  6.56092D+07    |proj g|=  1.08458D+05

At iterate   14    f=  6.55919D+07    |proj g|=  1.02624D+05

At iterate   15    f=  6.55661D+07    |proj g|=  1.95187D+05

At iterate   16    f=  6.55197D+07    |proj g|=  2.31626D+05

At iterate   17    f=  6.55056D+07    |proj g|=  6.71375D+04

At iterate   18    f=  6.54927D+07    |proj g|=  1.16425D+05

At iterate   19    f=  6.54449D+07    |proj g|=  2.45256D+05

At iter

 This problem is unconstrained.



At iterate    2    f=  6.53817D+07    |proj g|=  1.32500D+05

At iterate    3    f=  6.53551D+07    |proj g|=  3.35217D+04

At iterate    4    f=  6.53536D+07    |proj g|=  2.54098D+04

At iterate    5    f=  6.53527D+07    |proj g|=  3.32737D+04

At iterate    6    f=  6.53493D+07    |proj g|=  4.36173D+04

At iterate    7    f=  6.53368D+07    |proj g|=  2.72577D+04

At iterate    8    f=  6.53321D+07    |proj g|=  3.69651D+04

At iterate    9    f=  6.53304D+07    |proj g|=  1.58885D+04

At iterate   10    f=  6.53301D+07    |proj g|=  1.21976D+04

At iterate   11    f=  6.53298D+07    |proj g|=  1.25807D+04

At iterate   12    f=  6.53297D+07    |proj g|=  3.95335D+04

At iterate   13    f=  6.53296D+07    |proj g|=  2.63337D+04

At iterate   14    f=  6.53285D+07    |proj g|=  2.97804D+04

At iterate   15    f=  6.53249D+07    |proj g|=  4.75742D+04

At iterate   16    f=  6.53204D+07    |proj g|=  2.47823D+04

At iterate   17    f=  6.52989D+07    |proj g|=  1.53729D+05

At iter


   evaluations in the last line search.  Termination
   may possibly be caused by a bad search direction.
 This problem is unconstrained.



At iterate    9    f=  6.62051D+07    |proj g|=  9.96605D+03

At iterate   10    f=  6.62050D+07    |proj g|=  1.26652D+04

At iterate   11    f=  6.62043D+07    |proj g|=  6.08745D+03

At iterate   12    f=  6.62033D+07    |proj g|=  2.69914D+04

At iterate   13    f=  6.61958D+07    |proj g|=  3.44458D+04

At iterate   14    f=  6.61779D+07    |proj g|=  8.03241D+04

At iterate   15    f=  6.61558D+07    |proj g|=  1.07451D+05

At iterate   16    f=  6.61215D+07    |proj g|=  6.58215D+04

At iterate   17    f=  6.60905D+07    |proj g|=  9.74528D+04

At iterate   18    f=  6.60892D+07    |proj g|=  5.32051D+04

At iterate   19    f=  6.60842D+07    |proj g|=  1.04477D+05

At iterate   20    f=  6.60836D+07    |proj g|=  7.54939D+04

At iterate   21    f=  6.60822D+07    |proj g|=  4.01750D+04

At iterate   22    f=  6.60800D+07    |proj g|=  4.42844D+04

At iterate   23    f=  6.60786D+07    |proj g|=  1.27062D+05

At iterate   24    f=  6.60751D+07    |proj g|=  4.60771D+04

At iter

 This problem is unconstrained.



At iterate    6    f=  6.55638D+07    |proj g|=  3.95155D+04

At iterate    7    f=  6.55628D+07    |proj g|=  2.42752D+04

At iterate    8    f=  6.55610D+07    |proj g|=  2.23467D+04

At iterate    9    f=  6.55599D+07    |proj g|=  3.40655D+04

At iterate   10    f=  6.55575D+07    |proj g|=  4.89385D+04

At iterate   11    f=  6.55547D+07    |proj g|=  2.15975D+04

At iterate   12    f=  6.55529D+07    |proj g|=  3.42375D+04

At iterate   13    f=  6.55521D+07    |proj g|=  9.12807D+03

At iterate   14    f=  6.55512D+07    |proj g|=  1.05205D+04

At iterate   15    f=  6.55404D+07    |proj g|=  2.26629D+04

At iterate   16    f=  6.54592D+07    |proj g|=  1.12538D+05

At iterate   17    f=  6.54456D+07    |proj g|=  1.68745D+05

At iterate   18    f=  6.54394D+07    |proj g|=  5.33429D+04

At iterate   19    f=  6.54030D+07    |proj g|=  1.31450D+05

At iterate   20    f=  6.53894D+07    |proj g|=  7.39627D+04

At iterate   21    f=  6.53847D+07    |proj g|=  1.27448D+05

At iter

 This problem is unconstrained.



At iterate    2    f=  6.61731D+07    |proj g|=  1.55391D+05

At iterate    3    f=  6.61379D+07    |proj g|=  3.46591D+04

At iterate    4    f=  6.61360D+07    |proj g|=  1.33170D+04

At iterate    5    f=  6.61356D+07    |proj g|=  8.96492D+03

At iterate    6    f=  6.61326D+07    |proj g|=  2.23795D+04

At iterate    7    f=  6.61288D+07    |proj g|=  2.00138D+04

At iterate    8    f=  6.61243D+07    |proj g|=  1.48554D+04

At iterate    9    f=  6.61187D+07    |proj g|=  2.38061D+04

At iterate   10    f=  6.60877D+07    |proj g|=  6.60008D+04

At iterate   11    f=  6.60391D+07    |proj g|=  8.62945D+04

At iterate   12    f=  6.60264D+07    |proj g|=  7.04481D+04

At iterate   13    f=  6.60242D+07    |proj g|=  8.04615D+04

At iterate   14    f=  6.60011D+07    |proj g|=  7.37943D+04

At iterate   15    f=  6.60002D+07    |proj g|=  7.44248D+04

At iterate   16    f=  6.59938D+07    |proj g|=  2.91256D+04

At iterate   17    f=  6.59914D+07    |proj g|=  3.67929D+04

At iter

 This problem is unconstrained.



At iterate   11    f=  6.51169D+07    |proj g|=  1.61636D+04

At iterate   12    f=  6.51166D+07    |proj g|=  1.12916D+04

At iterate   13    f=  6.51146D+07    |proj g|=  1.99238D+04

At iterate   14    f=  6.51146D+07    |proj g|=  2.96250D+04

At iterate   15    f=  6.51141D+07    |proj g|=  2.42090D+04

At iterate   16    f=  6.51080D+07    |proj g|=  1.20949D+04

At iterate   17    f=  6.50489D+07    |proj g|=  1.71996D+04

At iterate   18    f=  6.50225D+07    |proj g|=  9.16338D+04

At iterate   19    f=  6.50087D+07    |proj g|=  9.16736D+04

At iterate   20    f=  6.50048D+07    |proj g|=  9.59532D+04

At iterate   21    f=  6.50031D+07    |proj g|=  2.47631D+04

At iterate   22    f=  6.50029D+07    |proj g|=  5.89668D+04

At iterate   23    f=  6.50022D+07    |proj g|=  3.84990D+04

At iterate   24    f=  6.50017D+07    |proj g|=  1.50569D+04

At iterate   25    f=  6.50012D+07    |proj g|=  1.75116D+04

At iterate   26    f=  6.50005D+07    |proj g|=  1.19056D+04

At iter

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
  self.n_iter_ = _check_optimize_result("lbfgs", opt_res, self.max_iter)
 This problem is unconstrained.



At iterate    4    f=  6.52863D+07    |proj g|=  2.28201D+04

At iterate    5    f=  6.52850D+07    |proj g|=  2.27793D+04

At iterate    6    f=  6.52794D+07    |proj g|=  4.21356D+04

At iterate    7    f=  6.52703D+07    |proj g|=  4.45888D+04

At iterate    8    f=  6.52556D+07    |proj g|=  3.81764D+04

At iterate    9    f=  6.52533D+07    |proj g|=  2.04216D+04

At iterate   10    f=  6.52531D+07    |proj g|=  2.42970D+04

At iterate   11    f=  6.52465D+07    |proj g|=  7.97171D+04

At iterate   12    f=  6.52447D+07    |proj g|=  6.03162D+04

At iterate   13    f=  6.52440D+07    |proj g|=  4.57038D+04

At iterate   14    f=  6.52272D+07    |proj g|=  5.86554D+04

At iterate   15    f=  6.51753D+07    |proj g|=  1.09481D+05

At iterate   16    f=  6.51520D+07    |proj g|=  1.96224D+05

At iterate   17    f=  6.51213D+07    |proj g|=  3.81643D+04

At iterate   18    f=  6.51099D+07    |proj g|=  1.02130D+05

At iterate   19    f=  6.50956D+07    |proj g|=  5.60829D+04

At iter


   evaluations in the last line search.  Termination
   may possibly be caused by a bad search direction.
 This problem is unconstrained.



At iterate    2    f=  6.47285D+07    |proj g|=  1.18886D+05

At iterate    3    f=  6.47068D+07    |proj g|=  3.06556D+04

At iterate    4    f=  6.47056D+07    |proj g|=  1.85727D+04

At iterate    5    f=  6.47045D+07    |proj g|=  2.00432D+04

At iterate    6    f=  6.47002D+07    |proj g|=  3.86278D+04

At iterate    7    f=  6.46902D+07    |proj g|=  4.35889D+04

At iterate    8    f=  6.46848D+07    |proj g|=  1.08088D+05

At iterate    9    f=  6.46817D+07    |proj g|=  8.05359D+04

At iterate   10    f=  6.46808D+07    |proj g|=  4.05523D+04

At iterate   11    f=  6.46784D+07    |proj g|=  3.25819D+04

At iterate   12    f=  6.46780D+07    |proj g|=  1.92259D+04

At iterate   13    f=  6.46774D+07    |proj g|=  2.03235D+04

At iterate   14    f=  6.46759D+07    |proj g|=  3.69663D+04

At iterate   15    f=  6.46735D+07    |proj g|=  2.53540D+04

At iterate   16    f=  6.46584D+07    |proj g|=  1.55584D+05

At iterate   17    f=  6.46297D+07    |proj g|=  1.58511D+05

At iter

 This problem is unconstrained.



At iterate    3    f=  6.44595D+07    |proj g|=  2.76861D+04

At iterate    4    f=  6.44583D+07    |proj g|=  1.55259D+04

At iterate    5    f=  6.44572D+07    |proj g|=  2.30438D+04

At iterate    6    f=  6.44558D+07    |proj g|=  2.59110D+04

At iterate    7    f=  6.44520D+07    |proj g|=  2.25151D+04

At iterate    8    f=  6.44480D+07    |proj g|=  2.42748D+04

At iterate    9    f=  6.44472D+07    |proj g|=  2.27115D+04

At iterate   10    f=  6.44469D+07    |proj g|=  2.17017D+04

At iterate   11    f=  6.44465D+07    |proj g|=  1.13538D+04

At iterate   12    f=  6.44460D+07    |proj g|=  2.52579D+04

At iterate   13    f=  6.44402D+07    |proj g|=  5.40626D+04

At iterate   14    f=  6.44328D+07    |proj g|=  3.29970D+04

At iterate   15    f=  6.44055D+07    |proj g|=  3.97972D+04

At iterate   16    f=  6.43865D+07    |proj g|=  8.42007D+04

At iterate   17    f=  6.43728D+07    |proj g|=  6.94038D+04

At iterate   18    f=  6.43661D+07    |proj g|=  1.12107D+05

At iter

 This problem is unconstrained.



At iterate    2    f=  6.51367D+07    |proj g|=  1.61044D+05

At iterate    3    f=  6.50990D+07    |proj g|=  3.76367D+04

At iterate    4    f=  6.50967D+07    |proj g|=  1.33960D+04

At iterate    5    f=  6.50966D+07    |proj g|=  1.65132D+04

At iterate    6    f=  6.50957D+07    |proj g|=  2.91063D+04

At iterate    7    f=  6.50948D+07    |proj g|=  2.19402D+04

At iterate    8    f=  6.50931D+07    |proj g|=  1.32181D+04

At iterate    9    f=  6.50574D+07    |proj g|=  9.55867D+04

At iterate   10    f=  6.49848D+07    |proj g|=  9.85943D+04

At iterate   11    f=  6.49820D+07    |proj g|=  6.62805D+04

At iterate   12    f=  6.49399D+07    |proj g|=  4.73996D+04

At iterate   13    f=  6.49380D+07    |proj g|=  1.95471D+05

At iterate   14    f=  6.49325D+07    |proj g|=  8.04230D+04

At iterate   15    f=  6.49291D+07    |proj g|=  7.11423D+04

At iterate   16    f=  6.49195D+07    |proj g|=  6.57416D+04

At iterate   17    f=  6.49161D+07    |proj g|=  8.51990D+04

At iter

 This problem is unconstrained.



At iterate    1    f=  1.57880D+08    |proj g|=  2.25140D+05

At iterate    2    f=  6.48623D+07    |proj g|=  1.11816D+05

At iterate    3    f=  6.48428D+07    |proj g|=  1.93177D+04

At iterate    4    f=  6.48422D+07    |proj g|=  1.55771D+04

At iterate    5    f=  6.48419D+07    |proj g|=  1.72626D+04

At iterate    6    f=  6.48407D+07    |proj g|=  2.41238D+04

At iterate    7    f=  6.48397D+07    |proj g|=  1.98571D+04

At iterate    8    f=  6.48346D+07    |proj g|=  2.30745D+04

At iterate    9    f=  6.48337D+07    |proj g|=  1.11479D+04

At iterate   10    f=  6.48329D+07    |proj g|=  1.64504D+04

At iterate   11    f=  6.48327D+07    |proj g|=  1.20526D+04

At iterate   12    f=  6.48298D+07    |proj g|=  6.83745D+03

At iterate   13    f=  6.48116D+07    |proj g|=  9.37125D+04

At iterate   14    f=  6.47632D+07    |proj g|=  6.41403D+04

At iterate   15    f=  6.47346D+07    |proj g|=  5.73051D+04

At iterate   16    f=  6.47160D+07    |proj g|=  1.64181D+05

At iter

array([ 0.07541978,  0.07708261,  0.15364231, -0.16781229,  0.08781898,
        0.04960831,  0.0239296 ,  0.08089971,  0.10710512,  0.14930951,
        0.08985837,  0.14734728])

In [80]:
# Usamos a média e o desvio padrão
print("Acurácia Média: %.2f%%" % (resultado.mean() * 100.0))

Acurácia Média: 7.29%


## **RESULTADOS**

### **RESULTADOS:**

**REGRESSÃO LINEAR MÚLTIPLA:** R^2 = 0,10/0,14; RMSE = 11644,32; R^2 Validação Cruzada: 10.30%

**REGRESSÃO POLINOMIAL:** R^2 = 0,08; RMSE = 20178,48

**REGRESSÃO SVR:** R^2 = -0,09/-0,06; RMSE = 12985,28. R^2 Validação Cruzada: -9.45%.

**REGRESSÃO ÁRVORE DE DECISÃO:** R^2 = 0,32/0,06; RMSE = 12267,58. R^2 Validação Cruzada: 1,50%.

**REGRESSÃO COM RANDOM FOREST:** R^2 = 0,27/0,14; RMSE = 11708,37. R^2 Validação Cruzada: 6,14%.

**REGRESSÃO COM XGBOOST:** R^2 = 0,30/0,11; RMSE = 11894,67. R^2 Validação Cruzada: 4,25%.

**REGRESSÃO COM LIGHT GBM:** R^2 = 0,20/0,13; RMSE = 11765,60. R^2 Validação Cruzada: 6,80%.

**REGRESSÃO COM CATBOOST:** R^2 = 0,23/0,15; RMSE = 11765,60 R^2 Validação Cruzada: 7,36%.

**REGRESSÃO COM REDES NEURAIS:** R^2 = 0,13/0,11; RMSE = 11655,95. R^2 Validação Cruzada: 7.29%

## **Padronização de escala**

In [29]:
from sklearn.preprocessing import StandardScaler
x_scaler = StandardScaler()
x_treino_scaler = x_scaler.fit_transform(x_treino)

In [31]:
x_treino_scaler

array([[ 0.05327517, -0.70150711, -0.05467118],
       [ 1.12799963, -0.44487061, -0.52922816],
       [ 0.60711128, -0.79792304,  0.230063  ],
       ...,
       [-0.33111532, -0.36121561, -0.33940537],
       [-0.31699486,  0.84398345, -0.29194967],
       [-0.33268427, -0.38815536, -0.90887374]])

In [32]:
y_scaler = StandardScaler()
y_treino_scaler = y_scaler.fit_transform(y_treino.reshape(-1,1))

In [33]:
y_treino_scaler

array([[-1.05925606e-02],
       [ 6.46900118e-01],
       [ 2.85923746e-01],
       [-1.13728667e-01],
       [ 1.44111599e-01],
       [-7.84113359e-01],
       [-1.24822584e+00],
       [-2.81324840e-01],
       [-1.39512694e-01],
       [-1.01616960e+00],
       [ 2.00056152e+00],
       [ 1.21414870e+00],
       [ 1.27860877e+00],
       [-1.52404707e-01],
       [ 9.04740384e-01],
       [ 2.29945267e-03],
       [-2.81324840e-01],
       [ 2.73031732e-01],
       [-2.81324840e-01],
       [ 2.73031732e-01],
       [ 4.27735892e-01],
       [ 2.60139719e-01],
       [-1.06773765e+00],
       [-1.17087376e+00],
       [ 3.49603506e+00],
       [ 9.04740384e-01],
       [-2.07331469e+00],
       [ 1.69895626e-01],
       [ 4.27735892e-01],
       [ 2.76119030e+00],
       [-3.45784907e-01],
       [ 2.73031732e-01],
       [-1.48028208e+00],
       [ 1.57003612e-01],
       [-1.91080747e-01],
       [-1.01616960e+00],
       [ 3.37491799e-01],
       [-2.34845740e-02],
       [ 1.9

In [34]:
x_teste_scaler = x_scaler.transform(x_teste)
x_teste_scaler

array([[-6.24507256e-01, -6.20687880e-01,  1.17917695e+00],
       [ 9.56985082e-01, -8.43295235e-01, -2.61727885e+00],
       [-1.30072075e+00,  1.98112421e+00, -1.81053199e+00],
       [ 9.72674490e-01,  9.82935809e-01,  7.99531373e-01],
       [ 2.90185237e-01, -5.72479917e-01, -3.39405369e-01],
       [ 2.72926888e-01,  9.46070896e-01,  7.99531373e-01],
       [-2.08519115e+00,  2.33134087e+00, -1.81053199e+00],
       [-1.88341711e-01, -2.51777567e-02,  7.99531373e-01],
       [-1.08325729e-01, -2.13755962e-01, -2.44493974e-01],
       [-4.18976010e-01,  1.39296468e-01,  1.17917695e+00],
       [-1.41892805e-02,  1.26651206e+00,  7.99531373e-01],
       [-6.19800434e-01,  4.03022379e-01,  7.99531373e-01],
       [-2.99736508e-01, -7.29864736e-01,  5.14797187e-01],
       [-2.51664988e+00,  3.05162455e+00,  7.99531373e-01],
       [ 7.76556889e-01, -4.85989161e-01,  1.13172126e+00],
       [-7.45315699e-01,  6.32719141e-01,  1.27408835e+00],
       [ 7.36714050e-02, -1.24429444e-01

In [35]:
y_teste_scaler = y_scaler.transform(y_teste.reshape(-1,1))
y_teste_scaler

array([[-2.29756787e-01],
       [ 1.08522857e+00],
       [-1.06773765e+00],
       [ 7.50036225e-01],
       [ 1.18327572e-01],
       [-5.90733160e-01],
       [-1.27400987e+00],
       [ 1.18327572e-01],
       [-3.63765873e-02],
       [-2.68432827e-01],
       [-1.48028208e+00],
       [-1.48028208e+00],
       [-1.65296720e-01],
       [-4.87597053e-01],
       [ 7.50036225e-01],
       [-7.84113359e-01],
       [-2.16864774e-01],
       [ 2.08571666e-01],
       [ 9.25435458e-02],
       [ 1.13679662e+00],
       [ 1.20125669e+00],
       [ 1.44620494e+00],
       [-1.89282650e+00],
       [-8.79446405e-02],
       [-5.39165106e-01],
       [ 3.13505869e+00],
       [ 1.84585736e+00],
       [ 2.65805419e+00],
       [ 2.29945267e-03],
       [-6.21606139e-02],
       [-8.09897386e-01],
       [ 2.29945267e-03],
       [-1.52404707e-01],
       [ 2.52913406e+00],
       [-6.21606139e-02],
       [ 2.58070211e+00],
       [-2.42648800e-01],
       [-1.14508973e+00],
       [ 8.2

In [36]:
redes = MLPRegressor(hidden_layer_sizes=(6,6,6), activation='relu', verbose=True, max_iter=1500,
                    solver='lbfgs', random_state = 12)

In [38]:
redes.fit(x_treino_scaler, y_treino_scaler.ravel())

RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =          115     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  1.46378D+00    |proj g|=  1.97544D+00

At iterate    1    f=  4.31094D-01    |proj g|=  2.78340D-01

At iterate    2    f=  3.91974D-01    |proj g|=  1.95031D-01

At iterate    3    f=  3.17777D-01    |proj g|=  1.69936D-01

At iterate    4    f=  2.87962D-01    |proj g|=  2.23520D-01

At iterate    5    f=  2.45481D-01    |proj g|=  2.78317D-01

At iterate    6    f=  1.97417D-01    |proj g|=  1.04130D-01

At iterate    7    f=  1.81568D-01    |proj g|=  1.07492D-01

At iterate    8    f=  1.60761D-01    |proj g|=  1.30627D-01

At iterate    9    f=  1.26457D-01    |proj g|=  1.21617D-01

At iterate   10    f=  1.17529D-01    |proj g|=  1.12683D-01

At iterate   11    f=  1.03539D-01    |proj g|=  2.61634D-02

At iterate   12    f=  1.00173D-01    |proj g|=  3.17116D-02

At iterate   13    f=  9.3

 This problem is unconstrained.



At iterate   37    f=  7.13029D-02    |proj g|=  1.17874D-02

At iterate   38    f=  7.10690D-02    |proj g|=  4.42663D-03

At iterate   39    f=  7.09035D-02    |proj g|=  3.29202D-03

At iterate   40    f=  7.05117D-02    |proj g|=  6.81710D-03

At iterate   41    f=  7.04289D-02    |proj g|=  9.45045D-03

At iterate   42    f=  7.01716D-02    |proj g|=  5.03428D-03

At iterate   43    f=  6.99832D-02    |proj g|=  4.18464D-03

At iterate   44    f=  6.98241D-02    |proj g|=  1.58673D-02

At iterate   45    f=  6.95533D-02    |proj g|=  6.47964D-03

At iterate   46    f=  6.95228D-02    |proj g|=  1.00810D-02

At iterate   47    f=  6.91868D-02    |proj g|=  7.85405D-03

At iterate   48    f=  6.89923D-02    |proj g|=  9.37762D-03

At iterate   49    f=  6.87619D-02    |proj g|=  4.56974D-03

At iterate   50    f=  6.85360D-02    |proj g|=  7.41762D-03

At iterate   51    f=  6.82562D-02    |proj g|=  1.87878D-02

At iterate   52    f=  6.80253D-02    |proj g|=  9.91262D-03

At iter

In [39]:
redes.n_layers_

5

In [40]:
redes.score(x_treino_scaler, y_treino_scaler)

0.8840227764062709

**TESTE**

In [41]:
redes.score(x_teste_scaler, y_teste_scaler)

0.8259255442446645

In [42]:
previsoes_teste_scaler = redes.predict(x_teste_scaler)

In [43]:
previsoes_teste_scaler

array([-2.59888882e-01,  1.27507402e+00, -8.38271864e-01, -8.78397518e-01,
        1.07502723e-01, -9.22845124e-01, -7.70075099e-01, -2.99692112e-01,
        1.05960786e-02, -4.33305555e-01, -1.17229051e+00, -5.85023244e-01,
        2.57793720e-02, -1.36920206e+00,  3.06571741e-02, -7.72445900e-01,
       -2.93163530e-02,  6.42772666e-01, -1.22064038e-01,  1.20701089e+00,
        1.39681005e+00,  1.64978821e+00, -1.31488238e+00,  1.58250721e-01,
       -8.97689203e-02,  2.83796888e+00,  1.86009300e+00,  2.77549340e+00,
       -3.44717212e-01, -1.34956555e-01, -8.45018767e-01, -3.65488136e-01,
        7.48788409e-02,  1.55981906e+00, -6.63844097e-02,  2.67788981e+00,
       -1.08754382e-02, -1.52965729e+00,  1.40450239e-01,  5.85970223e-01,
       -1.15158864e+00, -4.82106561e-01, -8.46105539e-02,  1.64382302e-01,
       -1.92245345e-01, -4.15692378e-01,  9.00289013e-02, -8.11893407e-01,
        1.66158884e+00, -1.35306354e-01,  1.57635806e+00,  3.56088478e-02,
        8.15563821e-01, -

## **MÉTRICAS**

**Revertendo a transformação**

In [44]:
previsoes_teste_scaler

array([-2.59888882e-01,  1.27507402e+00, -8.38271864e-01, -8.78397518e-01,
        1.07502723e-01, -9.22845124e-01, -7.70075099e-01, -2.99692112e-01,
        1.05960786e-02, -4.33305555e-01, -1.17229051e+00, -5.85023244e-01,
        2.57793720e-02, -1.36920206e+00,  3.06571741e-02, -7.72445900e-01,
       -2.93163530e-02,  6.42772666e-01, -1.22064038e-01,  1.20701089e+00,
        1.39681005e+00,  1.64978821e+00, -1.31488238e+00,  1.58250721e-01,
       -8.97689203e-02,  2.83796888e+00,  1.86009300e+00,  2.77549340e+00,
       -3.44717212e-01, -1.34956555e-01, -8.45018767e-01, -3.65488136e-01,
        7.48788409e-02,  1.55981906e+00, -6.63844097e-02,  2.67788981e+00,
       -1.08754382e-02, -1.52965729e+00,  1.40450239e-01,  5.85970223e-01,
       -1.15158864e+00, -4.82106561e-01, -8.46105539e-02,  1.64382302e-01,
       -1.92245345e-01, -4.15692378e-01,  9.00289013e-02, -8.11893407e-01,
        1.66158884e+00, -1.35306354e-01,  1.57635806e+00,  3.56088478e-02,
        8.15563821e-01, -

In [45]:
previsoes_teste_inverse = y_scaler.inverse_transform(previsoes_teste_scaler.reshape(-1,1))

In [47]:
previsoes_teste_inverse

array([[412991.73644368],
       [663024.21874183],
       [318778.03730495],
       [312241.90745754],
       [472836.72358515],
       [305001.76818305],
       [329886.71395796],
       [406508.12670469],
       [457051.45024426],
       [384743.62623457],
       [264369.22321282],
       [360030.09691612],
       [459524.68040404],
       [232293.99632532],
       [460319.23313607],
       [329500.53055277],
       [450550.05248043],
       [560027.67296014],
       [435442.23848138],
       [651937.31021844],
       [682853.98907936],
       [724061.99309271],
       [241142.2128865 ],
       [481103.14351972],
       [440702.84024135],
       [917606.58137768],
       [758318.86656135],
       [907429.85379097],
       [399173.91850129],
       [433342.1565315 ],
       [317679.02394911],
       [395790.51046533],
       [467522.5688776 ],
       [709406.77849354],
       [444511.97936438],
       [891531.05405976],
       [453553.92163039],
       [206157.19489692],
       [4782

In [48]:
y_teste

array([ 417900.,  632100.,  281400.,  577500.,  474600.,  359100.,
        247800.,  474600.,  449400.,  411600.,  214200.,  214200.,
        428400.,  375900.,  577500.,  327600.,  420000.,  489300.,
        470400.,  640500.,  651000.,  690900.,  147000.,  441000.,
        367500.,  966000.,  756000.,  888300.,  455700.,  445200.,
        323400.,  455700.,  430500.,  867300.,  445200.,  875700.,
        415800.,  268800.,  590100.,  497700.,  231000.,  315000.,
        388500.,  449400.,  413700.,  352800.,  453600.,  306600.,
        898800.,  514500.,  743400.,  474600.,  600600.,  304500.,
        661500.,  489300.,  422100.,  184800.,  525000.,  249900.,
        407400.,  361200.,  428400.,  392700.,  428400.,  472500.,
        258300.,  550200.,  346500.,  199500.,  302400.,  611100.,
        396900.,  585900.,  279300.,  483000.,  462000.,  218400.,
        518700.,  420000.,  392700.,  980700.,  455700.,  514500.,
        480900.,  520800.,  485100.,  525000.,  390600.,  5691

In [49]:
from sklearn.metrics import mean_absolute_error, mean_squared_error

In [50]:
# Erro médio Absoluto
mean_absolute_error(y_teste, previsoes_teste_inverse)

54239.75771276947

In [51]:
# Raiz do erro quadrático médio (RMSE)
np.sqrt(mean_squared_error(y_teste, previsoes_teste_inverse))

71067.4486827693

**Revertendo a transformação**

In [52]:
x_treino_inverse = x_scaler.inverse_transform(x_treino_scaler)

In [53]:
x_treino_inverse

array([[ 6.266,  7.9  , 18.4  ],
       [ 6.951,  9.71 , 17.4  ],
       [ 6.619,  7.22 , 19.   ],
       ...,
       [ 6.021, 10.3  , 17.8  ],
       [ 6.03 , 18.8  , 17.9  ],
       [ 6.02 , 10.11 , 16.6  ]])

In [54]:
x_treino

array([[ 6.266,  7.9  , 18.4  ],
       [ 6.951,  9.71 , 17.4  ],
       [ 6.619,  7.22 , 19.   ],
       ...,
       [ 6.021, 10.3  , 17.8  ],
       [ 6.03 , 18.8  , 17.9  ],
       [ 6.02 , 10.11 , 16.6  ]])

In [55]:
y_treino_inverse = y_scaler.inverse_transform(y_treino_scaler)

In [56]:
y_treino_inverse

array([[ 453600.],
       [ 560700.],
       [ 501900.],
       [ 436800.],
       [ 478800.],
       [ 327600.],
       [ 252000.],
       [ 409500.],
       [ 432600.],
       [ 289800.],
       [ 781200.],
       [ 653100.],
       [ 663600.],
       [ 430500.],
       [ 602700.],
       [ 455700.],
       [ 409500.],
       [ 499800.],
       [ 409500.],
       [ 499800.],
       [ 525000.],
       [ 497700.],
       [ 281400.],
       [ 264600.],
       [1024800.],
       [ 602700.],
       [ 117600.],
       [ 483000.],
       [ 525000.],
       [ 905100.],
       [ 399000.],
       [ 499800.],
       [ 214200.],
       [ 480900.],
       [ 424200.],
       [ 289800.],
       [ 510300.],
       [ 451500.],
       [ 766500.],
       [ 510300.],
       [ 369600.],
       [ 466200.],
       [ 312900.],
       [ 407400.],
       [ 273000.],
       [ 392700.],
       [ 294000.],
       [ 556500.],
       [ 178500.],
       [ 728700.],
       [ 157500.],
       [ 367500.],
       [ 327

In [57]:
x_teste_inverse = x_scaler.inverse_transform(x_teste_scaler)

In [58]:
x_teste_inverse

array([[ 5.834,  8.47 , 21.   ],
       [ 6.842,  6.9  , 13.   ],
       [ 5.403, 26.82 , 14.7  ],
       [ 6.852, 19.78 , 20.2  ],
       [ 6.417,  8.81 , 17.8  ],
       [ 6.406, 19.52 , 20.2  ],
       [ 4.903, 29.29 , 14.7  ],
       [ 6.112, 12.67 , 20.2  ],
       [ 6.163, 11.34 , 18.   ],
       [ 5.965, 13.83 , 21.   ],
       [ 6.223, 21.78 , 20.2  ],
       [ 5.837, 15.69 , 20.2  ],
       [ 6.041,  7.7  , 19.6  ],
       [ 4.628, 34.37 , 20.2  ],
       [ 6.727,  9.42 , 20.9  ],
       [ 5.757, 17.31 , 21.2  ],
       [ 6.279, 11.97 , 18.7  ],
       [ 6.51 ,  7.39 , 14.7  ],
       [ 5.807, 16.03 , 18.6  ],
       [ 6.739,  4.69 , 15.2  ],
       [ 7.327, 11.25 , 13.   ],
       [ 7.135,  4.45 , 17.   ],
       [ 4.519, 36.98 , 20.2  ],
       [ 5.85 ,  8.77 , 19.2  ],
       [ 5.569, 15.1  , 19.2  ],
       [ 7.645,  3.01 , 14.9  ],
       [ 7.333,  7.79 , 13.   ],
       [ 7.61 ,  3.11 , 14.7  ],
       [ 6.395, 13.27 , 20.2  ],
       [ 6.019, 12.92 , 19.2  ],
       [ 6

In [59]:
y_teste_inverse = y_scaler.inverse_transform(y_teste_scaler)

In [60]:
y_teste_inverse

array([[ 417900.],
       [ 632100.],
       [ 281400.],
       [ 577500.],
       [ 474600.],
       [ 359100.],
       [ 247800.],
       [ 474600.],
       [ 449400.],
       [ 411600.],
       [ 214200.],
       [ 214200.],
       [ 428400.],
       [ 375900.],
       [ 577500.],
       [ 327600.],
       [ 420000.],
       [ 489300.],
       [ 470400.],
       [ 640500.],
       [ 651000.],
       [ 690900.],
       [ 147000.],
       [ 441000.],
       [ 367500.],
       [ 966000.],
       [ 756000.],
       [ 888300.],
       [ 455700.],
       [ 445200.],
       [ 323400.],
       [ 455700.],
       [ 430500.],
       [ 867300.],
       [ 445200.],
       [ 875700.],
       [ 415800.],
       [ 268800.],
       [ 590100.],
       [ 497700.],
       [ 231000.],
       [ 315000.],
       [ 388500.],
       [ 449400.],
       [ 413700.],
       [ 352800.],
       [ 453600.],
       [ 306600.],
       [ 898800.],
       [ 514500.],
       [ 743400.],
       [ 474600.],
       [ 600

In [61]:
y_teste

array([ 417900.,  632100.,  281400.,  577500.,  474600.,  359100.,
        247800.,  474600.,  449400.,  411600.,  214200.,  214200.,
        428400.,  375900.,  577500.,  327600.,  420000.,  489300.,
        470400.,  640500.,  651000.,  690900.,  147000.,  441000.,
        367500.,  966000.,  756000.,  888300.,  455700.,  445200.,
        323400.,  455700.,  430500.,  867300.,  445200.,  875700.,
        415800.,  268800.,  590100.,  497700.,  231000.,  315000.,
        388500.,  449400.,  413700.,  352800.,  453600.,  306600.,
        898800.,  514500.,  743400.,  474600.,  600600.,  304500.,
        661500.,  489300.,  422100.,  184800.,  525000.,  249900.,
        407400.,  361200.,  428400.,  392700.,  428400.,  472500.,
        258300.,  550200.,  346500.,  199500.,  302400.,  611100.,
        396900.,  585900.,  279300.,  483000.,  462000.,  218400.,
        518700.,  420000.,  392700.,  980700.,  455700.,  514500.,
        480900.,  520800.,  485100.,  525000.,  390600.,  5691

# **DESAFIO 4**

DESENVOLVER UM ALGORITMO DE REDES NEURAIS ARTFICIAIS DE REGRESSÃO PARA DATASET DO LINK A SEGUIR:

https://www.kaggle.com/mirichoi0218/insurance/code