# **REDES NEURAIS ARTIFICIAIS: REGRESSÃO**

Este projeto tem por objetivo desenvolver um algoritmo de Machine Learning para prever o valor do preço médio de casas em Boston. 

Os dados foram extraídos do site do Kaggle:

https://www.kaggle.com/schirmerchad/bostonhoustingmlnd

In [4]:
import numpy as np
import pandas as pd

In [5]:
df = pd.read_csv('./housing.csv',
                    sep=',', encoding='iso-8859-1')

In [6]:
df.head()

Unnamed: 0,RM,LSTAT,PTRATIO,MEDV
0,6.575,4.98,15.3,504000.0
1,6.421,9.14,17.8,453600.0
2,7.185,4.03,17.8,728700.0
3,6.998,2.94,18.7,701400.0
4,7.147,5.33,18.7,760200.0


**Atributos previsores**

RM: é o número médio de cômodos entre os imóveis no bairro.

LSTAT: é a porcentagem de proprietários no bairro considerados de "classe baixa" (proletariado).

PTRATIO: é a razão entre estudantes e professores nas escolas de ensino fundamental e médio no bairro.

**Variável alvo**

MEDV: valor médio das casas

In [7]:
df.shape

(489, 4)

In [8]:
independente = df.iloc[:, 0:3].values
independente

array([[ 6.575,  4.98 , 15.3  ],
       [ 6.421,  9.14 , 17.8  ],
       [ 7.185,  4.03 , 17.8  ],
       ...,
       [ 6.976,  5.64 , 21.   ],
       [ 6.794,  6.48 , 21.   ],
       [ 6.03 ,  7.88 , 21.   ]])

In [9]:
independente.shape

(489, 3)

In [10]:
dependente = df.iloc[:, 3].values

In [11]:
dependente.shape

(489,)

## **TREINAMENTO**

In [12]:
from sklearn.model_selection import train_test_split
x_treino, x_teste, y_treino, y_teste = train_test_split(independente, dependente, test_size = 0.3, random_state = 0)

In [13]:
x_treino.shape, x_teste.shape

((342, 3), (147, 3))

In [14]:
from sklearn.neural_network import MLPRegressor

In [15]:
redes = MLPRegressor(hidden_layer_sizes=(100, 100), activation='relu', verbose=True, max_iter=2000,
                    solver='lbfgs', random_state = 12)

In [16]:
redes.fit(x_treino, y_treino)

RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =        10601     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  1.16928D+11    |proj g|=  3.29234D+06

At iterate    1    f=  1.16918D+11    |proj g|=  4.33329D+06

At iterate    2    f=  2.71782D+10    |proj g|=  4.09900D+08

At iterate    3    f=  2.69460D+10    |proj g|=  2.97406D+08

At iterate    4    f=  1.52752D+10    |proj g|=  2.88989D+08

At iterate    5    f=  8.56902D+09    |proj g|=  1.97168D+08

At iterate    6    f=  8.39413D+09    |proj g|=  1.71036D+08

At iterate    7    f=  8.17655D+09    |proj g|=  5.13025D+07

At iterate    8    f=  7.96029D+09    |proj g|=  8.40343D+07

At iterate    9    f=  7.56165D+09    |proj g|=  1.72619D+08

At iterate   10    f=  6.63223D+09    |proj g|=  2.74530D+08

At iterate   11    f=  6.42420D+09    |proj g|=  2.56539D+08

At iterate   12    f=  5.99207D+09    |proj g|=  1.46870D+08

At iterate   13    f=  5.5

 This problem is unconstrained.



At iterate  125    f=  2.60434D+09    |proj g|=  8.48020D+06

At iterate  126    f=  2.60403D+09    |proj g|=  5.17217D+06

At iterate  127    f=  2.60377D+09    |proj g|=  8.94736D+06

At iterate  128    f=  2.60275D+09    |proj g|=  2.35178D+07

At iterate  129    f=  2.60194D+09    |proj g|=  1.71559D+07

At iterate  130    f=  2.60157D+09    |proj g|=  1.25856D+07

At iterate  131    f=  2.60095D+09    |proj g|=  7.58459D+06

At iterate  132    f=  2.60036D+09    |proj g|=  1.08760D+07

At iterate  133    f=  2.59936D+09    |proj g|=  2.48261D+07

At iterate  134    f=  2.59845D+09    |proj g|=  2.06077D+07

At iterate  135    f=  2.59784D+09    |proj g|=  1.84219D+07

At iterate  136    f=  2.59686D+09    |proj g|=  2.99250D+07

At iterate  137    f=  2.59522D+09    |proj g|=  3.33876D+07

At iterate  138    f=  2.59424D+09    |proj g|=  2.37643D+07

At iterate  139    f=  2.59181D+09    |proj g|=  1.37618D+07

At iterate  140    f=  2.58983D+09    |proj g|=  2.06735D+07

At iter


   evaluations in the last line search.  Termination
   may possibly be caused by a bad search direction.


In [17]:
redes.n_layers_

4

In [18]:
redes.score(x_treino, y_treino)

0.8510204805906196

## **TESTE**

In [19]:
redes.score(x_teste, y_teste)

0.8165474544077103

In [20]:
previsoes_teste = redes.predict(x_teste)

## **MÉTRICAS**

In [21]:
from sklearn.metrics import mean_absolute_error, mean_squared_error

In [22]:
# Erro médio Absoluto
mean_absolute_error(y_teste, previsoes_teste)

55416.17551850673

In [23]:
# Raiz do erro quadrático médio (RMSE)
np.sqrt(mean_squared_error(y_teste, previsoes_teste))

72956.68171409974

### **Validação Cruzada**

In [24]:
from sklearn.model_selection import KFold
from sklearn.model_selection import cross_val_score

In [25]:
# Separando os dados em folds
kfold = KFold(n_splits = 12, shuffle=True, random_state = 5)

In [26]:
# Criando o modelo
from sklearn.neural_network import MLPRegressor
modelo = MLPRegressor(hidden_layer_sizes=(100, 100), activation='relu', verbose=True, max_iter=2000,
                    solver='lbfgs', random_state = 12)
resultado = cross_val_score(modelo, independente, dependente, cv = kfold)
resultado

 This problem is unconstrained.


RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =        10601     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  1.17207D+11    |proj g|=  3.30086D+06

At iterate    1    f=  1.17197D+11    |proj g|=  4.35095D+06

At iterate    2    f=  2.73537D+10    |proj g|=  3.96555D+08

At iterate    3    f=  2.71414D+10    |proj g|=  2.89174D+08

At iterate    4    f=  1.59010D+10    |proj g|=  2.81691D+08

At iterate    5    f=  8.78735D+09    |proj g|=  1.92526D+08

At iterate    6    f=  8.46651D+09    |proj g|=  1.44674D+08

At iterate    7    f=  8.25894D+09    |proj g|=  6.38165D+07

At iterate    8    f=  8.03636D+09    |proj g|=  8.07483D+07

At iterate    9    f=  7.82796D+09    |proj g|=  1.12509D+08

At iterate   10    f=  7.11768D+09    |proj g|=  1.31935D+08

At iterate   11    f=  6.79192D+09    |proj g|=  1.56049D+08

At iterate   12    f=  6.52191D+09    |proj g|=  2.36719D+08

At iterate   13    f=  6.1

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
  self.n_iter_ = _check_optimize_result("lbfgs", opt_res, self.max_iter)
 This problem is unconstrained.



At iterate   81    f=  2.96334D+09    |proj g|=  1.85173D+07

At iterate   82    f=  2.96270D+09    |proj g|=  3.25997D+07

At iterate   83    f=  2.96211D+09    |proj g|=  2.37621D+07

At iterate   84    f=  2.96196D+09    |proj g|=  2.57935D+07

At iterate   85    f=  2.95634D+09    |proj g|=  2.24569D+07

At iterate   86    f=  2.95116D+09    |proj g|=  1.39012D+07

At iterate   87    f=  2.94721D+09    |proj g|=  1.04612D+08

At iterate   88    f=  2.94017D+09    |proj g|=  1.89235D+07

At iterate   89    f=  2.93989D+09    |proj g|=  2.15726D+07

At iterate   90    f=  2.93771D+09    |proj g|=  2.34573D+07

At iterate   91    f=  2.93579D+09    |proj g|=  9.69952D+07

At iterate   92    f=  2.93348D+09    |proj g|=  1.09228D+08

At iterate   93    f=  2.93086D+09    |proj g|=  1.90017D+07

At iterate   94    f=  2.93021D+09    |proj g|=  1.92276D+07

At iterate   95    f=  2.92961D+09    |proj g|=  5.49161D+07

At iterate   96    f=  2.92849D+09    |proj g|=  2.25167D+07

At iter

 This problem is unconstrained.



At iterate   19    f=  3.70007D+09    |proj g|=  2.03335D+08

At iterate   20    f=  3.66961D+09    |proj g|=  2.58647D+08

At iterate   21    f=  3.57458D+09    |proj g|=  8.19849D+08

At iterate   22    f=  3.43386D+09    |proj g|=  5.37972D+08

At iterate   23    f=  3.35028D+09    |proj g|=  3.97499D+08

At iterate   24    f=  3.20476D+09    |proj g|=  1.09991D+08

At iterate   25    f=  3.13662D+09    |proj g|=  1.36823D+08

At iterate   26    f=  3.11644D+09    |proj g|=  8.54585D+07

At iterate   27    f=  3.10795D+09    |proj g|=  7.74156D+07

At iterate   28    f=  3.10293D+09    |proj g|=  1.31217D+08

At iterate   29    f=  3.09993D+09    |proj g|=  1.10294D+08

At iterate   30    f=  3.09615D+09    |proj g|=  4.80756D+07

At iterate   31    f=  3.09498D+09    |proj g|=  5.76527D+07

At iterate   32    f=  3.09413D+09    |proj g|=  5.27096D+07

At iterate   33    f=  3.09085D+09    |proj g|=  5.35062D+07

At iterate   34    f=  3.08971D+09    |proj g|=  6.50110D+07

At iter

 This problem is unconstrained.



At iterate   91    f=  3.01268D+09    |proj g|=  5.30052D+07

At iterate   92    f=  3.01251D+09    |proj g|=  6.73721D+07

At iterate   93    f=  3.01203D+09    |proj g|=  5.88789D+07

At iterate   94    f=  3.01105D+09    |proj g|=  1.23709D+07

At iterate   95    f=  3.01070D+09    |proj g|=  4.06858D+07

At iterate   96    f=  3.00984D+09    |proj g|=  4.58976D+07

At iterate   97    f=  3.00274D+09    |proj g|=  7.46517D+07

At iterate   98    f=  3.00263D+09    |proj g|=  9.34424D+07

At iterate   99    f=  3.00109D+09    |proj g|=  2.60636D+07

At iterate  100    f=  3.00029D+09    |proj g|=  3.24214D+07

At iterate  101    f=  2.99981D+09    |proj g|=  3.05710D+07

At iterate  102    f=  2.99558D+09    |proj g|=  9.85882D+07

At iterate  103    f=  2.99533D+09    |proj g|=  1.22408D+08

At iterate  104    f=  2.99516D+09    |proj g|=  1.52547D+08

At iterate  105    f=  2.99316D+09    |proj g|=  1.25411D+08

At iterate  106    f=  2.98225D+09    |proj g|=  2.34073D+08

At iter


   evaluations in the last line search.  Termination
   may possibly be caused by a bad search direction.
 This problem is unconstrained.



At iterate   53    f=  3.13536D+09    |proj g|=  3.25791D+07

At iterate   54    f=  3.13513D+09    |proj g|=  2.02757D+07

At iterate   55    f=  3.13476D+09    |proj g|=  1.03749D+07

At iterate   56    f=  3.13457D+09    |proj g|=  2.01541D+07

At iterate   57    f=  3.13429D+09    |proj g|=  3.98992D+07

At iterate   58    f=  3.13409D+09    |proj g|=  1.35593D+07

At iterate   59    f=  3.13338D+09    |proj g|=  1.61649D+07

At iterate   60    f=  3.13278D+09    |proj g|=  3.84436D+07

At iterate   61    f=  3.13247D+09    |proj g|=  2.56479D+07

At iterate   62    f=  3.13224D+09    |proj g|=  8.84169D+06

At iterate   63    f=  3.13210D+09    |proj g|=  9.39011D+06

At iterate   64    f=  3.13193D+09    |proj g|=  8.20130D+06

At iterate   65    f=  3.13169D+09    |proj g|=  3.14345D+07

At iterate   66    f=  3.13140D+09    |proj g|=  2.02210D+07

At iterate   67    f=  3.13131D+09    |proj g|=  1.15352D+07

At iterate   68    f=  3.13129D+09    |proj g|=  1.26472D+07

At iter

 This problem is unconstrained.



At iterate   41    f=  3.07685D+09    |proj g|=  8.83076D+07

At iterate   42    f=  3.07636D+09    |proj g|=  9.98415D+07

At iterate   43    f=  3.07172D+09    |proj g|=  2.12402D+08

At iterate   44    f=  3.06571D+09    |proj g|=  4.72294D+07

At iterate   45    f=  3.06311D+09    |proj g|=  3.34614D+07

At iterate   46    f=  3.06041D+09    |proj g|=  4.42724D+07

At iterate   47    f=  3.05462D+09    |proj g|=  1.88362D+07

At iterate   48    f=  3.05039D+09    |proj g|=  2.37981D+07

At iterate   49    f=  3.04856D+09    |proj g|=  2.35075D+07

At iterate   50    f=  3.04747D+09    |proj g|=  2.66425D+07

At iterate   51    f=  3.04668D+09    |proj g|=  1.36730D+07

At iterate   52    f=  3.04649D+09    |proj g|=  1.28305D+07

At iterate   53    f=  3.04592D+09    |proj g|=  1.01190D+07

At iterate   54    f=  3.04585D+09    |proj g|=  2.84533D+07

At iterate   55    f=  3.04544D+09    |proj g|=  3.92724D+07

At iterate   56    f=  3.04520D+09    |proj g|=  4.09188D+07

At iter

 This problem is unconstrained.


At iterate    5    f=  8.71068D+09    |proj g|=  2.09651D+08

At iterate    6    f=  8.38723D+09    |proj g|=  1.62289D+08

At iterate    7    f=  8.18017D+09    |proj g|=  5.92209D+07

At iterate    8    f=  8.03256D+09    |proj g|=  5.58912D+07

At iterate    9    f=  7.73232D+09    |proj g|=  1.82137D+08

At iterate   10    f=  7.44480D+09    |proj g|=  1.21384D+08

At iterate   11    f=  6.79813D+09    |proj g|=  2.69555D+08

At iterate   12    f=  6.54133D+09    |proj g|=  1.22424D+08

At iterate   13    f=  6.28113D+09    |proj g|=  8.48335D+07

At iterate   14    f=  6.12854D+09    |proj g|=  3.48028D+08

At iterate   15    f=  5.93680D+09    |proj g|=  1.41916D+08

At iterate   16    f=  5.76605D+09    |proj g|=  7.77794D+07

At iterate   17    f=  5.49617D+09    |proj g|=  1.07051D+08

At iterate   18    f=  5.39233D+09    |proj g|=  2.96889D+08

At iterate   19    f=  5.32768D+09    |proj g|=  6.47123D+08

At iterate   20    f=  5.02805D+09    |proj g|=  7.29321D+07

At itera


   evaluations in the last line search.  Termination
   may possibly be caused by a bad search direction.
 This problem is unconstrained.



At iterate   38    f=  2.97448D+09    |proj g|=  5.29070D+07

At iterate   39    f=  2.97142D+09    |proj g|=  6.40696D+07

At iterate   40    f=  2.96539D+09    |proj g|=  2.29963D+07

At iterate   41    f=  2.96303D+09    |proj g|=  1.46773D+07

At iterate   42    f=  2.96237D+09    |proj g|=  1.79724D+07

At iterate   43    f=  2.96201D+09    |proj g|=  1.08294D+07

At iterate   44    f=  2.96081D+09    |proj g|=  2.82169D+07

At iterate   45    f=  2.96005D+09    |proj g|=  1.20356D+07

At iterate   46    f=  2.95993D+09    |proj g|=  9.39330D+06

At iterate   47    f=  2.95984D+09    |proj g|=  1.26767D+07

At iterate   48    f=  2.95971D+09    |proj g|=  2.41718D+07

At iterate   49    f=  2.95965D+09    |proj g|=  2.53740D+07

At iterate   50    f=  2.95947D+09    |proj g|=  4.82485D+06

At iterate   51    f=  2.95938D+09    |proj g|=  7.23323D+06

At iterate   52    f=  2.95924D+09    |proj g|=  8.65170D+06

At iterate   53    f=  2.95920D+09    |proj g|=  3.04005D+07

At iter

 This problem is unconstrained.



At iterate   77    f=  2.87746D+09    |proj g|=  2.30057D+07

At iterate   78    f=  2.87718D+09    |proj g|=  1.42569D+07

At iterate   79    f=  2.87709D+09    |proj g|=  1.56566D+07

At iterate   80    f=  2.87699D+09    |proj g|=  9.41786D+06

At iterate   81    f=  2.87694D+09    |proj g|=  1.63482D+07

At iterate   82    f=  2.87682D+09    |proj g|=  9.93312D+06

At iterate   83    f=  2.87667D+09    |proj g|=  7.58675D+06

At iterate   84    f=  2.87617D+09    |proj g|=  2.08427D+07

At iterate   85    f=  2.87591D+09    |proj g|=  1.91860D+07

At iterate   86    f=  2.87565D+09    |proj g|=  2.63397D+07

At iterate   87    f=  2.87528D+09    |proj g|=  4.19928D+07

At iterate   88    f=  2.87505D+09    |proj g|=  2.31012D+07

At iterate   89    f=  2.87484D+09    |proj g|=  2.35508D+07

At iterate   90    f=  2.87469D+09    |proj g|=  1.59847D+07

At iterate   91    f=  2.87459D+09    |proj g|=  6.29886D+06

At iterate   92    f=  2.87442D+09    |proj g|=  4.64814D+06

At iter


   evaluations in the last line search.  Termination
   may possibly be caused by a bad search direction.
 This problem is unconstrained.



At iterate   16    f=  4.75569D+09    |proj g|=  1.59748D+08

At iterate   17    f=  4.32767D+09    |proj g|=  5.73083D+08

At iterate   18    f=  4.01116D+09    |proj g|=  8.69957D+08

At iterate   19    f=  3.72586D+09    |proj g|=  2.35414D+08

At iterate   20    f=  3.58830D+09    |proj g|=  2.68724D+08

At iterate   21    f=  3.46652D+09    |proj g|=  1.37649D+08

At iterate   22    f=  3.41397D+09    |proj g|=  3.44585D+08

At iterate   23    f=  3.34219D+09    |proj g|=  8.10373D+07

At iterate   24    f=  3.17263D+09    |proj g|=  1.45465D+08

At iterate   25    f=  3.16491D+09    |proj g|=  1.21461D+08

At iterate   26    f=  3.09067D+09    |proj g|=  1.55509D+08

At iterate   27    f=  3.06756D+09    |proj g|=  9.92326D+07

At iterate   28    f=  3.04415D+09    |proj g|=  1.35209D+08

At iterate   29    f=  2.99272D+09    |proj g|=  1.89501D+08

At iterate   30    f=  2.98317D+09    |proj g|=  7.85723D+07

At iterate   31    f=  2.97856D+09    |proj g|=  2.81457D+07

At iter


   evaluations in the last line search.  Termination
   may possibly be caused by a bad search direction.
 This problem is unconstrained.



At iterate    3    f=  2.65873D+10    |proj g|=  2.86121D+08

At iterate    4    f=  1.55531D+10    |proj g|=  2.80940D+08

At iterate    5    f=  8.53179D+09    |proj g|=  1.96060D+08

At iterate    6    f=  8.22901D+09    |proj g|=  1.46920D+08

At iterate    7    f=  8.04350D+09    |proj g|=  5.76708D+07

At iterate    8    f=  7.87324D+09    |proj g|=  5.64299D+07

At iterate    9    f=  7.65649D+09    |proj g|=  1.68908D+08

At iterate   10    f=  7.40392D+09    |proj g|=  9.64970D+07

At iterate   11    f=  6.91855D+09    |proj g|=  1.09394D+08

At iterate   12    f=  6.74028D+09    |proj g|=  2.24526D+08

At iterate   13    f=  6.56220D+09    |proj g|=  1.63529D+08

At iterate   14    f=  6.34505D+09    |proj g|=  1.52816D+08

At iterate   15    f=  6.23784D+09    |proj g|=  1.47714D+08

At iterate   16    f=  5.80749D+09    |proj g|=  1.83050D+08

At iterate   17    f=  5.48601D+09    |proj g|=  1.32972D+08

At iterate   18    f=  5.44453D+09    |proj g|=  1.48928D+08

At iter


   evaluations in the last line search.  Termination
   may possibly be caused by a bad search direction.
 This problem is unconstrained.



At iterate    1    f=  1.18827D+11    |proj g|=  4.33456D+06

At iterate    2    f=  2.81479D+10    |proj g|=  3.96622D+08

At iterate    3    f=  2.79875D+10    |proj g|=  3.04021D+08

At iterate    4    f=  1.55463D+10    |proj g|=  2.95427D+08

At iterate    5    f=  8.69151D+09    |proj g|=  2.13485D+08

At iterate    6    f=  8.35019D+09    |proj g|=  1.61791D+08

At iterate    7    f=  8.14391D+09    |proj g|=  5.93712D+07

At iterate    8    f=  7.97343D+09    |proj g|=  6.51838D+07

At iterate    9    f=  7.70658D+09    |proj g|=  1.25045D+08

At iterate   10    f=  7.17719D+09    |proj g|=  2.16571D+08

At iterate   11    f=  6.70013D+09    |proj g|=  2.56721D+08

At iterate   12    f=  6.16502D+09    |proj g|=  2.42537D+08

At iterate   13    f=  5.98416D+09    |proj g|=  2.72092D+08

At iterate   14    f=  5.58713D+09    |proj g|=  1.68310D+08

At iterate   15    f=  5.36135D+09    |proj g|=  1.65385D+08

At iterate   16    f=  5.33545D+09    |proj g|=  2.32544D+08

At iter

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
  self.n_iter_ = _check_optimize_result("lbfgs", opt_res, self.max_iter)


array([0.89673189, 0.88708774, 0.81057339, 0.79231959, 0.83500736,
       0.80825629, 0.74871691, 0.81462795, 0.73919502, 0.74344363,
       0.81419476, 0.69957929])

In [27]:
# Usamos a média e o desvio padrão
print("Acurácia Média: %.2f%%" % (resultado.mean() * 100.0))

Acurácia Média: 79.91%


## **RESULTADOS**

**REGRESSÃO LINEAR SIMPLES:** R^2 = 0,57/0,60; RMSE = 99315,5; R^2 Validação Cruzada: 55,97%

**REGRESSÃO LINEAR MÚLTIPLA:** R^2 = 0,73/0,68; RMSE = 96087,3; R^2 Validação Cruzada: 69,25%

**REGRESSÃO POLINOMIAL:** R^2 = 0,59/0,54; RMSE = 114670,6.

**REGRESSÃO SVR:** R^2 = 0,87/0,81; RMSE = 73422,7. R^2 Validação Cruzada: 82,37%.

**REGRESSÃO ÁRVORE DE DECISÃO:** R^2 = 0,91/0,83; RMSE = 71114,5. R^2 Validação Cruzada: 74,60%.

**REGRESSÃO COM RANDOM FOREST:** R^2 = 0,92/0,85; RMSE = 66729,3. R^2 Validação Cruzada: 82,85%.

**REGRESSÃO COM XGBOOST:** R^2 = 0,93/0,84; RMSE = 67788,8. R^2 Validação Cruzada: 83,22%.

**REGRESSÃO COM LIGHT GBM:** R^2 = 0,88/0,82; RMSE = 71906,4. R^2 Validação Cruzada: 82,38%.

**REGRESSÃO COM CATBOOST:** R^2 = 0,90/0,84; RMSE = 69053,3 R^2 Validação Cruzada: 83,40%.

**REGRESSÃO COM REDES NEURAIS:** R^2 = 0,88/0,83; RMSE = 69717,4. R^2 Validação Cruzada: 77,15%. Escalonado.

## **Padronização de escala**

In [28]:
from sklearn.preprocessing import StandardScaler
x_scaler = StandardScaler()
x_treino_scaler = x_scaler.fit_transform(x_treino)

In [29]:
x_treino_scaler

array([[ 0.05327517, -0.70150711, -0.05467118],
       [ 1.12799963, -0.44487061, -0.52922816],
       [ 0.60711128, -0.79792304,  0.230063  ],
       ...,
       [-0.33111532, -0.36121561, -0.33940537],
       [-0.31699486,  0.84398345, -0.29194967],
       [-0.33268427, -0.38815536, -0.90887374]])

In [30]:
y_scaler = StandardScaler()
y_treino_scaler = y_scaler.fit_transform(y_treino.reshape(-1,1))

In [31]:
y_treino_scaler

array([[-1.05925606e-02],
       [ 6.46900118e-01],
       [ 2.85923746e-01],
       [-1.13728667e-01],
       [ 1.44111599e-01],
       [-7.84113359e-01],
       [-1.24822584e+00],
       [-2.81324840e-01],
       [-1.39512694e-01],
       [-1.01616960e+00],
       [ 2.00056152e+00],
       [ 1.21414870e+00],
       [ 1.27860877e+00],
       [-1.52404707e-01],
       [ 9.04740384e-01],
       [ 2.29945267e-03],
       [-2.81324840e-01],
       [ 2.73031732e-01],
       [-2.81324840e-01],
       [ 2.73031732e-01],
       [ 4.27735892e-01],
       [ 2.60139719e-01],
       [-1.06773765e+00],
       [-1.17087376e+00],
       [ 3.49603506e+00],
       [ 9.04740384e-01],
       [-2.07331469e+00],
       [ 1.69895626e-01],
       [ 4.27735892e-01],
       [ 2.76119030e+00],
       [-3.45784907e-01],
       [ 2.73031732e-01],
       [-1.48028208e+00],
       [ 1.57003612e-01],
       [-1.91080747e-01],
       [-1.01616960e+00],
       [ 3.37491799e-01],
       [-2.34845740e-02],
       [ 1.9

In [32]:
x_teste_scaler = x_scaler.transform(x_teste)
x_teste_scaler

array([[-6.24507256e-01, -6.20687880e-01,  1.17917695e+00],
       [ 9.56985082e-01, -8.43295235e-01, -2.61727885e+00],
       [-1.30072075e+00,  1.98112421e+00, -1.81053199e+00],
       [ 9.72674490e-01,  9.82935809e-01,  7.99531373e-01],
       [ 2.90185237e-01, -5.72479917e-01, -3.39405369e-01],
       [ 2.72926888e-01,  9.46070896e-01,  7.99531373e-01],
       [-2.08519115e+00,  2.33134087e+00, -1.81053199e+00],
       [-1.88341711e-01, -2.51777567e-02,  7.99531373e-01],
       [-1.08325729e-01, -2.13755962e-01, -2.44493974e-01],
       [-4.18976010e-01,  1.39296468e-01,  1.17917695e+00],
       [-1.41892805e-02,  1.26651206e+00,  7.99531373e-01],
       [-6.19800434e-01,  4.03022379e-01,  7.99531373e-01],
       [-2.99736508e-01, -7.29864736e-01,  5.14797187e-01],
       [-2.51664988e+00,  3.05162455e+00,  7.99531373e-01],
       [ 7.76556889e-01, -4.85989161e-01,  1.13172126e+00],
       [-7.45315699e-01,  6.32719141e-01,  1.27408835e+00],
       [ 7.36714050e-02, -1.24429444e-01

In [33]:
y_teste_scaler = y_scaler.transform(y_teste.reshape(-1,1))
y_teste_scaler

array([[-2.29756787e-01],
       [ 1.08522857e+00],
       [-1.06773765e+00],
       [ 7.50036225e-01],
       [ 1.18327572e-01],
       [-5.90733160e-01],
       [-1.27400987e+00],
       [ 1.18327572e-01],
       [-3.63765873e-02],
       [-2.68432827e-01],
       [-1.48028208e+00],
       [-1.48028208e+00],
       [-1.65296720e-01],
       [-4.87597053e-01],
       [ 7.50036225e-01],
       [-7.84113359e-01],
       [-2.16864774e-01],
       [ 2.08571666e-01],
       [ 9.25435458e-02],
       [ 1.13679662e+00],
       [ 1.20125669e+00],
       [ 1.44620494e+00],
       [-1.89282650e+00],
       [-8.79446405e-02],
       [-5.39165106e-01],
       [ 3.13505869e+00],
       [ 1.84585736e+00],
       [ 2.65805419e+00],
       [ 2.29945267e-03],
       [-6.21606139e-02],
       [-8.09897386e-01],
       [ 2.29945267e-03],
       [-1.52404707e-01],
       [ 2.52913406e+00],
       [-6.21606139e-02],
       [ 2.58070211e+00],
       [-2.42648800e-01],
       [-1.14508973e+00],
       [ 8.2

In [34]:
redes = MLPRegressor(hidden_layer_sizes=(6,6,6), activation='relu', verbose=True, max_iter=1500,
                    solver='lbfgs', random_state = 12)

In [35]:
redes.fit(x_treino_scaler, y_treino_scaler.ravel())

 This problem is unconstrained.


RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =          115     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  1.46378D+00    |proj g|=  1.97544D+00

At iterate    1    f=  4.31094D-01    |proj g|=  2.78340D-01

At iterate    2    f=  3.91974D-01    |proj g|=  1.95031D-01

At iterate    3    f=  3.17777D-01    |proj g|=  1.69936D-01

At iterate    4    f=  2.87962D-01    |proj g|=  2.23520D-01

At iterate    5    f=  2.45481D-01    |proj g|=  2.78317D-01

At iterate    6    f=  1.97417D-01    |proj g|=  1.04130D-01

At iterate    7    f=  1.81568D-01    |proj g|=  1.07492D-01

At iterate    8    f=  1.60761D-01    |proj g|=  1.30627D-01

At iterate    9    f=  1.26457D-01    |proj g|=  1.21617D-01

At iterate   10    f=  1.17529D-01    |proj g|=  1.12683D-01

At iterate   11    f=  1.03539D-01    |proj g|=  2.61634D-02

At iterate   12    f=  1.00173D-01    |proj g|=  3.17116D-02

At iterate   13    f=  9.3


At iterate  857    f=  5.76613D-02    |proj g|=  7.34765D-03

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
  115    857   1089      1     0     0   7.348D-03   5.766D-02
  F =   5.7661339163487847E-002

CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH             


In [36]:
redes.n_layers_

5

In [37]:
redes.score(x_treino_scaler, y_treino_scaler)

0.8846876257097267

**TESTE**

In [38]:
redes.score(x_teste_scaler, y_teste_scaler)

0.8262650819721504

In [39]:
previsoes_teste_scaler = redes.predict(x_teste_scaler)

In [40]:
previsoes_teste_scaler

array([-0.28986152,  1.29766031, -0.86352149, -0.88422386,  0.13354029,
       -0.92886199, -0.80022877, -0.28526393,  0.01418334, -0.44582461,
       -1.16182068, -0.57631861, -0.03622503, -1.3447516 ,  0.04652836,
       -0.74773912, -0.03095814,  0.64394035, -0.11361627,  1.20448944,
        1.39941369,  1.65804675, -1.28742397,  0.17962618, -0.0555947 ,
        2.84898864,  1.85886882,  2.78270688, -0.37100983, -0.1397362 ,
       -0.85357823, -0.32256198,  0.04979702,  1.55329908, -0.06500968,
        2.68075059, -0.00756254, -1.56630237,  0.14671343,  0.58440398,
       -1.14086931, -0.4041412 , -0.10556309,  0.1745674 , -0.14680319,
       -0.42843982,  0.1126581 , -0.8146019 ,  1.62992123, -0.15052513,
        1.56603039,  0.04144897,  0.78929165, -0.75944142,  1.84190899,
        0.59168607, -0.39851383, -1.09546307,  0.09154048, -0.01132947,
       -0.56712241, -1.04993361, -0.09357453, -0.14730661, -0.46033306,
       -0.15861688, -1.77334929,  0.53667625,  0.47864062, -1.35

## **MÉTRICAS**

**Revertendo a transformação**

In [41]:
previsoes_teste_scaler

array([-0.28986152,  1.29766031, -0.86352149, -0.88422386,  0.13354029,
       -0.92886199, -0.80022877, -0.28526393,  0.01418334, -0.44582461,
       -1.16182068, -0.57631861, -0.03622503, -1.3447516 ,  0.04652836,
       -0.74773912, -0.03095814,  0.64394035, -0.11361627,  1.20448944,
        1.39941369,  1.65804675, -1.28742397,  0.17962618, -0.0555947 ,
        2.84898864,  1.85886882,  2.78270688, -0.37100983, -0.1397362 ,
       -0.85357823, -0.32256198,  0.04979702,  1.55329908, -0.06500968,
        2.68075059, -0.00756254, -1.56630237,  0.14671343,  0.58440398,
       -1.14086931, -0.4041412 , -0.10556309,  0.1745674 , -0.14680319,
       -0.42843982,  0.1126581 , -0.8146019 ,  1.62992123, -0.15052513,
        1.56603039,  0.04144897,  0.78929165, -0.75944142,  1.84190899,
        0.59168607, -0.39851383, -1.09546307,  0.09154048, -0.01132947,
       -0.56712241, -1.04993361, -0.09357453, -0.14730661, -0.46033306,
       -0.15861688, -1.77334929,  0.53667625,  0.47864062, -1.35

In [42]:
previsoes_teste_inverse = y_scaler.inverse_transform(previsoes_teste_scaler.reshape(-1,1))

In [43]:
previsoes_teste_inverse

array([[408109.44757434],
       [666703.33409063],
       [314665.08674253],
       [311292.84613135],
       [477078.02309254],
       [304021.67204705],
       [324974.93603596],
       [408858.35611317],
       [457635.78482049],
       [382704.37815031],
       [266074.66940893],
       [361448.00876891],
       [449424.6872055 ],
       [236276.7696601 ],
       [462904.51591013],
       [333525.05614887],
       [450282.61878178],
       [560217.878244  ],
       [436818.30876654],
       [651526.58730567],
       [683278.09971215],
       [725407.23982776],
       [245614.95615211],
       [484585.02450617],
       [446269.53158559],
       [919401.6065998 ],
       [758119.45769832],
       [908604.86875503],
       [394891.07329199],
       [432563.59331173],
       [316284.76009882],
       [402782.81887377],
       [463436.95242219],
       [708344.72899059],
       [444735.91114834],
       [891997.05028987],
       [454093.56506069],
       [200188.02230375],
       [4792

In [44]:
y_teste

array([ 417900.,  632100.,  281400.,  577500.,  474600.,  359100.,
        247800.,  474600.,  449400.,  411600.,  214200.,  214200.,
        428400.,  375900.,  577500.,  327600.,  420000.,  489300.,
        470400.,  640500.,  651000.,  690900.,  147000.,  441000.,
        367500.,  966000.,  756000.,  888300.,  455700.,  445200.,
        323400.,  455700.,  430500.,  867300.,  445200.,  875700.,
        415800.,  268800.,  590100.,  497700.,  231000.,  315000.,
        388500.,  449400.,  413700.,  352800.,  453600.,  306600.,
        898800.,  514500.,  743400.,  474600.,  600600.,  304500.,
        661500.,  489300.,  422100.,  184800.,  525000.,  249900.,
        407400.,  361200.,  428400.,  392700.,  428400.,  472500.,
        258300.,  550200.,  346500.,  199500.,  302400.,  611100.,
        396900.,  585900.,  279300.,  483000.,  462000.,  218400.,
        518700.,  420000.,  392700.,  980700.,  455700.,  514500.,
        480900.,  520800.,  485100.,  525000.,  390600.,  5691

In [45]:
from sklearn.metrics import mean_absolute_error, mean_squared_error

In [46]:
# Erro médio Absoluto
mean_absolute_error(y_teste, previsoes_teste_inverse)

54336.54241534761

In [47]:
# Raiz do erro quadrático médio (RMSE)
np.sqrt(mean_squared_error(y_teste, previsoes_teste_inverse))

70998.10519974722

**Revertendo a transformação**

In [48]:
x_treino_inverse = x_scaler.inverse_transform(x_treino_scaler)

In [49]:
x_treino_inverse

array([[ 6.266,  7.9  , 18.4  ],
       [ 6.951,  9.71 , 17.4  ],
       [ 6.619,  7.22 , 19.   ],
       ...,
       [ 6.021, 10.3  , 17.8  ],
       [ 6.03 , 18.8  , 17.9  ],
       [ 6.02 , 10.11 , 16.6  ]])

In [50]:
x_treino

array([[ 6.266,  7.9  , 18.4  ],
       [ 6.951,  9.71 , 17.4  ],
       [ 6.619,  7.22 , 19.   ],
       ...,
       [ 6.021, 10.3  , 17.8  ],
       [ 6.03 , 18.8  , 17.9  ],
       [ 6.02 , 10.11 , 16.6  ]])

In [51]:
y_treino_inverse = y_scaler.inverse_transform(y_treino_scaler)

In [52]:
y_treino_inverse

array([[ 453600.],
       [ 560700.],
       [ 501900.],
       [ 436800.],
       [ 478800.],
       [ 327600.],
       [ 252000.],
       [ 409500.],
       [ 432600.],
       [ 289800.],
       [ 781200.],
       [ 653100.],
       [ 663600.],
       [ 430500.],
       [ 602700.],
       [ 455700.],
       [ 409500.],
       [ 499800.],
       [ 409500.],
       [ 499800.],
       [ 525000.],
       [ 497700.],
       [ 281400.],
       [ 264600.],
       [1024800.],
       [ 602700.],
       [ 117600.],
       [ 483000.],
       [ 525000.],
       [ 905100.],
       [ 399000.],
       [ 499800.],
       [ 214200.],
       [ 480900.],
       [ 424200.],
       [ 289800.],
       [ 510300.],
       [ 451500.],
       [ 766500.],
       [ 510300.],
       [ 369600.],
       [ 466200.],
       [ 312900.],
       [ 407400.],
       [ 273000.],
       [ 392700.],
       [ 294000.],
       [ 556500.],
       [ 178500.],
       [ 728700.],
       [ 157500.],
       [ 367500.],
       [ 327

In [53]:
x_teste_inverse = x_scaler.inverse_transform(x_teste_scaler)

In [54]:
x_teste_inverse

array([[ 5.834,  8.47 , 21.   ],
       [ 6.842,  6.9  , 13.   ],
       [ 5.403, 26.82 , 14.7  ],
       [ 6.852, 19.78 , 20.2  ],
       [ 6.417,  8.81 , 17.8  ],
       [ 6.406, 19.52 , 20.2  ],
       [ 4.903, 29.29 , 14.7  ],
       [ 6.112, 12.67 , 20.2  ],
       [ 6.163, 11.34 , 18.   ],
       [ 5.965, 13.83 , 21.   ],
       [ 6.223, 21.78 , 20.2  ],
       [ 5.837, 15.69 , 20.2  ],
       [ 6.041,  7.7  , 19.6  ],
       [ 4.628, 34.37 , 20.2  ],
       [ 6.727,  9.42 , 20.9  ],
       [ 5.757, 17.31 , 21.2  ],
       [ 6.279, 11.97 , 18.7  ],
       [ 6.51 ,  7.39 , 14.7  ],
       [ 5.807, 16.03 , 18.6  ],
       [ 6.739,  4.69 , 15.2  ],
       [ 7.327, 11.25 , 13.   ],
       [ 7.135,  4.45 , 17.   ],
       [ 4.519, 36.98 , 20.2  ],
       [ 5.85 ,  8.77 , 19.2  ],
       [ 5.569, 15.1  , 19.2  ],
       [ 7.645,  3.01 , 14.9  ],
       [ 7.333,  7.79 , 13.   ],
       [ 7.61 ,  3.11 , 14.7  ],
       [ 6.395, 13.27 , 20.2  ],
       [ 6.019, 12.92 , 19.2  ],
       [ 6

In [55]:
y_teste_inverse = y_scaler.inverse_transform(y_teste_scaler)

In [56]:
y_teste_inverse

array([[ 417900.],
       [ 632100.],
       [ 281400.],
       [ 577500.],
       [ 474600.],
       [ 359100.],
       [ 247800.],
       [ 474600.],
       [ 449400.],
       [ 411600.],
       [ 214200.],
       [ 214200.],
       [ 428400.],
       [ 375900.],
       [ 577500.],
       [ 327600.],
       [ 420000.],
       [ 489300.],
       [ 470400.],
       [ 640500.],
       [ 651000.],
       [ 690900.],
       [ 147000.],
       [ 441000.],
       [ 367500.],
       [ 966000.],
       [ 756000.],
       [ 888300.],
       [ 455700.],
       [ 445200.],
       [ 323400.],
       [ 455700.],
       [ 430500.],
       [ 867300.],
       [ 445200.],
       [ 875700.],
       [ 415800.],
       [ 268800.],
       [ 590100.],
       [ 497700.],
       [ 231000.],
       [ 315000.],
       [ 388500.],
       [ 449400.],
       [ 413700.],
       [ 352800.],
       [ 453600.],
       [ 306600.],
       [ 898800.],
       [ 514500.],
       [ 743400.],
       [ 474600.],
       [ 600

In [57]:
y_teste

array([ 417900.,  632100.,  281400.,  577500.,  474600.,  359100.,
        247800.,  474600.,  449400.,  411600.,  214200.,  214200.,
        428400.,  375900.,  577500.,  327600.,  420000.,  489300.,
        470400.,  640500.,  651000.,  690900.,  147000.,  441000.,
        367500.,  966000.,  756000.,  888300.,  455700.,  445200.,
        323400.,  455700.,  430500.,  867300.,  445200.,  875700.,
        415800.,  268800.,  590100.,  497700.,  231000.,  315000.,
        388500.,  449400.,  413700.,  352800.,  453600.,  306600.,
        898800.,  514500.,  743400.,  474600.,  600600.,  304500.,
        661500.,  489300.,  422100.,  184800.,  525000.,  249900.,
        407400.,  361200.,  428400.,  392700.,  428400.,  472500.,
        258300.,  550200.,  346500.,  199500.,  302400.,  611100.,
        396900.,  585900.,  279300.,  483000.,  462000.,  218400.,
        518700.,  420000.,  392700.,  980700.,  455700.,  514500.,
        480900.,  520800.,  485100.,  525000.,  390600.,  5691

# **DESAFIO 4**

DESENVOLVER UM ALGORITMO DE REDES NEURAIS ARTFICIAIS DE REGRESSÃO PARA DATASET DO LINK A SEGUIR:

https://www.kaggle.com/mirichoi0218/insurance/code