<a href="https://colab.research.google.com/github/CristianoDataScience/Redes_Neurais_Artificiais_Regressao/blob/main/Redes_Neurais_Artificiais_Regressao.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Redes Neurais Artificiais: Regressão

* Este projeto tem por objetivo desenvolver um algoritmo de Machine Learning para prever o valor do preço médio de casas em Boston.

### Importando Bibliotecas

In [1]:
import pandas as pd
import numpy as np
import plotly.express as px
from sklearn.model_selection import train_test_split
from sklearn.neural_network import MLPRegressor
from sklearn.metrics import mean_absolute_error, mean_squared_error

In [2]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [3]:
df = pd.read_csv('/content/drive/MyDrive/ml_project/housing.csv', sep=',', encoding='iso-8859-1')
df.head(10)

Unnamed: 0,RM,LSTAT,PTRATIO,MEDV
0,6.575,4.98,15.3,504000.0
1,6.421,9.14,17.8,453600.0
2,7.185,4.03,17.8,728700.0
3,6.998,2.94,18.7,701400.0
4,7.147,5.33,18.7,760200.0
5,6.43,5.21,18.7,602700.0
6,6.012,12.43,15.2,480900.0
7,6.172,19.15,15.2,569100.0
8,5.631,29.93,15.2,346500.0
9,6.004,17.1,15.2,396900.0


### Atributos
1. RM: é o número médio de cômodos entre os imóveis no bairro.
2. LSTAT: é a procentagem de proprietários do bairro considerados de "calsse baixa"(proletariado)
3. PTRATIO: é a razão entre estundantes e professores nas escolas de ensino fundamentale médio no bairro
## Variável Alvo
1. MEDV: valor médio das casas

In [4]:
df.shape

(489, 4)

In [5]:
independente = df.iloc[:, 0:3].values
independente

array([[ 6.575,  4.98 , 15.3  ],
       [ 6.421,  9.14 , 17.8  ],
       [ 7.185,  4.03 , 17.8  ],
       ...,
       [ 6.976,  5.64 , 21.   ],
       [ 6.794,  6.48 , 21.   ],
       [ 6.03 ,  7.88 , 21.   ]])

In [6]:
independente.shape

(489, 3)

In [7]:
dependente = df.iloc[:, 3].values
dependente.shape

(489,)

## Treinamento

In [8]:
x_train, x_test, y_train, y_test = train_test_split(independente, dependente, test_size = 0.3, random_state = 0)

In [9]:
x_train.shape, x_test.shape

((342, 3), (147, 3))

In [10]:
redes = MLPRegressor(hidden_layer_sizes=(100, 100), activation='relu', verbose=True, max_iter=2000, solver='lbfgs', random_state= 12)

In [11]:
redes.fit(x_train, y_train)

In [12]:
redes.n_layers_

4

In [13]:
redes.score(x_train, y_train)

0.8536909521067109

## Teste

In [14]:
redes.score(x_test, y_test)

0.8178252425873315

In [15]:
previsores_teste = redes.predict(x_test)

In [16]:
previsores_teste

array([ 407767.5422182 ,  701972.97746861,  360968.61260667,
        281563.70376671,  504260.72534338,  294965.6791673 ,
        359661.83736238,  431768.94901103,  442911.66636384,
        397832.99417084,  260141.63435007,  371667.00215915,
        454351.69865628,  226556.5690648 ,  507897.48862296,
        337577.04366966,  443883.56705347,  581355.47934491,
        379434.56248093,  662706.21063195,  656462.36051081,
        707732.77231626,  233167.74937699,  422449.31968087,
        395208.70855643,  894121.71653521,  785687.31679089,
        889086.76826785,  429798.37739983,  432212.51644925,
        304791.7724169 ,  410770.49971032,  454159.07587872,
        717608.9934564 ,  433960.36992074, 1004099.59438876,
        475574.50404788,  253389.73178383,  485572.05145804,
        548798.09174937,  265174.97141844,  377724.48636809,
        423111.58269052,  478653.31057874,  295901.29030843,
        390240.67345036,  501059.85130627,  324797.84885648,
        910002.13737627,

## Métricas

In [17]:
# Erro médio absoluto
print(mean_absolute_error(y_test, previsores_teste))

54889.90145604481


In [18]:
# Raiz do erro quadrático médio (RMSE)
print(np.sqrt(mean_squared_error(y_test, previsores_teste)))

72702.15789942915


## Validação Cruzada

In [19]:
from sklearn.model_selection import KFold
from sklearn.model_selection import cross_val_score

In [20]:
# Separando os dados em folds
kfold = KFold(n_splits= 12, shuffle=True, random_state= 5)

In [21]:
# Criando o modelo
modelo = MLPRegressor(hidden_layer_sizes=(100, 100), activation='relu', verbose=True, max_iter= 2000, solver='lbfgs', random_state= 12)
resultado = cross_val_score(modelo, independente, dependente, cv= kfold)
resultado

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
  self.n_iter_ = _check_optimize_result("lbfgs", opt_res, self.max_iter)
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
  self.n_iter_ = _check_optimize_result("lbfgs", opt_res, self.max_iter)


array([0.89287907, 0.87189736, 0.81416127, 0.78567585, 0.80046946,
       0.80653329, 0.56896982, 0.79833847, 0.74399243, 0.68651471,
       0.85013494, 0.5445834 ])

In [22]:
# Usamos a média e o desvio padrão
print("Acurácia Média: %.2f%%" % (resultado.mean() * 100.0))
print("Desvio Padrão: %.2f%%" % (resultado.std() * 100.0))

Acurácia Média: 76.37%
Desvio Padrão: 10.63%


## Resultados

## **REGRESSÃO COM REDES NEURAIS:**
1. R² 0.85/0.82;
2. RMSE = 72702.15;
3. Acurácia Média = 76.37%;
4. Desvio Padrão = 10.63%