# Predicción usando Deep Learning con datos numéricos
En este notebook implementaremos una red neuronal de regresión teniendo en cuenta solo los datos numéricos (sin imágenes).

Usaremos una red neuronal MLP (Multiplayer Perceptron)

## Lectura dataset
Recuperamos los datasets de train y test que ya teníamos limpios.

In [1]:
import pandas as pd
import os
from sklearn import preprocessing

# Carga el dataset y devuelve un dataframe de Pandas
def load_airbnb_dataset(ruta,nombre):
    csv_path = os.path.join(ruta, nombre)
    return pd.read_csv(csv_path, sep=';')

In [2]:
copTrain = load_airbnb_dataset('datasets', 'train_final.csv')
copTest = load_airbnb_dataset('datasets', 'test_final.csv')

## Escalado de características

In [3]:
train_vote_average = copTrain['vote_average']
test_vote_average = copTest['vote_average']

train_sin_y = copTrain.drop(['vote_average'], axis=1, inplace=False)
test_sin_y = copTest.drop(['vote_average'], axis=1, inplace=False)

y_train = train_vote_average.values
X_train = train_sin_y.values

y_test = test_vote_average.values
X_test = test_sin_y.values

feature_names = copTrain.columns[:]

In [4]:
from sklearn import preprocessing

scaler = preprocessing.StandardScaler().fit(X_train)
XtrainScaled = scaler.transform(X_train)

scaler = preprocessing.StandardScaler().fit(X_test)
XtestScaled = scaler.transform(X_test)

## Red Neuronal (MLP)

In [5]:
import tensorflow as tf
#tf.enable_eager_execution()

In [6]:
from tensorflow.python.client import device_lib 
print(device_lib.list_local_devices())
print( tf.constant( 'Hellow from TensorFlow ' + tf.__version__ ) )

[name: "/device:CPU:0"
device_type: "CPU"
memory_limit: 268435456
locality {
}
incarnation: 17749256256812236872
, name: "/device:GPU:0"
device_type: "GPU"
memory_limit: 3063309926
locality {
  bus_id: 1
  links {
  }
}
incarnation: 14071241766969559359
physical_device_desc: "device: 0, name: GeForce GTX 1650 with Max-Q Design, pci bus id: 0000:03:00.0, compute capability: 7.5"
]
Tensor("Const:0", shape=(), dtype=string)


In [7]:
# Obtener precio máximo en Train, y escalamos los precios de test y train en rango de [0, 1]
maxPrice = copTrain["vote_average"].max()
trainY = copTrain["vote_average"] / maxPrice
testY = copTest["vote_average"] / maxPrice

In [8]:
# Número de características
XtrainScaled.shape

(7921, 66)

Para calcular el nº de neuronas por capa oculta uso la Regla de la pirámide geométrica.

Tenemos 66 características (66 neuronas de entrada), al ser un problema de regresión, solo tendremos una neurona de salida.

Dividiré por 3 capas ocultas de la siguiente forma:

r = raíz cuarta de 66/1 r = 3

h1 = 1 * 3^3 = 27

h2 = 1 * 3^2 = 9

h3 = 1 * 3 = 3

In [9]:
from keras.models import Sequential
from keras.layers.normalization import BatchNormalization
from keras.layers.convolutional import Conv2D
from keras.layers.convolutional import MaxPooling2D
from keras.layers.core import Activation
from keras.layers.core import Dropout
from keras.layers.core import Dense
from keras.layers import Flatten
from keras.layers import Input
from keras.models import Model
from keras.optimizers import Adam

def create_mlp(dim):
	# define our MLP network
	model = Sequential()
	model.add(Dense(27, input_dim=dim, activation="relu"))
	model.add(Dense(9, activation="relu"))
	model.add(Dense(3, activation="relu"))
	model.add(Dense(1, activation="linear"))
	
	return model

Using TensorFlow backend.


In [10]:
model = create_mlp(XtrainScaled.shape[1])
opt = Adam(lr=1e-3, decay=1e-3 / 200)
model.compile(loss="mean_squared_error", optimizer=opt)

historico = model.fit(XtrainScaled, trainY, validation_data=(XtestScaled, testY),
    epochs=10, batch_size=27)








Train on 7921 samples, validate on 1977 samples
Epoch 1/10





Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
