<a href="https://colab.research.google.com/github/Kaiziferr/Deep_Learning_Workshop/blob/master/workshop_regresion.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [13]:
import pandas as pd
from keras.models import Sequential
from keras.layers import Dense
from keras.wrappers.scikit_learn import KerasRegressor
from sklearn.model_selection import cross_val_score, KFold

In [4]:
# Carga de data
url = 'https://raw.githubusercontent.com/eric-bunch/boston_housing/master/boston.csv'
data = pd.read_csv(url)
data.head(3)

Unnamed: 0,CRIM,ZN,INDUS,CHAS,NOX,RM,AGE,DIS,RAD,TAX,PTRATIO,B,LSTAT,MDEV
0,0.00632,18.0,2.31,0.0,0.538,6.575,65.2,4.09,1.0,296.0,15.3,396.9,4.98,24.0
1,0.02731,0.0,7.07,0.0,0.469,6.421,78.9,4.9671,2.0,242.0,17.8,396.9,9.14,21.6
2,0.02729,0.0,7.07,0.0,0.469,7.185,61.1,4.9671,2.0,242.0,17.8,392.83,4.03,34.7


In [5]:
X = data.drop('MDEV', axis=1)
y = data['MDEV']

- Tiene una sola capa oculta 13 neuronas
- Funcion de activación ReLu para la capa oculta
  - Al ser un problema de regresión no tiene una función de activación en la capa de salida
  - Optimizador ADAM y función de perdida de error cuadrático medio
  - El objetivo Wrapper para la regresión se llama KerasRegressor
  - kfold 10

In [7]:
#Modelo base
def baseline_model():
  model = Sequential()
  model.add(Dense(13, input_dim = 13, activation='relu'))
  model.add(Dense(1))

  model.compile(loss='mean_squared_error', optimizer = 'adam')
  return model

In [14]:
estimator = KerasRegressor(build_fn=baseline_model, epochs=100, batch_size=5, verbose=0)
kfold = KFold(n_splits=10)
result = cross_val_score(estimator, X, y, cv = kfold)
print("Linea base: %.2f (%.2f) MSE" % (result.mean(), result.std()))

Linea base: -41.13 (29.92) MSE


- Da como resultado un valor negativo, ya que la libreria de sklearn macimiza la función, en vez de reducirla. Ignorar el valor negativa. por lo tanto tenemos un valor alto.

## Optimizar el rendimiento con procesamiento de datos

In [21]:
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline

estimator =[]
estimator.append(('standarize', StandardScaler()))
estimator.append(('NN', KerasRegressor(build_fn=baseline_model, epochs=100, batch_size=5,verbose=0)))
pipeline = Pipeline(estimator)
kfold = KFold(n_splits=10)
result = cross_val_score(pipeline, X, y, cv = kfold)
print("Estandarizado: %.2f (%.2f) MSE" % (result.mean(), result.std()))

Estandarizado: -20.53 (19.95) MSE


- Si el valor es muy cercano a 0 para que es menor, aparentemente por el contenedor, ignorando el negativo. **Investigar**

## Ajuste de la topología

In [23]:
#Modelo con más profundidad
def larger_model():
  model = Sequential()
  model.add(Dense(13, input_dim = 13, activation='relu'))
  model.add(Dense(6, activation='relu'))
  model.add(Dense(1))

  model.compile(loss='mean_squared_error', optimizer = 'adam')
  return model


estimator =[]
estimator.append(('standarize', StandardScaler()))
estimator.append(('Deep-NN', KerasRegressor(build_fn=larger_model, epochs=100, batch_size=5,verbose=0)))
pipeline = Pipeline(estimator)
kfold = KFold(n_splits=10)
result = cross_val_score(pipeline, X, y, cv = kfold)
print("Deep-NN: %.2f (%.2f) MSE" % (result.mean(), result.std()))

Deep-NN: -21.01 (26.88) MSE


- No mejoro mucho

In [24]:
# wider

def wider_model():
  model = Sequential()
  model.add(Dense(20, input_dim = 13, activation='relu'))
  model.add(Dense(1))

  model.compile(loss='mean_squared_error', optimizer = 'adam')
  return model


estimator =[]
estimator.append(('standarize', StandardScaler()))
estimator.append(('Wider-NN', KerasRegressor(build_fn=larger_model, epochs=50, batch_size=5,verbose=0)))
pipeline = Pipeline(estimator)
kfold = KFold(n_splits=10)
result = cross_val_score(pipeline, X, y, cv = kfold)
print("Wider-NN: %.2f (%.2f) MSE" % (result.mean(), result.std()))


Wider-NN: -19.46 (19.59) MSE
