source: https://machinelearningmastery.com/binary-classification-tutorial-with-the-keras-deep-learning-library/

## Description of the Dataset

The dataset will use in this tutorial is [Sonar dataset](https://archive.ics.uci.edu/ml/datasets/Connectionist+Bench+(Sonar,+Mines+vs.+Rocks))

In [3]:
import pandas as pd
import numpy as np
np.random.seed(42)

In [4]:
sonar_df = pd.read_csv("dataset/connectionist_bench/sonar.all-data", header=None)
dataset = sonar_df.values

X = dataset[:, :60].astype(float)
y = dataset[:,60]

## Baseline Neural Network Model Performance

In [8]:
from sklearn.preprocessing import LabelEncoder

In [10]:
encoder = LabelEncoder()
encoder.fit(y)
y_encoded = encoder.transform(y)

To use Keras models with scikit-learn, we must use the KerasClassifier wrapper. This class takes a function that creates and returns our neural network model.

In [16]:
from keras.models import Sequential
from keras.layers import Dense
from keras.wrappers.scikit_learn import KerasClassifier
from sklearn.model_selection import StratifiedKFold, cross_val_score

In [18]:
def baseline_model():
    model = Sequential()
    model.add(Dense(60, input_dim=60, kernel_initializer='normal', activation='relu'))
    model.add(Dense(1, kernel_initializer='normal', activation='sigmoid'))
    model.compile(loss='binary_crossentropy',
                  optimizer='adam',
                  metrics=['accuracy'])
    return model

In [19]:
estimator = KerasClassifier(build_fn=baseline_model, nb_epoch=100, batch_size=5, verbose=0)
kfold = StratifiedKFold(n_splits=10, shuffle=True, random_state=42)
results = cross_val_score(estimator, X, y_encoded, cv=kfold)
print("Results: %.2f%% (%.2f%%)" % (results.mean()*100, results.std()*100))

Results: 78.44% (10.57%)


## Re-Run The Baseline Model with Data Preparation

In [21]:
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler

In [22]:
pipeline = Pipeline([
    ('standardize', StandardScaler()),
    ('mlp', KerasClassifier(build_fn=baseline_model, epochs=100, batch_size=5, verbose=0)),
])

kfold = StratifiedKFold(n_splits=10, shuffle=True, random_state=42)
results = cross_val_score(pipeline, X, y_encoded, cv=kfold)
print("Standardized: %.2f%% (%.2f%%)" % (results.mean()*100, results.std()*100))

Standardized: 85.14% (10.11%)


## Tuning Layers and Number of Neurons in the Model

In [23]:
def deep_model():
    model = Sequential([
        Dense(60, input_dim=60, kernel_initializer="normal", activation="relu"),
        Dense(30, kernel_initializer="normal", activation="relu"),
        Dense(1, kernel_initializer="normal", activation="sigmoid"),
    ])
    model.compile(loss="binary_crossentropy",
                  optimizer="adam",
                  metrics=['accuracy'])
    return model

In [25]:
pipeline = Pipeline([
    ('standardize', StandardScaler()), 
    ('mlp', KerasClassifier(build_fn=deep_model, epochs=100, batch_size=5, verbose=0)),
])
kfold = StratifiedKFold(n_splits=10, shuffle=True, random_state=42)
results = cross_val_score(pipeline, X, y_encoded, cv=kfold)
print("Standardized: %.2f%% (%.2f%%)" % (results.mean()*100, results.std()*100))

Standardized: 84.28% (11.43%)
