<h1>Table of Contents<span class="tocSkip"></span></h1>


# Introduction
<hr style="border:2px solid black"> </hr>


**What?** Deep learning for binary classification



# Imports
<hr style="border:2px solid black"> </hr>

In [1]:
import numpy
import pandas
from keras.models import Sequential
from keras.layers import Dense
from keras.wrappers.scikit_learn import KerasClassifier
from sklearn.model_selection import cross_val_score
from sklearn.preprocessing import LabelEncoder
from sklearn.model_selection import StratifiedKFold
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline

### Baseline Neural Network Model Performance

In [None]:
"""
Our model will have a single fully connected hidden layer with the same
number of neurons as input variables. This is a good default starting 
point when creating neural networks on a new problem.
"""

In [9]:
# fix random seed for reproducibility
seed = 7
numpy.random.seed(seed)
# load dataset
dataframe = pandas.read_csv("../DATASETS/sonar.all-data.csv", header=None)
dataset = dataframe.values
# split into input (X) and output (Y) variables
X = dataset[:,0:60].astype(float)
Y = dataset[:,60]
# encode class values as integers
encoder = LabelEncoder()
encoder.fit(Y)
encoded_Y = encoder.transform(Y)

# baseline model
def create_baseline():
    # create model
    model = Sequential()
    model.add(Dense(60, input_dim=60, kernel_initializer = "normal" , activation= "relu" ))
    model.add(Dense(1, kernel_initializer = "normal" , activation= "sigmoid" ))
    # Compile model
    model.compile(loss= "binary_crossentropy" , optimizer= "adam" , metrics=["accuracy"])
    return model

# evaluate model with standardized dataset
estimator = KerasClassifier(build_fn=create_baseline, epochs=100, batch_size=5, verbose=0)
kfold = StratifiedKFold(n_splits=10, shuffle=True, random_state=seed)
results = cross_val_score(estimator, X, encoded_Y, cv=kfold)
print("Baseline: %.2f%% (%.2f%%)" % (results.mean()*100, results.std()*100))

Baseline: 79.76% (9.68%)


### Improve Performance With Data Preparation

In [None]:
"""
An e↵ective data preparation scheme for tabular data when building neural
network models is standardization. This is where the data is rescaled such
that the mean value for each attribute is 0 and the standard deviation is 1.
This preserves Gaussian and Gaussian-like distributions whilst normalizing
the central tendencies for each attribute.
"""

In [11]:
# fix random seed for reproducibility
seed = 7
numpy.random.seed(seed)
# load dataset
dataframe = pandas.read_csv("../DATASETS/sonar.all-data.csv", header=None)
dataset = dataframe.values
# split into input (X) and output (Y) variables
X = dataset[:,0:60].astype(float)
Y = dataset[:,60]
# encode class values as integers
encoder = LabelEncoder()
encoder.fit(Y)
encoded_Y = encoder.transform(Y)

# baseline model
def create_baseline():
    # create model
    model = Sequential()
    model.add(Dense(60, input_dim=60, kernel_initializer = "normal" , activation= "relu" ))
    model.add(Dense(1, kernel_initializer = "normal" , activation= "sigmoid" ))
    # Compile model
    model.compile(loss= "binary_crossentropy" , optimizer= "adam" , metrics=["accuracy"])
    return model

# evaluate baseline model with standardized dataset
numpy.random.seed(seed)
estimators = []
estimators.append(("standardize" , StandardScaler()))
estimators.append(( "mlp" , KerasClassifier(build_fn=create_baseline, epochs=100, batch_size=5, verbose=0)))
pipeline = Pipeline(estimators)
kfold = StratifiedKFold(n_splits=10, shuffle=True, random_state=seed)
results = cross_val_score(pipeline, X, encoded_Y, cv=kfold)
print("Standardized: %.2f%% (%.2f%%)" % (results.mean()*100, results.std()*100))


Standardized: 85.52% (8.34%)


In [None]:
"""
Running this example provides the results below. We do see a small but 
very nice lift in the mean accuracy.
"""

### Evaluate a Smaller Network

In [None]:
"""
In this experiment we take our baseline model with 60 neurons in the 
hidden layer and reduce it by half to 30. This will put pressure on 
the network during training to pick out the most important structure 
in the input data to model. We will also standardize the data as in 
the previous experiment with data preparation and try to take advantage 
of the small lift in performance.
"""

In [14]:
# fix random seed for reproducibility
seed = 7
numpy.random.seed(seed)
# load dataset
dataframe = pandas.read_csv("../DATASETS/sonar.all-data.csv", header=None)
dataset = dataframe.values
# split into input (X) and output (Y) variables
X = dataset[:,0:60].astype(float)
Y = dataset[:,60]
# encode class values as integers
encoder = LabelEncoder()
encoder.fit(Y)
encoded_Y = encoder.transform(Y)

# smaller model
def create_smaller():
    # create model
    model = Sequential()
    model.add(Dense(60, input_dim=60, kernel_initializer = "normal" , activation= "relu" ))
    model.add(Dense(1, kernel_initializer = "normal" , activation= "sigmoid" ))
    # Compile model
    model.compile(loss= "binary_crossentropy" , optimizer= "adam" , metrics=["accuracy"])
    return model

numpy.random.seed(seed)
estimators = []
estimators.append(( "standardize" , StandardScaler()))
estimators.append(( "mlp" , KerasClassifier(build_fn=create_smaller, epochs=100,
    batch_size=5, verbose=0)))
pipeline = Pipeline(estimators)
kfold = StratifiedKFold(n_splits=10, shuffle=True, random_state=seed)
results = cross_val_score(pipeline, X, encoded_Y, cv=kfold)
print("Smaller: %.2f%% (%.2f%%)" % (results.mean()*100, results.std()*100))    

Smaller: 86.02% (7.48%)


In [None]:
"""
Running this example provides the following result. We can see that we 
have a very slight boost in the mean estimated accuracy and an important
reduction in the standard deviation (average spread) of the accuracy 
scores for the model. This is a great result because we are
doing slightly better with a network half the size, which in turn takes 
half the time to train.
"""

### Evaluate a Larger Network

In [None]:
"""
Here, we add one new layer (one line) to the network that introduces 
another hidden layer with 30 neurons after the first hidden layer. Our
network now has the topology: 60 inputs -> [60 -> 30] -> 1 output

The idea here is that the network is given the opportunity to model 
all input variables before being bottlenecked and forced to halve the
representational capacity, much like we did in the experiment above 
with the smaller network. Instead of squeezing the representation of
the inputs themselves, we have an additional hidden layer to aid in 
the process.
"""

In [16]:
# fix random seed for reproducibility
seed = 7
numpy.random.seed(seed)
# load dataset
dataframe = pandas.read_csv("../DATASETS/sonar.all-data.csv", header=None)
dataset = dataframe.values
# split into input (X) and output (Y) variables
X = dataset[:,0:60].astype(float)
Y = dataset[:,60]
# encode class values as integers
encoder = LabelEncoder()
encoder.fit(Y)
encoded_Y = encoder.transform(Y)

# smaller model
def create_larger():
    # create model
    model = Sequential()
    model.add(Dense(60, input_dim=60, kernel_initializer = "normal" , activation= "relu" ))
    model.add(Dense(1, kernel_initializer = "normal" , activation= "relu" ))
    model.add(Dense(1, kernel_initializer = "normal" , activation= "sigmoid" ))
    # Compile model
    model.compile(loss= "binary_crossentropy" , optimizer= "adam" , metrics=["accuracy"])
    return model

numpy.random.seed(seed)
estimators = []
estimators.append(( "standardize" , StandardScaler()))
estimators.append(("mlp" , KerasClassifier(build_fn=create_larger, epochs=100, batch_size=5, verbose=0)))
pipeline = Pipeline(estimators)
kfold = StratifiedKFold(n_splits=10, shuffle=True, random_state=seed)
results = cross_val_score(pipeline, X, encoded_Y, cv=kfold)
print("Larger: %.2f%% (%.2f%%)" % (results.mean()*100, results.std()*100))

Larger: 81.24% (6.61%)


In [None]:
"""
We can see that we do get a nice lift in the model performance, achieving
near state-of-the-art results with very little e↵ort indeed.
"""

# References
<hr style="border:2px solid black"> </hr>


- https://machinelearningmastery.com/binary-classification-tutorial-with-the-keras-deep-learning-library/

