# Sonar Object Classifiction Dataset
* It describes sonar chirp returns bouncing off different surfaces.
* The 60 input variables are the strength of the returns at different angles.
* It is a binary classification problem that requires a model to differentiate rocks from metal cylinders.

## 1. Baseline Neural Network Model Performance

In [1]:
import numpy as np
import pandas as pd
from keras.models import Sequential
from keras.layers import Dense
from keras.wrappers.scikit_learn import KerasClassifier
from sklearn.model_selection import cross_val_score
from sklearn.preprocessing import LabelEncoder
from sklearn.model_selection import StratifiedKFold
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline

In [2]:
# fix random seed for reproducibility
seed = 7
np.random.seed(seed)

In [4]:
# load dataset
dataframe = pd.read_csv('sonar.csv',header=None)
dataset = dataframe.values

# split into input(X) and output (Y) variables
X = dataset[:,0:60].astype(float)
Y = dataset[:,60]


In [6]:
# encode class values as integers
encoder = LabelEncoder()
encoder.fit(Y)
encoded_Y = encoder.transform(Y)
print(encoded_Y)

[1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]


In [7]:
# baseline model
def create_baseline():
    model = Sequential()
    model.add(Dense(60,input_dim=60,activation='relu'))
    model.add(Dense(1,activation='sigmoid'))
    model.compile(loss='binary_crossentropy',optimizer='adam',metrics=['accuracy'])
    return model

In [9]:
# evaluate model with standardized dataset
estimator = KerasClassifier(build_fn=create_baseline,epochs=100,batch_size=5,verbose=0)
kfold = StratifiedKFold(n_splits=10,shuffle=True,random_state=seed)
results = cross_val_score(estimator,X,encoded_Y,cv=kfold)
print("Baseline: %.2f%% (%.2f%%)" %(results.mean()*100,results.std()*100))

Baseline: 80.76% (8.64%)


## 2. Improve Performance With Data Preparation
* An effective data preparation scheme for tabular data when building neural netowork models is standardization.
* Data is rescaled such that the mean value for each attribute is 0 and the standard deviation is 1.

In [10]:
estimators = []
estimators.append(('standardize',StandardScaler()))
estimators.append(('mlp',KerasClassifier(build_fn=create_baseline,epochs=100,batch_size=5,verbose=0)))
pipeline = Pipeline(estimators)
kfold = StratifiedKFold(n_splits=10,shuffle=True,random_state=seed)
results = cross_val_score(pipeline,X,encoded_Y,cv=kfold)
print("Standardized: %.2f%% (%.2f%%)" %(results.mean()*100,results.std()*100))

Standardized: 86.95% (6.97%)


## 3. Tuning Layers and Neurons in the Model
* There are many things to tune on a neural network,such as the *weight initialization,activation functions,optimization procedure and so on*.
* One aspect that may have an outsized effect is the structure of the network itself called the **network topology**.
* Experiments on the structure of the network : making it smaller and making it larger.

## 3.1 Evaluate a smaller network
* we take our baseline model with 60 neurons in the hidden layer and reduce it by half to 30.
* This will put pressure on the network during training to pick out the most important structure in the input data to model.

In [12]:
# smaller model
def create_smaller():
    model = Sequential()
    model.add(Dense(30,input_dim=60,activation='relu'))
    model.add(Dense(1,activation='sigmoid'))
    model.compile(loss='binary_crossentropy',optimizer='adam',metrics=['accuracy'])
    return model

np.random.seed(seed)
estimators = []
estimators.append(('standardize',StandardScaler()))
estimators.append(('mlp',KerasClassifier(build_fn=create_smaller,epochs=100,batch_size=5,verbose=0)))
pipeline = Pipeline(estimators)
kfold = StratifiedKFold(n_splits=10,shuffle=True,random_state=seed)
results = cross_val_score(pipeline,X,encoded_Y,cv=kfold)
print("Smaller: %.2f%% (%.2f%%)" %(results.mean()*100,results.std()*100))

Smaller: 85.57% (8.50%)


## 3.2 Evaluate a Larger Network
* Here, we add one new layer to the network that introduces another hidden layer with 30 neurons after the first hidden layer.
* 60 inputs -> [60->30] -> 1 output

In [13]:
# larger model
def create_larger():
    model = Sequential()
    model.add(Dense(60,input_dim=60,activation='relu'))
    model.add(Dense(30,activation='relu'))
    model.add(Dense(1,activation='sigmoid'))
    model.compile(loss='binary_crossentropy',optimizer='adam',metrics=['accuracy'])
    return model

np.random.seed(seed)
estimators = []
estimators.append(('standardize',StandardScaler()))
estimators.append(('mlp',KerasClassifier(build_fn=create_larger,epochs=100,batch_size=5,verbose=0)))
pipeline = Pipeline(estimators)
kfold = StratifiedKFold(n_splits=10,shuffle=True,random_state=seed)
results = cross_val_score(pipeline,X,encoded_Y,cv=kfold)
print("Larger: %.2f%% (%.2f%%)" %(results.mean()*100,results.std()*100))


Larger: 87.40% (6.78%)


# Summary
* Discovered how we can work through a binary classification problem step-by-step with keras.