# Training a Classifier on the *Salammbô* Dataset with Keras
Author: Pierre Nugues

We first need to import some modules

In [1]:
from tensorflow import keras
from tensorflow.keras import layers
import numpy as np

### Reading the dataset
We can read the data from a file with the svmlight format or directly create numpy arrays

In [2]:
X = np.array(
    [[35680, 2217], [42514, 2761], [15162, 990], [35298, 2274],
     [29800, 1865], [40255, 2606], [74532, 4805], [37464, 2396],
     [31030, 1993], [24843, 1627], [36172, 2375], [39552, 2560],
     [72545, 4597], [75352, 4871], [18031, 1119], [36961, 2503],
     [43621, 2992], [15694, 1042], [36231, 2487], [29945, 2014],
     [40588, 2805], [75255, 5062], [37709, 2643], [30899, 2126],
     [25486, 1784], [37497, 2641], [40398, 2766], [74105, 5047],
     [76725, 5312], [18317, 1215]
     ])

y = np.array(
    [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
     1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1])

## Scaling the Data
Scaling and normalizing are usually very significant with neural networks. We use sklean transformers. They consist of two main methods: `fit()` and `transform()`.

### Normalizing

In [3]:
from sklearn.preprocessing import Normalizer
normalizer = Normalizer()
X_norm = normalizer.fit_transform(X)
X_norm[:4]



array([[0.99807515, 0.06201605],
       [0.99789783, 0.06480679],
       [0.99787509, 0.06515607],
       [0.99793128, 0.06428964]])

### Standardizing

In [4]:
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler(with_mean=True, with_std=True)
X_scaled = scaler.fit_transform(X_norm)
X_scaled[:4]

array([[ 1.68336574, -1.7197772 ],
       [ 0.57376529, -0.56145427],
       [ 0.43143908, -0.41648279],
       [ 0.78308579, -0.77610221]])

## Creating a Model

We set a seed to have reproducible results

In [5]:
np.random.seed(1337)

We create a classifier equivalent to a logistic regression 

In [6]:
model = keras.Sequential([
        layers.Dense(1, activation='sigmoid')
    ])

2022-09-01 14:27:46.379720: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.


Or with one hidden layer

In [7]:
model2 = keras.Sequential([
        layers.Dense(10, activation='relu'),
        # layers.Dropout(0.5),
        layers.Dense(1, activation='sigmoid')
    ])

To try the network with one hidden layer, set `complex` to true

In [8]:
complex = True
if complex == True:
    model = model2

## Fitting the Model

We compile and fit the model

In [9]:
model.compile(loss='binary_crossentropy',
              optimizer='sgd',
              metrics=['accuracy'])
model.fit(X_scaled, y, epochs=20, batch_size=1)

Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


<keras.callbacks.History at 0x7fc008cb3460>

### The weights

In [10]:
model.get_weights()

[array([[-2.168058  ],
        [ 0.21561582]], dtype=float32),
 array([0.02216917], dtype=float32)]

## Prediction
### Probabilities

We compute the probabilities to belong to class 1 for all the training set

In [11]:
y_pred_proba = model.predict(X_scaled)
y_pred_proba[:4]



array([[0.01801668],
       [0.20704249],
       [0.26834884],
       [0.13670324]], dtype=float32)

We recompute it with matrices

In [12]:
from tensorflow.keras.activations import sigmoid, relu

if complex:
    print(sigmoid((relu(X_scaled@model.get_weights()[0] + model.get_weights()[1]))@model.get_weights()[2] + model.get_weights()[3])[:4])
else:
    print(sigmoid((X_scaled@model.get_weights()[0] + model.get_weights()[1]))[:4])

tf.Tensor(
[[0.01801668]
 [0.20704248]
 [0.26834885]
 [0.13670324]], shape=(4, 1), dtype=float64)


### Classes

In [13]:
def predict_class(y_pred_proba):
    y_pred = np.zeros(y_pred_proba.shape[0])
    for i in range(y_pred_proba.shape[0]):
        if y_pred_proba[i][0] >= 0.5:
            y_pred[i] = 1
    return y_pred

In [14]:
y_pred = predict_class(y_pred_proba)
y_pred

array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 1., 1.,
       1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.])

## Evaluation

With Keras

In [15]:
scores = model.evaluate(X_scaled, y)



With sklearn

In [16]:
from sklearn.metrics import classification_report

print(classification_report(y, y_pred))

              precision    recall  f1-score   support

           0       1.00      1.00      1.00        15
           1       1.00      1.00      1.00        15

    accuracy                           1.00        30
   macro avg       1.00      1.00      1.00        30
weighted avg       1.00      1.00      1.00        30



We computed the accuracy from the training set. This is not a good practice. We should use a dedicated test set instead.