# SciKeras Benchmarks

SciKeras wraps Keras Models, but does not alter their performance since all of the heavy lifting still happens within Keras/Tensorflow. In this notebook, we compare the performance and accuracy of a pure-Keras Model to the same model wrapped in SciKeras.


<table align="left"><td>
<a target="_blank" href="https://colab.research.google.com/github/adriangb/scikeras/blob/master/notebooks/Basic_Usage.ipyn">
    <img src="https://www.tensorflow.org/images/colab_logo_32px.png" />Run in Google Colab</a>  
</td><td>
<a target="_blank" href="https://github.com/adriangb/scikeras/blob/master/notebooks/Basic_Usage.ipynb"><img width=32px src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />View source on GitHub</a></td></table>

### Table of contents

* [Dataset](#Dataset)
* [Define the Keras Model](#Define-the-Keras-Model)
* [Keras benchmark](#Keras-benchmark)
* [SciKeras benchmark](#SciKeras-benchmark)

Install SciKeras

In [None]:
!python -m pip install scikeras

Silence TensorFlow warnings to keep output succint.

In [2]:
import warnings
from tensorflow import get_logger
get_logger().setLevel('ERROR')
warnings.filterwarnings("ignore", message="Setting the random state for TF")

In [3]:
import numpy as np
from scikeras.wrappers import KerasClassifier, KerasRegressor
from tensorflow import keras

## Dataset

We will be using the MNIST dataset available within Keras.

In [4]:
(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()
# Scale images to the [0, 1] range
x_train = x_train.astype("float32") / 255
x_test = x_test.astype("float32") / 255
# Make sure images have shape (28, 28, 1)
x_train = np.expand_dims(x_train, -1)
x_test = np.expand_dims(x_test, -1)
# Reduce dataset size for faster benchmarks
x_bench, y_bench = x_train[:5000], y_train[:5000]

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz


## Define Keras Model

Next we will define our Keras model (adapted from [keras.io](https://keras.io/examples/vision/mnist_convnet/)):

In [5]:
num_classes = 10
input_shape = (28, 28, 1)


def get_model():
    model = keras.Sequential(
        [
            keras.Input(input_shape),
            keras.layers.Conv2D(32, kernel_size=(3, 3), activation="relu"),
            keras.layers.MaxPooling2D(pool_size=(2, 2)),
            keras.layers.Conv2D(64, kernel_size=(3, 3), activation="relu"),
            keras.layers.MaxPooling2D(pool_size=(2, 2)),
            keras.layers.Flatten(),
            keras.layers.Dropout(0.5),
            keras.layers.Dense(num_classes, activation="softmax"),
        ]
    )
    model.compile(
        loss="sparse_categorical_crossentropy", optimizer="adam"
    )
    return model

## Keras benchmarks

Performance:

In [13]:
fit_kwargs = {"batch_size": 128, "validation_split": 0.1, "verbose": 0}
%timeit get_model().fit(x_train, y_train, **fit_kwargs)

1 loop, best of 3: 38 s per loop


Accuracy:

In [22]:
from sklearn.metrics import accuracy_score
from scikeras._utils import TFRandomState

In [23]:
with TFRandomState(seed=0):  # we force a TF random state to be able to compare accuracy
    model = get_model()
    model.fit(x_bench, y_bench, **fit_kwargs)
    y_pred = np.argmax(model.predict(x_test), axis=1)
print(f"Accuracy: {accuracy_score(y_test, y_pred)}")

Accuracy: 0.7948


## SciKeras benchmark

In [24]:
clf = KerasClassifier(
    model=get_model,
    batch_size=128,
    validation_split=0.1,
    verbose=0,
    random_state=0,
)

Performance:

In [27]:
%timeit clf.fit(x_train, y_train)

1 loop, best of 3: 38.4 s per loop


Accuracy:

In [26]:
clf.fit(x_bench, y_bench)
y_pred = clf.predict(x_test)
print(f"Accuracy: {accuracy_score(y_test, y_pred)}")

Accuracy: 0.7948
