# Tutorial
Welcome to Keras-MML! This notebook will introduce you to the basics of working with Keras-MML.

Keras-MML mainly provides layers that replace in-built Keras layers with those that do not use matrix multiplications. For this notebook, we will focus on a matrix multiplication free implementation of a `Dense` layer, appropriately called `DenseMML`.

We will demonstrate its use in predicting handwritten digits from the [MNIST dataset](https://en.wikipedia.org/wiki/MNIST_database) using a very simple [multi-layer perceptron (MLP)](https://en.wikipedia.org/wiki/Multilayer_perceptron).

First, let's prepare the imports.

In [1]:
import keras
import numpy as np

2024-06-20 13:40:35.880397: I tensorflow/core/util/port.cc:113] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-06-20 13:40:35.880695: I external/local_tsl/tsl/cuda/cudart_stub.cc:32] Could not find cuda drivers on your machine, GPU will not be used.
2024-06-20 13:40:35.882736: I external/local_tsl/tsl/cuda/cudart_stub.cc:32] Could not find cuda drivers on your machine, GPU will not be used.
2024-06-20 13:40:35.907981: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.


Define constants relating to the data. In particular, we know that there are 10 distinct digits in the dataset, and that each entry is a $28 \times 28$ greyscale image. This means that the input shape into the model is `(28, 28)`.

In [2]:
NUM_CLASSES = 10
INPUT_SHAPE = (28, 28)

Let's now load the data. Keras provides the MNIST dataset already, so we just need to load it in using the `load_data()` function for the `mnist` dataset.

In [3]:
(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()

We do some simple preprocessing. We normalize each pixel's value to be in the interval $[0, 1]$ so that the model can learn better.

In [4]:
x_train = x_train.astype("float32") / 255
x_test = x_test.astype("float32") / 255

Finally, we convert the class vectors into binary class matrices.

In [5]:
y_train = keras.utils.to_categorical(y_train, NUM_CLASSES)
y_test = keras.utils.to_categorical(y_test, NUM_CLASSES)

Now we are ready to define the prediction model. Of course, we first have to import `keras_mml` before we can do anything.

In [6]:
import keras_mml

We are now ready to define the `Sequential` model. Notice that we swap out `Dense` layers with `DenseMML` layers. However, we need to leave the last layer alone in order for the model outputs to work correctly. This is because `DenseMML` uses quantization internally, which means that the outputs of the model have been treated in such a way that they are *forced* to not use matrix multiplications. This is fine and good for the most part, but for outputs of our model, we require the highest precision. So we are stuck with using the standard `Dense` layer.

In [7]:
model = keras.Sequential(
    [
        keras.Input(shape=INPUT_SHAPE),
        keras.layers.Flatten(),
        keras_mml.layers.DenseMML(256),
        keras_mml.layers.DenseMML(256),
        keras_mml.layers.DenseMML(256),
        keras.layers.Dense(NUM_CLASSES, activation="softmax"),
    ],
    name="MNIST-Classifier"
)

model.compile(loss="categorical_crossentropy", optimizer="adam", metrics=["accuracy"])

In [8]:
model.summary()

We can now train the model.

In [9]:
model.fit(x_train, y_train, batch_size=128, epochs=20, validation_split=0.1)

Epoch 1/20
[1m422/422[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 5ms/step - accuracy: 0.4256 - loss: 2.1773 - val_accuracy: 0.7970 - val_loss: 1.2788
Epoch 2/20
[1m422/422[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 5ms/step - accuracy: 0.7718 - loss: 1.1203 - val_accuracy: 0.8463 - val_loss: 0.6688
Epoch 3/20
[1m422/422[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 5ms/step - accuracy: 0.8174 - loss: 0.7031 - val_accuracy: 0.8652 - val_loss: 0.5159
Epoch 4/20
[1m422/422[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 5ms/step - accuracy: 0.8374 - loss: 0.5814 - val_accuracy: 0.8757 - val_loss: 0.4529
Epoch 5/20
[1m422/422[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 5ms/step - accuracy: 0.8495 - loss: 0.5237 - val_accuracy: 0.8823 - val_loss: 0.4178
Epoch 6/20
[1m422/422[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 5ms/step - accuracy: 0.8577 - loss: 0.4859 - val_accuracy: 0.8878 - val_loss: 0.3959
Epoch 7/20
[1m422/422[0m 

<keras.src.callbacks.history.History at 0x7fa1fbe10ee0>

Once the model is trained, let's evaluate it.

In [10]:
score = model.evaluate(x_test, y_test, verbose=0)
print("Test loss:", score[0])
print("Test accuracy:", score[1])

Test loss: 0.33874186873435974
Test accuracy: 0.8985999822616577


Congratulations! You have seen how to use Keras-MML in your Keras models!