# TensorFlow Keras MNIST Example on Habana Gaudi<sup>TM</sup>

This Jupyter Notebook example demonstrates how to train a simple neural network on Habana Gaudi<sup>TM</sup> card. The neural network is built with Keras APIs, and trained with MNIST dataset.


In [1]:
%pwd

'/home/ubuntu'

In [2]:
!ls

BUILD_FROM_SOURCE_PACKAGES_LICENCES  THIRD_PARTY_SOURCE_CODE_URLS
LINUX_PACKAGES_LICENSES		     resnet50_keras_example.ipynb
LINUX_PACKAGES_LIST		     resnet50_keras_lars_bf16_1card.yaml
PYTHON_PACKAGES_LICENSES	     tf_mnist.ipynb


We will clone Habana `Model-References` repository 0.15.4 branch to the current directory.

In [3]:
!git clone -b 0.15.4 https://github.com/HabanaAI/Model-References.git

Cloning into 'Model-References'...
remote: Enumerating objects: 5011, done.[K
remote: Counting objects: 100% (2786/2786), done.[K
remote: Compressing objects: 100% (2035/2035), done.[K
remote: Total 5011 (delta 1148), reused 2149 (delta 696), pack-reused 2225[K
Receiving objects: 100% (5011/5011), 64.04 MiB | 51.11 MiB/s, done.
Resolving deltas: 100% (1951/1951), done.


Check if we have cloned the repository successfully.

In [4]:
%ls

BUILD_FROM_SOURCE_PACKAGES_LICENCES  THIRD_PARTY_SOURCE_CODE_URLS
LINUX_PACKAGES_LICENSES              resnet50_keras_example.ipynb
LINUX_PACKAGES_LIST                  resnet50_keras_lars_bf16_1card.yaml
[0m[01;34mModel-References[0m/                    tf_mnist.ipynb
PYTHON_PACKAGES_LICENSES


Now let's check if `Model-References` repository location is in the sys.path. If not, then add it.

In [5]:
import sys
sys.path

['/home/ubuntu',
 '/usr/lib/python37.zip',
 '/usr/lib/python3.7',
 '/usr/lib/python3.7/lib-dynload',
 '',
 '/home/ubuntu/.local/lib/python3.7/site-packages',
 '/usr/local/lib/python3.7/dist-packages',
 '/usr/lib/python3/dist-packages',
 '/usr/local/lib/python3.7/dist-packages/IPython/extensions',
 '/home/ubuntu/.ipython']

Add `Model-References` location location to the sys.path so that the dependent Python packages in `Model-References` repository are loaded for ResNet50 training.

In [6]:
sys.path.append('/home/ubuntu/Model-References')
sys.path

['/home/ubuntu',
 '/usr/lib/python37.zip',
 '/usr/lib/python3.7',
 '/usr/lib/python3.7/lib-dynload',
 '',
 '/home/ubuntu/.local/lib/python3.7/site-packages',
 '/usr/local/lib/python3.7/dist-packages',
 '/usr/lib/python3/dist-packages',
 '/usr/local/lib/python3.7/dist-packages/IPython/extensions',
 '/home/ubuntu/.ipython',
 '/home/ubuntu/Model-References']

Now let's build a neural network and enable it on HPU.

First of all, import dependent packages.

In [7]:
import tensorflow as tf

2021-11-16 20:47:42.335801: W tensorflow/core/profiler/internal/smprofiler_timeline.cc:460] Initializing the SageMaker Profiler.
2021-11-16 20:47:42.335894: W tensorflow/core/profiler/internal/smprofiler_timeline.cc:105] SageMaker Profiler is not enabled. The timeline writer thread will not be started, future recorded events will be dropped.
2021-11-16 20:47:42.357561: W tensorflow/core/profiler/internal/smprofiler_timeline.cc:460] Initializing the SageMaker Profiler.


Disable TensorFlow eager mode if running TF 2.x.

In [8]:
tf.compat.v1.disable_eager_execution()

Load Habana TensorFlow software modules.

In [9]:
from TensorFlow.common.library_loader import load_habana_module
load_habana_module()

2021-11-16 20:47:47.972003: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX512F
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.


[name: "/device:CPU:0"
 device_type: "CPU"
 memory_limit: 268435456
 locality {
 }
 incarnation: 13765085378285791497,
 name: "/device:HPU:0"
 device_type: "HPU"
 memory_limit: 268435456
 locality {
 }
 incarnation: 13815380069492528153]

Download MNIST dataset and split the dataset for training and testing separately. 

In [10]:
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

Build a simple neural network with Keras APIs.

In [11]:
model = tf.keras.models.Sequential([
    tf.keras.layers.Flatten(input_shape=(28, 28)),
    tf.keras.layers.Dense(10),
])

loss = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
optimizer = tf.keras.optimizers.SGD(learning_rate=0.01)
model.compile(optimizer=optimizer, loss=loss, metrics=['accuracy'])

Train the model for 5 epochs with batch size of 128 

In [12]:
model.fit(x_train, y_train, epochs=5, batch_size=128)

Train on 60000 samples


2021-11-16 20:48:02.457201: I tensorflow/core/platform/profile_utils/cpu_utils.cc:114] CPU Frequency: 2999995000 Hz


Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<tensorflow.python.keras.callbacks.History at 0x7f26a5942550>

Evaluate the model with test dataset, output is in the format of [loss, accuracy]

In [13]:
model.evaluate(x_test, y_test)



[0.44847774155139924, 0.8872]