## How to get reproducible results with Tensorflow
Deterministic behaviour can be obtained by using the tf.set_random_seed function. This allows to place a graph-level seed. This is helpful in order to train the same final set of network weights given a training dataset in the case where a model is being used in a production environment.

This notebook will run with Python 3.x and Tensorflow 2.1.x. For example you can use this container : nvcr.io/nvidia/tensorflow:20.03-tf2-py3

If you wish to run a jupyter notebook on your local machine, just use 

docker run -ti --rm -p 8888:8888 -v $(pwd):/tmp nvcr.io/nvidia/tensorflow:20.03-tf2-py3 /bin/bash     
and open the notebook in your browser by typing      

localhost:8888 




In [1]:
import json
import pprint
import tensorflow as tf
import numpy as np
print(tf.version.VERSION)

2.1.0


Now create random numbers with two functions f and g and print them. We expect to get different values for A1 and A2 for funtion f and g.  

In [None]:
@tf.function
def f():
  a = tf.random.uniform([1])
  b = tf.random.uniform([1])
  return a, b

@tf.function
def g():
  a = tf.random.uniform([1])
  b = tf.random.uniform([1])
  return a, b

print(f())  # prints '(A1, A2)'
print(g())  # prints '(A1, A2)'

The random seed can also be specified with a specific number, such as “1”, to ensure that the same sequence of random numbers is generated each time the code is run.

This number can be used again and makes sure to get the same random numbers again in your model. The expected behaviour of this is, that the two functions e and h give exactly the same result for A1 and A2.

In [None]:
tf.random.set_seed(123)

@tf.function
def e():
  a = tf.random.uniform([1])
  b = tf.random.uniform([1])
  return a, b

@tf.function
def h():
  a = tf.random.uniform([1])
  b = tf.random.uniform([1])
  return a, b

print(e())  # prints '(A1, A2)'
print(h())  # prints '(A1, A2)'

In [11]:
!pwd

/home/thomas/Downloads


## Future Work
Visit https://pypi.org/project/tensorflow-determinism/

NGC TensorFlow containers, starting with version 19.06, implement GPU-deterministic TensorFlow functionality. In Python code running inside the container, this can be enabled as follows:

In the past, tf.math.reduce_sum and tf.math.reduce_mean operated non-deterministically when running on a GPU. This was resolved before TensorFlow version 1.12. These ops now function deterministically by default when running on a GPU.

In [None]:
!pip install tensorflow-determinism

In [None]:
!pip install keras

In [None]:
import os
os.environ['TF_DETERMINISTIC_OPS'] = '0'
os.environ['TF_CUDNN_DETERMINISTIC'] = '0'

In [None]:
'''Trains a simple convnet on the MNIST dataset.

Gets to 99.25% test accuracy after 12 epochs
(there is still a lot of margin for parameter tuning).
16 seconds per epoch on a GRID K520 GPU.
'''

from __future__ import print_function
import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras import backend as K
import time

batch_size = 32
num_classes = 10
epochs = 6

# input image dimensions
img_rows, img_cols = 28, 28

# the data, split between train and test sets
(x_train, y_train), (x_test, y_test) = mnist.load_data()

if K.image_data_format() == 'channels_first':
    x_train = x_train.reshape(x_train.shape[0], 1, img_rows, img_cols)
    x_test = x_test.reshape(x_test.shape[0], 1, img_rows, img_cols)
    input_shape = (1, img_rows, img_cols)
else:
    x_train = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1)
    x_test = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1)
    input_shape = (img_rows, img_cols, 1)

x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255
print('x_train shape:', x_train.shape)
print(x_train.shape[0], 'train samples')
print(x_test.shape[0], 'test samples')

# convert class vectors to binary class matrices
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)

model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),
                 activation='relu',
                 input_shape=input_shape))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))

start_time=time.time()


model.compile(loss=keras.losses.categorical_crossentropy,
              optimizer=keras.optimizers.Adadelta(),
              metrics=['accuracy'])

model.fit(x_train, y_train,
          batch_size=batch_size,
          epochs=epochs,
          verbose=1,
          validation_data=(x_test, y_test))
score = model.evaluate(x_test, y_test, verbose=0)
elapsedtime = time.time() - start_time
print('Test loss:', score[0])
print('Test accuracy:', score[1])
print('Elapsed Time:', elapsedtime)

## Trial 1 : 
    - val_accuracy: 0.9927     
Test loss: 0.026081169054882777       
Test accuracy: 0.9926999807357788     

In [None]:
'''Trains a simple convnet on the MNIST dataset.

Gets to 99.25% test accuracy after 12 epochs
(there is still a lot of margin for parameter tuning).
16 seconds per epoch on a GRID K520 GPU.
'''

from __future__ import print_function
import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras import backend as K
import time

batch_size = 32
num_classes = 10
epochs = 6

# input image dimensions
img_rows, img_cols = 28, 28

# the data, split between train and test sets
(x_train, y_train), (x_test, y_test) = mnist.load_data()

if K.image_data_format() == 'channels_first':
    x_train = x_train.reshape(x_train.shape[0], 1, img_rows, img_cols)
    x_test = x_test.reshape(x_test.shape[0], 1, img_rows, img_cols)
    input_shape = (1, img_rows, img_cols)
else:
    x_train = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1)
    x_test = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1)
    input_shape = (img_rows, img_cols, 1)

x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255
print('x_train shape:', x_train.shape)
print(x_train.shape[0], 'train samples')
print(x_test.shape[0], 'test samples')

# convert class vectors to binary class matrices
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)

model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),
                 activation='relu',
                 input_shape=input_shape))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))

start_time=time.time()

model.compile(loss=keras.losses.categorical_crossentropy,
              optimizer=keras.optimizers.Adadelta(),
              metrics=['accuracy'])

model.fit(x_train, y_train,
          batch_size=batch_size,
          epochs=epochs,
          verbose=1,
          validation_data=(x_test, y_test))
score = model.evaluate(x_test, y_test, verbose=0)
elapsedtime = time.time() - start_time
print('Test loss:', score[0])
print('Test accuracy:', score[1])
print('Elapsed Time:', elapsedtime)

## Trial 2 :

- val_accuracy: 0.9911     
Test loss: 0.027497793700697368      
Test accuracy: 0.991100013256073  

As you can see these numbers are not the same as before.


## Now try the TheanoFlag 

In [None]:
os.environ['dnn.conv.algo_bwd_filter'] = 'deterministic'

In [None]:
os.environ['dnn.conv.algo_bwd_filter'] = 'none'

In [None]:
'''Trains a simple convnet on the MNIST dataset.

Gets to 99.25% test accuracy after 12 epochs
(there is still a lot of margin for parameter tuning).
16 seconds per epoch on a GRID K520 GPU.
'''

from __future__ import print_function
import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras import backend as K
import time


batch_size = 32
num_classes = 10
epochs = 6

# input image dimensions
img_rows, img_cols = 28, 28

# the data, split between train and test sets
(x_train, y_train), (x_test, y_test) = mnist.load_data()

if K.image_data_format() == 'channels_first':
    x_train = x_train.reshape(x_train.shape[0], 1, img_rows, img_cols)
    x_test = x_test.reshape(x_test.shape[0], 1, img_rows, img_cols)
    input_shape = (1, img_rows, img_cols)
else:
    x_train = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1)
    x_test = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1)
    input_shape = (img_rows, img_cols, 1)

x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255
print('x_train shape:', x_train.shape)
print(x_train.shape[0], 'train samples')
print(x_test.shape[0], 'test samples')

# convert class vectors to binary class matrices
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)

model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),
                 activation='relu',
                 input_shape=input_shape))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))

start_time=time.time()

model.compile(loss=keras.losses.categorical_crossentropy,
              optimizer=keras.optimizers.Adadelta(),
              metrics=['accuracy'])

model.fit(x_train, y_train,
          batch_size=batch_size,
          epochs=epochs,
          verbose=1,
          validation_data=(x_test, y_test))
score = model.evaluate(x_test, y_test, verbose=0)
elapsedtime = time.time() - start_time
print('Test loss:', score[0])
print('Test accuracy:', score[1])
print('Elapsed Time:', elapsedtime)

## Trial 3 :

- val_accuracy: 0.9916  
Test loss: 0.027999836079927717   
Test accuracy: 0.991599977016449     
    

In [None]:
tf.random.set_seed(123)
os.environ['dnn.conv.algo_bwd_filter'] = 'deterministic'
os.environ['dnn.conv.algo_bwd_data'] = 'deterministic'

In [8]:
import os
os.environ['dnn.conv.algo_bwd_filter'] = 'deterministic'
os.environ['dnn.conv.algo_bwd_data'] = 'deterministic'
#os.environ['dnn.conv.algo_fwd'] = 'time_once'
os.environ['optimizer_excluding'] = 'conv_dnn'

In [9]:
from numpy.random import seed
seed(1)

In [10]:
'''Trains a simple convnet on the MNIST dataset.

Gets to 99.25% test accuracy after 12 epochs
(there is still a lot of margin for parameter tuning).
16 seconds per epoch on a GRID K520 GPU.
'''

from __future__ import print_function
import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras import backend as K
import time


batch_size = 32
num_classes = 10
epochs = 2

# input image dimensions
img_rows, img_cols = 28, 28

# the data, split between train and test sets
(x_train, y_train), (x_test, y_test) = mnist.load_data()

if K.image_data_format() == 'channels_first':
    x_train = x_train.reshape(x_train.shape[0], 1, img_rows, img_cols)
    x_test = x_test.reshape(x_test.shape[0], 1, img_rows, img_cols)
    input_shape = (1, img_rows, img_cols)
else:
    x_train = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1)
    x_test = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1)
    input_shape = (img_rows, img_cols, 1)

x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255
print('x_train shape:', x_train.shape)
print(x_train.shape[0], 'train samples')
print(x_test.shape[0], 'test samples')

# convert class vectors to binary class matrices
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)

model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),
                 activation='relu',
                 input_shape=input_shape))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))

start_time=time.time()

model.compile(loss=keras.losses.categorical_crossentropy,
              optimizer=keras.optimizers.Adadelta(),
              metrics=['accuracy'])

model.fit(x_train, y_train,
          batch_size=batch_size,
          epochs=epochs,
          verbose=1,
          validation_data=(x_test, y_test))
score = model.evaluate(x_test, y_test, verbose=0)
elapsedtime = time.time() - start_time
print('Test loss:', score[0])
print('Test accuracy:', score[1])
print('Elapsed Time:', elapsedtime)

x_train shape: (60000, 28, 28, 1)
60000 train samples
10000 test samples
Train on 60000 samples, validate on 10000 samples
Epoch 1/2
Epoch 2/2
Test loss: 0.04017468197933631
Test accuracy: 0.9854999780654907
Elapsed Time: 413.41916728019714


The Theano flag dnn.conv.algo_bwd_filter 
none (default) : use the default non-deterministic convolution implementation
deterministic : use a slower but deterministic implementation

and dnn.conv.algo_bwd_data allows to specify the cuDNN convolution implementation that Theano should use for gradient convolutions. Possible values include :


fft : use the Fast Fourier Transform implementation of convolution (very high memory usage)
guess_once : the first time a convolution is executed, the implementation to use is chosen according to cuDNN’s heuristics and reused for every subsequent execution of the convolution.
guess_on_shape_change : like guess_once but a new convolution implementation selected every time the shapes of the inputs and kernels don’t match the shapes from the last execution.
time_once : the first time a convolution is executed, every convolution implementation offered by cuDNN is executed and timed. The fastest is reused for every subsequent execution of the convolution.
time_on_shape_change : like time_once but a new convolution implementation selected every time the shapes of the inputs and kernels don’t match the shapes from the last execution.
(algo_bwd_data only) fft_tiling : use the Fast Fourier Transform implementation of convolution with tiling (high memory usage, but less then fft)
(algo_bwd_data only) small : use a convolution implementation with small memory usage