# TUTORIAL : Tensorflow basic computation using multiple GPUs

## Introduction

The aim of this tutorial is to use `AI TRAINING` product to do a **very simple tensor computation** with the `Tensorflow` library and to compare performances of running this over CPU versus GPU.

## Prerequities

* a Public cloud project
* an `AI-TRAINING` notebook job launched with the `Tensorflow 2` preset image ([documentation available here](https://docs.ovh.com/gb/en/ai-training/start-use-notebooks/))
* the notebook resources should have at least 2 GPU

## In practice

### Step 1: Import Tensorflow library

In [1]:
import tensorflow as tf

### Step 2 (optional): Enable the log of device placement

This step is optional but is better to understand what is happening under the hood. If executed it will log the selected device for each computation.

In [2]:
tf.debugging.set_log_device_placement(True)

*Example with a simple computation on the default device:*

In [3]:
# Initialize 2 tensors
tensor_1 = tf.constant([[4.0, 3.0, 1.0], [2.0, 6.0, 6.0]])
tensor_2 = tf.constant([[5.0, 1.0], [5.0, 5.0], [7.0, 8.0]])

# Multiply them
tensor_3 = tf.matmul(tensor_1, tensor_2)

Executing op MatMul in device /job:localhost/replica:0/task:0/device:GPU:0


### Step 3 : Check that you have multiple GPUs available on your notebook

In [4]:
# Get the list of all logical GPU device on your notebook
GPU_DEVICES = tf.config.list_logical_devices('GPU')
# Keep only the names of each GPU devices
GPU_DEVICES_NAMES = [x.name for x in GPU_DEVICES]
# The number of GPU devices
GPU_DEVICES_NB = len(GPU_DEVICES)

if GPU_DEVICES_NB == 0:
    raise SystemError('No GPU device found on your notebook, please restart with at least 2 GPUs')
if GPU_DEVICES_NB == 1:
    raise SystemError('Only 1 GPU found on your notebook, please restart with at least 2 GPUs')
else:
    print(f'{GPU_DEVICES_NB} GPU device(s) have been found on your notebook :')

for nb in range(GPU_DEVICES_NB):
    gpu_name = GPU_DEVICES_NAMES[nb]
    print(f'* GPU n°{nb} whose name is "{gpu_name}"')

2 GPU device(s) have been found on your notebook :
* GPU n°0 whose name is "/device:GPU:0"
* GPU n°1 whose name is "/device:GPU:1"


### Step 4 : Define your own placement strategy

If you want to manually select placement for each of your operations on the GPU that is right for you :

In [5]:
# Initialize 2 tensors
tensor_1 = tf.constant([[4.0, 3.0, 1.0], [2.0, 6.0, 6.0]])
tensor_2 = tf.constant([[5.0, 1.0], [5.0, 5.0], [7.0, 8.0]])

# Will force execution of the multiplication on the first GPU
with tf.device(GPU_DEVICES_NAMES[0]):
    print('Executing first operation :')
    tensor_3 = tf.matmul(tensor_1, tensor_2)
    print('')

# Will force execution of the multiplication on the seconb GPU
with tf.device(GPU_DEVICES_NAMES[1]):
    print('Executing second operation :')
    tensor_3 = tf.matmul(tensor_1, tensor_2)
    print('')

Executing first operation :
Executing op MatMul in device /job:localhost/replica:0/task:0/device:GPU:0

Executing second operation :
Executing op MatMul in device /job:localhost/replica:0/task:0/device:GPU:1



### Step 5 : Use a predefined dristribution strategy

Manual placement over GPUs is great for experimented users but sometimes it is better to let the library select for you and use a predefined dristribution strategy.

For this example we are going to use a `MirroredStrategy` that creates a replica of you model for each GPU device available. Check the official documentation link on the *going further part below*

In [6]:
# Compiling a simple model 
with tf.distribute.MirroredStrategy().scope():
    inputs = tf.keras.layers.Input(shape=(1,))
    predictions = tf.keras.layers.Dense(1)(inputs)
    model = tf.keras.models.Model(inputs=inputs, outputs=predictions)
    model.compile(loss='mse', optimizer=tf.keras.optimizers.SGD(learning_rate=0.2))

INFO:tensorflow:Using MirroredStrategy with devices ('/job:localhost/replica:0/task:0/device:GPU:0', '/job:localhost/replica:0/task:0/device:GPU:1')
Executing op RandomUniform in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op Sub in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op Mul in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op Add in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op VarIsInitializedOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op LogicalNot in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op Assert in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op AssignVariableOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op ReadVariableOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op Identity in device /job:localhost/replica:0/task:0/

## Going further

* For more information about running computations over GPU with TensorFlow we advise you to follow the [official documentation](https://www.tensorflow.org/guide/gpu)
* For more information about running distributed comptutations over several GPUs with Tensorflow we advise you to follow the [official documentation](https://www.tensorflow.org/guide/distributed_training)
* **Resource consumption** of your notebook is displayed in a dashboard that you can see. **Just execute the following cells to get the URL corresponding to your notebook** session. The credencials needed to access this dashboard are the same than those used for the current notebook.

In [None]:
import os

if 'NOTEBOOK_ID' in os.environ:
    VARID = "var-notebook=" + os.environ['NOTEBOOK_ID']
    HOST = os.environ['NOTEBOOK_HOST']
    SUBDOMAIN = "notebook"
else:
    VARID =  "var-job=" + os.environ['JOB_ID']
    HOST = os.environ['JOB_HOST']
    SUBDOMAIN = "job"


print(f'Your resource monitoring dashboard URL is :')
print(f'http://{HOST.replace(SUBDOMAIN, "monitoring")}/d/gpu/job-monitoring?orgId=1&from=now-5m&{VARID}&to=now')