# Using Multiple devices

An easy way to have a GPU is to run the code in Google Colab and set GPU as the hardware accelerator in the notebook settings.

In [None]:
import tensorflow as tf

In [None]:
print("Num GPUs Available: ", len(tf.config.list_physical_devices('GPU')))

## Find out where placement occurs

If a TensorFlow operation is implemented for CPU and GPU devices, the operation will be executed by default on a GPU device if a GPU is available.

In [None]:
# To find out where placement occurs, set 'log_device_placement'
tf.debugging.set_log_device_placement(True)

a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
c = tf.matmul(a, b)

We can also use the tensor `device` attribute that returns the name of the device on which this tensor will be assigned.

In [None]:
a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
print(a.device)
b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
print(b.device)

## Create a device context

We can select the device to use by creating a device context through the `with tf.device` function.
Each operation executed in this context will use the selected device.

In [None]:
tf.debugging.set_log_device_placement(True)
with tf.device('/device:CPU:0'):
    a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
    b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
    c = tf.matmul(a, b)

If we move the `matmul` operation out of the context. This operation will be executed on a GPU device if it's available.

In [None]:
tf.debugging.set_log_device_placement(True)
with tf.device('/device:CPU:0'):
    a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
    b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
c = tf.matmul(a, b)

## Limit the GPU memory allocation
Careful with GPU memory allocation, TensorFlow never releases it.  TensorFlow starts with almost
all of the GPU memory allocated.  

We can slowly grow to that limit with the `tf.config.experimental.set_memory_growth` method option setting or another solution is to set the environmental variable `TF_FORCE_GPU_ALLOW_GROWTH` to `True`.

In [None]:
gpu_devices = tf.config.list_physical_devices('GPU')
if gpu_devices:
    try:
        tf.config.experimental.set_memory_growth(gpu_devices[0], True)
    except RuntimeError as e:
        # Memory growth cannot be modified after GPU has been initialized
        print(e)

We can also create a virtual GPU device  with `tf.config.experimental.set_virtual_device_configuration` and set the maximum memory limit (in MB) to allocate on this virtual GPU.

In [None]:
gpu_devices = tf.config.list_physical_devices('GPU')
if gpu_devices:
    try:
        tf.config.experimental.set_virtual_device_configuration(gpu_devices[0],
                                                   [tf.config.experimental.VirtualDeviceConfiguration(memory_limit=1024)])
    except RuntimeError as e:
        # Memory growth cannot be modified after GPU has been initialized
        print(e)

# Using multiple GPUs

We can set placements on multiple devices.
Here, assume we have three devices CPU:0, GPU:0, and GPU:1

In [None]:
# Create two virtual GPUs
gpu_devices = tf.config.list_physical_devices('GPU')
tf.debugging.set_log_device_placement(True)
if gpu_devices:
    try:
        tf.config.experimental.set_virtual_device_configuration(gpu_devices[0],
                                                   [tf.config.experimental.VirtualDeviceConfiguration(memory_limit=1024),
                                                    tf.config.experimental.VirtualDeviceConfiguration(memory_limit=1024) ])
    except RuntimeError as e:
        # Memory growth cannot be modified after GPU has been initialized
        print(e)

print("Num GPUs Available: ", len(tf.config.list_logical_devices('GPU')))

if tf.test.is_built_with_cuda():
    with tf.device('/cpu:0'):
        a = tf.constant([1.0, 3.0, 5.0], shape=[1, 3])
        b = tf.constant([2.0, 4.0, 6.0], shape=[3, 1])
        
        with tf.device('/gpu:0'):
            c = tf.matmul(a,b)
            c = tf.reshape(c, [-1])
        
        with tf.device('/gpu:1'):
            d = tf.matmul(b,a)
            flat_d = tf.reshape(d, [-1])
        
        combined = tf.multiply(c, flat_d)
    print(combined)
