<center>
    <img src="https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/assets/logos/SN_web_lightmode.png" width="300" alt="cognitiveclass.ai logo">
</center>


# **GPU with Keras**


Estimated time needed: **25** minutes


You may have heard of GPUs (Graphics Processing Unit) and CPUs (Central Processing Unit), but what is the difference? GPUs have been commonly seen used by gamers for better visual rendering, but nowadays its applications extend way beyond improving videogame experience. With respect to deep learning, GPUs are extremely helpful by speeding up certain computations. The difference is evident especially for models that train on large datasets, in which the researcher can take advantage of parallel computing to run operations simultaneously and save time. In this lab, you will learn about how to utilize GPU for `tensorflow`, specifically `keras`.


<center>
    <img src="https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-ML311-Coursera/labs/Module6/L1/img_GPU.jpeg" width="600" alt="computer components">
<center>


**_Note_**: Skills Network Labs currently doesn't have any GPUs available. In order to test the difference between CPU and GPU, please run this lab on a local machine or environment that has GPUs.


## __Table of Contents__

<ol>
    <li><a href="#Objectives">Objectives</a></li>
    <li>
        <a href="#Setup">Setup</a>
        <ol>
            <li><a href="#Installing-Required-Libraries">Installing Required Libraries</a></li>
            <li><a href="#Importing-Required-Libraries">Importing Required Libraries</a></li>
            <li><a href="#Defining-Helper-Functions">Defining Helper Functions</a></li>
        </ol>
    </li>
    <li>
        <a href="#Benefits-of-Using-GPU">Benefits of Using GPU</a>
    </li>  
    <li>
        <a href="#Using-CPU">Using CPU</a>
    </li> 
    <li>
        <a href="#Using-GPU">Using GPU</a>
        <ol>
            <li><a href="#Check-Availability">Check Availability</a></li>
            <li><a href="#Choosing-Specific-GPUs">Choosing Specific GPUs</a></li>
        </ol>    
    </li>
    <li>
        <a href="#Using-CPU-and-GPU-jointly">Using CPU and GPU jointly</a>
    </li>     
</ol>


## Objectives

After completing this lab you will be able to:

 - Set environment to CPU/GPU
 - Control usage of CPU/GPU in parts of the code


----


## Setup


For this lab, we will be using the following libraries:

*   [`pandas`](https://pandas.pydata.org/?utm_medium=Exinfluencer&utm_source=Exinfluencer&utm_content=000026UJ&utm_term=10006555&utm_id=NA-SkillsNetwork-Channel-SkillsNetworkCoursesIBMML0187ENSkillsNetwork31430127-2021-01-01) for managing the data.
*   [`numpy`](https://numpy.org/?utm_medium=Exinfluencer&utm_source=Exinfluencer&utm_content=000026UJ&utm_term=10006555&utm_id=NA-SkillsNetwork-Channel-SkillsNetworkCoursesIBMML0187ENSkillsNetwork31430127-2021-01-01) for mathematical operations.
*   [`sklearn`](https://scikit-learn.org/stable/?utm_medium=Exinfluencer&utm_source=Exinfluencer&utm_content=000026UJ&utm_term=10006555&utm_id=NA-SkillsNetwork-Channel-SkillsNetworkCoursesIBMML0187ENSkillsNetwork31430127-2021-01-01) for machine learning and machine-learning-pipeline related functions.
*   [`seaborn`](https://seaborn.pydata.org/?utm_medium=Exinfluencer&utm_source=Exinfluencer&utm_content=000026UJ&utm_term=10006555&utm_id=NA-SkillsNetwork-Channel-SkillsNetworkCoursesIBMML0187ENSkillsNetwork31430127-2021-01-01) for visualizing the data.
*   [`matplotlib`](https://matplotlib.org/?utm_medium=Exinfluencer&utm_source=Exinfluencer&utm_content=000026UJ&utm_term=10006555&utm_id=NA-SkillsNetwork-Channel-SkillsNetworkCoursesIBMML0187ENSkillsNetwork31430127-2021-01-01) for additional plotting tools.


### Installing Required Libraries

The following required libraries are pre-installed in the Skills Network Labs environment. However, if you run these notebook commands in a different Jupyter environment (e.g. Watson Studio or Ananconda), you will need to install these libraries by removing the `#` sign before `!mamba` and before `!pip install --upgrade tensorflow` in the code cells below.


In [ ]:
# All Libraries required for this lab are listed below. The libraries pre-installed on Skills Network Labs are commented.
# !mamba install -qy pandas==1.3.4 numpy==1.21.4 seaborn==0.9.0 matplotlib==3.5.0 scikit-learn==0.20.1
# Note: If your environment doesn't support "!mamba install", use "!pip install"

In [ ]:
# %%capture
!pip install --upgrade tensorflow -qqq

### Importing Required Libraries

_We recommend you import all required libraries in one place, as follows:_


In [ ]:
import warnings
warnings.simplefilter('ignore')

import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2' 

import numpy as np

import tensorflow as tf
# Import the keras library
from tensorflow import keras
from tensorflow.keras import Model
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense
from tensorflow.python.client import device_lib

## Benefits of Using GPU


GPU excels in parallel computing in comparison to CPU. This technique is especially useful for deep learning algorithms, such as building a Convolutional Neural Network (CNN), or as a matter of fact, any neural network. An example of a parallel processing task is performing convolution on an input layer, in which the kernel is multiplied with the input layer matrix, one local region at a time.

The runtime difference is especially noticeable when you train a CNN with multiple epochs - tasks where a lot of matrix operations are involved!


## Using CPU


By default, `tensorflow` searches for available GPU to use. There are two ways to force your machine to ignore all GPUs and run code with CPU instead.


If you want the entire code/notebook to run on CPU, you can specify the environment _**before**_ importing tensorflow/keras. If you decide to switch back, you can restart the kernel and run import as usual without the line below.


In [ ]:
# Specify the environment variable value
os.environ['CUDA_VISIBLE_DEVICES'] = '-1'

If the environment variable `CUDA_VISIBLE_DEVICES` value takes the values 0/1 (or other positive values), the machine is using GPU to run the code. By setting it to -1, it specifies the algorithm to be run with CPU.


In [ ]:
# Check that CPU is used
print(os.environ['CUDA_VISIBLE_DEVICES'])

If instead you want to use CPU for portions of the code in a notebook, consider the following approach. Here, you specify what to run with `/CPU:0` using a `with` statement. Using `%%timeit` with `-n1 -r1` will time the process for one pass of the cell. As an example, we'll be training the following CNN on a **DATASET**. Feel free to change the code within the statement to test CPU performance! 


In [ ]:
# Import data
(X_train, y_train), (X_test, y_test) = keras.datasets.mnist.load_data()

In [ ]:
# Reshape the data
X_train = X_train.reshape((X_train.shape[0],X_train.shape[1],X_train.shape[2],1))
X_test = X_test.reshape((X_test.shape[0],X_test.shape[1],X_test.shape[2],1))

y_train = y_train.reshape((y_train.shape[0],1))
y_test = y_test.reshape((y_test.shape[0],1))

In [ ]:
%%timeit -n1 -r1
# Building the CNN model and fitting on train data
with tf.device('/CPU:0'):
    model_cpu = Sequential()
    model_cpu.add(Conv2D(input_shape = (28, 28, 1),
                     filters=5, 
                     padding='Same',
                     kernel_size=(3,3)
                     ))
    model_cpu.add(MaxPooling2D(pool_size=(2,2)))
    model_cpu.add(Flatten())
    model_cpu.add(Dense(256, activation='relu'))
    model_cpu.add(Dense(10, activation='softmax'))
    
    model_cpu.compile(optimizer='adam', 
              loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=['accuracy'])
    
    model_cpu.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=5)

## Using GPU


As mentioned above, `tensorflow` automatically searches for GPUs to run on. Let's take a closer look at how we can have more control over that.


### Check Availability


First, you can check the number of GPUs available on the machine. 


In [ ]:
print("Num GPUs Available: ", len(tf.config.list_physical_devices('GPU')))

If you're running this notebook in Skills Network Lab, you can see that it doesn't have any GPUs available for use. However, if your local machine does have GPU(s), you can try the following code to play with what you want to run on GPU.


### Choosing Specific GPUs


In order to specify a particular GPU to run on, we have to first check what units there are in the environment. The following lists out the information of each device, including the device name, type, memory limit, and so on. 


In [ ]:
print(device_lib.list_local_devices())

To specify using a specific GPU, again use `tf.device()` with the `name` as input, just like we did for the CPU case. In the `with` statement, proceed with writing code as usual. Here, we are specifying `tensorflow` to be run on GPU ennumerated #2. We also use `%%timeit` here so you can compare the time that GPU took to run in comparison with CPU!


In [ ]:
%%timeit -n1 -r1
with tf.device('/device:GPU:2'):
    model_gpu = Sequential()
    model_gpu.add(Conv2D(input_shape = (28, 28, 1),
                     filters=5, 
                     padding='Same',
                     kernel_size=(3,3)
                     ))
    model_gpu.add(MaxPooling2D(pool_size=(2,2)))
    model_gpu.add(Flatten())
    model_gpu.add(Dense(256, activation='relu'))
    model_gpu.add(Dense(10, activation='softmax'))
    
    model_gpu.compile(optimizer='adam', 
              loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=['accuracy'])
    
    model_gpu.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=5)

## Using CPU and GPU jointly


What if we want to use _both_ CPU and GPU for different parts of the same python script? Turns out we can do that too! Simply take advantage of the `tf.device()` function again to specify which unit the code fragment should be run on. Below, we show an example of how to run the same matrix operation on multiple GPUs and add up the tensors on CPU.


In [ ]:
# Enable tensor allocations or operations to be printed
tf.debugging.set_log_device_placement(True)

# Get list of all logical GPUs
gpus = tf.config.list_logical_devices('GPU')

# Check if there are GPUs on this computer
if gpus:
  # Run matrix computation on multiple GPUs
    c = []
    for gpu in gpus:
        with tf.device(gpu.name):
            a = tf.constant([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]])
            b = tf.constant([[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]]) 
            c.append(tf.matmul(a, b))

    # Run on CPU 
    with tf.device('/CPU:0'):
        matmul_sum = tf.add_n(c)

    print(matmul_sum)

## Authors


[Cindy Huang](https://www.linkedin.com/in/cindy-shih-ting-huang/) is a data science associate of the Skills Network team. She has a passion for machine learning to improve user experience, especially in the area of computational linguistics.


### Other Contributors


[Joseph Santarcangelo](https://www.linkedin.com/in/joseph-s-50398b136/) has a PhD in Electrical Engineering, his research focused on using machine learning, signal processing, and computer vision to determine how videos impact human cognition. Joseph has been working for IBM since he completed his PhD.


## Change Log


|Date (YYYY-MM-DD)|Version|Changed By|Change Description|
|-|-|-|-|
|2022-07-11|0.1|Cindy H.|Created Lab|
|2022-07-21|0.2|Joseph S.|Reviewed Lab|
|2022-08-09|0.3|Steve H.|QA Pass|


Copyright © 2022 IBM Corporation. All rights reserved.
