# Binary classification with Keras neural network running on GPU

Original notebook: https://www.kaggle.com/kosovanolexandr/keras-nn-x-ray-predict-pneumonia-86-54  
Dataset: https://www.kaggle.com/paultimothymooney/chest-xray-pneumonia

### Our Goal

The goal of this notebook is to demonstrate the GPU utilization on an Openshift Cluster with Open Data Hub components running on top of it.  
Jupyter Notebook has been deployed by the Open Data Hub operator and this notebook image was build with all the dependencies needed to use GPU thanks to the operator. Especially we are working with tensorflow-gpu version 2.7.0 and Cuda version 11.4.2.  
We will demonstrate the GPU usage by building a neural network. We will train a neural network with xrays of chests in order to predict if a patient suffers from a pneumonia.

You can run this notebook cell by cell and see the ressource usage on the following Grafana: https://grafana-route-grafana.apps.sno-nvidia-p6.redhat.hpecic.net/dashboards/

### Imports

Import the necessary packages for our python code

In [None]:
from tensorflow.keras.preprocessing.image import ImageDataGenerator, load_img
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D
from tensorflow.keras.layers import Activation, Dropout, Flatten, Dense
from tensorflow.keras import backend as K
import tensorflow as tf

import os
import numpy as np
import pandas as np

import matplotlib.pyplot as plt
%matplotlib inline

### Check that GPU is enabled by default

We can see that a device named */physical_device:GPU:0* has been discovered thanks to the dependencies installed with this notebook image. The type of this device is *GPU*

In [None]:
print('List of available GPUs: ', tf.config.list_physical_devices('GPU'))
print("Number of GPUs Available: ", len(tf.config.list_physical_devices('GPU')))

### Verify our directories structure

In [None]:
print(os.listdir("/opt/app-root/src/data/chest_xray"))

print(os.listdir("/opt/app-root/src/data/chest_xray/test"))

print(os.listdir("/opt/app-root/src/data/chest_xray/train/"))

### Check an image in the "NORMAL" training set

Let's display a random image from the normal training set i.e tagged as a patient without any pneumonia

In [None]:
img_name = 'NORMAL2-IM-0588-0001.jpeg'
img_normal = load_img('/opt/app-root/src/data/chest_xray/train/NORMAL/' + img_name)

print('NORMAL')
plt.imshow(img_normal)
plt.show()

### Check an image in the PNEUMONIA training set

Let's display a random image from the pneumonia training set i.e tagged as a patient who suffers from a pneumonia

In [None]:
img_name = 'person63_bacteria_306.jpeg'
img_pneumonia = load_img('/opt/app-root/src/data/chest_xray/train/PNEUMONIA/' + img_name)

print('PNEUMONIA')
plt.imshow(img_pneumonia)
plt.show()

### Initialize variables

We are defining few variables

In [None]:
# dimensions of our images.
img_width, img_height = 150, 150

In [None]:
# Path to the data directories
train_data_dir = '/opt/app-root/src/data/chest_xray/train'
test_data_dir = '/opt/app-root/src/data/chest_xray/test'

nb_train_samples = 5232 # Number of train images
epochs = 20 # Number of time we procces the entire dataset
batch_size = 16 # Number of images feeded at a time

In [None]:
if K.image_data_format() == 'channels_first':
    input_shape = (3, img_width, img_height)
else:
    input_shape = (img_width, img_height, 3)

### Create Sequential model

We define the Keras model and add some layers in order to create our neural network

In [None]:
model = Sequential()
model.add(Conv2D(32, (3, 3), input_shape=input_shape))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Conv2D(32, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Conv2D(64, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Flatten())
model.add(Dense(64))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(1))
model.add(Activation('sigmoid'))

### Check information about the model

In [None]:
model.layers

In [None]:
model.input

In [None]:
model.output

### Compile the model

We define the loss, optimizer and metrics that will be used as improvement goals for the neural network

In [None]:
model.compile(loss='binary_crossentropy',
              optimizer='rmsprop',
              metrics=['accuracy'])

### Upload images from the different sets

In [None]:
# this is the augmentation configuration we will use for training
train_datagen = ImageDataGenerator(
    rescale=1. / 255,
    shear_range=0.2,
    zoom_range=0.2,
    horizontal_flip=True)

In [None]:
# this is the augmentation configuration we will use for testing:
# only rescaling
test_datagen = ImageDataGenerator(rescale=1. / 255)

In [None]:
# Process the train data set
train_generator = train_datagen.flow_from_directory(
    train_data_dir,
    target_size=(img_width, img_height),
    batch_size=batch_size,
    class_mode='binary')

In [None]:
# Process the test data set
test_generator = test_datagen.flow_from_directory(
    test_data_dir,
    target_size=(img_width, img_height),
    batch_size=batch_size,
    class_mode='binary')

### Fit the model

#### Choose between CPU and GPU

You can choose to use either GPU or CPU. Uncomment the *device_path="/cpu:0"* line if you want to force tensorflow to use CPU.

In [None]:
# GPU
device_path="/gpu:0"
# CPU
# device_path="/cpu:0"

Before training the model, you can open the Grafana instance and observe the different dashboards about the resource consumption.  
Here is the [GPU usage dashboard](https://grafana-route-grafana.apps.sno-nvidia-p6.redhat.hpecic.net/d/51c4ef955d49b81689012770a4b1791ba80e9c7a/nvidia-dcgm-exporter-dashboard?orgId=1)  
Here is the [CPU usage dashboard](https://grafana-route-grafana.apps.sno-nvidia-p6.redhat.hpecic.net/d/4ccbbd05fa2622168c09e3b8b92194d2f5825d95/kubernetes-compute-resources-pod?orgId=1&refresh=10s)  
On both dashboard you can check the *Utilization* graph.  
Note that as some other graph values are calculated on a 5m range, you need to wait a bit after started trainning the model before the peak appears.

#### Train the model
Now we can train the model. Note that it will take around 20min to complete using GPU and around 33min to complete using CPU.

In [None]:
with tf.device(device_path):
    model.fit(
        train_generator,
        steps_per_epoch=nb_train_samples // batch_size,
        epochs=epochs)

### Evaluate the model

In [None]:
# evaluate the model
scores = model.evaluate(test_generator)
print("\n%s: %.2f%%" % (model.metrics_names[1], scores[1]*100))

## Cleanup

When you will finish the lab please run the following commands to cleanup and preserve comput resources. CAUTION, this will delete all your notebooks and the kernel variables containing your model.

- Please delete the kernel by clicking Kernel->Shut Down Kernel
- Please remove your personnal folder by opening a new terminal (File->New->Terminal) and run *rm -rf /opt/app-root/src/notebooks/<FOLDER_NAME>* where FOLDER_NAME is the name of your personnal directory. This will erase all the content of your directory so please download or push your work if needed.