Script to test GPU availability in Tensorflow

In [1]:
#============================================================================
#from Diane Fedema
#============================================================================

#!nvidia-smi

#!oc project 

#get the name of the "nvidia-driver-daemonset-xxxx" from the output   (where xxx is specifc to your driver pod)
#!oc get pods 

#!oc exec -it nvidia-driver-daemonset-xxxx nvidia-smi

In [2]:
#============================================================================
#General script to test GPU availability
#============================================================================

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
import tensorflow as tf

In [3]:
#check GPU availablity
gpus = tf.config.experimental.list_physical_devices('GPU')
for gpu in gpus:
    print("Name:", gpu.name, "  Type:", gpu.device_type)
    
#should see Name: /physical_device:GPU:0   Type: GPU

Name: /physical_device:GPU:0   Type: GPU


In [4]:
#List Devices including GPUs with Tensorflow

from tensorflow.python.client import device_lib

device_lib.list_local_devices()

#should see name: "/device:CPU:0"
 #device_type: "CPU"
 #memory_limit: 268435456
 #locality {
 #}
 #incarnation: 9642244910482137207,
 #name: "/device:XLA_CPU:0"
 #device_type: "XLA_CPU"
 #memory_limit: 17179869184

[name: "/device:CPU:0"
 device_type: "CPU"
 memory_limit: 268435456
 locality {
 }
 incarnation: 3208460017293559617,
 name: "/device:GPU:0"
 device_type: "GPU"
 memory_limit: 14403224256
 locality {
   bus_id: 1
   links {
   }
 }
 incarnation: 1367692923913724399
 physical_device_desc: "device: 0, name: Tesla T4, pci bus id: 0000:00:1e.0, compute capability: 7.5"]

In [5]:
#Check GPU in Tensorflow
#tf.test.is_gpu_available()
tf.config.list_physical_devices('GPU')

#should see  True

[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]

In [6]:
#Load MNiSt Dataset

mnist = tf.keras.datasets.mnist

(x_train, y_train), (x_test, y_test) = mnist.load_data()

#should see  Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz
#11493376/11490434 [==============================] - 0s 0us/step

In [7]:
#Pre-processing of Training and Test Datasets
x_train, x_test = x_train / 255.0, x_test / 255.0


In [8]:
#Create Sequential Model using Tensorflow Keras

model = tf.keras.models.Sequential([
  tf.keras.layers.Flatten(input_shape=(28, 28)),
  tf.keras.layers.Dense(128, activation='relu'),
  tf.keras.layers.Dropout(0.2),
  tf.keras.layers.Dense(10)
])

In [9]:
predictions = model(x_train[:1]).numpy()
predictions

array([[-0.8918475 ,  0.14371046,  0.6872673 ,  0.09292346, -0.03083122,
        -0.5006603 ,  0.4896315 ,  0.18308999, -0.28008485, -0.52647316]],
      dtype=float32)

In [10]:
#create loss function
loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)

Compile the Model Designed Earlier
Before the model is ready for training, it needs a few more settings. These are added during the model's compile step:

Loss function This measures how accurate the model is during training. You want to minimize this function to "steer" the model in the right direction.

Optimizer This is how the model is updated based on the data it sees and its loss function.

Metrics Used to monitor the training and testing steps. The following example uses accuracy, the fraction of the images that are correctly classified.


In [11]:
model.compile(optimizer='adam',
              loss=loss_fn,
              metrics=['accuracy'])

Training and Validation
The Model.fit method adjusts the model parameters to minimize the loss:

In [15]:
model.fit(x_train, y_train, epochs=5)

#should see 
#Train on 60000 samples
#Epoch 1/5
#60000/60000 [==============================] - 5s 81us/sample - loss: 0.2936 - accuracy: 0.9143
#Epoch 2/5
#60000/60000 [==============================] - 5s 77us/sample - loss: 0.1403 - accuracy: 0.9588

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<tensorflow.python.keras.callbacks.History at 0x7f4549fbd130>

In [16]:
# The Model.evaluate method checks the models performance, usually on a "Validation-set" or "Test-set".

model.evaluate(x_test,  y_test, verbose=2)

#should see
#10000/10000 - 1s - loss: 0.0696 - accuracy: 0.9764
#[0.06958451116532087, 0.9764]

313/313 - 0s - loss: 0.0690 - accuracy: 0.9812


[0.06901837140321732, 0.9811999797821045]