# Taking advantage of Colab Pro



## Faster GPUs

With Colab Pro you have priority access to our fastest GPUs. For example, you may get a T4 or P100 GPU at times when most users of standard Colab receive a slower K80 GPU. You can see what GPU you've been assigned at any time by executing the following cell.

In [2]:
gpu_info = !nvidia-smi
gpu_info = '\n'.join(gpu_info)
if gpu_info.find('failed') >= 0:
  print('Select the Runtime > "Change runtime type" menu to enable a GPU accelerator, ')
  print('and then re-execute this cell.')
else:
  print(gpu_info)

Sat Jun  6 16:08:03 2020       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.82       Driver Version: 418.67       CUDA Version: 10.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|   0  Tesla P100-PCIE...  Off  | 00000000:00:04.0 Off |                    0 |
| N/A   41C    P0    26W / 250W |      0MiB / 16280MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|  No ru

In order to use a GPU with your notebook, select the Runtime > Change runtime type menu, and then set the hardware accelerator dropdown to GPU.

## More memory

With Colab Pro you have the option to access high-memory VMs when they are available. To set your notebook preference to use a high-memory runtime, select the Runtime > 'Change runtime type' menu, and then select High-RAM in the Runtime shape dropdown.

You can see how much memory you have available at any time by running the following code.


In [3]:
from psutil import virtual_memory
ram_gb = virtual_memory().total / 1e9
print('Your runtime has {:.1f} gigabytes of available RAM\n'.format(ram_gb))

if ram_gb < 20:
  print('To enable a high-RAM runtime, select the Runtime > "Change runtime type"')
  print('menu, and then select High-RAM in the Runtime shape dropdown. Then, ')
  print('re-execute this cell.')
else:
  print('You are using a high-RAM runtime!')

Your runtime has 27.4 gigabytes of available RAM

You are using a high-RAM runtime!


In [4]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


## Longer runtimes

All Colab runtimes are reset after some period of time (which is faster if the runtime isn't executing code). While Colab Pro subscribers still have limits, these will be roughly twice the limits for non-subscribers.

## Resource limits in Colab Pro

Your resources are not unlimited in Colab Pro. To make the most of Colab Pro, please avoid using resources when you don't need them. For example, only use a GPU or high-RAM runtime when required, and close Colab tabs when finished.


## Send us feedback!

If you have any feedback for us, please let us know. The best way to send feedback is by using the Help > 'Send feedback...' menu. If you encounter usage limits in Colab Pro and would be interested in a product with higher usage limits, do let us know.

If you encounter errors or other issues with billing (payments) for Colab Pro, please email colab-billing@google.com.

In [5]:
import os
import random
import numpy as np
import pandas as pd 
import pickle
from skimage import io
from skimage import color
from PIL import Image
from IPython.display import display
import matplotlib.pyplot as plt
import seaborn as sns
from dask.array.image import imread
from dask import bag, threaded
from dask.diagnostics import ProgressBar
import cv2
from sklearn.model_selection import train_test_split
import warnings
warnings.filterwarnings("ignore")



import keras
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras.utils import to_categorical
from keras.preprocessing import image 
from keras.layers.normalization import BatchNormalization
from keras import optimizers


  import pandas.util.testing as tm
Using TensorFlow backend.


In [10]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [13]:
train_image = []
image_label = []

for i in range(10):
  path = "/content/drive/My Drive/distdrv/cache/zoommask_r_224_c_224_c_3_class" + str(i) + ".dat"

  print(f'loading pickle files from class = {i}')
  # get orig image
  file = open(path, 'rb')
  images, labels = pickle.load(file)
  train_image = train_image + images
  #image_label = image_label + labels





loading pickle files from class = 0
loading pickle files from class = 1
loading pickle files from class = 2
loading pickle files from class = 3
loading pickle files from class = 4
loading pickle files from class = 5
loading pickle files from class = 6
loading pickle files from class = 7
loading pickle files from class = 8
loading pickle files from class = 9


In [0]:
images = []
labels = []
driver_details = []

In [15]:
print(f'train image size = {len(train_image)}')

train image size = 22424


In [0]:
import random

random.shuffle(train_image)



In [17]:
print(f'train image size = {len(train_image)}')

train image size = 22424


In [0]:
## getting list of driver names

D = []
for features,labels,drivers in train_image:
    D.append(drivers)

## Deduplicating drivers

deduped = []

for i in D:
    if i not in deduped:
        deduped.append(i)
    

## selecting random drivers for the validation set
driv_selected = []
import random
driv_nums = random.sample(range(len(deduped)), 4)
for i in driv_nums:
    driv_selected.append(deduped[i])


In [19]:
driv_nums

[11, 17, 12, 3]

In [20]:
len(deduped)

26

In [21]:
## Splitting the train and test

X_train= []
y_train = []
X_test = []
y_test = []
D_train = []
D_test = []

#for features,labels,drivers,features2,labels2,drivers2 in zip(train_image, train_image2):
for features, labels, drivers in train_image:

    if drivers in driv_selected:
        X_test.append(features)
        y_test.append(labels)
        #D_test.append(drivers)
    
    else:
        X_train.append(features)
        y_train.append(labels)
        #D_train.append(drivers)


true_test = y_test
    
print (len(X_train),len(X_test))
print (len(y_train),len(y_test))






18597 3827
18597 3827


In [22]:
len(X_train)

18597

In [23]:
## Converting images to nparray. Encoding the Y

X_train = np.array(X_train).reshape(-1,224,224,3)
X_test = np.array(X_test).reshape(-1,224,224,3)
y_train = to_categorical(y_train)
y_test = to_categorical(y_test)

#X_train = X_train/255
#X_test = X_test/255


print (X_train.shape)
print (X_test.shape)
print (y_train.shape)
print (y_test.shape)


(18597, 224, 224, 3)
(3827, 224, 224, 3)
(18597, 10)
(3827, 10)


In [0]:
X_train[0]

In [0]:
from __future__ import print_function  # for Python2
import sys

local_vars = list(locals().items())
for var, obj in local_vars:
  #print(var, sys.getsizeof(obj))
  pass

In [24]:
## Defining the input

from keras.layers import Input
model1_input = Input(shape = (224, 224, 3), name = 'orig_image_input')

## The VGG model

from keras.applications.vgg16 import VGG16, preprocess_input

#Get back the convolutional part of a VGG network trained on ImageNet
model1 = VGG16(weights='imagenet', include_top=False, input_tensor = model1_input)
#model1.trainable = False
model1.summary()



Downloading data from https://github.com/fchollet/deep-learning-models/releases/download/v0.1/vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5
Model: "vgg16"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
orig_image_input (InputLayer (None, 224, 224, 3)       0         
_________________________________________________________________
block1_conv1 (Conv2D)        (None, 224, 224, 64)      1792      
_________________________________________________________________
block1_conv2 (Conv2D)        (None, 224, 224, 64)      36928     
_________________________________________________________________
block1_pool (MaxPooling2D)   (None, 112, 112, 64)      0         
_________________________________________________________________
block2_conv1 (Conv2D)        (None, 112, 112, 128)     73856     
_________________________________________________________________
block2_conv2 (Conv2D)        (None, 112, 112, 128

In [25]:
#Use the generated model 
from keras.models import Model
from keras.layers import Dropout, Flatten, Dense, GlobalAveragePooling2D
from keras.layers.normalization import BatchNormalization

output_vgg16_conv = model1(model1_input)

#Add the fully-connected layers 
x=GlobalAveragePooling2D()(output_vgg16_conv)
#x = Flatten()(output_vgg16_conv)

x=Dense(1024,activation='relu')(x) #we add dense layers so that the model can learn more complex functions and classify for better results.
x = Dropout(0.1)(x) # **reduce dropout 
x=Dense(1024,activation='relu')(x) #dense layer 2
x = BatchNormalization()(x)
x = Dropout(0.35)(x)
x = Dense(512,activation='relu')(x) #dense layer 3
x = Dense(10, activation='softmax', name='predictions')(x)

singleModel = Model(input = model1_input, output = x)
singleModel.summary()

# Compile CNN model
sgd = optimizers.SGD(lr = 0.001)
singleModel.compile(loss='categorical_crossentropy',optimizer = sgd,metrics=['accuracy'])


Model: "model_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
orig_image_input (InputLayer (None, 224, 224, 3)       0         
_________________________________________________________________
vgg16 (Model)                (None, 7, 7, 512)         14714688  
_________________________________________________________________
global_average_pooling2d_1 ( (None, 512)               0         
_________________________________________________________________
dense_1 (Dense)              (None, 1024)              525312    
_________________________________________________________________
dropout_1 (Dropout)          (None, 1024)              0         
_________________________________________________________________
dense_2 (Dense)              (None, 1024)              1049600   
_________________________________________________________________
batch_normalization_1 (Batch (None, 1024)              4096

In [0]:
#singleModel.fit(X_train, y_train, epochs=25, validation_data=(X_test, y_test))

In [26]:
from keras.preprocessing.image import ImageDataGenerator
from keras.callbacks import ModelCheckpoint,EarlyStopping

checkpointer = ModelCheckpoint('/content/drive/My Drive/kaggle/singlemaskModel_aug.h5', verbose=1, save_best_only=True)
earlystopper = EarlyStopping(monitor='val_loss', patience=5, verbose=1)


datagen = ImageDataGenerator(
    height_shift_range=0.5,
    width_shift_range = 0.5,
    zoom_range = 0.5,
    rotation_range=30
        )
#datagen.fit(X_train)
data_generator = datagen.flow(X_train, y_train, batch_size = 64)

# Fits the model on batches with real-time data augmentation:
vgg16_model = singleModel.fit_generator(data_generator,steps_per_epoch = len(X_train) / 64, callbacks=[checkpointer, earlystopper],
                                                            epochs = 25, verbose = 1, validation_data = (X_test, y_test))

#vgg16_model = singleModel.fit_generator(data_generator,steps_per_epoch = len(X_train) / 64,
#                                                            epochs = 25, verbose = 1, validation_data = (X_test, y_test))



Epoch 1/25

Epoch 00001: val_loss improved from inf to 1.63070, saving model to /content/drive/My Drive/kaggle/singlemaskModel_aug.h5
Epoch 2/25

Epoch 00002: val_loss improved from 1.63070 to 0.98482, saving model to /content/drive/My Drive/kaggle/singlemaskModel_aug.h5
Epoch 3/25

Epoch 00003: val_loss improved from 0.98482 to 0.68509, saving model to /content/drive/My Drive/kaggle/singlemaskModel_aug.h5
Epoch 4/25

Epoch 00004: val_loss improved from 0.68509 to 0.61490, saving model to /content/drive/My Drive/kaggle/singlemaskModel_aug.h5
Epoch 5/25

Epoch 00005: val_loss improved from 0.61490 to 0.57777, saving model to /content/drive/My Drive/kaggle/singlemaskModel_aug.h5
Epoch 6/25

Epoch 00006: val_loss did not improve from 0.57777
Epoch 7/25

Epoch 00007: val_loss did not improve from 0.57777
Epoch 8/25

Epoch 00008: val_loss did not improve from 0.57777
Epoch 9/25

Epoch 00009: val_loss did not improve from 0.57777
Epoch 10/25

Epoch 00010: val_loss improved from 0.57777 to 0.

In [0]:
#vgg16_model.save("/content/drive/My Drive/kaggle/singleModel_aug.h5")
singleModel.save_weights("/content/drive/My Drive/kaggle/singlemaskModel_aug_weights.h5")

In [29]:
from keras.models import load_model
rcModel = load_model('/content/drive/My Drive/kaggle/singlemaskModel_aug.h5')
rcModel.summary()

Model: "model_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
orig_image_input (InputLayer (None, 224, 224, 3)       0         
_________________________________________________________________
vgg16 (Model)                (None, 7, 7, 512)         14714688  
_________________________________________________________________
global_average_pooling2d_1 ( (None, 512)               0         
_________________________________________________________________
dense_1 (Dense)              (None, 1024)              525312    
_________________________________________________________________
dropout_1 (Dropout)          (None, 1024)              0         
_________________________________________________________________
dense_2 (Dense)              (None, 1024)              1049600   
_________________________________________________________________
batch_normalization_1 (Batch (None, 1024)              4096

In [30]:
# labels is the image array

from sklearn.metrics import accuracy_score, confusion_matrix

model1_prediction = []
model1_pred_class = []

model1_prediction = rcModel.predict(X_test)
print('Images Predicted until now:',len(model1_prediction))
print(f'True images: {len(true_test)}')

for i in range(len(model1_prediction)):
    model1_pred_class.append(np.where(model1_prediction[i] == np.amax(model1_prediction[i]))[0][0])

    
print('The accuracy of this model over validation set is:',accuracy_score(true_test,model1_pred_class))
confusion_matrix(true_test,model1_pred_class)

Images Predicted until now: 3827
True images: 3827
The accuracy of this model over validation set is: 0.8860726417559446


array([[336,   3,   0,   0,   3,   3,   0,   2,  57,   7],
       [  5, 387,   0,   0,   0,   0,   4,   0,  11,   0],
       [  0,   0, 380,   0,   0,   0,   1,   4,  25,   0],
       [  0,   0,   0, 389,   6,   0,   1,   0,   0,   2],
       [  0,   0,   0,   0, 389,   0,   2,   0,  12,   2],
       [  3,   0,   0,   0,   0, 379,   0,   0,   0,   0],
       [  0,   0,   0,   0,   0,   0, 382,   1,  28,   0],
       [  0,   0,   0,   0,   0,   0,   0, 272,   0,   0],
       [ 10,   0,  43,   0,  10,   1,   7,   8, 213,   5],
       [166,   0,   0,   0,   0,   3,   0,   0,   1, 264]])

In [31]:
print(f'prediction={model1_prediction}, shape={model1_prediction.shape}')

prediction=[[6.32527292e-01 3.55592550e-04 2.59549648e-04 ... 2.88804236e-04
  7.14419130e-03 3.39608192e-01]
 [3.68618569e-03 7.37958448e-03 2.35473178e-03 ... 6.01615291e-04
  1.11919525e-03 3.46934277e-04]
 [6.20885839e-06 1.04078817e-05 3.46379547e-09 ... 4.48854394e-08
  8.84906513e-08 1.28789429e-06]
 ...
 [6.09843509e-09 2.65331479e-10 3.79946546e-10 ... 3.75499805e-11
  3.95628819e-09 2.69009259e-09]
 [1.08291976e-01 2.28985492e-03 2.12338357e-03 ... 1.56412106e-02
  2.02571298e-03 8.63696218e-01]
 [8.35100489e-10 1.59526767e-10 1.00000000e+00 ... 1.56469981e-09
  3.78676335e-09 3.51924351e-10]], shape=(3827, 10)


In [32]:
from sklearn.metrics import log_loss

lgloss = log_loss(y_test, model1_prediction, eps=1e-15, normalize=True, sample_weight=None, labels=None)

print('The log loss from this model is:',round(lgloss,2))

The log loss from this model is: 0.41
