Kaggle Dogs Vs. Cats Using LeNet  on Google Colab TPU
==================================================

### Required setup

1. Update api_token with kaggle api key for downloading dataset

          - Login to kaggle
          - My Profile > Edit Profile > Createt new API Token
          - Update **api_token** dict below with the values
          
2. Change Notebook runtime to TPU
          
          - In colab notebook menu, Runtime > Change runtime type
          - Select TPU in the list

Install kaggle package, download and extract zip file

In [10]:
!pip install kaggle

api_token = {"username":"xxxxx","key":"xxxxxxxxxxxxxxxxxxxxxxxx"}

import json
import zipfile
import os

os.mkdir('/root/.kaggle')

with open('/root/.kaggle/kaggle.json', 'w') as file:
    json.dump(api_token, file)
!chmod 600 /root/.kaggle/kaggle.json
# !kaggle config path -p /root
!kaggle competitions download -c dogs-vs-cats

Collecting keras
[?25l  Downloading https://files.pythonhosted.org/packages/5e/10/aa32dad071ce52b5502266b5c659451cfd6ffcbf14e6c8c4f16c0ff5aaab/Keras-2.2.4-py2.py3-none-any.whl (312kB)
[K    3% |█                               | 10kB 20.8MB/s eta 0:00:01[K    6% |██                              | 20kB 5.4MB/s eta 0:00:01[K    9% |███▏                            | 30kB 6.0MB/s eta 0:00:01[K    13% |████▏                           | 40kB 5.7MB/s eta 0:00:01[K    16% |█████▎                          | 51kB 5.8MB/s eta 0:00:01[K    19% |██████▎                         | 61kB 6.8MB/s eta 0:00:01[K    22% |███████▍                        | 71kB 6.3MB/s eta 0:00:01[K    26% |████████▍                       | 81kB 6.4MB/s eta 0:00:01[K    29% |█████████▍                      | 92kB 7.1MB/s eta 0:00:01[K    32% |██████████▌                     | 102kB 6.9MB/s eta 0:00:01[K    36% |███████████▌                    | 112kB 7.0MB/s eta 0:00:01[K    39% |████████████▋        

In [0]:
zip_ref = zipfile.ZipFile('/content/train.zip', 'r')
zip_ref.extractall()
zip_ref.close()

Re-arrange classes to 2 separate directories

In [0]:
!mkdir train/cat train/dog
!mv train/*cat*.jpg train/cat
!mv train/*dog*.jpg train/dog

Training configs

In [0]:
BATCH_SIZE   = 64
IMG_DIM      = (256, 256, 3)
NUM_EPOCHS   = 1

Setup generators to provide with train and validation batches

In [15]:
import tensorflow as tf
from tensorflow import keras

print(keras.__version__)
print(tf.__version__)

datagen = keras.preprocessing.image.ImageDataGenerator(
    rescale=1./255,
    validation_split=0.2)

traingen = datagen.flow_from_directory(
    'train',
    batch_size = BATCH_SIZE,
    target_size = IMG_DIM[:-1],
    class_mode = 'categorical',
    subset='training')

valgen = datagen.flow_from_directory(
    'train',
    batch_size = BATCH_SIZE,
    target_size = IMG_DIM[:-1],
    class_mode = 'categorical',
    subset='validation')

2.1.6-tf
1.11.0
Found 20000 images belonging to 2 classes.
Found 5000 images belonging to 2 classes.


Define LeNet model architecture

In [17]:
input = keras.layers.Input(IMG_DIM, name="input")
conv1 = keras.layers.Conv2D(20, kernel_size=(5, 5), padding='same')(input)
pool1 = keras.layers.MaxPooling2D(pool_size=(2,2), strides=(2,2))(conv1)
conv2 = keras.layers.Conv2D(50, kernel_size=(5,5), padding='same')(pool1)
pool2 = keras.layers.MaxPooling2D(pool_size=(2,2), strides=(2,2))(conv1)
flatten1 = keras.layers.Flatten()(pool2)
fc1 = keras.layers.Dense(500, activation='relu')(flatten1)
fc2 = keras.layers.Dense(2, activation='softmax')(fc1)

model = keras.models.Model(inputs=input, outputs=fc2)
model.compile(
    loss='categorical_crossentropy',
    optimizer=keras.optimizers.SGD(lr=0.01),
    metrics=['accuracy'])

print(model.summary())

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input (InputLayer)           (None, 256, 256, 3)       0         
_________________________________________________________________
conv2d (Conv2D)              (None, 256, 256, 20)      1520      
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 128, 128, 20)      0         
_________________________________________________________________
flatten (Flatten)            (None, 327680)            0         
_________________________________________________________________
dense (Dense)                (None, 500)               163840500 
_________________________________________________________________
dense_1 (Dense)              (None, 2)                 1002      
Total params: 163,843,022
Trainable params: 163,843,022
Non-trainable params: 0
______________________________________________________________

Check for TPU availability

In [18]:
import os

try:
  device_name = os.environ['COLAB_TPU_ADDR']
  TPU_ADDRESS = 'grpc://' + device_name
  print('Found TPU at: {}'.format(TPU_ADDRESS))

except KeyError:
  print('TPU not found')

Found TPU at: grpc://10.73.36.106:8470


Convert keras model to TPU model

In [19]:
tpu_model = tf.contrib.tpu.keras_to_tpu_model(
    model,
    strategy=tf.contrib.tpu.TPUDistributionStrategy(
        tf.contrib.cluster_resolver.TPUClusterResolver(TPU_ADDRESS)))

INFO:tensorflow:Querying Tensorflow master (b'grpc://10.73.36.106:8470') for TPU system metadata.
INFO:tensorflow:Found TPU system:
INFO:tensorflow:*** Num TPU Cores: 8
INFO:tensorflow:*** Num TPU Workers: 1
INFO:tensorflow:*** Num TPU Cores Per Worker: 8
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:CPU:0, CPU, -1, 4430250600885613814)
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:XLA_CPU:0, XLA_CPU, 17179869184, 14860921772671154020)
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:XLA_GPU:0, XLA_GPU, 17179869184, 10329331434607546216)
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:0, TPU, 17179869184, 3020471452782936925)
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:1, TPU, 17179869184, 7058080726911325303)
INFO:tensorflow:*** Available Device: _Device

Run training

In [21]:
tpu_model.fit_generator(
    traingen,
    steps_per_epoch=traingen.n//traingen.batch_size,
    epochs=1,
    validation_data=valgen,
    validation_steps=valgen.n//valgen.batch_size)

Epoch 1/1
INFO:tensorflow:New input shapes; (re-)compiling: mode=train, [TensorSpec(shape=(8, 256, 256, 3), dtype=tf.float32, name='input_20'), TensorSpec(shape=(8, 2), dtype=tf.float32, name='dense_1_target_10')]
INFO:tensorflow:Overriding default placeholder.
INFO:tensorflow:Remapping placeholder for input
INFO:tensorflow:Cloning SGD {'lr': 0.009999999776482582, 'momentum': 0.0, 'decay': 0.0, 'nesterov': False}
INFO:tensorflow:Get updates: Tensor("loss/mul:0", shape=(), dtype=float32)
INFO:tensorflow:Started compiling
INFO:tensorflow:Finished compiling. Time elapsed: 28.781575918197632 secs
INFO:tensorflow:Setting weights on TPU model.
  5/312 [..............................] - ETA: 43:10 - loss: 1.3635 - acc: 0.5250INFO:tensorflow:New input shapes; (re-)compiling: mode=train, [TensorSpec(shape=(4, 256, 256, 3), dtype=tf.float32, name='input_20'), TensorSpec(shape=(4, 2), dtype=tf.float32, name='dense_1_target_10')]
INFO:tensorflow:Overriding default placeholder.
INFO:tensorflow:Rema

<tensorflow.python.keras.callbacks.History at 0x7f674345db70>

Save the model weights

In [22]:
tpu_model.save_weights('./lenet-catdog.h5', overwrite=True)

INFO:tensorflow:Copying TPU weights to the CPU


Download model weights locally

In [0]:
from google.colab import files

files.download("lenet-catdog.h5")