<a href="https://colab.research.google.com/github/kylematoba/deeplearning-project/blob/master/gsa.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# How to get a free Tensor Processing Unit and use it to do interesting stuff


  - Deep Learning: Statistical methods that achieve very expressive models of $y$ by composing several nonlinear functions of $x$. Generally characterised by requiring a lot of data and calculation.
  - TensorFlow: Deep learning framework originally developed at Google Brain. Written in C++ and accessed most commonly (I think?) through Python.
  - Keras: Very high level language for phrasing deep learning models, originally developed by a dude who subsequently went to Google. Most commonly (?) backed by TensforFlow, and in turn incorporated into TensorFlow.
  - Tensor Processing Unit (TPU): A computer optimised for deep learning (relative to a GPU) by giving up unneeded stuff like high precision arithmetic and rasterisation.
  - Colab: Google-hosted Jupyter notebooks that can be trivially changed to have a GPU or TPU backend.
  
  
  
  



https://colab.research.google.com/github/tensorflow/tpu/blob/master/tools/colab/shakespeare_with_tpu_and_keras.ipynb


 
  - https://en.wikipedia.org/wiki/CIFAR-10#Research_Papers_Claiming_State-of-the-Art_Results_on_CIFAR-10
    - State of the art entails many GPU-years to fit
  - https://github.com/keras-team/keras/blob/master/examples/cifar10_cnn.py
  - https://dawn.cs.stanford.edu/benchmark/#cifar10
  
NB. TPU can be a bit touchy, for nontrivial models I recommend giving it a few seconds between compiling and starting to fit.
    

In [3]:
import os
import time

import keras
import tensorflow

from keras.datasets import cifar10
from keras.preprocessing.image import ImageDataGenerator

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, Activation, Flatten
from tensorflow.keras.layers import Conv2D, MaxPooling2D

want_tpu = True
use_tpu = want_tpu and ('COLAB_TPU_ADDR' in os.environ)
batch_size = 32
num_classes = 10
epochs = 100

optimizer = tensorflow.keras.optimizers.RMSprop(lr=0.0001, decay=1e-6)

(x_train, y_train), (x_test, y_test) = cifar10.load_data()
print('x_train shape: {}'.format(x_train.shape))
print('training set size: {}'.format(x_train.shape[0]))
print('test set size: {}'.format(x_test.shape[0]))

# Convert class vectors to binary class matrices.
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)

model = Sequential()
model.add(Conv2D(32, (3, 3), padding='same', input_shape=x_train.shape[1:]))
model.add(Activation('relu'))
model.add(Conv2D(32, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))

model.add(Conv2D(64, (3, 3), padding='same'))
model.add(Activation('relu'))
model.add(Conv2D(64, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))

model.add(Flatten())
model.add(Dense(512))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes))
model.add(Activation('softmax'))

model.compile(loss='categorical_crossentropy',
              optimizer=optimizer,
              metrics=['accuracy'])

x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255

if use_tpu:
    TPU_WORKER = 'grpc://' + os.environ['COLAB_TPU_ADDR']
    tpu_cluster_resolver = tensorflow.contrib.cluster_resolver.TPUClusterResolver(TPU_WORKER)
    using_single_core = False
    strategy = tensorflow.contrib.tpu.TPUDistributionStrategy(tpu_cluster_resolver, using_single_core=using_single_core)
    model = tensorflow.contrib.tpu.keras_to_tpu_model(model, strategy=strategy)


x_train shape: (50000, 32, 32, 3)
training set size: 50000
test set size: 10000


In [0]:
# time.sleep(1)
data_augmentation = True
if not data_augmentation:
    print('Not using data augmentation.')
    model.fit(x_train, y_train,
              batch_size=batch_size,
              epochs=epochs,
              validation_data=(x_test, y_test),
              shuffle=True)
else:
    print('Using real-time data augmentation.')
    # This will do preprocessing and realtime data augmentation:
    datagen = ImageDataGenerator(
        featurewise_center=False,  # set input mean to 0 over the dataset
        samplewise_center=False,  # set each sample mean to 0
        featurewise_std_normalization=False,  # divide inputs by std of the dataset
        samplewise_std_normalization=False,  # divide each input by its std
        zca_whitening=False,  # apply ZCA whitening
        zca_epsilon=1e-06,  # epsilon for ZCA whitening
        rotation_range=0,  # randomly rotate images in the range (degrees, 0 to 180)
        # randomly shift images horizontally (fraction of total width)
        width_shift_range=0.1,
        # randomly shift images vertically (fraction of total height)
        height_shift_range=0.1,
        shear_range=0.,  # set range for random shear
        zoom_range=0.,  # set range for random zoom
        channel_shift_range=0.,  # set range for random channel shifts
        # set mode for filling points outside the input boundaries
        fill_mode='nearest',
        cval=0.,  # value used for fill_mode = "constant"
        horizontal_flip=True,  # randomly flip images
        vertical_flip=False,  # randomly flip images
        # set rescaling factor (applied before any other transformation)
        rescale=None,
        # set function that will be applied on each input
        preprocessing_function=None,
        # image data format, either "channels_first" or "channels_last"
        data_format=None,
        # fraction of images reserved for validation (strictly between 0 and 1)
        validation_split=0.0)

    # Compute quantities required for feature-wise normalization
    # (std, mean, and principal components if ZCA whitening is applied).
    datagen.fit(x_train)

    # Fit the model on the batches generated by datagen.flow().
    model.fit_generator(datagen.flow(x_train, y_train, batch_size=batch_size),
                        epochs=epochs,
                        validation_data=(x_test, y_test))

    
# model.fit(x_train, y_train,
#           batch_size=batch_size,
#           epochs=epochs,
#           validation_data=(x_test, y_test),
#           shuffle=True)

# Score trained model.
scores = model.evaluate(x_test, y_test, verbose=1)
print('Test loss:', scores[0])
print('Test accuracy:', scores[1])

Using real-time data augmentation.
Epoch 1/100
Epoch 2/100
Epoch 3/100