# FastRNN and FastGRNN in Tensorflow

This is a simple notebook that illustrates the usage of Tensorflow implementation of FastRNN and FastGRNN. We are using the USPS dataset. Please refer to `fetch_usps.py` and run it for downloading and cleaning up the dataset.

In [1]:
# Copyright (c) Microsoft Corporation. All rights reserved.
# Licensed under the MIT license.

import helpermethods
import tensorflow as tf
import numpy as np
import sys
import os

#Provide the GPU number to be used
os.environ['CUDA_VISIBLE_DEVICES'] =''

#FastRNN and FastGRNN imports
from edgeml.trainer.fastTrainer import FastTrainer
from edgeml.graph.rnn import FastGRNNCell
from edgeml.graph.rnn import FastRNNCell
from edgeml.graph.rnn import UGRNNLRCell
from edgeml.graph.rnn import GRULRCell
from edgeml.graph.rnn import LSTMLRCell

# Fixing seeds for reproducibility
tf.set_random_seed(42)
np.random.seed(42)

# USPS Data

It is assumed that the USPS data has already been downloaded and processed with [fetch_usps.py](fetch_usps.py) and [process_usps.py](process_usps.py), and is present in the `./usps10` subdirectory.

Note: Even though usps10 is not a time-series dataset, it can be assumed as, a time-series where each row is coming in at one single time.
So the number of timesteps = 16 and inputDims = 16

In [2]:
#Loading and Pre-processing dataset for FastCells
dataDir = "usps10"
(dataDimension, numClasses, Xtrain, Ytrain, Xtest, Ytest, mean, std) = helpermethods.preProcessData(dataDir)
print("Feature Dimension: ", dataDimension)
print("Num classes: ", numClasses)

Feature Dimension:  256
Num classes:  10


# Model Parameters

FastRNN and FastGRNN work for most of the hyper-parameters with which you could acheive decent accuracies on LSTM/GRU. Over and above that, you can use low-rank, sparsity and quatization to reduce model size upto 45x when compared to LSTM/GRU.

In [3]:
cell = "FastGRNN" # Choose between FastGRNN, FastRNN, UGRNN, GRU and LSTM

inputDims = 16 #features taken in by RNN in one timestep
hiddenDims = 32 #hidden state of RNN

totalEpochs = 300
batchSize = 100

learningRate = 0.01
decayStep = 200
decayRate = 0.1

outFile = None #provide your file, if you need all the logging info in a file

#low-rank parameterisation for weight matrices. None => Full Rank
wRank = None 
uRank = None 

#Sparsity of the weight matrices. x => 100*x % are non-zeros
sW = 1.0 
sU = 1.0

#Non-linearities for the RNN architecture. Can choose from "tanh, sigmoid, relu, quantTanh, quantSigm"
update_non_linearity = "tanh"
gate_non_linearity = "sigmoid"

assert dataDimension % inputDims == 0, "Infeasible per step input, Timesteps have to be integer"

Placeholders for Data feeding during training and infernece

In [4]:
X = tf.placeholder("float", [None, int(dataDimension / inputDims), inputDims])
Y = tf.placeholder("float", [None, numClasses])

Creating a directory for current model in the datadirectory using timestamp

In [5]:
currDir = helpermethods.createTimeStampDir(dataDir, cell)
helpermethods.dumpCommand(sys.argv, currDir)

# FastCell Graph Object

Instantiating the FastCell Graph using modular RNN Cells which will be used for training and inference.

Note: RNN cells in edgeml.rnn can be used anywhere in place of LSTM/GRU in a plug & play fashion.

In [6]:
#Create appropriate RNN cell object based on choice
if cell == "FastGRNN":
    FastCell = FastGRNNCell(hiddenDims, gate_non_linearity=gate_non_linearity,
                            update_non_linearity=update_non_linearity,
                            wRank=wRank, uRank=uRank)
elif cell == "FastRNN":
    FastCell = FastRNNCell(hiddenDims, update_non_linearity=update_non_linearity,
                           wRank=wRank, uRank=uRank)
elif cell == "UGRNN":
    FastCell = UGRNNLRCell(hiddenDims, update_non_linearity=update_non_linearity,
                           wRank=wRank, uRank=uRank)
elif cell == "GRU":
    FastCell = GRULRCell(hiddenDims, update_non_linearity=update_non_linearity,
                         wRank=wRank, uRank=uRank)
elif cell == "LSTM":
    FastCell = LSTMLRCell(hiddenDims, update_non_linearity=update_non_linearity,
                          wRank=wRank, uRank=uRank)
else:
    sys.exit('Exiting: No Such Cell as ' + cell)

# FastCell Trainer Object

Instantiating the FastCell Trainer which will be used for 3 phase training

In [7]:
FastCellTrainer = FastTrainer(FastCell, X, Y, sW=sW, sU=sU, learningRate=learningRate, outFile=outFile)

Session declaration and variable initialization. Interactive Session doesn't clog the entire GPU.

In [8]:
sess = tf.InteractiveSession()
sess.run(tf.global_variables_initializer())

# FastCell Training Routine

The method to to run the 3 phase training, followed by giving out the best early stopping model, accuracy along with saving of the parameters.

In [9]:
FastCellTrainer.train(batchSize, totalEpochs, sess, Xtrain, Xtest,
                      Ytrain, Ytest, decayStep, decayRate, dataDir, currDir)


Epoch Number: 0

******************** Dense Training Phase Started ********************

Train Loss: 1.3531070024999854 Train Accuracy: 0.565881378744563
Test Loss: 0.8334901 Test Accuracy: 0.7349278

Epoch Number: 1
Train Loss: 0.5264064224615489 Train Accuracy: 0.8227005854044875
Test Loss: 0.52811986 Test Accuracy: 0.83557546

Epoch Number: 2
Train Loss: 0.3170111432467421 Train Accuracy: 0.8997546287432109
Test Loss: 0.41971388 Test Accuracy: 0.87593424

Epoch Number: 3
Train Loss: 0.22838621382435706 Train Accuracy: 0.9285217539904869
Test Loss: 0.37176716 Test Accuracy: 0.8943697

Epoch Number: 4
Train Loss: 0.17584358977332507 Train Accuracy: 0.9436173479850978
Test Loss: 0.3482268 Test Accuracy: 0.9013453

Epoch Number: 5
Train Loss: 0.1554100387921072 Train Accuracy: 0.9503703141865665
Test Loss: 0.36468038 Test Accuracy: 0.8963627

Epoch Number: 6
Train Loss: 0.13128593576791353 Train Accuracy: 0.9591509887616928
Test Loss: 0.36238122 Test Accuracy: 0.9028401

Epoch Number: 


Epoch Number: 62
Train Loss: 0.01902851595364715 Train Accuracy: 0.9946575393415478
Test Loss: 0.43569365 Test Accuracy: 0.918286

Epoch Number: 63
Train Loss: 0.022098659849502402 Train Accuracy: 0.9919178142939529
Test Loss: 0.4453173 Test Accuracy: 0.92575985

Epoch Number: 64
Train Loss: 0.02353779313932747 Train Accuracy: 0.9930001562588835
Test Loss: 0.43414015 Test Accuracy: 0.91429996

Epoch Number: 65
Train Loss: 0.016468530626048986 Train Accuracy: 0.9947809764783676
Test Loss: 0.43052217 Test Accuracy: 0.9217738

Epoch Number: 66
Train Loss: 0.016379667304086257 Train Accuracy: 0.9958904148781136
Test Loss: 0.4004999 Test Accuracy: 0.92825115

Epoch Number: 67
Train Loss: 0.012232361819072026 Train Accuracy: 0.9971232904146795
Test Loss: 0.40298688 Test Accuracy: 0.93273544

Epoch Number: 68
Train Loss: 0.008708359920403806 Train Accuracy: 0.998493152121975
Test Loss: 0.42018083 Test Accuracy: 0.9272546

Epoch Number: 69
Train Loss: 0.009453040786081134 Train Accuracy: 0.99


Epoch Number: 124
Train Loss: 0.0015932139348516189 Train Accuracy: 0.9997260276585409
Test Loss: 0.4208521 Test Accuracy: 0.9342302

Epoch Number: 125
Train Loss: 0.0016064488660697252 Train Accuracy: 0.9998630138292705
Test Loss: 0.4273708 Test Accuracy: 0.93721974

Epoch Number: 126
Train Loss: 0.0015048502509048438 Train Accuracy: 0.9998630138292705
Test Loss: 0.43023735 Test Accuracy: 0.9337319

Epoch Number: 127
Train Loss: 0.0014419755101050824 Train Accuracy: 0.9998630138292705
Test Loss: 0.4389877 Test Accuracy: 0.9362232

Epoch Number: 128
Train Loss: 0.0013684726919028398 Train Accuracy: 0.9997260276585409
Test Loss: 0.44143116 Test Accuracy: 0.9342302

Epoch Number: 129
Train Loss: 0.0013124902181690943 Train Accuracy: 0.9998630138292705
Test Loss: 0.44684827 Test Accuracy: 0.93721974

Epoch Number: 130
Train Loss: 0.001271455863264249 Train Accuracy: 0.9998630138292705
Test Loss: 0.44386175 Test Accuracy: 0.9362232

Epoch Number: 131
Train Loss: 0.0013829727382247642 Trai

Train Loss: 0.0011997123000495875 Train Accuracy: 0.9998630138292705
Test Loss: 0.41469583 Test Accuracy: 0.935725

Epoch Number: 186
Train Loss: 0.0014707065065397741 Train Accuracy: 0.9998630138292705
Test Loss: 0.41498712 Test Accuracy: 0.94020927

Epoch Number: 187
Train Loss: 0.002836034407166203 Train Accuracy: 0.9993150691463523
Test Loss: 0.45857498 Test Accuracy: 0.9247633

Epoch Number: 188
Train Loss: 0.04335543119080671 Train Accuracy: 0.9872467313727288
Test Loss: 0.4854505 Test Accuracy: 0.91081214

Epoch Number: 189
Train Loss: 0.0711721541576904 Train Accuracy: 0.9769863102534045
Test Loss: 0.38205332 Test Accuracy: 0.92575985

Epoch Number: 190
Train Loss: 0.028694037625515093 Train Accuracy: 0.9897260347457781
Test Loss: 0.3982458 Test Accuracy: 0.9217738

Epoch Number: 191
Train Loss: 0.020338214381298125 Train Accuracy: 0.9945205531708182
Test Loss: 0.39259014 Test Accuracy: 0.9317389

Epoch Number: 192
Train Loss: 0.015414321870058265 Train Accuracy: 0.995753428707


Epoch Number: 251
Train Loss: 0.0003565376773372943 Train Accuracy: 1.0
Test Loss: 0.4135833 Test Accuracy: 0.9352267

Epoch Number: 252
Train Loss: 0.0003480334934253845 Train Accuracy: 1.0
Test Loss: 0.41463742 Test Accuracy: 0.9352267

Epoch Number: 253
Train Loss: 0.0003396494167359316 Train Accuracy: 1.0
Test Loss: 0.41570726 Test Accuracy: 0.9352267

Epoch Number: 254
Train Loss: 0.0003313843925540935 Train Accuracy: 1.0
Test Loss: 0.41679344 Test Accuracy: 0.93472844

Epoch Number: 255
Train Loss: 0.00032324064602718165 Train Accuracy: 1.0
Test Loss: 0.4178963 Test Accuracy: 0.9352267

Epoch Number: 256
Train Loss: 0.00031521800582134485 Train Accuracy: 1.0
Test Loss: 0.4190162 Test Accuracy: 0.93472844

Epoch Number: 257
Train Loss: 0.0003073199977859064 Train Accuracy: 1.0
Test Loss: 0.42015436 Test Accuracy: 0.93472844

Epoch Number: 258
Train Loss: 0.00029954845147535896 Train Accuracy: 1.0
Test Loss: 0.42131123 Test Accuracy: 0.9352267

Epoch Number: 259
Train Loss: 0.0002

# Model Quantization

Byte Quantization for the trained FastModels, to reduce the model size by 4x. If one uses piece-wise linear approximations for non-linearities like quantTanh for tanh and quantSigm for Sigmoid, they can benefit greatly from pure integer arithmetic after model quantization during prediction

In [10]:
#Model quantization
model_dir = currDir #you will see model dir printed at the end of trianing, use that here or use the currDir

import quantizeFastModels
quantizeFastModels.quantizeFastModels(model_dir)

Bg.npy has max: 4.9833384 min: -0.6077357
Bh.npy has max: 2.8973198 min: -0.16004847
FC.npy has max: 4.9540076 min: -5.963999
FCbias.npy has max: 2.540496 min: -1.7358814
U.npy has max: 2.2965062 min: -2.670992
W.npy has max: 1.3919494 min: -1.2454427


Quantized Model Dir: usps10\FastGRNNResults/23_51_17_15_03_19\QuantizedFastModel
