# Deep learning software

CPSC 340: Machine Learning and Data Mining

The University of British Columbia

2017 Winter Term 2

Mike Gelbart

## Some popular packages:

| Name   |  Host language  | Released |  Comments | 
|--------|-------------|---------------|----------|
| [Torch](http://torch.ch) | Lua | 2002 | Early library, still used |
| [Theano](http://deeplearning.net/software/theano/) | Python | 2007 | From U. de Montréal |
| [Caffe](http://caffe.berkeleyvision.org) | Executable with Python wrapper | 2014 | Specifically for CNNs, from UC Berkeley
| [TensorFlow](https://www.tensorflow.org) | Python | 2015 | Created by Google for both prototyping and production
| [Keras](https://keras.io) | Python | 2015 | A front-end on top of Theano or TensorFlow |
| [PyTorch](http://pytorch.org) | Python | 2017 | Flexible, gaining a lot of popularity
| [Caffe 2](https://caffe2.ai/) | Python or C++ | 2017 | Facebook, open source

- See also [Comparison of deep learning software](https://en.wikipedia.org/wiki/Comparison_of_deep_learning_software).
- Lots of new software.
- Lots of Python.
  
Some nice things the software can do for you:

- automatic differentiation
- compile/optimize code, especially for GPU 
- numerically stable implementations
- implementation of various regularizers (like dropout) and solvers (like Adam)
- a common standard for a community of users

## GPUs

- A big part of the deep learning revolution
- GPUs were originally designed for graphics --> _matrix multiplication_
  - This is what we need for neural networks
- Leader is NVIDIA, their GPU programming language is called CUDA.
  - These days we can usually avoid writing in CUDA
  - NVIDIA's share price 2 years ago: \$35. Now: \$220.
  
From Jeff Bezos' 2017 letter to shareholders:

> Using our pre-packaged versions of popular deep learning frameworks running on P2 compute instances (optimized for this workload), customers are already developing powerful systems ranging everywhere from early disease detection to increasing crop yields. [...] Watch this space. Much more to come.

-More recently: [P3 instances](https://aws.amazon.com/ec2/instance-types/p3/).


## Cloud computing

- CUDA-capable NVIDIA GPUs are [expensive](https://www.amazon.com/Nvidia-Tesla-GDDR5-Cores-Graphic/dp/B00Q7O7PQA). 
- Cloud computing platforms enable easy (and sometimes free) access. 
  - Big players are AWS EC2, Google Cloud, Microsoft Azure

## Demos

The data is built in to Keras, so we just access it for convenience. If not present already it is automatically downloaded.

Attribution: the code below is adapted from the [Keras MNIST example](https://github.com/fchollet/keras/blob/master/examples/mnist_mlp.py), which is under the [MIT license](https://github.com/fchollet/keras/blob/master/LICENSE).

In [1]:
from keras.datasets import mnist
from keras.utils import np_utils
from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Activation, Flatten
from keras.layers.convolutional import Convolution2D, MaxPooling2D

# the data, shuffled and split between train and test sets
(X_train, y_train_cat), (X_test, y_test_cat) = mnist.load_data()

img_dim = (28,28) 
img_size = img_dim[0]*img_dim[1]
num_classes = 10

X_train = X_train.astype('float32')
X_test  = X_test.astype('float32')
X_train /= 255
X_test  /= 255
X_train_flat = X_train.reshape(60000, img_size)
X_test_flat  = X_test.reshape(10000, img_size)
X_train = X_train[...,None] # add 4th dimension, needed later for convnets
X_test  = X_test[...,None]

# convert class vectors to binary class matrices
y_train = np_utils.to_categorical(y_train_cat, num_classes)
y_test = np_utils.to_categorical(y_test_cat, num_classes)

print(X_train.shape[0], 'train samples')
print(X_test.shape[0], 'test samples')

Using TensorFlow backend.


60000 train samples
10000 test samples


## Demo 1: scikit-learn

In [2]:
from sklearn.neural_network import MLPClassifier

In [3]:
%%timeit -n1 -r1

nn = MLPClassifier(hidden_layer_sizes=(100,100), max_iter=10, 
                   batch_size=128)
nn.fit(X_train_flat, y_train_cat)

score = nn.score(X_train_flat, y_train_cat)
print('Train accuracy:', score)

score = nn.score(X_test_flat, y_test_cat)
print('Test accuracy:', score)



Train accuracy: 0.9924
Test accuracy: 0.9745
25.9 s ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)


## Demo 2: Keras

### Fully-connected net

** Model definition **

Here we need to specify the input and output size in advance (unlike sklearn) because the model is first _compiled_.

In [8]:
model = Sequential()
model.add(Dense(100, input_shape=(X_train_flat.shape[1],), activation='relu'))
model.add(Dense(100, activation='relu'))
model.add(Dense(10, activation='softmax'))

model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

** Training and evaluation **

In [9]:
%%timeit -n1 -r1

history = model.fit(X_train_flat, y_train,
                    batch_size=128, 
                    epochs=10,
                    verbose=0)

score = model.evaluate(X_train_flat, y_train, verbose=0)
print('Train accuracy:', score[1])

score = model.evaluate(X_test_flat, y_test, verbose=0)
print('Test accuracy:', score[1])

Train accuracy: 0.9924
Test accuracy: 0.9764
30.3 s ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)


### Convolutional net

Attribution: the code below is adapted from [Deep Learning with Python](https://machinelearningmastery.com/deep-learning-with-python2/) with permission from the author.

** Model definition **

In [6]:
model = Sequential()
model.add(Convolution2D(32, (5, 5), input_shape=img_dim+(1,), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.2))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dense(num_classes, activation='softmax'))

model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

** Training and evaluation **

In [7]:
%%timeit -n1 -r1

history = model.fit(X_train, y_train,
                    batch_size=128, 
                    epochs=1,
                    verbose=0)

score = model.evaluate(X_train, y_train, verbose=0)
print('Train accuracy:', score[1])

score = model.evaluate(X_test, y_test, verbose=0)
print('Test accuracy:', score[1])

Train accuracy: 0.9781
Test accuracy: 0.9774
1min 4s ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)


## Demo 3: Training in the cloud

Instructions:

- Go to https://colab.research.google.com/
- Make an account
- Upload notebook or create a new one
- Runtime --> change runtime type
- Select GPU

Typical speedups of **10x**. This translates into more prototyping, more optimization, better models.