<div style="text-align: right">INFO 6105 Data Science Eng Methods and Tools, Lecture 9</div>
<div style="text-align: right">Dino Konstantopoulos, 27 March 2019</div>

# Lab: Machine Learning (ML) with Keras

In this notebook, we reuse professor's trained weights to recognize images we download from the web. We start with an image of a bird that we selected in class.

Steps:
- Create folder `saved_models` in your `C:/Users/<username>` folder. In that folder, place professor's pretrained neural weights in [H5 format](https://en.wikipedia.org/wiki/Hierarchical_Data_Format) `keras_cifar10_trained_model.h5`, which is up on blackboard
- In your `C:/Users/<username>/data` folder, download the images used for training from file `cifar-10-batches-py.tar.gz` on blackboard
- Download the picture of a bird we picked together in class `yellowboard.jpg` and place it in folder `C:/Users/<username>/data`
- Run this notebook, which should run pretty fast

In [1]:
from __future__ import print_function
import keras
from keras.datasets import cifar10
from keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.layers import Conv2D, MaxPooling2D
import os

  from ._conv import register_converters as _register_converters
Using Theano backend.


In [2]:
batch_size = 32
num_classes = 10
epochs = 100
data_augmentation = True
num_predictions = 20
save_dir = os.path.join(os.getcwd(), 'saved_models')
model_name = 'keras_cifar10_trained_model.h5'

In [3]:
# The data, split between train and test sets:
(x_train, y_train), (x_test, y_test) = cifar10.load_data()
print('x_train shape:', x_train.shape)
print(x_train.shape[0], 'train samples')
print(x_test.shape[0], 'test samples')

x_train shape: (50000, 32, 32, 3)
50000 train samples
10000 test samples


In [4]:
# Convert class vectors to binary class matrices.
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)

In [5]:
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255

In [6]:
from keras.models import load_model
#model.load_weights('saved_models/keras_cifar10_trained_model.h5')
model = load_model('saved_models/keras_cifar10_trained_model.h5')

This should take about a minute:

In [7]:
# Score trained model.
scores = model.evaluate(x_test, y_test, verbose=1)
print('Test loss:', scores[0])
print('Test accuracy:', scores[1])

Test loss: 0.7208882711410523
Test accuracy: 0.7705


In [8]:
def setup_load_cifar(verbose=False):
  import os,shutil
  from keras.datasets import cifar10
  from keras.utils import to_categorical
  
  #datadir = os.path.expanduser("~") + "/.keras/datasets/"
  datadir = "/data"
  datafile = datadir+"cifar-10-batches-py.tar.gz" # the name keras looks for
  
  #if not os.path.isfile(datafile):
  #  os.makedirs(datadir)
  #  shutil.copyfile($$ref{{["~:output","119210b3-a610-428e-93f2-ad5d987f442b","cifar-10-python.tar.gz"]}}, datafile)
  
  # The data, shuffled and split between train and test sets:
  (x_train, y_train), (x_test, y_test) = cifar10.load_data()
  if verbose:
    print("x_train shape: {}, {} train samples, {} test samples.\n".format(
      x_train.shape, x_train.shape[0], x_test.shape[0]))
  
  # Convert class vectors to binary class matrices.
  y_train = to_categorical(y_train, num_classes)
  y_test = to_categorical(y_test, num_classes)
  
  x_train = x_train.astype("float32")
  x_test = x_test.astype("float32")
  x_train /= 255.0
  x_test /= 255.0
  
  # Load label names to use in prediction results
  label_list_path = "datasets/cifar-10-batches-py/batches.meta"
  
  keras_dir = os.path.expanduser(os.path.join("~", ".keras"))
  datadir_base = os.path.expanduser(keras_dir)
  if not os.access(datadir_base, os.W_OK):
    datadir_base = os.path.join("/tmp", ".keras")
  label_list_path = os.path.join(datadir_base, label_list_path)
  
  with open(label_list_path, mode="rb") as f:
    labels = pickle.load(f)
  
  return x_train, y_train, x_test, y_test, labels

In [9]:
#!conda install dill
import numpy as np
import dill as pickle
from math import *
_,_,_,_,labels = setup_load_cifar()

In [10]:
import numpy as np
import tensorflow
import keras.backend.tensorflow_backend as K
sess = K.get_session()

Predicting should only take a few seconds!

In [11]:
img = tensorflow.read_file('data/yellowbird.jpg')
img = tensorflow.image.decode_jpeg(img, channels=3)
img.set_shape([None, None, 3])
img = tensorflow.image.resize_images(img, (32, 32))
img = img.eval(session=sess) # convert to numpy array
img = np.expand_dims(img, 0) # make 'batch' of 1

pred = model.predict(img)
pred = labels["label_names"][np.argmax(pred)]
pred

'bird'

Now download images of a cats, or dogs, or horses, or frogs, or deer, or birds, or ships, or airplanes, or automobiles, or trucks, and test them against the model. Keep a score :-)