# Image classification using pre-trained image model VGG16

State-of-the-art deep learning image classifiers (Pre-trained Image Models) are fully integrated into the Keras core
Keras has five Convolutional Neural Networks that have been pre-trained on the ImageNet dataset:
* VGG16 (visual geometry group, by Oxford)
* VGG19
* ResNet50
* Inception V3 (by Google)
* Xception

Googles Goggles is the beginning of visual search technology.
With this image recognition app, users can take a photo of a physical object, and Google will try to find information about what is pictured.

Take a photo of a landmark and Google Goggles can give you its history.
Snap a pic of a foreign menu, and it can be translated. 
the app can recognise and generate informaation on books, CDs, virtually anything that is 2D.

business value:
* another avenue to generate search data
* recommend users to advertisers and retailers

![](img/vgg16_croped.png)

University of Oxford Visual Geometry Group has developed VGG16 trained weights [(details here)](https://github.com/fchollet/deep-learning-models/releases)

Download the tensorflow h5 file [vgg16_weights_tf_dim_ordering_tf_kernels.h5](https://github.com/fchollet/deep-learning-models/releases/download/v0.1/vgg16_weights_tf_dim_ordering_tf_kernels.h5), and save it in the same directory as this notebook.

Note this file is a little over half a gigabyte, so it will take a while to download.


In [73]:
# # check that the above weight file is in the same directory as this notebook

# weight_file = 'vgg16_weights_tf_dim_ordering_tf_kernels.h5'

# import os
# if not os.path.exists(weight_file):
#     raise FileNotFoundError("No file {} found. Check path again".format(weight_file))

In [4]:
# # Download labels for VGG16
# !curl https://raw.githubusercontent.com/torch/tutorials/master/7_imagenet_classification/synset_words.txt -o synset_words.txt

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100 31675  100 31675    0     0  92346      0 --:--:-- --:--:-- --:--:-- 92346


## A Convolutional Neural Network (CNN) Architecture

In [1]:
from keras import backend as K
# K.set_image_dim_ordering('th')   # alternative(A) some Python version works on this
K.common.set_image_dim_ordering('th')   # alternative(B)

from keras.models import Sequential
from keras.layers.core import Flatten, Dense, Dropout, Activation
from keras.layers.convolutional import Conv2D, MaxPooling2D, ZeroPadding2D
from keras.optimizers import SGD
import numpy as np
import pandas as pd
import PIL

Using TensorFlow backend.
  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
  _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
  _np_qint16 = np.dtype([("qint16", np.int16, 1)])
  _np_quint16 = np.dtype([("quint16", np.uint16, 1)])
  _np_qint32 = np.dtype([("qint32", np.int32, 1)])
  np_resource = np.dtype([("resource", np.ubyte, 1)])


In [2]:
def convert_image_to_bgr_numpy_array(image_path, size=(224,224)):
    """The network has been trained using opencv and BGR images 
    (i.e. channels order blue, green, red rather than red, green, blue).
    The description of why is https://stackoverflow.com/questions/14556545/why-opencv-using-bgr-colour-space-instead-of-rgb
    
    We can use a simpler image library as long as we manually convert
    the data to the expected format.
    """
    image = PIL.Image.open(image_path).resize(size)
    img_data = np.array(image.getdata(), np.float32).reshape(*size, -1)
    # swap R and B channels
    img_data = np.flip(img_data, axis=2)
    return img_data

def prepare_image(image_path):
    im = convert_image_to_bgr_numpy_array(image_path)

    # these subtractions are just mean centering the images based on known means for different color channels
    im[:,:,0] -= 103.939
    im[:,:,1] -= 116.779
    im[:,:,2] -= 123.68

    im = im.transpose((2,0,1))        # adjust from (224, 224, 3) to (3, 224, 224) for keras
    im = np.expand_dims(im, axis=0)   # adjust to (1, 3, 224, 224) for generating keras prediction
    return im

In [3]:
# Load labels for VGG16
# synset = pd.read_csv('synset_words.txt', skipinitialspace=True, names = ['synset', 'words'])   # simplified classes/labels
synset = pd.read_csv('synset_words.csv', skipinitialspace=True, names = ['synset'])   # full classification classes/labels
synset

Unnamed: 0,synset
0,"n01440764 tench, Tinca tinca"
1,"n01443537 goldfish, Carassius auratus"
2,"n01484850 great white shark, white shark, man-..."
3,"n01491361 tiger shark, Galeocerdo cuvieri"
4,"n01494475 hammerhead, hammerhead shark"
5,"n01496331 electric ray, crampfish, numbfish, t..."
6,n01498041 stingray
7,n01514668 cock
8,n01514859 hen
9,"n01518878 ostrich, Struthio camelus"


### Option 1: Manual method to define the VGG16 architecture

VGG16 model has been trained on a large dataset from imagenet, ie, ~1.2 million training images with another 50,000 images for validation and 100,000 images for testing. It has taken a huge amount of gpu time/power and data to train this model, which can classify an input image into 1,000 separate object categories.

Here are [more examples of keras transfer learning](https://keras.io/applications/) with modern pre-trained CNNs. 

In [75]:
# This network is characterized by its simplicity, defined manually below:
# using only 3×3 convolutional layers stacked on top of each other in increasing depth
# Reducing volume size is handled by max pooling
# Two fully-connected layers, each with 4,096 nodes are then followed by a softmax classifier

def VGG_16(weights_path=None):
    model = Sequential()
    model.add(ZeroPadding2D((1,1),input_shape=(3,224,224)))
    model.add(Conv2D(64, (3, 3), activation='relu'))
    model.add(ZeroPadding2D((1,1)))
    model.add(Conv2D(64, (3, 3), activation='relu'))
    model.add(MaxPooling2D((2,2), strides=(2,2)))

    model.add(ZeroPadding2D((1,1)))
    model.add(Conv2D(128, (3, 3), activation='relu'))
    model.add(ZeroPadding2D((1,1)))
    model.add(Conv2D(128, (3, 3), activation='relu'))
    model.add(MaxPooling2D((2,2), strides=(2,2)))

    model.add(ZeroPadding2D((1,1)))
    model.add(Conv2D(256, (3, 3), activation='relu'))
    model.add(ZeroPadding2D((1,1)))
    model.add(Conv2D(256, (3, 3), activation='relu'))
    model.add(ZeroPadding2D((1,1)))
    model.add(Conv2D(256, (3, 3), activation='relu'))
    model.add(MaxPooling2D((2,2), strides=(2,2)))

    model.add(ZeroPadding2D((1,1)))
    model.add(Conv2D(512, (3, 3), activation='relu'))
    model.add(ZeroPadding2D((1,1)))
    model.add(Conv2D(512, (3, 3), activation='relu'))
    model.add(ZeroPadding2D((1,1)))
    model.add(Conv2D(512, (3, 3), activation='relu'))
    model.add(MaxPooling2D((2,2), strides=(2,2)))

    model.add(ZeroPadding2D((1,1)))
    model.add(Conv2D(512, (3, 3), activation='relu'))
    model.add(ZeroPadding2D((1,1)))
    model.add(Conv2D(512, (3, 3), activation='relu'))
    model.add(ZeroPadding2D((1,1)))
    model.add(Conv2D(512, (3, 3), activation='relu'))
    model.add(MaxPooling2D((2,2), strides=(2,2)))

    model.add(Flatten())
    model.add(Dense(4096, activation='relu'))
    model.add(Dropout(0.5))
    model.add(Dense(4096, activation='relu'))
    model.add(Dropout(0.5))
    model.add(Dense(1000, activation='softmax'))

    if weights_path:
        model.load_weights(weights_path)

    return model

# define and compile model
model = VGG_16(weight_file)   # note that we don't actually train/adjust the weights at all here
sgd = SGD(lr=0.1, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(optimizer=sgd, loss='categorical_crossentropy')

In [79]:
# make predictions using compiled VGG16 model
img = prepare_image('img/dog.jpg')
out = model.predict(img)
y_pred = np.argmax(out)
print(y_pred)
print(synset.loc[y_pred].synset)

259
n02112018 Pomeranian


In [80]:
out.shape

(1, 1000)

### Option 2: Loading VGG16 using keras utilities (recommended)

In [4]:
# highly recommended method, instead of manually defining architectures and loading weights
from keras.applications.vgg16 import VGG16
from keras.applications.imagenet_utils import decode_predictions

model = VGG16()   # can this work without specifying the weights?
# model = VGG16(weights='imagenet')

Instructions for updating:
Colocations handled automatically by placer.


In [5]:
# make predictions using loaded VGG16 model
img = prepare_image('img/dog.jpg')
out = model.predict(img)
y_pred = np.argmax(out)
print(y_pred)
print(synset.loc[y_pred].synset)
print('Predicted:', decode_predictions(out))

259
n02112018 Pomeranian
Predicted: [[('n02112018', 'Pomeranian', 0.5479653), ('n02113023', 'Pembroke', 0.11714048), ('n02115641', 'dingo', 0.07175173), ('n02085620', 'Chihuahua', 0.033746526), ('n02104365', 'schipperke', 0.030352084)]]


In [13]:
img = prepare_image('img/dog_2.jpg')
out = model.predict(img)
y_pred = np.argmax(out)
print(y_pred)
print(synset.loc[y_pred].synset)
print('Predicted:', decode_predictions(out))

235
n02106662 German shepherd, German shepherd dog, German police dog, alsatian
Predicted: [[('n02106662', 'German_shepherd', 0.9972421), ('n02105162', 'malinois', 0.0015698465), ('n03803284', 'muzzle', 0.00032143845), ('n04254680', 'soccer_ball', 0.00016304254), ('n02105412', 'kelpie', 0.00015440548)]]


In [14]:
img = prepare_image('img/test.jpg')
out = model.predict(img)
print(synset.loc[np.argmax(out)].synset)
print('Predicted:', decode_predictions(out))

n04350905 suit, suit of clothes
Predicted: [[('n04350905', 'suit', 0.7036784), ('n04591157', 'Windsor_tie', 0.16369325), ('n03838899', 'oboe', 0.026618008), ('n10148035', 'groom', 0.013540737), ('n02883205', 'bow_tie', 0.011791961)]]


In [15]:
img = prepare_image('img/sloth.jpg')
out = model.predict(img)
print(synset.loc[np.argmax(out)].synset)
print('Predicted:', decode_predictions(out))

n07930864 cup
Predicted: [[('n07930864', 'cup', 0.6994144), ('n03063599', 'coffee_mug', 0.18904433), ('n04131690', 'saltshaker', 0.020779125), ('n03063689', 'coffeepot', 0.011247833), ('n04423845', 'thimble', 0.0071213855)]]


In [93]:
img = prepare_image('img/sloth2.jpg')
out = model.predict(img)
print(synset.loc[np.argmax(out)].synset)
print('Predicted:', decode_predictions(out))

n02457408 three-toed sloth, ai, Bradypus tridactylus
Predicted: [[('n02457408', 'three-toed_sloth', 0.98550665), ('n02483362', 'gibbon', 0.0010795437), ('n01622779', 'great_grey_owl', 0.0007896442), ('n02493509', 'titi', 0.0006946715), ('n02500267', 'indri', 0.00065417995)]]


In [94]:
img = prepare_image('img/sloth3.jpg')
out = model.predict(img)
print(synset.loc[np.argmax(out)].synset)
print('Predicted:', decode_predictions(out))

n02457408 three-toed sloth, ai, Bradypus tridactylus
Predicted: [[('n02457408', 'three-toed_sloth', 0.9983479), ('n02493509', 'titi', 0.00095363497), ('n02483362', 'gibbon', 0.00019791185), ('n02138441', 'meerkat', 0.00017710007), ('n02490219', 'marmoset', 8.666833e-05)]]


In [214]:
img = prepare_image('img/Dog_3.jpg')
out = model.predict(img)
print(synset.loc[np.argmax(out)].synset)
print('Predicted:', decode_predictions(out))

n02098286 West Highland white terrier
Predicted: [[('n02098286', 'West_Highland_white_terrier', 0.6178925), ('n02085936', 'Maltese_dog', 0.12265609), ('n02094114', 'Norfolk_terrier', 0.07862637), ('n02096177', 'cairn', 0.03849569), ('n02094433', 'Yorkshire_terrier', 0.032624725)]]


In [209]:
img = prepare_image('img/Dog_4.jpg')
out = model.predict(img)
print(synset.loc[np.argmax(out)].synset)
print('Predicted:', decode_predictions(out))

n02085936 Maltese dog, Maltese terrier, Maltese
Predicted: [[('n02085936', 'Maltese_dog', 0.9560953), ('n02098413', 'Lhasa', 0.018933775), ('n02086240', 'Shih-Tzu', 0.009139382), ('n02086079', 'Pekinese', 0.0061969035), ('n02113624', 'toy_poodle', 0.003968405)]]


In [26]:
img = prepare_image('img/strawberry.jpg')
out = model.predict(img)
print(synset.loc[np.argmax(out)].synset)
print('Predicted:', decode_predictions(out))

Predicted: [[('n07745940', 'strawberry', 0.99977857), ('n04332243', 'strainer', 3.446263e-05), ('n07747607', 'orange', 2.406561e-05), ('n07753592', 'banana', 2.256647e-05), ('n07768694', 'pomegranate', 2.1113809e-05)]]
Predicted: [[('n07745940', 'strawberry', 0.99977857), ('n04332243', 'strainer', 3.446263e-05), ('n07747607', 'orange', 2.406561e-05), ('n07753592', 'banana', 2.256647e-05), ('n07768694', 'pomegranate', 2.1113809e-05)]]


In [27]:
img = prepare_image('img/icecream.jpg')
out = model.predict(img)
print(synset.loc[np.argmax(out)].synset)
print('Predicted:', decode_predictions(out))

Predicted: [[('n07614500', 'ice_cream', 0.20130746), ('n03476684', 'hair_slide', 0.057905596), ('n07579787', 'plate', 0.057216298), ('n07745940', 'strawberry', 0.054555055), ('n07714571', 'head_cabbage', 0.050523743)]]
Predicted: [[('n07614500', 'ice_cream', 0.20130746), ('n03476684', 'hair_slide', 0.057905596), ('n07579787', 'plate', 0.057216298), ('n07745940', 'strawberry', 0.054555055), ('n07714571', 'head_cabbage', 0.050523743)]]


In [28]:
img = prepare_image('img/icecream2.jpg')
out = model.predict(img)
print(synset.loc[np.argmax(out)].synset)
print('Predicted:', decode_predictions(out))

Predicted: [[('n07614500', 'ice_cream', 0.87625885), ('n07613480', 'trifle', 0.11704696), ('n07836838', 'chocolate_sauce', 0.004256764), ('n07745940', 'strawberry', 0.0008514527), ('n07579787', 'plate', 0.0003462588)]]
Predicted: [[('n07614500', 'ice_cream', 0.87625885), ('n07613480', 'trifle', 0.11704696), ('n07836838', 'chocolate_sauce', 0.004256764), ('n07745940', 'strawberry', 0.0008514527), ('n07579787', 'plate', 0.0003462588)]]


In [35]:
img = prepare_image('img/schnauzer.jpg')
out = model.predict(img)
print(synset.loc[np.argmax(out)].synset)
print('Predicted:', decode_predictions(out))

Predicted: [[('n02097047', 'miniature_schnauzer', 0.7287519), ('n02097209', 'standard_schnauzer', 0.24815588), ('n02097130', 'giant_schnauzer', 0.020994069), ('n02093991', 'Irish_terrier', 0.0005178273), ('n02096051', 'Airedale', 0.0003147888)]]
Predicted: [[('n02097047', 'miniature_schnauzer', 0.7287519), ('n02097209', 'standard_schnauzer', 0.24815588), ('n02097130', 'giant_schnauzer', 0.020994069), ('n02093991', 'Irish_terrier', 0.0005178273), ('n02096051', 'Airedale', 0.0003147888)]]


In [36]:
img = prepare_image('img/fan_sil1.jpg')
out = model.predict(img)
print(synset.loc[np.argmax(out)].synset)
print('Predicted:', decode_predictions(out))

Predicted: [[('n02085936', 'Maltese_dog', 0.61389536), ('n02086240', 'Shih-Tzu', 0.10318202), ('n02098413', 'Lhasa', 0.09179912), ('n02086079', 'Pekinese', 0.069826424), ('n02113624', 'toy_poodle', 0.017381951)]]
Predicted: [[('n02085936', 'Maltese_dog', 0.61389536), ('n02086240', 'Shih-Tzu', 0.10318202), ('n02098413', 'Lhasa', 0.09179912), ('n02086079', 'Pekinese', 0.069826424), ('n02113624', 'toy_poodle', 0.017381951)]]


In [126]:
img = prepare_image('img/fan_sil2.jpg')
out = model.predict(img)
print(synset.loc[np.argmax(out)].synset)
print('Predicted:', decode_predictions(out))

Predicted: [[('n04162706', 'seat_belt', 0.7091236), ('n02085936', 'Maltese_dog', 0.21847743), ('n02098286', 'West_Highland_white_terrier', 0.040213455), ('n02098413', 'Lhasa', 0.009687599), ('n02094114', 'Norfolk_terrier', 0.002983844)]]
Predicted: [[('n04162706', 'seat_belt', 0.7091236), ('n02085936', 'Maltese_dog', 0.21847743), ('n02098286', 'West_Highland_white_terrier', 0.040213455), ('n02098413', 'Lhasa', 0.009687599), ('n02094114', 'Norfolk_terrier', 0.002983844)]]


In [38]:
img = prepare_image('img/fan_sil3.jpg')
out = model.predict(img)
print(synset.loc[np.argmax(out)].synset)
print('Predicted:', decode_predictions(out))

Predicted: [[('n02085936', 'Maltese_dog', 0.8535949), ('n02098413', 'Lhasa', 0.064508125), ('n02098286', 'West_Highland_white_terrier', 0.04744286), ('n02086240', 'Shih-Tzu', 0.0124242455), ('n04162706', 'seat_belt', 0.0057937186)]]
Predicted: [[('n02085936', 'Maltese_dog', 0.8535949), ('n02098413', 'Lhasa', 0.064508125), ('n02098286', 'West_Highland_white_terrier', 0.04744286), ('n02086240', 'Shih-Tzu', 0.0124242455), ('n04162706', 'seat_belt', 0.0057937186)]]


In [None]:
img = prepare_image('img/ddog_basset_hound.jpg')
out = model.predict(img)
print(synset.loc[np.argmax(out)].synset)
print('Predicted:', decode_predictions(out))

In [None]:
img = prepare_image('img/ddog_beagle.jpg')
out = model.predict(img)
print(synset.loc[np.argmax(out)].synset)
print('Predicted:', decode_predictions(out))

In [None]:
img = prepare_image('img/ddog_english_foxhound.jpg')
out = model.predict(img)
print(synset.loc[np.argmax(out)].synset)
print('Predicted:', decode_predictions(out))

In [None]:
img = prepare_image('img/ddog_walker_hound.jpg')
out = model.predict(img)
print(synset.loc[np.argmax(out)].synset)
print('Predicted:', decode_predictions(out))

In [None]:
img = prepare_image('img/poodle_miniature.jpg')
out = model.predict(img)
print(synset.loc[np.argmax(out)].synset)
print('Predicted:', decode_predictions(out))

In [None]:
img = prepare_image('img/poodle_standard.jpg')
out = model.predict(img)
print(synset.loc[np.argmax(out)].synset)
print('Predicted:', decode_predictions(out))

In [185]:
img = prepare_image('img/poodle_toy.jpg')
out = model.predict(img)
print(synset.loc[np.argmax(out)].synset)
print('Predicted:', decode_predictions(out))

In [None]:
img = prepare_image('img/cat_egyptian_cat.jpg')
out = model.predict(img)
print(synset.loc[np.argmax(out)].synset)
print('Predicted:', decode_predictions(out))

In [None]:
img = prepare_image('img/cat_persian.jpg')
out = model.predict(img)
print(synset.loc[np.argmax(out)].synset)
print('Predicted:', decode_predictions(out))

In [None]:
img = prepare_image('img/cat_siamese_cat.jpg')
out = model.predict(img)
print(synset.loc[np.argmax(out)].synset)
print('Predicted:', decode_predictions(out))

In [173]:
img = prepare_image('img/cat_tabby_cat_mackerel.jpg')
out = model.predict(img)
print(synset.loc[np.argmax(out)].synset)
print('Predicted:', decode_predictions(out))

In [174]:
img = prepare_image('img/cat_tiger_cat.jpg')
out = model.predict(img)
print(synset.loc[np.argmax(out)].synset)
print('Predicted:', decode_predictions(out))

In [None]:
img = prepare_image('img/catt_cheetah.jpg')
out = model.predict(img)
print(synset.loc[np.argmax(out)].synset)
print('Predicted:', decode_predictions(out))

In [None]:
img = prepare_image('img/catt_cougar.jpg')
out = model.predict(img)
print(synset.loc[np.argmax(out)].synset)
print('Predicted:', decode_predictions(out))

In [None]:
img = prepare_image('img/catt_jaguar.jpg')
out = model.predict(img)
print(synset.loc[np.argmax(out)].synset)
print('Predicted:', decode_predictions(out))

In [None]:
img = prepare_image('img/catt_leopard.jpg')
out = model.predict(img)
print(synset.loc[np.argmax(out)].synset)
print('Predicted:', decode_predictions(out))

In [None]:
img = prepare_image('img/catt_linx_cat.jpg')
out = model.predict(img)
print(synset.loc[np.argmax(out)].synset)
print('Predicted:', decode_predictions(out))

In [184]:
img = prepare_image('img/catt_snow_leopard.jpg')
out = model.predict(img)
print(synset.loc[np.argmax(out)].synset)
print('Predicted:', decode_predictions(out))

In [None]:
img = prepare_image('img/guinea_pig_abyssinian.jpg')
out = model.predict(img)
print(synset.loc[np.argmax(out)].synset)
print('Predicted:', decode_predictions(out))

In [None]:
img = prepare_image('img/guinea_pig_american1.jpg')
out = model.predict(img)
print(synset.loc[np.argmax(out)].synset)
print('Predicted:', decode_predictions(out))

In [None]:
img = prepare_image('img/guinea_pig_himalayan.jpg')
out = model.predict(img)
print(synset.loc[np.argmax(out)].synset)
print('Predicted:', decode_predictions(out))

In [None]:
img = prepare_image('img/guinea_pig_silkie-hazelnut.jpg')
out = model.predict(img)
print(synset.loc[np.argmax(out)].synset)
print('Predicted:', decode_predictions(out))

In [None]:
img = prepare_image('img/guinea_pig_skinny.jpg')
out = model.predict(img)
print(synset.loc[np.argmax(out)].synset)
print('Predicted:', decode_predictions(out))

In [None]:
img = prepare_image('img/bird_black_grouse.jpg')
out = model.predict(img)
print(synset.loc[np.argmax(out)].synset)
print('Predicted:', decode_predictions(out))

In [None]:
img = prepare_image('img/bird_bulbul.jpg')
out = model.predict(img)
print(synset.loc[np.argmax(out)].synset)
print('Predicted:', decode_predictions(out))

In [None]:
img = prepare_image('img/bird_chickadee.jpg')
out = model.predict(img)
print(synset.loc[np.argmax(out)].synset)
print('Predicted:', decode_predictions(out))

In [None]:
img = prepare_image('img/bird_coucal.jpg')
out = model.predict(img)
print(synset.loc[np.argmax(out)].synset)
print('Predicted:', decode_predictions(out))

In [None]:
img = prepare_image('img/bird_goldfinch.jpg')
out = model.predict(img)
print(synset.loc[np.argmax(out)].synset)
print('Predicted:', decode_predictions(out))

In [None]:
img = prepare_image('img/bird_hornbill.jpg')
out = model.predict(img)
print(synset.loc[np.argmax(out)].synset)
print('Predicted:', decode_predictions(out))

In [None]:
img = prepare_image('img/bird_indigo_bunting.jpg')
out = model.predict(img)
print(synset.loc[np.argmax(out)].synset)
print('Predicted:', decode_predictions(out))

In [None]:
img = prepare_image('img/bird_jay.jpg')
out = model.predict(img)
print(synset.loc[np.argmax(out)].synset)
print('Predicted:', decode_predictions(out))

In [None]:
img = prepare_image('img/bird_magpie.jpg')
out = model.predict(img)
print(synset.loc[np.argmax(out)].synset)
print('Predicted:', decode_predictions(out))

In [None]:
img = prepare_image('img/bird_partridge.jpg')
out = model.predict(img)
print(synset.loc[np.argmax(out)].synset)
print('Predicted:', decode_predictions(out))

In [None]:
img = prepare_image('img/bird_ptarmigan.jpg')
out = model.predict(img)
print(synset.loc[np.argmax(out)].synset)
print('Predicted:', decode_predictions(out))

In [None]:
img = prepare_image('img/bird_quail.jpg')
out = model.predict(img)
print(synset.loc[np.argmax(out)].synset)
print('Predicted:', decode_predictions(out))

In [16]:
img = prepare_image('img/street_sign1.jpg')
out = model.predict(img)
print(synset.loc[np.argmax(out)].synset)
print('Predicted:', decode_predictions(out))

n06794110 street sign
Predicted: [[('n06794110', 'street_sign', 0.97328454), ('n06874185', 'traffic_light', 0.019324293), ('n03710193', 'mailbox', 0.0017053946), ('n02843684', 'birdhouse', 0.0009099932), ('n03976657', 'pole', 0.0007574964)]]


In [20]:
img = prepare_image('img/street_sign2.jpg')
out = model.predict(img)
print(synset.loc[np.argmax(out)].synset)
print('Predicted:', decode_predictions(out))

n06794110 street sign
Predicted: [[('n06794110', 'street_sign', 0.9851708), ('n04149813', 'scoreboard', 0.00654738), ('n06874185', 'traffic_light', 0.0049419673), ('n03976657', 'pole', 0.0006728825), ('n03710193', 'mailbox', 0.0003602859)]]


In [19]:
img = prepare_image('img/traffic_light1.jpg')
out = model.predict(img)
print(synset.loc[np.argmax(out)].synset)
print('Predicted:', decode_predictions(out))

n06874185 traffic light, traffic signal, stoplight
Predicted: [[('n06874185', 'traffic_light', 0.94168574), ('n03535780', 'horizontal_bar', 0.017492842), ('n06794110', 'street_sign', 0.016808076), ('n03976657', 'pole', 0.0152156055), ('n04146614', 'school_bus', 0.002145557)]]


In [18]:
img = prepare_image('img/traffic_light2.jpg')
out = model.predict(img)
print(synset.loc[np.argmax(out)].synset)
print('Predicted:', decode_predictions(out))

n06874185 traffic light, traffic signal, stoplight
Predicted: [[('n06874185', 'traffic_light', 0.99999785), ('n06794110', 'street_sign', 6.8075326e-07), ('n03691459', 'loudspeaker', 5.467894e-07), ('n04146614', 'school_bus', 2.1650672e-07), ('n03891332', 'parking_meter', 1.2415423e-07)]]


In [21]:
img = prepare_image('img/truck_fire.jpg')
out = model.predict(img)
print(synset.loc[np.argmax(out)].synset)
print('Predicted:', decode_predictions(out))

n03345487 fire engine, fire truck
Predicted: [[('n03345487', 'fire_engine', 0.99140996), ('n04461696', 'tow_truck', 0.004776698), ('n04467665', 'trailer_truck', 0.001807259), ('n04065272', 'recreational_vehicle', 0.0005020286), ('n03776460', 'mobile_home', 0.00048446533)]]


In [22]:
img = prepare_image('img/truck_garbage.jpg')
out = model.predict(img)
print(synset.loc[np.argmax(out)].synset)
print('Predicted:', decode_predictions(out))

n03417042 garbage truck, dustcart
Predicted: [[('n03417042', 'garbage_truck', 0.9956779), ('n04467665', 'trailer_truck', 0.0015499599), ('n04461696', 'tow_truck', 0.0013962251), ('n03796401', 'moving_van', 0.00066518283), ('n03126707', 'crane', 0.00024041657)]]


In [23]:
img = prepare_image('img/truck_pickup.jpg')
out = model.predict(img)
print(synset.loc[np.argmax(out)].synset)
print('Predicted:', decode_predictions(out))

n03930630 pickup, pickup truck
Predicted: [[('n03930630', 'pickup', 0.97242695), ('n04461696', 'tow_truck', 0.009773601), ('n03670208', 'limousine', 0.0054782643), ('n03100240', 'convertible', 0.004803099), ('n02814533', 'beach_wagon', 0.0025183233)]]


In [24]:
img = prepare_image('img/truck_tow.jpg')
out = model.predict(img)
print(synset.loc[np.argmax(out)].synset)
print('Predicted:', decode_predictions(out))

n04461696 tow truck, tow car, wrecker
Predicted: [[('n04461696', 'tow_truck', 0.763738), ('n03126707', 'crane', 0.1105743), ('n04252225', 'snowplow', 0.07156029), ('n03417042', 'garbage_truck', 0.021777483), ('n03384352', 'forklift', 0.018552471)]]


In [27]:
img = prepare_image('img/truck_trailer.jpg')
out = model.predict(img)
print(synset.loc[np.argmax(out)].synset)
print('Predicted:', decode_predictions(out))

n04467665 trailer truck, tractor trailer, trucking rig, rig, articulated lorry, semi
Predicted: [[('n04467665', 'trailer_truck', 0.7626647), ('n03417042', 'garbage_truck', 0.16754861), ('n03796401', 'moving_van', 0.046888623), ('n03126707', 'crane', 0.011043401), ('n04465501', 'tractor', 0.0034897025)]]


In [15]:
img = prepare_image('img/screenshot_10.jpg')
out = model.predict(img)
print(synset.loc[np.argmax(out)].synset)
print('Predicted:', decode_predictions(out))

n02009912 American egret, great white heron, Egretta albus
Predicted: [[('n02009912', 'American_egret', 0.6355606), ('n02006656', 'spoonbill', 0.33667693), ('n02009229', 'little_blue_heron', 0.013474912), ('n02012849', 'crane', 0.007456017), ('n02002556', 'white_stork', 0.0039981715)]]


### Transfer Learning

it turns out that the lower level featured learned by VGG16 on imagenet are still applicable to other problems with natural images. If we can preserve the lower-level features, we can just train a new model on those features. (In fact, in the case of 'softmax', we can think of this as just training a new multinomial logistic regression, on those convolution features)

Lets just snip off last layer.

A Caveat

if we just add a new layer with default weights, it is going to be very wrong the first iteration. Since it is so wrong, the gradient will be huge, and because we are using back propagation those errors will be sent down stream into the lower level features. This can quickly destroy the rest of the network.

In order to retrain this model we must protect the lower-level features, until our new layers have reached more stability. We can do this by freezing those layers

Then we'll add our new layer.

In [59]:
# from keras.models import Model

# base_model = VGG16(weights='imagenet', include_top=False, input_shape=(3,224,224)) 
# # Freeze convolutional layers
# for layer in base_model.layers:
#     layer.trainable = False 

# #     model.add(Flatten())
# #     model.add(Dense(4096, activation='relu'))
# #     model.add(Dropout(0.5))
# #     model.add(Dense(4096, activation='relu'))
# #     model.add(Dropout(0.5))
# #     model.add(Dense(1000, activation='softmax'))
# # note we exclude the above final dense layers, and add the dense layers below, so we could retrain it ourselves

# x = base_model.output
# x = Flatten()(x) # flatten from convolution tensor output 
# x = Dense(512, activation='relu')(x)
# x = Dropout(0.5)(x)
# x = Dense(256, activation='relu')(x)
# x = Dropout(0.5)(x)
# predictions = Dense(3, activation='softmax')(x) # should match # of classes predicted

# # this is the model we will train
# model = Model(inputs=base_model.input, outputs=predictions)

In [60]:
base_model.layers

[<keras.engine.input_layer.InputLayer at 0x2b38182f710>,
 <keras.layers.convolutional.Conv2D at 0x2b38182f780>,
 <keras.layers.convolutional.Conv2D at 0x2b38182f908>,
 <keras.layers.pooling.MaxPooling2D at 0x2b380819518>,
 <keras.layers.convolutional.Conv2D at 0x2b380819668>,
 <keras.layers.convolutional.Conv2D at 0x2b380505a20>,
 <keras.layers.pooling.MaxPooling2D at 0x2b3805eb898>,
 <keras.layers.convolutional.Conv2D at 0x2b3805eb9e8>,
 <keras.layers.convolutional.Conv2D at 0x2b38077ed68>,
 <keras.layers.convolutional.Conv2D at 0x2b38066cba8>,
 <keras.layers.pooling.MaxPooling2D at 0x2b3805ad9e8>,
 <keras.layers.convolutional.Conv2D at 0x2b3805adb38>,
 <keras.layers.convolutional.Conv2D at 0x2b38062dd30>,
 <keras.layers.convolutional.Conv2D at 0x2b381812c88>,
 <keras.layers.pooling.MaxPooling2D at 0x2b381887b38>,
 <keras.layers.convolutional.Conv2D at 0x2b381887c88>,
 <keras.layers.convolutional.Conv2D at 0x2b3806b0080>,
 <keras.layers.convolutional.Conv2D at 0x2b380692f60>,
 <keras.

In [61]:
base_model.output

<tf.Tensor 'block5_pool_1/transpose_1:0' shape=(?, 512, 7, 7) dtype=float32>

In [62]:
x

<tf.Tensor 'dropout_4/cond/Merge:0' shape=(?, 256) dtype=float32>

In [63]:
model.compile(optimizer=SGD(lr=0.0001, momentum=0.9),
            loss='categorical_crossentropy', metrics=['accuracy'])

Then you would just train like normal

```python
# i.e. if we had training images and our own labels, we could run
model.fit(X_train,y_train)
```

How much data do you need?

More!

Actually with this bottleneck approach, you don't need as much. 200-1000 representitive images of each class will give good results. Because
* Google has already done most of the hard work
* We can use image augmentation to increase our number of training samples

New Architectures are being published every day. So much to read!

* [Curated List of Deep Learning papers](https://github.com/ChristosChristofidis/awesome-deep-learning)
* [Good reddit post for keeping up with the latest research](https://www.reddit.com/r/MachineLearning/comments/6d7nb1/d_machine_learning_wayr_what_are_you_reading_week/)
