# Dats versus cogs example

This was taken from `fast.ai`.

https://github.com/fastai/fastai/blob/master/courses/dl1/lesson1.ipynb

## Image classification with Convolutional Neural Networks
Welcome to the first week of the second deep learning certificate! We're going to use convolutional neural networks (CNNs) to allow our computer to see – something that is only possible thanks to deep learning.

## The first task: 'Dogs vs Cats'

We're going to try to create a model to enter the Dogs vs Cats competition at Kaggle. There are 25,000 labelled dog and cat photos available for training, and 12,500 in the test set that we have to try to label for this competition. According to the Kaggle web-site, when this competition was launched (end of 2013): "State of the art: The current literature suggests machine classifiers can score above 80% accuracy on this task". So if we can beat 80%, then we will be at the cutting edge as of 2013!

> fast.ai from here: https://github.com/fastai/fastai/blob/master/courses/dl1/lesson1.ipynb

In [None]:
# Put these at the top of every notebook, to get automatic reloading and inline plotting
## commented some out for the time being
#%reload_ext autoreload
#%autoreload 2
%matplotlib inline

In [None]:
# import from fast.ai

from fastai.imports import *

In [None]:
from fastai.transforms import *
from fastai.conv_learner import *
from fastai.model import *
from fastai.dataset import *
from fastai.sgdr import *
from fastai.plots import *

Download the data from Kaggle.

`sz` is the size that the images will be resized to in order to ensure that the training runs quickly. We'll be talking about this parameter a lot during the course. We will leave it at 224 for now.

In [None]:
PATH = "/Users/jamespearce/repos/dl/data/dogscats/"
sz=224

Need to have NVidia CPUs available for `cuda`.

In [None]:
torch.cuda.is_available()

Not on my MacBook! What about the MS DSVM?

In addition, NVidia provides special accelerated functions for deep learning in a package called `CuDNN`. Although not strictly necessary, it will improve training performance significantly, and is included by default in all supported fastai configurations. Therefore, if the following does not return `True`, you may want to look into why.

In [None]:
torch.backends.cudnn.enabled

## Extra steps if NOT using Crestle or Paperspace or our scripts

The dataset is available at http://files.fast.ai/data/dogscats.zip. You can download it directly on your server by running the following line in your terminal. `wget http://files.fast.ai/data/dogscats.zip`. You should put the data in a subdirectory of this notebook's directory, called `data/`. Note that this data is already available in Crestle and the Paperspace fast.ai template.

> I couldn't get this, so I downloaded direct from Kaggle.

In [None]:
os.listdir(PATH)

# Pivot – using the DSVM

Looking at the blog post of getting to second place in Kaggle in 22 minutes by Adrian Rosebrock.

https://blogs.technet.microsoft.com/machinelearning/2018/02/22/22-minutes-to-2nd-place-in-a-kaggle-competition-with-deep-learning-azure/

To get there, we use features from pre-trained Convolutional Neural Networks.

Import the libraries we need, including `ResNet50` from `keras`.



In [2]:
# import the necessary packages
from keras.applications import ResNet50
from keras.applications import imagenet_utils
from keras.preprocessing.image import img_to_array
from keras.preprocessing.image import load_img
from sklearn.preprocessing import LabelEncoder
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import GridSearchCV
from sklearn.metrics import classification_report
from sklearn.metrics import accuracy_score
from imutils import paths
import numpy as np
import progressbar
import random
import os

Using TensorFlow backend.


A typical flow to classify an image using a CNN is this:

  1. feed the image into the neural network
  2. the image forward propagates through the neural network
  3. the network outputs the final probabilities at the end
  
However, we **can** stop the propagation at an arbitrary layer, such as the activation or pooling layer, to give _feature vectors_ rather than probabilities.

Where the network was trained on a rich data set of images, it gives extremely useful features that can be applied to other computer vision problems. We call this process **transfer learning**.

In [3]:
# since we are not using command line arguments (like we typically
# would inside Deep Learning for Computer Vision with Python, let's
# "pretend" we are by using an `args` dictionary -- this will enable
# us to easily reuse and swap out code depending if we are using the
# command line or Jupyter Notebook
args = {
    "dataset": "/Users/jamespearce/repos/dl/data/dogscats/train",
    "batch_size": 32,
}

# store the batch size in a convenience variable
bs = args["batch_size"]

In [4]:
# grab the list of images in the Kaggle Dogs vs. Cats download and# grab  
# shuffle them to allow for easy training and testing splits via
# array slicing during training time
imagePaths = list(paths.list_images(args["dataset"]))
random.shuffle(imagePaths)
print(len(imagePaths))

25000


All files in the Dogs vs. Cats dataset have filenames such as `cat.153.jpg` or `dog.4375.jpg` – since the class labels are baked right into the filenames, we can easily extract them before the dot.

In [5]:
# extract the class labels from the image paths then encode the
# labels
labels = [p.split(os.path.sep)[-1].split(".")[0] for p in imagePaths]
le = LabelEncoder()
labels = le.fit_transform(labels)

Download the `ResNet50` weights and load the model. This took around half an hour on a slow connection.

Python would not download the weights on this connections, so downloaded separately from https://github.com/fchollet/deep-learning-models/releases/download/v0.2/resnet50_weights_tf_dim_ordering_tf_kernels_notop.h5 and stored in `~/.keras/models`.

In order to perform feature extraction, we need a pre-trained network – `ResNet50` is a good choice for this application. Notice how we have set `include_top=False` to leave off the fully-connected layers, enabling us to easily perform feature extraction.



In [6]:
# load the ResNet50 network (i.e., the network we'll be using for
# feature extraction)

model = ResNet50(weights="imagenet", include_top=False)

Once we have all image paths we need to loop over them individually and build batches to pass through the network for feature extraction.


In [7]:
# initialize the progress bar
widgets = ["Extracting Features: ", progressbar.Percentage(), " ",
    progressbar.Bar(), " ", progressbar.ETA()]
pbar = progressbar.ProgressBar(maxval=len(imagePaths),
    widgets=widgets).start()


Extracting Features:   0% |                                    | ETA:  --:--:--

In [22]:
percentage = 0.1
random.seed(2027)

k = int(len(imagePaths) * 0.1)
indices = random.sample(range(len(imagePaths)), k)

imagePaths_sample = [imagePaths[i] for i in indices]

In [17]:
%%time
# initialize our data matrix (where we will store our extracted
# features)
data = None

# loop over the images in batches
for i in np.arange(0, len(imagePaths_sample), bs):
    # extract the batch of images and labels, then initialize the
    # list of actual images that will be passed through the network
    # for feature extraction
    batchPaths = imagePaths_sample[i:i + bs]
    batchLabels = labels[i:i + bs]
    batchImages = []
    
    # loop over the images and labels in the current batch
    for (j, imagePath) in enumerate(batchPaths):
        # load the input image using the Keras helper utility
        # while ensuring the image is resized to 224x224 pixels
        image = load_img(imagePath, target_size=(224, 224))
        image = img_to_array(image)

        # preprocess the image by (1) expanding the dimensions and
        # (2) subtracting the mean RGB pixel intensity from the
        # ImageNet dataset
        image = np.expand_dims(image, axis=0)
        image = imagenet_utils.preprocess_input(image)

        # add the image to the batch
        batchImages.append(image)

    # pass the images through the network and use the outputs as
    # our actual features
    batchImages = np.vstack(batchImages)

    features = model.predict(batchImages, batch_size=bs)

    # reshape the features so that each image is represented by
    # a flattened feature vector of the `MaxPooling2D` outputs
    print(features.shape)
    features = features.reshape((features.shape[0], 2048))
    
    # if our data matrix is None, initialize it
    if data is None:
        data = features
    
    # otherwise, stack the data and features together
    else:
        data = np.vstack([data, features])
    
    # update the progress bar
    pbar.update(i)

# finish up the progress bar
pbar.finish()

1
(1, 224, 224, 3)
2
(32, 7, 7, 2048)


ValueError: cannot reshape array of size 3211264 into shape (32,2048)

In [16]:
7*7*32*2048

3211264