### Homework 6: Homebrew Computer vision

I didn't have time to refactor this, so it's a standard "run-the-cells in order" notebook with explanations and answers to questions inline in Markdown cells.

### Setup, exploration

In [262]:
%matplotlib inline

import os, glob

import numpy as np
import matplotlib.pyplot as plt

from scipy import stats

from skimage import io

from skimage.filters import gabor
from skimage.feature import hog, canny

from sklearn import model_selection

In [389]:
all_categories = [os.path.split(path)[-1] for path in glob.glob("50_categories/*")]
print("The categories are: ", all_categories)
all_filenames = glob.glob("50_categories/*/*.jpg")

The categories are:  ['gorilla', 'raccoon', 'crab', 'blimp', 'snail', 'airplanes', 'dog', 'dolphin', 'goldfish', 'giraffe', 'bear', 'killer-whale', 'penguin', 'zebra', 'duck', 'conch', 'camel', 'owl', 'helicopter', 'starfish', 'saturn', 'galaxy', 'goat', 'iguana', 'elk', 'hummingbird', 'triceratops', 'porcupine', 'teddy-bear', 'comet', 'hot-air-balloon', 'leopards', 'toad', 'mussels', 'kangaroo', 'speed-boat', 'bat', 'swan', 'octopus', 'frog', 'cormorant', 'unicorn', 'horse', 'skunk', 'mars', 'ostrich', 'goose', 'llama', 'snake', 'elephant']


### Adjust next line to train on all or a subset of files

In [484]:
subset_filenames = all_filenames[::5]
n_files_to_classify = len(subset_filenames)

In [376]:
def read_image(filename):
    img = io.imread(filename)
    #print(f"Image shape: {img.shape}")
    return img

def split_into_rgb_channels(image):
    r = image[:,:,0]
    g = image[:,:,1]
    b = image[:,:,2]
    return r, g, b

### Functions that take an image and return a feature

Ok, so let's talk about features. I iterated a few times here but ultimately didn't get features I'm happy with. The stupid/obvious features brought most of the performance boost, and my "smarter" features didn't really seem to help that much, despite taking a bunch of computational and programmer time. I think this is because I'm condensing the smarter features down to into single numbers that throw away most of the useful information... but I did this because I was looking for a balance between complexity and speed, and thought a few useful statistics of well-known image transform like e.g. gabor filters would do as well as classifying based on the transformed image itself. This seems not to be true, and I'm out of time to build something better.

If I had more time, here's the approach I would take: First, refactor this whole thing so that it's object-oriented, and small stuff like handling multichannel vs grayscale images, efficient application of multiple functions to a transformed image without recomputing, features of different sizes all getting neatly flattened, etc, is all handled. Also, eliminate the ugly code repetition across channels and features. You can see I made some progress here towards computing e.g. a gabor filter once, then doing several things to it, returning all those. That way the filter only needs to be done once and you can get many features out of it. Then, add some more features using additional functions of scikit-image. Finally, use the filtered images themselves as features, rather than computing single numbers from them. I could have added multiprocessing here too. I think ultimately I need a much bigger diversity of features than I have. Oh well, I learned a lot at least!

In [485]:
def feat_imsize(img):
    return img.shape[:2]

def feat_chanratio(img, c1, c2):
    return np.mean(img[:,:,c1])/np.mean(img[:,:,c2]) if len(img.shape) > 2 else 1

def feat_quadrant(img, channel, func=np.max):
    h, w = img.shape[:2]
    half_h = h//2
    half_w = w//2
    ic = img[:,:,channel] if len(img.shape) > 2 else img
    qs = {1: func(ic[:half_h,:half_w]),
         2: func(ic[half_h:,:half_w]),
         3: func(ic[:half_h,half_w:]),
         4: func(ic[half_h:,half_w:])}
    return max(qs, key=lambda key: qs[key])

def feat_histogram(img, channel, funcs=[stats.kurtosis]):
    ic = img[:,:,channel] if len(img.shape) > 2 else img
    h, fd = np.histogram(ic)
    #print(ic.shape, h.shape, h, fd.shape, fd)
    return [f(h) for f in funcs]
    
# smarter features
def feat_gabor(img, channel, f, funcs=[np.max]):
    ic = img[:,:,channel] if len(img.shape) > 2 else img
    f_real, f_imag = gabor(ic, frequency=f)
    #print(f_real.shape, f_real)
    return [f(f_real.flatten()) for f in funcs]

def feat_canny(img, channel, funcs=[np.count_nonzero]):
    ic = img[:,:,channel] if len(img.shape) > 2 else img
    c = canny(ic)
    #print(c.shape, c)
    return [f(c) for f in funcs]

def feat_hog(img, channel, funcs=[lambda x: x]):
    ic = img[:,:,channel] if len(img.shape) > 2 else img
    h = hog(ic, block_norm="L2-Hys")
    #print(h.shape, h, func(h))
    return [f(h) for f in funcs]
 
def feat_all(img):
    h, w = feat_imsize(img)
    n_ch = img.shape[2] if len(img.shape)>2 else 1
    rgavg = feat_chanratio(img, 0, 1)
    rbavg = feat_chanratio(img, 0, 2)
    gbavg = feat_chanratio(img, 1, 2)
    rquadmax = feat_quadrant(img, 0)
    gquadmax = feat_quadrant(img, 1)
    bquadmax = feat_quadrant(img, 2)
    rquadmin = feat_quadrant(img, 0, np.min)
    gquadmin = feat_quadrant(img, 1, np.min)
    bquadmin = feat_quadrant(img, 2, np.min)
    gabor_funcs = [np.count_nonzero, stats.kurtosis, stats.skew]
    r_gabor, r_gabor_max, r_gabor_3 = feat_gabor(img, 0, 0.4, gabor_funcs)
    g_gabor, g_gabor_max, g_gabor_3 = feat_gabor(img, 1, 0.4, gabor_funcs)
    b_gabor, b_gabor_max, b_gabor_3 = feat_gabor(img, 2, 0.4, gabor_funcs)
    
    hist_funcs = [stats.skew, stats.kurtosis, stats.variation]
    rskew, rkurt, rvar = feat_histogram(img, 0, hist_funcs)
    gskew, gkurt, gvar = feat_histogram(img, 1, hist_funcs)
    bskew, bkurt, bvar = feat_histogram(img, 2, hist_funcs)
    
    hog_funcs = [stats.skew, stats.kurtosis, np.std]
    r_hog, r_hog_k, r_hog3 = feat_hog(img, 0, hog_funcs)
    g_hog, g_hog_k, g_hog3 = feat_hog(img, 1, hog_funcs)
    b_hog, b_hog_k, b_hog3 = feat_hog(img, 2, hog_funcs)
    #print (r_hog, r_hog_k, r_hog3)
    
    canny_funcs = [np.nanmedian, np.nanmean]
    r_c, r_c2 = feat_canny(img, 0, canny_funcs)
    g_c, g_c2 = feat_canny(img, 1, canny_funcs)
    b_c, b_c2 = feat_canny(img, 2, canny_funcs)
    #print (r_c, g_c, b_c)
    
    return np.array([h, w, n_ch, 
                     rgavg, rbavg, gbavg, 
                     #rquadmax, bquadmax, gquadmax,
                     #rquadmin, gquadmin, bquadmin,
                     rkurt, bkurt, gkurt,
                     rkurt/gkurt, rkurt/bkurt, gkurt/bkurt,
                     rskew, gskew, bskew,
                     rvar, bvar, gvar,
                     r_gabor/b_gabor, r_gabor/g_gabor, b_gabor/g_gabor,
                     r_gabor_max, b_gabor_max, g_gabor_max,
                     r_gabor_3, g_gabor_3, b_gabor_3,
                     r_hog, g_hog, b_hog,
                     #r_hog_k, g_hog_k, b_hog_k,
                     r_hog3, g_hog3, b_hog3,
                     r_c, g_c, b_c,
                     r_c2, b_c2, g_c2
                    ])

### Calculate features of the images

In [486]:
def build_xy(files, all_categories):
    """Build the features and targets for the image files provided,
    where the targets are taken from the folder name and the provided list of categories"""
    n_files = len(files)
    eg_feats = feat_all(read_image(files[np.random.randint(n_files)]))
    n_feats = len(eg_feats)
    print(f"Will calculate {n_feats} features for {n_files} images.")# Feature vectors look like:\n{eg_feats}")
    x = np.empty((n_files, n_feats), dtype="float16")
    y = np.zeros(n_files, dtype="int")
    for i,f in enumerate(files):
        img = read_image(f)
        x_i = feat_all(img)
        head, tail = os.path.split(f)
        _, target = os.path.split(head)
        y[i] = all_categories.index(target)
        if np.any(np.isinf(x_i)):
            print(i, f, x_i)
            x_i[np.isinf(x_i)] = 0
        x[i, :] = x_i
    return x, y

In [487]:
x, y = build_xy(subset_filenames, all_categories)
print(x.shape, y.shape)
#print(y)

Will calculate 39 features for 849 images.
(849, 39) (849,)


In [488]:
print(np.argwhere(np.isinf(x)))

[]


### Classifier cross-validation (Question #3)

In [489]:
from sklearn.ensemble import RandomForestClassifier

In [490]:
# Create a random forest Classifier. By convention, clf means 'Classifier'
clf = RandomForestClassifier(n_jobs=-1, random_state=0)

In [491]:
xval_scores = model_selection.cross_val_score(clf, x, y, cv=5)
print(f"The cross-validation scores were:\n{xval_scores}\nMean: {np.mean(xval_scores)}.\n\
The chance level with 50 categories is 2% so this is {np.mean(xval_scores)-.02} better.")

The cross-validation scores were:
[ 0.16842105  0.20903955  0.24418605  0.20625     0.24666667]
Mean: 0.21491266276649448.
The chance level with 50 categories is 2% so this is 0.1949126627664945 better.


### Train on all data

In [None]:
xall, yall = build_xy(all_filenames, all_categories)
clf.fit(xall, yall)

Will calculate 39 features for 4244 images.


#### Feature importance

In [478]:
importances = clf.feature_importances_
indices = np.argsort(importances)[::-1]
print(indices)

[ 0  1 36 19 20 25  4  3 12 14 37  9 38 15  6 17  5 24 10 16  8 18 27 23 31
 21 22 32 30 13  7 11 28 29 26  2 33 35 34]


So the most important features are the image height and width, and the nanmedian of the canny edge-detector of the blue channel of the image. I really think I could have done better with different features, or by not compressing my features down to one number, but I'm out of time. Oh well.

Probably should have pandas-ized this too, as it stands you have to look up where the 0, 1, and 2 are and then figure out what those are.

### Final validation

In [492]:
def run_final_classifier(val_dir):
    global clf #clf.fit (above) must have been called!
    files = glob.glob(f"{val_dir}/*.jpg")
    xtest, ytest = build_xy(files[::5], all_categories)
    preds = clf.predict(xtest).astype(int)
    results = (list(zip(files, ytest, preds)))
    print("------------------------------------------------------------")
    for r in results:
        print(f"{r[0]}\t{all_categories[r[2]]}")

#### Put validation directory here

In [493]:
run_final_classifier("50_categories/airplanes/")

Will calculate 39 features for 107 images.


NotFittedError: This RandomForestClassifier instance is not fitted yet. Call 'fit' with appropriate arguments before using this method.

### Final notes

This was a cool assignment! Wish I had spent more time on it, considering how early I started, because I got bogged down in an approach and a couple more refactorings and iterations would have been useful. Also it would have been good to do GridSearchCV to figure out the correct hyperparameters for the classifier. I still feel like I learned a lot though.
