# Step 1: Compute the texture descriptions for the training images.

For each training image, calculate a vector of GLCM features.  Which GLCM features and the set of displacements you choose to you use are up to you (note that displacements for `skimage.feature.graycomatrix()` need to be specified by distances and angles in radians rather than change in x and y directions).  Experiment to obtain the best possible classification rate.  Use conservative choices to begin with until everything is working, then come back and experiemnt.  As described in the Topic 10 lecture notes, use `skimage.feature.graycomatrix()` and `skimage.feature.graycoprops()` to calculate GLCM features.  You'll probably want to use `normed=True` with `graycomatrix`.  Your GLCM features should be stored as a 120-row array by m-element array, (m will depend on how many different features and displacements you used and whether or not you combine values for different displacements or not, e.g., by taking their mean).  

_Hint: Pay close attention to the format of the return values of  `graycomatrix()` and `graycoprops()`._

For each training image, calculate the rotationally invariant LBP features using `skiamge.feature.local_binary_pattern()`.  You can experiment with parameters `P` and `R` to get a good classification rate, but probably `P=8` and `R=1` are good enough.   For the `method` parameter, use `'uniform'` which gives you the LBP flavour we talked about in class.   Remember that `skiamge.feature.local_binary_pattern()` returns an "LBP Image", which is an image in which the pixel value is between 0 and 9, and corresponds to one of the ten possible pattern labels.  It's up to you to turn the "LBP Image" into a 10-bin histogram, which serves as the feature vector for that image (you can use `numpy.histogram` for this but again remember to specify `bins` and `range` parameters, and that it returns two things, and you only need the first one). 

Addionally, calculate the LBP variance feature again using `skimage.feature.local_binary_pattern()` but use `method='var'` instead.  This is the VAR feature we saw in class.  Use the same P and R as before.  Build a 16-bin histogram of the resulting 'LBP-VAR' image; use `range=(0,7000)` with `numpy.hisotgram()` (this is not quite "correct", but it's good enough).  Concatenate these with the rotationally invariant LBP features so that you have a 26-element feature vector for each training image.   These should be stored as a 120-row, 26-column array.

You can do this all in one loop which builds both feature arrays.



In [16]:
import os as os
import numpy as np
import skimage.io as io
import skimage.feature as feat
import math as m


glcm_props = ['contrast', 'homogeneity', 'energy', 'correlation']
distances = (8,)
angles = (0, m.pi/4, m.pi/2, 3*m.pi/4)
P=8
R=1
            
training_lbp = []
training_glcm = []
for root, dirs, files in os.walk('./brodatztraining'):
    for f in files:
        if f[-4:] == '.png':
            I = io.imread(os.path.join(root, f))
            
            # Compute the LBP features for the image
            lbp = feat.local_binary_pattern(I, P, R, method='uniform')
            var = feat.local_binary_pattern(I, P, R, method='var')
        
            lbp_hist, bin_edges = np.histogram(lbp, range=(0,9), bins=10)
            var_hist, bin_edges = np.histogram(var, range=(0,7000), bins=16)
            training_lbp.append(np.hstack((lbp_hist, var_hist)))
            
            # Compute the GLCM features for the image

            glcm = feat.greycomatrix(I, distances, angles, normed=True)
            
            glcm_features = []
            for p in glcm_props:
                props = feat.greycoprops(glcm, prop=p)
                glcm_features.append(np.mean(props))
    
            glcm_features = np.array(glcm_features)
            training_glcm.append(glcm_features)
            
                
                
            
            
training_lbp = np.vstack(training_lbp)
training_glcm = np.vstack(training_glcm)


  keep = (tmp_a >= first_edge)
  keep &= (tmp_a <= last_edge)


# Step 2: Compute Test Image Features

Compute the exact same features as you did in step 1 for each of the test images.  Store them in the same way (these arrays will just have more rows, specifically 320 rows, one for each testing sample). For GLCM you'll probably have trouble beating 65% classificadtion rate.  For LBP you should be able to get 95% or better.

In [17]:
testing_lbp = []
testing_glcm = []
for root, dirs, files in os.walk('./brodatztesting'):
    for f in files:
        if f[-4:] == '.png':
            I = io.imread(os.path.join(root, f))
            
            # Compute the LBP features for the image
            lbp = feat.local_binary_pattern(I, P, R, method='uniform')
            var = feat.local_binary_pattern(I, P, R, method='var')
        
            lbp_hist, bin_edges = np.histogram(lbp, range=(0,9), bins=10)
            var_hist, bin_edges = np.histogram(var, range=(0,7000), bins=16)
            testing_lbp.append(np.hstack((lbp_hist, var_hist)))
            
            # Compute the GLCM features for the image

            glcm = feat.greycomatrix(I, distances, angles, normed=True)
            
            glcm_features = []
            for p in glcm_props:
                props = feat.greycoprops(glcm, prop=p)
                glcm_features.append(np.mean(props))
    
            glcm_features = np.array(glcm_features)
            testing_glcm.append(glcm_features)
            
                
                
            
            
testing_lbp = np.vstack(testing_lbp)
testing_glcm = np.vstack(testing_glcm)


# Step 3: Generate Label Arrays for the Training and Testing Data

Use labels 1 for the first class, label 2 for the second class, etc.   This should be easy to do since the filenames are ordered in blocks of 15 or 40 images of each class for training and testing respectively.

In [18]:
labels = np.ones(15, dtype='int')
training_labels = np.hstack( [labels * (i+1) for i in range(8) ] )

labels = np.ones(40, dtype='int')
testing_labels = np.hstack( [labels * (i+1) for i in range(8) ] )


# Step 4:  Train an KNN classifier.  

Train an KNN  classifier using your GLCM features.  Train another one using your LBP features.



In [19]:
import sklearn.svm as svm
import sklearn.neighbors as knn

KNN_glcm = knn.KNeighborsClassifier(n_neighbors=1)
KNN_glcm.fit(training_glcm, training_labels)

KNN_lbp = knn.KNeighborsClassifier(n_neighbors=1)
KNN_lbp.fit(training_lbp, training_labels)



KNeighborsClassifier(algorithm='auto', leaf_size=30, metric='minkowski',
           metric_params=None, n_jobs=None, n_neighbors=1, p=2,
           weights='uniform')

# Step 5:  Predict the classes of the test images

Predict the classes of the test images using both classifiers.

In [20]:
predictions_glcm_knn = KNN_glcm.predict(testing_glcm)
predictions_lbp_knn = KNN_lbp.predict(testing_lbp)

# Step 6:  Display Results

Display results as in the final step of Question 1.  For each classifier display the image filenames that were incorrectly classified, the confisuion matrix, and the classification rate.



In [21]:
def display_results(predictions, testing_labels, feat_name):    
    # Construct a boolean array that denotes which images were correctly classified.
    correct_labels = predictions==testing_labels
    
    # obtain the filenames of images that were incorrectly classified.
    incorrectly_classified = [ files[i] for i in range(len(files)) if not correct_labels[i] ]
    
    # Print out the names of incorrectly classified images.
    for f in incorrectly_classified:
        print(f, 'was incorrectly classified.')
    print()  

    # Compute and print out the confusion matrix.
    confusion = np.zeros((8, 8), dtype='int')
    
    for i in range(len(predictions)):
        confusion[testing_labels[i]-1, predictions[i]-1] += 1
        
    print('The confusion matrix for', feat_name, 'is:')
    for x in confusion:
        print('{:5}, {:5}, {:5}, {:5}, {:5}, {:5}, {:5}, {:5}'.format(x[0], x[1], x[2], x[3], x[4], x[5], x[6], x[7]))
    print()
    
    # Compute and print out the classification rate.
    correct_rate = np.sum(correct_labels) / len(predictions)
    print('The classification rate for', feat_name, 'was', correct_rate*100, 'percent.')
    
display_results(predictions_glcm_knn, testing_labels, 'GLCM-KNN')
display_results(predictions_lbp_knn, testing_labels, 'LBP-KNN')


patch-260526.png was incorrectly classified.
patch-717001.png was incorrectly classified.
patch-162809.png was incorrectly classified.
patch-759385.png was incorrectly classified.
patch-423283.png was incorrectly classified.
patch-306264.png was incorrectly classified.
patch-120541.png was incorrectly classified.
patch-425816.png was incorrectly classified.
patch-330503.png was incorrectly classified.
patch-407692.png was incorrectly classified.
patch-658306.png was incorrectly classified.
patch-838954.png was incorrectly classified.
patch-549113.png was incorrectly classified.
patch-243842.png was incorrectly classified.
patch-154835.png was incorrectly classified.
patch-815756.png was incorrectly classified.
patch-802202.png was incorrectly classified.
patch-551206.png was incorrectly classified.
patch-144615.png was incorrectly classified.
patch-159888.png was incorrectly classified.
patch-614325.png was incorrectly classified.
patch-850142.png was incorrectly classified.
patch-5451

# Step 7: Reflections

Answer the following questions right here in this block:

- Discuss the performance difference of the two different texture features.  Hypothesize reasons for observed differenes.
	
	_Your answer:_

- For each of your two classifiers, discuss the misclassified images.  Were there any classes that were particularly difficult to distinguish?  Do the misclassified images (over all classes) have anything in common that would cause them to be misclassified?  If so what do they ahve in common, and why do you think it is confusing the classifier?

	_Your answer:_