# CSE527 Homework 3
**Due date: 23:59 on 11/06, 2018 (Tuesday)**

## Description
---
In this homework, we will examine the task of scene recognition starting with
very simple methods: tiny images and nearest neighbor classification, and then
move on to more advanced methods: bags of quantized local features and linear
classifiers learned by support vector machines.

Bag of words models are a popular technique for image classification inspired by
models used in natural language processing. The model ignores or downplays word
arrangement (spatial information in the image) and classifies based on a
histogram of the frequency of visual words. The visual word "vocabulary" is
established by clustering a large corpus of local features. See Szeliski chapter
14.4.1 for more details on category recognition with quantized features. In
addition, 14.3.2 discusses vocabulary creation and 14.1 covers classification
techniques.

For this homework you will be implementing a basic bag of words model. You will
classify scenes into one of 15 categories by training and testing on the 15
scene database (introduced in [Lazebnik et al.
2006](http://www.di.ens.fr/willow/pdfs/cvpr06b.pdf), although built on top of
previously published datasets).
[Lazebnik et al. 2006](http://www.di.ens.fr/willow/pdfs/cvpr06b.pdf) is a great
paper to read, although we will be implementing the baseline method the paper
discusses (equivalent to the zero level pyramid) and not the more sophisticated
spatial pyramid. For an excellent survey of
pre-deep-learning feature encoding methods for bag of words models, see
[Chatfield et al, 2011](http://www.robots.ox.ac.uk/~vgg/research/encoding_eval/).

You are required to implement 2 different image representations: tiny images and bags of SIFT features, and 2 different classification techniques: nearest neighbor and linear SVM. There are 3 problems plus a performance report in this homework with a total of 100 points. 1 bonus question with extra 10 points is provided under problem 3. The maximum points you may earn from this homework is 100 + 10 = 110 points. Be sure to read **Submission Guidelines** below. They are important.

## Dataset
---
The starter code trains and tests on 100 images from each category (i.e. 1500
training examples total and 1500 test cases total). In a real research paper,
one would be expected to test performance on random splits of the data into
training and test sets, but the starter code does not do this to ease debugging.
Download the dataset
[here](https://drive.google.com/a/cs.stonybrook.edu/file/d/0B446EB1iI6_Qc0Q1NTRTajdUVTg/view?usp=sharing). <br>

Once downloded, extract it to your root folder Surname_Givenname_SBUID. Under your root folder,
there should be a folder named "data" (i.e. XXX/Surname_Givenname_SBUID/data) containing the images.
**Delete** the data subfolder before submission or the blackboard won't let you do so because
of the size. There should be only one .ipynb file under your root folder Surname_Givenname_SBUID.


## Starter Code
---
To make your task a little easier, below we provide some starter code which
randomly guesses the category of every test image and achieves about 6.6% accuracy
(1 out of 15 guesses is correct).

In [4]:
import cv2
import numpy as np
import matplotlib.pyplot as plt
import glob
import itertools
import sys
print(sys.version)
from sklearn import preprocessing 
from sklearn import neighbors
from sklearn.neighbors import NearestNeighbors
from sklearn.cluster import KMeans
from sklearn.svm import LinearSVC
from sklearn.datasets import make_classification


2.7.15 | packaged by conda-forge | (default, Jul 27 2018, 15:07:58) [MSC v.1500 64 bit (AMD64)]


In [26]:
class_names = [name[11:] for name in glob.glob('data/train/*')] 
class_names = dict(zip(xrange(len(class_names)), class_names))

def load_dataset(path, num_per_class=-1):
    data = []
    labels = []
    for id, class_name in class_names.iteritems():
        img_path_class = glob.glob(path + class_name + '/*.jpg')
        if num_per_class > 0:
            img_path_class = img_path_class[:num_per_class]  #img_path_calss[0:100]
            labels.extend([id]*len(img_path_class)) # labels 0, 0, 0, 0 
        for filename in img_path_class:
            data.append(cv2.imread(filename, 0))
    return data, labels

# load training dataset
train_data, train_label = load_dataset('data/train/', 100)
train_num = len(train_label) # 1500

# load testing dataset
test_data, test_label = load_dataset('data/test/', 100)
test_num = len(test_label) #1500

# feature extraction
def extract_feat(raw_data):
    feat_dim = 1000
    feat = np.zeros((len(raw_data), feat_dim), dtype=np.float32)
    for i in xrange(feat.shape[0]):
        feat[i] = np.reshape(raw_data[i], (raw_data[i].size))[:feat_dim] # dummy implemtation
        
    return feat

train_feat = extract_feat(train_data)
test_feat = extract_feat(test_data)

# model training: take feature and label, return model
def train(X, Y):
    return 0 # dummy implementation

# prediction: take feature and model, return label
def predict(model, x):
    return np.random.randint(15) # dummy implementation

# evaluation
predictions = [-1]*len(test_feat)
for i in xrange(test_num):
    predictions[i] = predict(None, test_feat[i])
    
accuracy = sum(np.array(predictions) == test_label) / float(test_num)

print "The accuracy of my dummy model is {:.2f}%".format(accuracy*100)

The accuracy of my dummy model is 6.67%


## Problem 1: Tiny Image Representation + Nearest Neighbor Classifier
{25 points} You will start by implementing the tiny image representation and the nearest neighbor classifier. They are easy to understand, easy to implement, and run very quickly for our experimental setup.

The "tiny image" feature is one of the simplest possible image representations. One simply resizes each image to a small, fixed resolution. You are required to **resize the image to 16x16**. It works slightly better if the tiny image is made to have zero mean and unit length (normalization). This is not a particularly good representation, because it discards all of the high frequency image content and is not especially invariant to spatial or brightness shifts. We are using tiny images simply as a baseline.

The nearest neighbor classifier is equally simple to understand. When tasked with classifying a test feature into a particular category, one simply finds the "nearest" training example (L2 distance is a sufficient metric) and assigns the label of that nearest training example to the test example. The nearest neighbor classifier has many desirable features — it requires no training, it can learn arbitrarily complex decision boundaries, and it trivially supports multiclass problems. It is quite vulnerable to training noise, though, which can be alleviated by voting based on the K nearest neighbors (but you are not required to do so). Nearest neighbor classifiers also suffer as the feature dimensionality increases, because the classifier has no mechanism to learn which dimensions are irrelevant for the decision.

Report your classification accuracy on the test sets and time consumption.

**Hints**:
- Use [cv2.resize()](https://docs.opencv.org/2.4/modules/imgproc/doc/geometric_transformations.html#resize) to resize the images;
- Use [NearestNeighbors in Sklearn](http://scikit-learn.org/stable/modules/neighbors.html) as your nearest neighbor classifier.

In [27]:
class_names = [name[11:] for name in glob.glob('data/train/*')] 
class_names = dict(zip(xrange(len(class_names)), class_names))

def load_dataset(path, num_per_class=-1):
    data = []
    labels = []
    for id, class_name in class_names.iteritems():
        img_path_class = glob.glob(path + class_name + '/*.jpg')
        if num_per_class > 0:
            img_path_class = img_path_class[:num_per_class]  
            labels.extend([id]*len(img_path_class)) 
        for filename in img_path_class:
            img=cv2.imread(filename,0)
            resize_img = cv2.resize(img, (16,16), interpolation=cv2.INTER_CUBIC)
            np_img=np.array(resize_img)
            mean_image=np.mean(np_img)
            var_image=np.var(img,ddof=1)
            normalized_one=np_img-mean_image
            final_image=np.divide(normalized_one,var_image)
            data.append(final_image)   
    return data, labels

def flat_input (train_data):
    flat_train_data=[]
    for element in train_data:
        flat_element=element.ravel()
        flat_train_data.append(flat_element)
    X=np.array(flat_train_data)
    return X

def knearst_neighbor_report():
    train_data, train_label = load_dataset('data/train/', 100)
    train_num = len(train_label) 
    test_data, test_label = load_dataset('data/test/', 100)
    test_num = len(test_label) 
    X=flat_input(train_data)
    train_labels_np=np.array(train_label)
    y=np.transpose(train_labels_np)

    test_X=flat_input(test_data)
    test_labels_np=np.array(test_label)
    test_y=np.transpose(test_labels_np)
    
    n_neighbors = 15
    
    for weights in ['uniform', 'distance']:
        clf = neighbors.KNeighborsClassifier(n_neighbors, weights=weights)
        clf.fit(X, y) 
        Z = clf.predict(test_X)
        accuracy = sum(Z == test_labels_np) / float(test_num)
        print("the accuracy of  using "+ weights + " nearst neighbor is "+ str(accuracy))
        
knearst_neighbor_report()

the accuracy of  using uniform nearst neighbor is 0.195333333333
the accuracy of  using distance nearst neighbor is 0.205333333333


## Problem 2: Bag of SIFT Representation + Nearest Neighbor Classifer
{35 points}
After you have implemented a baseline scene recognition pipeline it is time to
move on to a more sophisticated image representation — bags of quantized SIFT
features. Before we can represent our training and testing images as bag of
feature histograms, we first need to establish a vocabulary of visual words. We
will form this vocabulary by sampling many local features from our training set
(10's or 100's of thousands) and then cluster them with k-means. The number of
k-means clusters is the size of our vocabulary and the size of our features. For
example, you might start by clustering many SIFT descriptors into k=50 clusters.
This partitions the continuous, 128 dimensional SIFT feature space into 50
regions. For any new SIFT feature we observe, we can figure out which region it
belongs to as long as we save the centroids of our original clusters. Those
centroids are our visual word vocabulary. Because it can be slow to sample and
cluster many local features, the starter code saves the cluster centroids and
avoids recomputing them on future runs.

Now we are ready to represent our training and testing images as histograms of
visual words. For each image we will densely sample many SIFT descriptors.
Instead of storing hundreds of SIFT descriptors, we simply count how many SIFT
descriptors fall into each cluster in our visual word vocabulary. This is done
by finding the nearest neighbor k-means centroid for every SIFT feature. Thus,
if we have a vocabulary of 50 visual words, and we detect 220 distinct SIFT
features in an image, our bag of SIFT representation will be a histogram of 50
dimensions where each bin counts how many times a SIFT descriptor was assigned
to that cluster. The total of all the bin-counts is 220. The histogram should be
normalized so that image size does not dramatically change the bag of features
magnitude.

**Note**: 
- Instead of using SIFT to detect invariant keypoints which is time-consuming,
  you are recommended to densely sample keypoints in a grid with certain step
  size (sampling density) and scale.
- There are many design decisions and free parameters for the bag of SIFT
  representation (number of clusters, sampling density, sampling scales, SIFT
  parameters, etc.) so accuracy might vary from 50% to 60%.
- Indicate clearly the parameters you use along with the prediction accuracy
  on test set and time consumption.

**Hints**:
- Use [KMeans in Sklearn](http://scikit-learn.org/stable/modules/generated/sklearn.cluster.KMeans.html)
  to do clustering and find the nearest cluster centroid for each SIFT feature;
- Use `cv2.xfeatures2d.SIFT_create()` to create a SIFT object;
- Use `sift.compute()` to compute SIFT descriptors given densely sampled keypoints
  ([cv2.Keypoint](https://docs.opencv.org/3.0-beta/modules/core/doc/basic_structures.html?highlight=keypoint#keypoint)).

In [139]:
class_names = [name[11:] for name in glob.glob('data/train/*')] 
class_names = dict(zip(xrange(len(class_names)), class_names))

def load_dataset(path, num_per_class=-1):
    data = []
    labels = []
    row_count=0    
    for id, class_name in class_names.iteritems():
        img_path_class = glob.glob(path + class_name + '/*.jpg')
        if num_per_class > 0:
            img_path_class = img_path_class[:num_per_class]  
            labels.extend([id]*len(img_path_class)) 
        for filename in img_path_class:
            img=cv2.imread(filename,0)
            row_count=row_count+img.shape[0]
            data.append(img)  
    print(row_count)
    return data, labels

train_data, train_label = load_dataset('data/train/', 10) # load 10 image/class 
train_num = len(train_label)
test_data, test_label = load_dataset('data/test/', 100)
test_num = len(test_label) 

36221
363750


In [140]:
print(train_num)
print(test_num)

150
1500


In [141]:
print(len(train_data))
print(train_data[0])

150
[[159 142 114 ...,  56  49  39]
 [220 234 230 ...,  63  53  46]
 [220 236 238 ...,  67  59  51]
 ..., 
 [ 42  36  37 ...,  31  28  23]
 [ 38  28  38 ...,  26  31  29]
 [ 41  43  45 ...,  31  34  35]]


In [119]:
train_data_image=np.array_split(train_data, 15) # divide to 15 groups each has 10 picture of the 
# test_data_image=np.array_split(test_data, 15) #divide to 15 groups each has 100 pictures of the group

In [120]:
print(type(train_data_image))
print(type(test_data))

<type 'list'>
<type 'list'>


In [121]:
print(len(train_data_image))
print(len(train_data_image[0])) 
print(train_data_image[0].shape)

15
100
(100L,)


In [122]:
def calculate_descriptors_for_Kmeans_train(data):
    sift = cv2.xfeatures2d.SIFT_create()
    all_descriptors=[]
    count=0
    for image in data:
        kp = sift.detect(image,None)  # key points 
        kp,des = sift.compute(image,kp) # calculate the descriptors 
        count=count+des.shape[0]
        all_descriptors.append(des)
    stack_array=all_descriptors[0]
    for i in range(1, len(all_descriptors)):
        stack_array=np.concatenate((stack_array, all_descriptors[i]), axis=0)
    return stack_array
def calculate_descriptors_for_Kmeans_test(data):
    sift = cv2.xfeatures2d.SIFT_create()
    kp = sift.detect(data,None)  # key points 
    kp,des = sift.compute(data,kp) # calculate the descriptors 
    return des

In [123]:
print(len(test_data))

1500


In [124]:
test_data_stack=[]   # each image 
for elements in test_data:
    test_data_stack.append(calculate_descriptors_for_Kmeans_test(elements))

KeyboardInterrupt: 

In [None]:
print(len(test_data_stack))
print(test_data_stack[0].shape)

In [None]:
train_data_stack=[]
for elements in train_data_image:
    train_data_stack.append(calculate_descriptors_for_Kmeans_train(elements))

In [None]:
# for elements in train_data_stack:
#     print(elements.shape)

In [None]:
kmeans_array=[]
kmeans_labels=[]
kmeans_centers=[]
for elements in train_data_stack:
    kmeans = KMeans(n_clusters=50, random_state=0).fit(elements)
    kmeans_array.append(kmeans)
    kmeans_labels.append(kmeans.labels_)
    kmeans_centers.append(kmeans.cluster_centers_)    

In [None]:
print(len(kmeans_array))

In [None]:
# print(kmeans_labels[0])
# print(kmeans_labels[1])

In [None]:
for kmeans in kmeans_array:
    for test_descriptor in test_data_stack:
        predict=kmeans.predict(stack_array_test)

In [None]:
stack_array=calculate_descriptors_for_Kmeans(train_data)

In [None]:
sift = cv2.xfeatures2d.SIFT_create()
kp = sift.detect(test_data[0],None)  # key points 
kp,des = sift.compute(test_data[0],kp) # calculate the descriptors 
print(des.shape)

In [None]:
stack_array_test=calculate_descriptors_for_Kmeans(test_data)

In [None]:
kmeans = KMeans(n_clusters=50, random_state=0).fit(stack_array)

In [None]:
print(type(kmeans))
print(kmeans)

In [None]:
test_predict=kmeans.predict(stack_array_test)
print(len(test_predict))
print(stack_array_test.shape)

In [None]:
def hist_gen(arr, numPerChunk): 
    # there are a total of 15 classes with 100 image in each class 
#     single_image=np.array_split(kmeans.labels_, 1500) # each element is representative of single image
    single_image=np.array_split(arr, numPerChunk)
#     print(len(single_image))
    hist=[]
    for elements in single_image:
        unique, counts = np.unique(elements, return_counts=True)
        hist.append(counts)
    return hist

In [None]:
hist=hist_gen(kmeans.labels_,1500)
hist_test=hist_gen(test_predict,1500)

In [110]:
print(hist[0])
test_labels_np=np.array(test_label)

[ 1  3 21  8  2  9  5  6  8  6  5  3  9  2  9  5  5  5  4  6  5  3  1  5  3
  9  4  7  7 16  4  1  2  7  4  2  5  3  3  9  8  2  1  3  4  5  6  1  1  4]


# my original method 

In [146]:
class_names = [name[11:] for name in glob.glob('data/train/*')] 
class_names = dict(zip(xrange(len(class_names)), class_names))

def load_dataset(path, num_per_class=-1):
    data = []
    labels = []
    row_count=0    
    for id, class_name in class_names.iteritems():
        img_path_class = glob.glob(path + class_name + '/*.jpg')
        if num_per_class > 0:
            img_path_class = img_path_class[:num_per_class]  
            labels.extend([id]*len(img_path_class)) 
        for filename in img_path_class:
            img=cv2.imread(filename,0)
            row_count=row_count+img.shape[0]
            data.append(img)  
    print(row_count)
    return data, labels

train_data, train_label = load_dataset('data/train/', 10) # load 10 image/class 
train_num = len(train_label)
test_data, test_label = load_dataset('data/test/', 100)
test_num = len(test_label) 

36221
363750


In [147]:
def calculate_descriptors_for_Kmeans(data):
    sift = cv2.xfeatures2d.SIFT_create()
    all_descriptors=[]
    count=0
    for image in data:
        kp = sift.detect(image,None)  # key points 
        kp,des = sift.compute(image,kp) # calculate the descriptors 
        count=count+des.shape[0]
        all_descriptors.append(des)
    stack_array=all_descriptors[0]
    for i in range(1, len(all_descriptors)):
        stack_array=np.concatenate((stack_array, all_descriptors[i]), axis=0)
    return stack_array


In [148]:
train_data_stack=calculate_descriptors_for_Kmeans(train_data)
# test_data_stack=calculate_descriptors_for_Kmeans(test_data)

In [149]:
kmeans = KMeans(n_clusters=50, random_state=0).fit(train_data_stack) 

In [150]:
def calculate_histgram(data):
    sift = cv2.xfeatures2d.SIFT_create()
    hist=[]
    for image in data:
        kp = sift.detect(image,None)  # key points 
        kp,des = sift.compute(image,kp) # calculate the descriptors 
        cluster_group=np.array(kmeans.predict(des))
        unique, counts = np.unique(cluster_group, return_counts=True)
        dictionary=dict(zip(unique,counts))        
        total_sum=np.sum(counts,dtype=float)        
        nor_hist=np.divide(counts, total_sum)
        dictionary = dict(zip(unique, nor_hist))
        for i in range(0,50):
            if (dictionary.has_key(i)==False):
                dictionary[i]=0
        hist.append(dictionary.values())
    return hist
        
   

In [151]:
train_data_all, train_label_all = load_dataset('data/train/', 100) # load 10 image/class 
train_hist=calculate_histgram(train_data_all)

364142


In [152]:
test_hist=calculate_histgram(test_data)

In [153]:
print(len(test_hist))
print(test_hist[0])

1500
[0.0546875, 0.018229166666666668, 0, 0.0390625, 0.005208333333333333, 0.020833333333333332, 0.013020833333333334, 0.020833333333333332, 0.0390625, 0.03125, 0.010416666666666666, 0.013020833333333334, 0.0234375, 0.015625, 0.013020833333333334, 0.005208333333333333, 0.015625, 0.005208333333333333, 0.026041666666666668, 0.010416666666666666, 0.010416666666666666, 0.013020833333333334, 0.0078125, 0.0546875, 0.044270833333333336, 0.041666666666666664, 0.010416666666666666, 0.018229166666666668, 0.013020833333333334, 0.015625, 0.010416666666666666, 0.013020833333333334, 0.010416666666666666, 0.010416666666666666, 0.018229166666666668, 0.03125, 0.020833333333333332, 0.03125, 0.010416666666666666, 0.0234375, 0.046875, 0.020833333333333332, 0.018229166666666668, 0, 0.010416666666666666, 0.0234375, 0.036458333333333336, 0.026041666666666668, 0.010416666666666666, 0.018229166666666668]


In [154]:
test_labels_np=np.array(test_label)
n_neighbors=15
X=np.array(train_hist)
print(X.shape)
train_labels_np=np.array(train_label_all)
y=np.transpose(train_labels_np)
print(y.shape)
for weights in ['uniform', 'distance']:
    clf = neighbors.KNeighborsClassifier(n_neighbors, weights=weights)
    clf.fit(X, y) 
    Z = clf.predict(test_hist)
    accuracy = sum(Z == test_labels_np) / float(test_num)
    print("the accuracy of "+ weights + " is "+ str(accuracy))
        

(1500L, 50L)
(1500L,)
the accuracy of uniform is 0.401333333333
the accuracy of distance is 0.409333333333


# Problem 3: Bag of SIFT Representation + one-vs-all SVMs
{20 points}
The last task is to train one-vs-all linear SVMS to operate in the bag of SIFT
feature space. Linear classifiers are one of the simplest possible learning
models. The feature space is partitioned by a learned hyperplane and test cases
are categorized based on which side of that hyperplane they fall on. Despite
this model being far less expressive than the nearest neighbor classifier, it
will often perform better.

You do not have to implement the support vector machine. However, linear
classifiers are inherently binary and we have a 15-way classification problem
(the library has handled it for you). To decide which of 15 categories a test
case belongs to, you will train 15 binary, one-vs-all SVMs. One-vs-all means
that each classifier will be trained to recognize 'forest' vs 'non-forest',
'kitchen' vs 'non-kitchen', etc. All 15 classifiers will be evaluated on each
test case and the classifier which is most confidently positive "wins". E.g. if
the 'kitchen' classifier returns a score of -0.2 (where 0 is on the decision
boundary), and the 'forest' classifier returns a score of -0.3, and all of the
other classifiers are even more negative, the test case would be classified as a
kitchen even though none of the classifiers put the test case on the positive
side of the decision boundary. When learning an SVM, you have a free parameter
$\lambda$ (lambda) which controls how strongly regularized the model is. Your
accuracy will be very sensitive to $\lambda$, so be sure to try many values.

Indicate clearly the parameters you use along with the prediction accuracy on
test set and time consumption.

**Bonus {10 points}**: 10 points will be given to students whose accuracy
  ranks top 3 in this homework. Don't cheat and don't train your model on
  testing data, a separate testing dataset will be used to evaluate your model.

**Hints**:
- Use SVM in
  [Sklearn](http://scikit-learn.org/stable/modules/classes.html#module-sklearn.svm)
  (recommended) or
  [OpenCV](https://docs.opencv.org/3.0-alpha/modules/ml/doc/support_vector_machines.html)
  to do training and prediction.

In [158]:
test_labels_np=np.array(test_label)
X=np.array(train_hist)
train_labels_np=np.array(train_label_all)
y=np.transpose(train_labels_np)

clf = LinearSVC(random_state=0, tol=1e-5)
clf.fit(X, y)
LinearSVC(C=0.7, class_weight=None, dual=True, fit_intercept=True,
     intercept_scaling=1, loss='squared_hinge', max_iter=1000,
     multi_class='ovr', penalty='l2', random_state=0, tol=1e-05, verbose=0)
Z=clf.predict(test_hist)
accuracy = sum(Z == test_labels_np) / float(test_num)
print("the accuracy of "+ weights + " is "+ str(accuracy))

the accuracy of distance is 0.421333333333


## Performance Report
---
{20 points}
Please report the performance of the following combinations **in the given order**
in terms of the time consumed and classification accuracy. Describe your algorithm,
any decisions you made to write your algorithm in your particular way, and how
different choices you made affect it. Compute and draw a (normalized) [confusion matrix](https://en.wikipedia.org/wiki/Confusion_matrix), and discuss
where the method performs best and worse for each of the combination.
Here is an [example](http://scikit-learn.org/stable/auto_examples/model_selection/plot_confusion_matrix.html#sphx-glr-auto-examples-model-selection-plot-confusion-matrix-py) of how to compute confusion matrix.


1st: Tiny images representation and nearest neighbor classifier (accuracy of about 18-25%).<br>
2nd: Bag of SIFT representation and nearest neighbor - classifier (accuracy of about 50-60%). <br>
3rd: Bag of SIFT representation and linear SVM classifier (accuracy of about 60-70%). <br>

**First combination:** <br>

-- Time consumed and prediction accuracy

-- Algorithm descriptions and discussions

-- Confusion matrix observations

**Second combination:** <br>

-- Time consumed and prediction accuracy

-- Algorithm descriptions and discussions

-- Confusion matrix observations

**Third combination:** <br>

-- Time consumed and prediction accuracy

-- Algorithm descriptions and discussions

-- Confusion matrix observations

In [None]:
def plot_confusion_matrix(cm, classes,
                          normalize=False,
                          title='Confusion matrix',
                          cmap=plt.cm.Blues):
    """
    This function prints and plots the confusion matrix.
    Normalization can be applied by setting `normalize=True`.
    """

c_names = [name[11:] for name in glob.glob('data/train/*')]

#First combination:
# Confusion matrix
cm1 = confusion_matrix(pred1, label1)
plt.figure(figsize=(12,12))
plot_confusion_matrix(cm1, c_names, normalize=True)
plt.show()

#Second combination:
# Confusion matrix
cm2 = confusion_matrix(pred2, label2)
plt.figure(figsize=(12,12))
plot_confusion_matrix(cm2, c_names, normalize=True)
plt.show()

#Third combination:
# Confusion matrix
cm3 = confusion_matrix(pred3, label3)
plt.figure(figsize=(12,12))
plot_confusion_matrix(cm3, c_names, normalize=True)
plt.show()

## Submission guidelines
---
Extract the downloaded .zip file to a folder of your preference. The input and output paths are predefined and **DO NOT** change them. The image read and write functions are already written for you. 

When submitting your .zip file through blackboard, please <br> 
-- name your .zip file as Surname_Givenname_SBUID (example: Trump_Donald_11113456). <br>
-- DO NOT change the folder structre, please just fill in the blanks. <br>

You are encouraged to make posts and answer questions on Piazza. Due to the amount of emails I receive from past years, it is unfortunate that I won't be able to reply all your emails. Please ask questions on Piazza and send emails only when it is private.

To encourage you to answer questions on piazza, the three persons answering the most questions will be awarded extra 5 points at the end of the semester.

If you alter the folder strucutres, the grading of your homework will be significantly delayed and possibly penalized. And I **WILL NOT** reply to any email regarding this matter.

Be aware that your codes will undergo plagiarism checker both vertically and horizontally. Please do your own work.

Late submission penalty: <br>
There will be a 10% penalty per day for late submission. However, you will have 3 days throughout the whole semester to submit late without penalty. Note that the grace period is calculated by days instead of hours. If you submit the homework one minute after the deadline, one late day will be counted. Likewise, if you submit one minute after the deadline, the 10% penaly will be imposed if not using the grace period. All late penalties incurred will be applied to your scores at the end of the semester.

Some important things to note: <br>
A correct pipeline for your submitted folder structure: <br>
1) Download the .zip file from blackboard and unzip it (e.g. CSE527-HW1-Fall18.zip) <br>
2) The unzipped folder should have name like CSE527-HW1-Fall18, rename it to Surname_Givenname_SBUID <br>
3) Write your codes in the given .ipynb file <br>
4) Save the visual outputs in the .ipynb file <br>
5) Rezip your Surname_Givenname_SBUID folder and submit <br>

**2 credits will be deducted** from HW2 and onwards if:
1) The unzipped folder still have name like CSE527-HW1-Fall18 <br>
2) There is a nested folder named CSE527-HW1-Fall18 under your Surname_Givenname_SBUID folder <br>
3) You zipped sub-folders and .ipynb directly without providing a root folder called Surname_Givenname_SBUID <br>
4) There are more than one .ipynb file under your folder (people who did this didn't receive a score for HW1) <br>
5) The naming didn't conform to Surname_Givenname_SBUID <br>
6) You didn't save the visual outputs inside your .ipynb file <br>