# AS 6 Image Classification

Richard Yang

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

import warnings
warnings.filterwarnings('ignore')

import tensorflow as tf
from tensorflow import keras
from keras.preprocessing.image import ImageDataGenerator
print(tf.__version__)

2.12.0


## 1. Data Processing: 

In [2]:
# Build the ImageDataGenerator
train_datagen = ImageDataGenerator(rescale=1./255, shear_range=0.2, \
                                   zoom_range=0.2, horizontal_flip=True)



In [3]:
# Build the training set
trainGen = train_datagen.flow_from_directory('C:/Users\Richa/MLcode/week6/dataset_train/dataset_train',  # this is the target directory
                                        target_size=(64, 64),  
                                        batch_size=32,
                                        class_mode='categorical')

Found 88 images belonging to 4 classes.


In [4]:
# What is the image shape of each training observation?

#shape of the image
n_shape = trainGen.image_shape
n_shape

(64, 64, 3)

In [5]:
#classes of the image
n_classes = np.unique(trainGen.classes)
n_classes

array([0, 1, 2, 3])

The image shape is (64, 64, 3) and the data type is uint8. The image is a 64x64 pixel image with 3 color channels (RGB). The data type is uint8, which means that the values range from 0 to 255.

There are **4** classes in the train data

## 2. Initial Classifier Build: 

In [6]:
# Build the CNN
def build_classifier():
    model = keras.Sequential()
    model.add(keras.layers.Conv2D(filters = 32,
                                    kernel_size = (3,3),
                                    input_shape = n_shape,
                                    activation = 'relu'))
    model.add(keras.layers.MaxPooling2D(pool_size = (2,2)))
    model.add(keras.layers.Conv2D(filters = 64,
                                    kernel_size = (3,3),
                                    activation = 'relu'))
    model.add(keras.layers.MaxPooling2D(pool_size = (2,2)))
    model.add(keras.layers.Flatten())
    model.add(keras.layers.Dense(units = 128, activation = 'relu'))
    model.add(keras.layers.Dense(units = len(n_classes), activation = 'softmax'))
    return model

In [7]:
# build the classifier

model1 = build_classifier()
model1.compile(optimizer = 'adam', loss = 'categorical_crossentropy', metrics = ['accuracy'])

model1.summary()

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 conv2d (Conv2D)             (None, 62, 62, 32)        896       
                                                                 
 max_pooling2d (MaxPooling2D  (None, 31, 31, 32)       0         
 )                                                               
                                                                 
 conv2d_1 (Conv2D)           (None, 29, 29, 64)        18496     
                                                                 
 max_pooling2d_1 (MaxPooling  (None, 14, 14, 64)       0         
 2D)                                                             
                                                                 
 flatten (Flatten)           (None, 12544)             0         
                                                                 
 dense (Dense)               (None, 128)               1

## 3. Model Runs: 

In [8]:
# a) Use .fit() with the training set. For the first run, use the following parameters: steps_per_epoch = 3, epochs = 3

model_my = model1.fit_generator(trainGen, steps_per_epoch = 3, epochs = 3)


Epoch 1/3
Epoch 2/3
Epoch 3/3


In [9]:
# save the model
model1.save('model_my.h5')
print("Saved model")

Saved model


In [10]:
# c) Predict using the model built in step 2.

# load the model

import os, glob
import numpy as np
from tensorflow.keras.preprocessing import image
from tensorflow.keras.models import load_model

# returns a compiled model
# identical to the previous one
model = load_model('model_my.h5')
print("Saved model")

# test data path
img_dir = "C:/Users/Richa/MLcode/week6/dataset_test" # Enter Directory of test set

# iterate over each test image
data_path = os.path.join(img_dir, '*g')
files = glob.glob(data_path)

# print the files in the dataset_test folder 
for f in files:
    print(f)
    
# make a prediction and add to results 
data = []
results = []
for f1 in files:
    img = image.load_img(f1, target_size = (64, 64))
    img = image.img_to_array(img)
    img = np.expand_dims(img, axis = 0)
    data.append(img)
    result = model.predict(img)
    r = np.argmax(result, axis=1)
    results.append(r)

results

Saved model
C:/Users/Richa/MLcode/week6/dataset_test\1022.png
C:/Users/Richa/MLcode/week6/dataset_test\1053.png
C:/Users/Richa/MLcode/week6/dataset_test\4011.png
C:/Users/Richa/MLcode/week6/dataset_test\4053.png
C:/Users/Richa/MLcode/week6/dataset_test\6023.png
C:/Users/Richa/MLcode/week6/dataset_test\6051.png
C:/Users/Richa/MLcode/week6/dataset_test\C014.png
C:/Users/Richa/MLcode/week6/dataset_test\C033.png


[array([0], dtype=int64),
 array([0], dtype=int64),
 array([0], dtype=int64),
 array([2], dtype=int64),
 array([1], dtype=int64),
 array([1], dtype=int64),
 array([1], dtype=int64),
 array([3], dtype=int64)]

d) Determine accuracy.

Note: To determine accuracy, you will need to check the labels given to each class in the training data and manually label your test data. This will require you to

Look into the training data(images) in the dataset_train folder, and then determine how a category was coded in keras using the following code:

In [11]:
# check category labels in training_set
trainGen.class_indices

{'category 1': 0, 'category 2': 1, 'category 3': 2, 'category 4': 3}

In [17]:
# according the results, the predicted labels are:
predicted_label = [0,0,0,2,1,1,1,3]

#Manually checking the labels of the test data
actual_label = [0,0,2,2,1,1,3,3]

# # Compare the predicted values to the actual values for the test set and calculate accuracy score
from sklearn.metrics import accuracy_score
accuracy_score(actual_label, predicted_label)

0.75

e) Run this process for the following combinations:

* (steps_per_epoch: 1, epochs: 1)
* (steps_per_epoch: 1, epochs: 2)
* (steps_per_epoch: 1, epochs: 3)
* (steps_per_epoch: 2, epochs: 4)
* (steps_per_epoch: 2, epochs: 5)
* (steps_per_epoch: 2, epochs: 6)
* (steps_per_epoch: 3, epochs: 7)
* (steps_per_epoch: 3, epochs: 8)
* (steps_per_epoch: 5, epochs: 9)
* (steps_per_epoch: 5, epochs: 10)

In [13]:
steps_per_epoch = [1, 1, 1, 2, 2, 2, 3, 3, 5, 5]
epochs = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
score = []

for i in range(0,len(steps_per_epoch)):
    model1 = build_classifier()
    # compile the model
    model1.compile(loss=keras.losses.categorical_crossentropy, optimizer='adam', metrics=['accuracy'])
    model1.fit_generator(trainGen,steps_per_epoch=steps_per_epoch[i], epochs=epochs[i])
    model_name = "model_" +str(steps_per_epoch[i])+"_"+str(epochs[i])
    model1.save(model_name)
    
    model = load_model(model_name)
# make a prediction and add to results 
    data = []
    results = []
    for f1 in files:
        img = image.load_img(f1, target_size = (64, 64))
        img = image.img_to_array(img)
        img = np.expand_dims(img, axis = 0)
        data.append(img)
        result = model.predict(img)
        r = np.argmax(result, axis=1)
        results.append(r)

    results = list(np.concatenate(results))
    
    score.append([steps_per_epoch[i], epochs[i], results])






INFO:tensorflow:Assets written to: model_1_1\assets


INFO:tensorflow:Assets written to: model_1_1\assets


Epoch 1/2
Epoch 2/2




INFO:tensorflow:Assets written to: model_1_2\assets


INFO:tensorflow:Assets written to: model_1_2\assets


Epoch 1/3
Epoch 2/3
Epoch 3/3




INFO:tensorflow:Assets written to: model_1_3\assets


INFO:tensorflow:Assets written to: model_1_3\assets


Epoch 1/4




Epoch 2/4
Epoch 3/4
Epoch 4/4




INFO:tensorflow:Assets written to: model_2_4\assets


INFO:tensorflow:Assets written to: model_2_4\assets


Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5




INFO:tensorflow:Assets written to: model_2_5\assets


INFO:tensorflow:Assets written to: model_2_5\assets


Epoch 1/6
Epoch 2/6
Epoch 3/6
Epoch 4/6
Epoch 5/6
Epoch 6/6




INFO:tensorflow:Assets written to: model_2_6\assets


INFO:tensorflow:Assets written to: model_2_6\assets


Epoch 1/7
Epoch 2/7
Epoch 3/7
Epoch 4/7
Epoch 5/7
Epoch 6/7
Epoch 7/7




INFO:tensorflow:Assets written to: model_3_7\assets


INFO:tensorflow:Assets written to: model_3_7\assets


Epoch 1/8
Epoch 2/8
Epoch 3/8
Epoch 4/8
Epoch 5/8
Epoch 6/8
Epoch 7/8
Epoch 8/8




INFO:tensorflow:Assets written to: model_3_8\assets


INFO:tensorflow:Assets written to: model_3_8\assets


Epoch 1/9








INFO:tensorflow:Assets written to: model_5_9\assets


INFO:tensorflow:Assets written to: model_5_9\assets


Epoch 1/10








INFO:tensorflow:Assets written to: model_5_10\assets


INFO:tensorflow:Assets written to: model_5_10\assets




In [14]:
list_accuracy = []
for idx in range(0,len(steps_per_epoch)):
    accuracy=0
    for j in range(0,len(actual_label)):
        if score[idx][2][j] == actual_label[j]:
            accuracy += 1
    list_accuracy.append(accuracy)

explain what the above for loop does:  

The for loop runs the model with different steps_per_epoch and epochs and prints out the accuracy of each model.

The steps_per_epoch is the number of batches of training images that go through the model before the epoch is considered finished. The epoch is the number of times the model goes through the entire training set.

In [15]:
data = {'steps_per_epoch' : steps_per_epoch, 'epoch': epochs,'accuracy' : list_accuracy}
df_accuracy = pd.DataFrame(data)
df_accuracy['accuracy'] = (df_accuracy['accuracy']/8)
df_accuracy

Unnamed: 0,steps_per_epoch,epoch,accuracy
0,1,1,0.25
1,1,2,0.625
2,1,3,0.25
3,2,4,0.875
4,2,5,0.75
5,2,6,0.875
6,3,7,0.75
7,3,8,0.875
8,5,9,0.625
9,5,10,0.5


We can notice that the accuracy increases as the steps_per_epoch and epochs increase. This is because the model is trained more times and the model is able to learn more from the training data.

## Conceptual Questions:

4. Discuss the effect of the following on accuracy and loss (train & test): 

- Increasing the steps_per_epoch
- Increasing the number of epochs

Increasing the steps_per_epoch increases the accuracy and decreases the loss. Increasing the number of epochs increases the accuracy and decreases the loss.

Increasing the number of epochs increases the accuracy and decreases the loss. Increasing the steps_per_epoch increases the accuracy and decreases the loss.

5. Name two uses of zero padding in CNN.

- Zero padding can be used to preserve the size of the input image. This is important because the image needs to be the same size as the output of the convolutional layers.

- It is also used to prevent the shrinking of the image as it goes through the convolutional layers. This is important because the image needs to be the same size as the output of the convolutional layers.

6. What is the use of a 1 x 1 kernel in CNN? 

- Dimensionality reduction. It is used to reduce the number of channels in the image. By using a 1x1 kernel, the convolutional layer can transform the input feature map from one depth (number of channels) to another depth, effectively reducing or increasing the number of channels. This can be helpful in reducing the computational cost of subsequent layers or adjusting the complexity of the network.

- Non-linearity. It is used to introduce non-linearity into the network. By using a 1x1 kernel, the convolutional layer can introduce non-linearity into the network. This can be helpful in increasing the complexity of the network.

- Improve Network Architecture. The presence of 1x1 convolutional layers enables the network to have more flexibility and expressiveness.

7. What are the advantages of a CNN over a fully connected DNN for this image classification problem?

- Local receptive fields: CNNs capture spatial relationships by connecting each neuron to a small region of the input image, enabling effective extraction of local patterns.

- Parameter sharing: Sharing weights across regions reduces parameters, making CNNs more efficient and enabling the learning of translation-invariant features.

- Pooling layers: Pooling summarizes important features, reduces spatial dimensions, and enhances robustness to small image transformations.

- Hierarchical representation learning: CNNs learn hierarchical features, progressing from low-level to high-level representations, enabling the modeling of complex concepts.

- Parameter efficiency: CNNs achieve high accuracy with fewer parameters by exploiting local structure, reducing computational complexity, and memory requirements.

In a fully connected layer, every neuron is connected to all neurons in the previous layer, with each connection having its own weight. This type of connection pattern is not specific to any particular features in the data and lacks assumptions about feature relationships. However, it is computationally and memory-intensive due to the large number of weights required to connect every neuron. 

In this problem, the CNN is more efficient than the fully connected DNN because it is a image classification problem. There are a lot of pixels in an image. There are also spatial relationships between pixels in an image. In addition, CNN will need less computational power and memory.