## Assignment 12

In this assignment, we will continue working on image classification using PyTorch.
* Download the intel image dataset from Kaggle.
* We will use the OpenCV image feature extraction library. 
`(conda install -c conda-forge opencv)`


1. [10 pts] Download the dataset, unzip and explore the file folders. Load the image dataset 
with training and testing grouped. (Note, `cv2` reads and saves in BGR channel order)

Display a few images. How many color channels are there?


In [1]:
import cv2
import os
import numpy as np

IMGSIZE = (128, 128)
CNAMES = ['buildings', 'forest', 'glacier', 'mountain', 'sea', 'street']

def get_images_labels (path):
    images, labels = [], []
    label_dict = {}
    label_count = 0
    for label_folder in os.listdir(path):
        real_path = os.path.join(path, label_folder)
        if not os.path.isdir(real_path): continue
        if label_folder not in label_dict:
            label_dict[label_folder] = label_count
            label_count += 1
        for f in os.listdir(real_path):
            img_path = os.path.join(real_path, f)
            img = cv2.imread(img_path)
            if img is not None:
                img = cv2.resize(img, IMGSIZE)
                images.append(img)
                labels.append(label_dict[label_folder])
    images = np.array(images)
    labels = np.array(labels)
    return images, labels

X_tr, y_tr, X_ts, y_ts = [], [], [], []
train_dir = 'seg_train/seg_train/'
test_dir = 'seg_test/seg_test/'
X_tr, y_tr = get_images_labels(train_dir)
X_ts, y_ts = get_images_labels(test_dir)

In [2]:
for i, img in enumerate(X_tr):
    cv2.imshow(f'Image - {y_tr[i]}', img)
    cv2.waitKey(1)  # Display each image for 1ms


2. [30 pts] Convert the imageset to numpy array, such as the array size:
(14034, 128, 128, 3)
Scale the imageset to [0-1].
Build a regular fully connected neural network and report its performance on this dataset.


In [3]:
# Scale image to 0-1
X_tr = X_tr.astype('float32') / 255.0
X_ts = X_ts.astype('float32') / 255.0

In [7]:
import tensorflow as tf
from tensorflow.keras import layers, models



# Flatten the images for the fully connected network
X_tr_flat = X_tr.reshape(X_tr.shape[0], -1)
X_ts_flat = X_ts.reshape(X_ts.shape[0], -1)

# Build the neural network
model = models.Sequential([
    layers.Dense(512, activation='relu', input_shape=(128 * 128 * 3,)),
    layers.Dense(256, activation='relu'),
    layers.Dense(len(np.unique(y_tr)), activation='softmax')
])

# Compile the model
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

# Train the model
history = model.fit(X_tr_flat, y_tr, epochs=10, validation_data=(X_ts_flat, y_ts))


Epoch 1/10
[1m439/439[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m60s[0m 135ms/step - accuracy: 0.3585 - loss: 6.0638 - val_accuracy: 0.4883 - val_loss: 1.3569
Epoch 2/10
[1m 33/439[0m [32m━[0m[37m━━━━━━━━━━━━━━━━━━━[0m [1m53s[0m 132ms/step - accuracy: 0.4892 - loss: 1.3708

KeyboardInterrupt: 

In [6]:
# Build the neural network
neural_model = models.Sequential([
    layers.Dense(512, activation='relu', input_shape=(128 * 128 * 3,)),
    layers.Dense(256, activation='relu'),
    layers.Dense(len(np.unique(y_tr)), activation='softmax')
])
# Flatten the images for the fully connected network
X_tr_flat = X_tr.reshape(X_tr.shape[0], -1)
X_ts_flat = X_ts.reshape(X_ts.shape[0], -1)

build_neural_network(X_tr_flat, X_ts_flat, y_tr, y_ts, neural_model)

Epoch 1/10
[1m439/439[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 129ms/step - accuracy: 0.3447 - loss: 7.5019

ValueError: Data cardinality is ambiguous. Make sure all arrays contain the same number of samples.'x' sizes: 14034
'y' sizes: 3000


3. [40 pts] Create a convolutional neural network (CNN) to train and report its performance on 
the testing portion of the dataset. 95% reclassification and 75% testing performance should 
be achievable without any hyperparameter tuning. (Hint: My model, which is similar to the 
model in module notebook, took around 10 minutes to train 10 epochs without a GPU.)


In [None]:
model = models.Sequential([
    layers.Conv2D(32, (3, 3), activation='relu', input_shape=(128, 128, 3)),
    layers.MaxPooling2D((2, 2)),
    layers.Conv2D(64, (3, 3), activation='relu'),
    layers.MaxPooling2D((2, 2)),
    layers.Conv2D(128, (3, 3), activation='relu'),
    layers.MaxPooling2D((2, 2)),
    layers.Conv2D(128, (3, 3), activation='relu'),
    layers.MaxPooling2D((2, 2)),
    layers.Flatten(),
    layers.Dense(512, activation='relu'),
    layers.Dense(len(np.unique(y_tr)), activation='softmax')  # Output layer
])

# Compile the model
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

# Summary of the model
model.summary()

model.fit(X_tr, y_tr, epochs=10, batch_size=32, validation_data=(X_ts, y_ts))

test_loss, test_accuracy = model.evaluate(X_ts, y_ts)
print(f'Test Loss: {test_loss}')
print(f'Test Accuracy: {test_accuracy * 100:.2f}%')

X_tr shape: (14034, 128, 128, 3)
y_tr unique labels: [0 1 2 3 4 5]
Epoch 1/10


ValueError: Exception encountered when calling Sequential.call().

[1mInvalid input shape for input Tensor("data:0", shape=(None, 49152), dtype=float32). Expected shape (None, 128, 128, 3), but input has incompatible shape (None, 49152)[0m

Arguments received by Sequential.call():
  • inputs=tf.Tensor(shape=(None, 49152), dtype=float32)
  • training=True
  • mask=None

4. [20 pts] Add regularization and/or drop-out features to your CNN. Report your model's best 
performance. As the performance standard deviation decreases the model is deemed to be 
more robust. Why?


