First Model I am trying out

Is using increasing convo layers

Certainly! Here's the rationale behind the architecture:

1. **Convolutional Layers**: Convolutional layers are crucial for capturing spatial patterns in images. The initial convolutional layers (with 32 filters) learn low-level features like edges and textures, while subsequent layers (with 64 and 128 filters) learn higher-level features.

2. **ReLU Activation**: ReLU (Rectified Linear Unit) is used as the activation function after each convolutional layer. ReLU introduces non-linearity to the model and helps in learning complex patterns in the data.

3. **Max Pooling**: Max pooling layers downsample the feature maps, reducing their spatial dimensions. This helps in reducing the computational complexity of the model and also aids in creating translation-invariant features.

4. **Dropout**: Dropout layers are added after each max pooling layer to prevent overfitting. Dropout randomly drops a fraction of the units (neurons) during training, forcing the network to learn redundant representations and making it more robust.

5. **Flatten Layer**: The flatten layer converts the 3D feature maps into a 1D vector, which can be input to the dense layers.

6. **Dense Layers**: Dense (fully connected) layers are added at the end of the network to perform classification based on the learned features. The first dense layer consists of 512 units, which allows the model to learn complex combinations of features. The second dense layer has 100 units with softmax activation, which outputs probabilities for each of the 100 classes in CIFAR-100.

7. **Model Complexity**: The chosen architecture strikes a balance between model complexity and performance. It is deep enough to capture complex patterns in the CIFAR-100 dataset but not overly complex to cause overfitting, especially with the inclusion of dropout layers.

Overall, this architecture is a commonly used and effective choice for image classification tasks like CIFAR-100, providing a good starting point for experimentation and further optimization.

In [None]:

import matplotlib.pyplot as plt
import numpy as np
import PIL
# We use PIL lib to load image from a image path to be consistent with Tensorflow tutorial. You can use Skimage instead like previous weeks.
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras.models import Sequential
cifar100 = tf.keras.datasets.cifar100 

batch_size = 32
# img_height = 224
# img_width = 224

img_height = 32
img_width = 32

cifar100 = tf.keras.datasets.cifar100 
(train_images, train_labels), (test_images, test_labels) = cifar100.load_data()

# def preprocess(image, label):
#     # Resize images to the specified dimensions
#     image = tf.image.resize(image, [img_height, img_width])
#     # Normalize images to have a mean of 0 and standard deviation of 1
#     image = tf.cast(image, tf.float32) / 255.0
#     return image, label

# # Apply the preprocess function to training and testing data
# train_ds = tf.data.Dataset.from_tensor_slices((train_images, train_labels))
# train_ds = train_ds.map(preprocess, num_parallel_calls=tf.data.AUTOTUNE)

# test_ds = tf.data.Dataset.from_tensor_slices((test_images, test_labels))
# test_ds = test_ds.map(preprocess, num_parallel_calls=tf.data.AUTOTUNE)

# # Cache, shuffle, and batch the datasets
# train_ds = train_ds.cache().shuffle(1000).batch(batch_size).prefetch(buffer_size=tf.data.AUTOTUNE)
# test_ds = test_ds.cache().batch(batch_size).prefetch(buffer_size=tf.data.AUTOTUNE)


data_augmentation = Sequential(
  [ 
    layers.RandomFlip("horizontal", 
                      input_shape=(img_height, img_width, 3)),
    layers.RandomRotation(0.1),
    layers.RandomZoom(0.1),
  ]
);



In [8]:
 
unique_labels = np.unique(train_labels)
print("Unique labels:", unique_labels)
num_classes = len(unique_labels)
vg_net_model = Sequential([
    layers.Rescaling(1./255, input_shape=(img_height, img_width, 3)),
    data_augmentation,
    layers.Conv2D(32, (3,3),padding='same', activation='relu'), 
    layers.Conv2D(32, (3,3),padding='same', activation='relu'), 
    layers.MaxPooling2D((2,2), strides=(2,2)), 
    layers.Dropout(0.25),
    layers.Conv2D(64, (3,3),padding='same', activation='relu'), 
    layers.Conv2D(64, (3,3),padding='same', activation='relu'), 
    layers.MaxPooling2D((2,2), strides=(2,2)), 
    layers.Dropout(0.25),
    layers.Conv2D(64, (3,3),padding='same', activation='relu'), 
    layers.Conv2D(64, (3,3),padding='same', activation='relu'), 
    layers.MaxPooling2D((2,2), strides=(2,2)), 
    layers.Dropout(0.25),
    layers.Flatten(),
    layers.Dense(512, activation="relu"),
    layers.Dropout(0.25),
    layers.Dense(num_classes, activation="softmax"),
])

vg_net_model.summary()

Unique labels: [ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47
 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71
 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95
 96 97 98 99]


In [10]:

vg_net_model.compile(optimizer='adam',
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=['accuracy'])
epochs = 10
history_bn_model_filter = vg_net_model.fit(
  train_images,train_labels,
  validation_data=(test_images, test_labels),
  epochs=epochs,
  verbose=2, # 0 = silent, 1 = progress bar, 2 = one line per epoch
)


Epoch 1/10
1563/1563 - 234s - 150ms/step - accuracy: 0.0085 - loss: 4.6062 - val_accuracy: 0.0100 - val_loss: 4.6052
Epoch 2/10
1563/1563 - 246s - 157ms/step - accuracy: 0.0089 - loss: 4.6060 - val_accuracy: 0.0100 - val_loss: 4.6052
Epoch 3/10
1563/1563 - 236s - 151ms/step - accuracy: 0.0084 - loss: 4.6059 - val_accuracy: 0.0100 - val_loss: 4.6052
Epoch 4/10
1563/1563 - 231s - 148ms/step - accuracy: 0.0083 - loss: 4.6059 - val_accuracy: 0.0100 - val_loss: 4.6052
Epoch 5/10


KeyboardInterrupt: 