# Image Classification Project 6
Choose three classes from the Open Images Dataset. Train a neural net that is able to classify images into these three categories.



In [1]:
classes = ['Cat', 'Dog', 'Person']

## Dataset
https://storage.googleapis.com/openimages/web/visualizer/index.html?type=detection

## Base model
VGG 19

In [2]:
# imports
from keras.applications import VGG19
from keras.layers import Dense, Flatten
from keras.models import Model
from keras.preprocessing.image import ImageDataGenerator
from openimages import download

In [3]:
# Path to the directory where the images are stored
base_dir = './dataset'
n_images = 2000  # number of images per class

# Download images for each class using Open Images
download.download_dataset(
    class_labels=classes,
    dest_dir=base_dir,
    limit=n_images
)

2023-06-02  17:17:38 INFO Downloading 2000 train images for class 'cat'
100%|██████████| 2000/2000 [01:18<00:00, 25.44it/s]
2023-06-02  17:18:57 INFO Downloading 2000 train images for class 'dog'
100%|██████████| 2000/2000 [01:23<00:00, 24.07it/s]
2023-06-02  17:20:21 INFO Downloading 2000 train images for class 'person'
100%|██████████| 2000/2000 [01:22<00:00, 24.11it/s]


{'cat': {'images_dir': './dataset\\cat\\images'},
 'dog': {'images_dir': './dataset\\dog\\images'},
 'person': {'images_dir': './dataset\\person\\images'}}

## Task
1. Preparation: Split dataset into a 70/30 Train/test split


In [4]:
# Define parameters for the loader
batch_size = 20
img_height = 224
img_width = 224

# Load the training data
train_datagen = ImageDataGenerator(rescale=1. / 255,
                                   shear_range=0.2,
                                   zoom_range=0.2,
                                   horizontal_flip=True,
                                   validation_split=0.3)  # set validation split

train_generator = train_datagen.flow_from_directory(
    base_dir,
    target_size=(img_height, img_width),
    batch_size=batch_size,
    class_mode='categorical',
    subset='training')  # set as training data

# Load the validation data
validation_generator = train_datagen.flow_from_directory(
    base_dir,
    target_size=(img_height, img_width),
    batch_size=batch_size,
    class_mode='categorical',
    subset='validation')  # set as validation data

Found 4200 images belonging to 3 classes.
Found 1800 images belonging to 3 classes.


2. Train a VGG19 network from scratch (randomly initialized weights) and estimate the testset accuracy.

In [5]:
# load a vgg19 with random init weights
random_base_vgg19 = VGG19(weights=None, include_top=False, input_shape=(img_height, img_width, 3))

# TODO estimate testset accuracy



3. Transfer learning: Use an imagenet pretrained VGG19 network, train the model and estimate the testset accuracy. Show the differences in loss and accuracy of the plain and pre trained network over the first 10 epochs.

In [6]:
# Load the VGG19 model
base_model = VGG19(weights='imagenet', include_top=False, input_shape=(img_height, img_width, 3))

# Freeze the layers of the base model
for layer in base_model.layers:
    layer.trainable = False

# Create a new model on top of the base model
x = Flatten()(base_model.output)
x = Dense(1024, activation='relu')(x)
predictions = Dense(len(classes), activation='softmax')(x)
model = Model(inputs=base_model.input, outputs=predictions)

# Compile the model
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

# Train the model
model.fit(
    train_generator,
    steps_per_epoch=train_generator.samples // batch_size,
    validation_data=validation_generator,
    validation_steps=validation_generator.samples // batch_size,
    epochs=10)

# Save the model
model.save('models/model.h5')


Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


4. Data cleansing: Remove “bad” images from the dataset. Which did you remove? How many? Discuss results.

5. Add data augmentation and train again, discuss results

6. Rebuild VGG19. After layer block4_conv4 (25, 25, 512):
    - Random flip
    - Random contrast
    - Random translation

7. Test a few of your own images and present the results
    - Add inception layer with dimensionality reduction (no of output filters should be 512, choose own values for the filter dimensionality reduction in 1x1 layers)
    - Add conv layer (kernel 1x1,  filters 1024, padding valid, stride 1, activation leaky relu)
    - Add conv layer (kernel 3x3,  filters 1024, padding same, stride 1, activation relu)
    - Freeze conv2 layers and before

8. Answer the following questions:
    - What accuracy can be achieved? What is the accuracy of the train vs. test set?
    - On what infrastructure did you train it? What is the inference time?
    - What are the number of parameters of the model?
    - Which categories are most likely to be confused by the algorithm? Show results in a confusion matrix.

Compare the results of the experiments.