<img src="header.png" align="left"/>

# Exercise Image classification with augmentation (10 points)

The goal of this example is to explain the organization, import and preparation of image data for classification including augmentation of the images. The following steps are performed:

- Dynamic loading and unpacking of image data from an external source.
- Review of the organization on the file system
- Loading of the data
- Transformations
- Augmentation
- Training
- Analysis
- Enhancement

The dataset used is called caltech101[3,4] with 101 classes and 40 to 800 images per class. The images have 200 - 300 pixel resolution in color.

Sources for the examples and data:


- [1] [https://machinelearningmastery.com/how-to-develop-a-cnn-from-scratch-for-cifar-10-photo-classification/](https://machinelearningmastery.com/how-to-develop-a-cnn-from-scratch-for-cifar-10-photo-classification/)
- [2] [https://github.com/bhavul/Caltech-101-Object-Classification](https://github.com/bhavul/Caltech-101-Object-Classification)
- [3] [http://www.vision.caltech.edu/Image_Datasets/Caltech101/](http://www.vision.caltech.edu/Image_Datasets/Caltech101/)


Citation for the Caltech 101 Dataset:

```
[4] L. Fei-Fei, R. Fergus and P. Perona. Learning generative visual models
    from few training examples: an incremental Bayesian approach tested on
    101 object categories. IEEE. CVPR 2004, Workshop on Generative-Model
    Based Vision. 2004
```



**NOTE**

Document your results by simply adding a markdown cell or a python cell (as comment) and writing your statements into this cell. For some tasks the result cell is already available.


[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/ditomax/mlexercises/blob/master/08%20Exercise%20Image%20classification%20with%20augmentation.ipynb)


In [None]:
#
# Prepare colab
#
import os

COLAB=False
try:
    %tensorflow_version 2.x
    print("running on google colab")
    COLAB=True
    os.makedirs('data/caltech101',exist_ok=True)    
    os.makedirs('results',exist_ok=True)    
except:
    print("not running on google colab")


#
# Turn off errors and warnings (does not work sometimes)
#
from warnings import simplefilter
simplefilter(action='ignore', category=FutureWarning)
simplefilter(action='ignore', category=Warning)
simplefilter(action='ignore', category=RuntimeWarning)



#
# Import of modules
#
import logging
import tarfile
import operator
import random
from urllib.request import urlretrieve
from PIL import Image

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

import sklearn
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder

#
# Tensorflow and Keras
#
import tensorflow as tf
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.models import Sequential, Model
from tensorflow.keras.layers import Conv2D, Input, Dropout, Activation, Dense, MaxPooling2D, Flatten, GlobalAveragePooling2D
from tensorflow.keras.optimizers import Adadelta
from tensorflow.keras.callbacks import ModelCheckpoint
from tensorflow.keras.callbacks import EarlyStopping
from tensorflow.keras.preprocessing.image import ImageDataGenerator

#
# GPU Support
#
tflogger = tf.get_logger()
tflogger.setLevel(logging.ERROR)
tf.compat.v1.logging.set_verbosity(tf.compat.v1.logging.ERROR )
physical_devices = tf.config.list_physical_devices('GPU')
if len(physical_devices) > 0:
    tf.config.experimental.set_memory_growth(physical_devices[0], True)
    print('using GPU support')


#
# Sizes of plots
#
plt.rcParams['figure.figsize'] = [16, 9]


#
# Versions
#
print('working on keras version {} on tensorflow {} using sklearn {}'.format ( tf.keras.__version__, tf.version.VERSION, sklearn.__version__ ) )

# Support functions for loading of data 

In [None]:
urlDataSource = 'http://www.vision.caltech.edu/Image_Datasets/Caltech101/101_ObjectCategories.tar.gz'
localExtractionFolder = 'data/caltech101'
localDataArchive = 'data/caltech101/caltech101.tar.gz'

In [None]:
#
# Load data from URL
#
def download_dataset(url,dataset_file_path):
    if os.path.exists(localDataArchive):
        print("archive already downloaded.")
    else:
        print("started loading archive from url {}".format(url))
        filename, headers = urlretrieve(url, dataset_file_path)
        print("finished loading archive from url {} with filename {}".format(url,filename))

#
# Extract images from archive
#       
def extract_dataset(dataset_file_path, extraction_directory):
    if (not os.path.exists(extraction_directory)):
        os.makedirs(extraction_directory)
    if (dataset_file_path.endswith("tar.gz") or dataset_file_path.endswith(".tgz")):
        tar = tarfile.open(dataset_file_path, "r:gz")
        tar.extractall(path=extraction_directory)
        tar.close()
    elif (dataset_file_path.endswith("tar")):
        tar = tarfile.open(dataset_file_path, "r:")
        tar.extractall(path=extraction_directory)
        tar.close()
    print("extraction of dataset from {} to {} done.".format(dataset_file_path,extraction_directory) )


# Load image data

**Note**

The download of the caltech 101 archive does not work any longer from python code. Please download the archive manually from the link above ```urlDataSource``` and paste the archive file at the location and name given in ```localDataArchive```. In colab you can easily add the archive by uploading it. However, note to store it at the correction folder position under the correct name.

In [None]:
#
# Run loading functions
#
download_dataset(urlDataSource,localDataArchive)

In [None]:
#
# Extract files
#
extract_dataset(localDataArchive,localExtractionFolder)

# Organisation of image data on a file system


- [Brownlee](https://machinelearningmastery.com/how-to-load-large-datasets-from-directories-for-deep-learning-with-keras/) 
- [Sarkar](https://towardsdatascience.com/a-single-function-to-streamline-image-classification-with-keras-bd04f5cfe6df)



<div class="alert alert-block alert-info">

## Task

Read both of the above documents and describe the organization of image data in the file system as it is shown in one of the two linked documents (2 points).
</div>


# Prepare training data

In [None]:
#
# Get images from category folder 
#
def get_images(object_category, data_directory):
    if (not os.path.exists(data_directory)):
        print("data directory not found.")
        return
    obj_category_dir = os.path.join(os.path.join(data_directory,"101_ObjectCategories"),object_category)
    images = [os.path.join(obj_category_dir,img) for img in os.listdir(obj_category_dir)]
    return images


#
# Get categories
#
def return_categories(data_directory):
    folder = os.path.join(data_directory,"101_ObjectCategories")
    categories=[d for d in os.listdir(folder) if os.path.isdir(os.path.join(folder,d))]
    return categories



#
# Crop to square whitout cutting
#
def expand2square(pil_img, background_color):
    width, height = pil_img.size
    if width == height:
        return pil_img
    elif width > height:
        result = Image.new(pil_img.mode, (width, width), background_color)
        result.paste(pil_img, (0, (width - height) // 2))
        return result
    else:
        result = Image.new(pil_img.mode, (height, height), background_color)
        result.paste(pil_img, ((height - width) // 2, 0))
        return result


#
# Read image into memory
#
def read_image(image_path):
    im = Image.open(image_path).convert("RGB")
    im = expand2square(im, (0, 0, 0) )
    im = im.resize( (200,200) )
    return np.array(im).astype(np.float32)


#
# Loading images into memory
#
def create_training_data(data_directory,fraction):
    
    i = 0
    X = []
    Y = []
    
    print("started to read dataset from {}.".format(data_directory) )
    
    for category in return_categories(data_directory):
        
        if category == 'BACKGROUND_Google':
            continue
        
        print(".",end='')
        
        for image in get_images(category, data_directory):
            if not image.endswith('.jpg'):
                continue
                
            if random.uniform(0, 1) > fraction:
                continue
                
            X.insert(i, read_image(image) )
            Y.insert(i, category )
            i += 1
            
    print(".")
    print("finished reading dataset.")
    X = np.array(X)
    return X,Y

In [None]:
#
# Load training data into memory. Collect only a random sample of 70%. Does not care about distribution of classes.
#
X_raw, Y_raw = create_training_data(localExtractionFolder,fraction=0.7)

In [None]:
#
# Check shapes of data
#
print('X shape {}, Y len {}'.format(X_raw.shape,len(Y_raw)))

In [None]:
#
# Transformation of labels in one-hot encoding
#
label_encoder = LabelEncoder()
Y_integer_encoded = label_encoder.fit_transform(Y_raw)
Y_one_hot = to_categorical(Y_integer_encoded)

In [None]:
Y_integer_encoded

In [None]:
Y_one_hot.shape

In [None]:
#
# Scale image data
#

X_normalized = ( X_raw / 256.0 ) + 0.001
del X_raw

<div class="alert alert-block alert-info">

## Task

Explain why we do a ```del X_raw``` here. (2 points).
</div>

In [None]:
#
# Split data into training and validation sets
#
X_train, X_validation, Y_train, Y_validation = train_test_split(X_normalized, Y_one_hot, test_size=0.25, random_state=42)
del X_normalized


#
# values are now in X_train, X_validation, Y_train, Y_validation, label_encoder, data_directory
#

# Check the data

In [None]:
#
# Shape of final data
#
print('train: X=%s, y=%s' % (X_train.shape, Y_train.shape))
print('test: X=%s, y=%s' % (X_validation.shape, Y_validation.shape))

In [None]:
#
# Check the pixel values
#
np.amax(X_train[0])


In [None]:
np.amin(X_train[0])        

In [None]:
np.mean(X_train[0])

In [None]:
#
# Plot some images
#
_, axarr = plt.subplots(4,4)
for row in range(4):
    for column in range(4):
        axarr[row,column].imshow(X_train[row*4+column])        
plt.show()

In [None]:
#
# Check distribution of classes
#
df = pd.DataFrame(Y_integer_encoded,columns=['class'])
counts= df.groupby('class').size()
counts
#np.histogram(Y_integer_encoded, bins=101)

In [None]:
class_pos = np.arange(101)
plt.bar(class_pos, counts, align='center', alpha=0.5)
plt.xlabel(class_pos)
plt.ylabel('digits')
plt.title('samples per digit')
plt.show()

<div class="alert alert-block alert-info">

## Task

The current distribution of classes is not balanced. Research in the internet, what we could do to improve the distribution of classes. Write down and describe two possible solutions (2 points).
</div>

# Build a model

In [None]:
#
# Erzeugen eines einfache Modelles
#
def createModel():
    model = Sequential()
    model.add(Conv2D(16, (3,3), activation='relu', input_shape=(200,200,3)))
    model.add(Conv2D(32, (3,3), activation='relu'))
    model.add(MaxPooling2D(pool_size=2, strides=2))
    model.add(Dropout(0.2))
    model.add(Conv2D(64, (3,3), activation='relu'))
    model.add(Conv2D(128, (3,3), activation='relu'))
    model.add(MaxPooling2D(pool_size=2, strides=2))
    model.add(Dropout(0.2))
    model.add(Flatten())
    model.add(Dense(512, activation='relu'))
    model.add(Dense(101, activation='softmax'))
    return model

In [None]:
#
# Compile und Training des Modelles
#
model_cnn = createModel()
model_cnn.compile(loss='categorical_crossentropy',optimizer='adam', metrics=['accuracy'])

In [None]:
#
# Callbacks steuern das Speichern von Checkpoints und eine Überwachung gegen Overfitting.
#
callbacks = [ EarlyStopping(monitor='val_loss', patience=4, verbose=1, mode='auto')]

<div class="alert alert-block alert-info">

## Task

Run the training and try to find out how much memory the training is using. (2 point).

**Hint**: look for the memory usage of the python process.

</div>

In [None]:
history = model_cnn.fit(X_train, Y_train, batch_size=16, epochs=6, verbose=1, validation_data=(X_validation,Y_validation), callbacks=callbacks)

In [None]:
#
# Evaluation of model quality with validation data
#
_, acc = model_cnn.evaluate(X_validation, Y_validation, verbose=0)
print('accuracy {:.3f} '.format(acc) )

In [None]:
#
# Print training loss and accuracy
#
def summarize_diagnostics(history,modelname):
    plt.subplot(211)
    plt.title('Cross Entropy Loss')
    plt.plot(history.history['loss'], color='blue', label='train')
    plt.plot(history.history['val_loss'], color='lightblue', label='test')
    plt.subplot(212)
    plt.title('Classification Accuracy')
    plt.plot(history.history['accuracy'], color='green', label='train')
    plt.plot(history.history['val_accuracy'], color='lightgreen', label='test')
    plt.subplots_adjust(hspace=0.5)
    plt.savefig( 'results/' + modelname + '_plot.png')
    plt.show()
    plt.close()

In [None]:
summarize_diagnostics(history,'05_model_cnn')

# Optimization using augmentation

Augmentation extends the training dataset with artificially generated images. This makes a model more robust and does not refer to individual pixels. Methods of augmentation for images are:

- Change width and height of image content (width_shift_range, height_shift_range).
- mirroring (flip)
- Rotation (rotation_range)
- Zooming (zoom_range)
- Brightness (brightness_range)
- Distortion (shear_range)

Adding noise cannot be set directly in Keras using the [ImageDataGenerator](https://keras.io/preprocessing/image/). However, this is approximately simulated by using dropout.


<img src="info.png" align="left"/> 

<div class="alert alert-block alert-info">

## Task

Experiment with the augmentation settings of the image generator to increase the accuracy by at least 5%. (2 point).

</div>

In [None]:
#
# Create generator for image loading
#
datagen = ImageDataGenerator(...)
# prepare iterator
it_train = datagen.flow(X_train, Y_train, batch_size=16)

In [None]:
#
# Training
#
steps = int(X_train.shape[0] / 16)

In [None]:
model_cnn = createModel()
model_cnn.compile(loss='categorical_crossentropy',optimizer='adam', metrics=['accuracy'])

In [None]:
history = model_cnn.fit(it_train, steps_per_epoch=steps, epochs=12, validation_data=(X_validation,Y_validation), verbose=1, callbacks=callbacks)

In [None]:
#
# Evaluation
#
_, acc = model_cnn.evaluate(X_validation, Y_validation, verbose=0)
print('accuracy {:.3f} '.format(acc) )

In [None]:
summarize_diagnostics(history,'08_model_cnn_aug')

# Test your model on a new image

In [None]:
#
# Prepare image (reusing import function defined above)
#
image_data_1 = np.array(read_image('data/test_image_1.png')) / 255.0
image_data_2 = np.array(read_image('data/test_image_2.png')) / 255.0

In [None]:
plt.imshow(image_data_1)

In [None]:
prediction = model_cnn.predict(np.array([image_data_1,image_data_2]))

In [None]:
predicted_classes = np.argmax(prediction,axis=1)
predicted_classes

In [None]:
print(label_encoder.inverse_transform ( predicted_classes ))