<img src="header.png" align="left"/>

# Anwendungsbeispiel Import of image data with augmentation and classification

Das Ziel dieses Beispieles ist es die Organisation, den Import und die Vorbereitung von Bilddaten für eine Klassifikation zu erklären. Dabei werden folgende Schritte durchgeführt:

- Dynamisches Laden und entpacken der Bilddaten von einer externen Quelle
- Review der Organisation auf dem Filesystem
- Laden der Daten
- Transformationen
- Augmentierung
- Training
- Analyse
- Verbesserung

Der verwendete Datensatz heisst caltech101[3] mit 101 Klassen und jeweils 40 bis 800 Bildern pro Klasse. Die Bilder haben 200 - 300 Pixel Auflösung in Farbe.

Quellen für die Beispiele und Daten:

- [1] [https://machinelearningmastery.com/how-to-develop-a-cnn-from-scratch-for-cifar-10-photo-classification/](https://machinelearningmastery.com/how-to-develop-a-cnn-from-scratch-for-cifar-10-photo-classification/)
- [2] [https://github.com/bhavul/Caltech-101-Object-Classification](https://github.com/bhavul/Caltech-101-Object-Classification)
- [3] [http://www.vision.caltech.edu/Image_Datasets/Caltech101/](http://www.vision.caltech.edu/Image_Datasets/Caltech101/)


In [1]:
#
# Import der Module
#
import numpy as np
from urllib.request import urlretrieve
import os
import tarfile
import operator
from PIL import Image

from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder

from keras.utils import to_categorical
from keras.models import Sequential, Model
from keras.layers import Conv2D, Input, Dropout, Activation, Dense, MaxPooling2D, Flatten, GlobalAveragePooling2D
from keras.optimizers import Adadelta
from keras.callbacks import ModelCheckpoint
from keras.callbacks import EarlyStopping

Using TensorFlow backend.
  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
  _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
  _np_qint16 = np.dtype([("qint16", np.int16, 1)])
  _np_quint16 = np.dtype([("quint16", np.uint16, 1)])
  _np_qint32 = np.dtype([("qint32", np.int32, 1)])
  np_resource = np.dtype([("resource", np.ubyte, 1)])


# Hilfsfunktionen 

In [2]:
urlDataSource = 'http://www.vision.caltech.edu/Image_Datasets/Caltech101/101_ObjectCategories.tar.gz'
localExtractionFolder = 'data/caltech101'
localDataArchive = 'data/caltech101/caltech101.tar.gz'

In [7]:
#
# Laden der Daten von einer URL
#
def download_dataset(url,dataset_file_path):
    if os.path.exists(localDataArchive):
        print("archive already downloaded.")
    else:
        print("started loading archive from url {}".format(url))
        filename, headers = urlretrieve(url, dataset_file_path)
        print("finished loading archive from url {}".format(url))

def extract_dataset(dataset_file_path, extraction_directory):
    if (not os.path.exists(extraction_directory)):
        os.makedirs(extraction_directory)
    if (dataset_file_path.endswith("tar.gz") or dataset_file_path.endswith(".tgz")):
        tar = tarfile.open(dataset_file_path, "r:gz")
        tar.extractall(path=extraction_directory)
        tar.close()
    elif (dataset_file_path.endswith("tar")):
        tar = tarfile.open(dataset_file_path, "r:")
        tar.extractall(path=extraction_directory)
        tar.close()
    print("extraction of dataset from {} to {} done.".format(dataset_file_path,extraction_directory) )


# Laden der Daten

In [9]:
#
# Laden der Daten ausführen
#
download_dataset(urlDataSource,localDataArchive)

archive already downloaded.


In [10]:
#
# Extrahieren der Daten
#
extract_dataset(localDataArchive,localExtractionFolder)

extraction of dataset from data/caltech101/caltech101.tar.gz to data/caltech101 done.


# Organisation von Bilddaten auf dem Filesystem

Eine gute Einführung in das Thema ist zu finden bei Jason Brownlee unter [https://machinelearningmastery.com/how-to-load-large-datasets-from-directories-for-deep-learning-with-keras/](https://machinelearningmastery.com/how-to-load-large-datasets-from-directories-for-deep-learning-with-keras/).


# Erzeugen der Trainingsdaten

In [27]:
#
# Hilfsfunktionen
#

def get_images(object_category, data_directory):
    if (not os.path.exists(data_directory)):
        print("Data directory not found. Are you sure you downloaded and extracted dataset properly?")
        return
    obj_category_dir = os.path.join(os.path.join(data_directory,"101_ObjectCategories"),object_category)
    images = [os.path.join(obj_category_dir,img) for img in os.listdir(obj_category_dir)]
    return images

def return_images_per_category(data_directory):
    folder = os.path.join(data_directory,"101_ObjectCategories")
    print(folder)
    categories=[d for d in os.listdir(folder) if os.path.isdir(os.path.join(folder,d))]
    print(categories)
    return categories

#
# Lesen der Bilddaten aus einer Datei. Anpassen der Größe auf 300x200 (Breite x Höhe) Pixel.
#
def read_image(image_path):
    #img = cv2.imread(image_path, cv2.IMREAD_COLOR)
    #img = cv2.resize(img, (300,200), interpolation=cv2.INTER_CUBIC)    
    im = Image.open(image_path).convert("RGB").resize((300,200))
    np_img = np.array(im)
    return np_img




def create_training_data(data_directory):
    i = 0
    X = np.ndarray((8677, 200, 300, 3), dtype=np.uint8)
    Y = []
    
    print("started to read dataset from {}.".format(data_directory) )
    
    for category in return_images_per_category(data_directory):
        
        if category == 'BACKGROUND_Google':
            continue
        
        print("reading class {}".format(category))
        
        for image in get_images(category, data_directory):
            if not image.endswith('.jpg'):
                continue
            X[i] = read_image(image)
            Y.insert(i,category)
            i += 1
        print("processed {}  of 8678".format(i+1))
    print("finished reading dataset.")
    return X,Y



In [28]:
#
# Erzeugen der Trainingsdaten
#
X, Y = create_training_data(localExtractionFolder)

started to read dataset from data/caltech101.
data/caltech101/101_ObjectCategories
['gerenuk', 'hawksbill', 'headphone', 'ant', 'butterfly', 'lamp', 'strawberry', 'water_lilly', 'chandelier', 'dragonfly', 'crab', 'pagoda', 'dollar_bill', 'emu', 'inline_skate', 'platypus', 'dalmatian', 'cup', 'airplanes', 'joshua_tree', 'cougar_body', 'grand_piano', 'trilobite', 'brontosaurus', 'wild_cat', 'pigeon', 'dolphin', 'soccer_ball', 'wrench', 'scorpion', 'flamingo_head', 'nautilus', 'accordion', 'cougar_face', 'pyramid', 'camera', 'barrel', 'schooner', 'cellphone', 'panda', 'revolver', 'lobster', 'menorah', 'lotus', 'stapler', 'crocodile', 'chair', 'helicopter', 'minaret', 'starfish', 'ceiling_fan', 'ketch', 'mayfly', 'wheelchair', 'bass', 'yin_yang', 'crocodile_head', 'saxophone', 'beaver', 'mandolin', 'bonsai', 'Leopards', 'car_side', 'ibis', 'electric_guitar', 'kangaroo', 'stegosaurus', 'ferry', 'snoopy', 'umbrella', 'rhino', 'okapi', 'watch', 'brain', 'gramophone', 'scissors', 'rooster', 'c

In [29]:
#
# Transformation der Labels in one-hot encoding
#
label_encoder = LabelEncoder()
Y_integer_encoded = label_encoder.fit_transform(Y)
Y_one_hot = to_categorical(Y_integer_encoded)


In [None]:
#
# Normalisieren der Bilddaten
#
X_normalized = ( X.astype(np.float64) / 255 ) + 0.001
del X




In [None]:
#
# Split der Daten in Train und Test(validation) Datensätze
#
X_train, X_validation, Y_train, Y_validation = train_test_split(X_normalized, Y_one_hot, test_size=0.25, random_state=42)
del X_normalized

#
# gültige Werte in X_train, X_validation, Y_train, Y_validation, label_encoder, data_directory
#

# Prüfen der Daten

In [None]:
#
# Form der Daten
#
print('Train: X=%s, y=%s' % (X_train.shape, Y_train.shape))
print('Test: X=%s, y=%s' % (X_validation.shape, Y_validation.shape))

#
# Plot von Bildern
#
for i in range(9):
    pyplot.subplot(330 + 1 + i)
    pyplot.imshow(X_train[i])
pyplot.show()

In [None]:
#
# Erzeugen eines einfache Modelles
#

model_cnn = Sequential()
model_cnn.add(Conv2D(16, (3,3), activation='relu', input_shape=(200,300,3)))
model_cnn.add(Conv2D(32, (3,3), activation='relu'))
model_cnn.add(MaxPooling2D(pool_size=2, strides=2))
model_cnn.add(Conv2D(64, (3,3), activation='relu'))
model_cnn.add(Conv2D(128, (3,3), activation='relu'))
model_cnn.add(MaxPooling2D(pool_size=2, strides=2))
model_cnn.add(Flatten())
model_cnn.add(Dense(101, activation='softmax'))



In [None]:
#
# Compile und Training des Modelles
#

model_cnn.compile(loss='categorical_crossentropy',optimizer='adam', metrics=['accuracy'])

callbacks = [ModelCheckpoint('model_cnn_weights.h5', monitor='val_acc', save_best_only=True),
             EarlyStopping(monitor='val_loss', patience=4, verbose=1, mode='auto')]

model_cnn.fit(X_train, Y_train, batch_size=64, epochs=30, verbose=1, validation_data=(X_validation,Y_validation), callbacks=callbacks)



# Optimiertes Laden der Bilder

Die bisherige Ladefunktion hat alle Bilder in den Speicher geladen. Das führt schnell dazu, dass der Hauptspeicher ausgeht. Daher benötigen wir eine Funktion, die Bilder der Reihe nach in den Speicher lädt und für das Training zur Verfügung stellt.

Quelle: [https://towardsdatascience.com/a-single-function-to-streamline-image-classification-with-keras-bd04f5cfe6df](https://towardsdatascience.com/a-single-function-to-streamline-image-classification-with-keras-bd04f5cfe6df)
