# Image Processor
The goal of this notebook is to create a model that can classify images of the following two classes:
- Kaninchen
- Feldhase

The comments in this notebook are plenty and generated mostly using the help of artificial intelligence. This approach is chosen, as it enables both the developers and the users of this notebook to understand the code better.

# Imports

In [14]:
import numpy as np
import tensorflow as tf
from sklearn.model_selection import train_test_split

# Limit GPU memory usage

In [15]:
gpus = tf.config.experimental.list_physical_devices('GPU') # Get the list of GPUs
for gpu in gpus: 
    tf.config.experimental.set_memory_growth(gpu, True) # enable tensorflow to allocate memory dynamically -> use only as much GPU memory as needed
gpus

[]

In [16]:
tf.config.list_physical_devices('GPU') # Check if GPU is available

[]

# Load images and transform them into a numpy array

In [17]:
# Create a dataset from the images in the folder (with 400 images per dataset)
data = tf.keras.utils.image_dataset_from_directory('photos', batch_size= 400)

# Create an iterator to iterate over the batches of the dataset (only one batch in our case)
data_iterator = data.as_numpy_iterator()

# Get the first batch (the only one in our case)
images, labels = data_iterator.next()

# First row of pixels of the first image in RGB to show how the data looks like
print(images[0].astype(int)[0])

Found 400 files belonging to 2 classes.
[[ 34  38  15]
 [ 34  38  15]
 [ 34  39  16]
 [ 36  40  17]
 [ 38  42  19]
 [ 39  41  19]
 [ 40  42  20]
 [ 43  41  20]
 [ 43  41  20]
 [ 42  43  21]
 [ 41  43  21]
 [ 43  43  22]
 [ 45  43  22]
 [ 45  44  22]
 [ 47  46  23]
 [ 48  46  25]
 [ 49  46  27]
 [ 55  52  34]
 [ 59  55  38]
 [ 88  81  64]
 [182 175 159]
 [175 168 149]
 [134 125  99]
 [104  94  69]
 [ 95  86  64]
 [ 86  80  57]
 [ 93  87  67]
 [ 91  84  65]
 [ 80  73  53]
 [ 77  68  47]
 [ 78  67  45]
 [ 78  66  42]
 [ 80  65  38]
 [ 81  61  35]
 [ 82  61  34]
 [ 87  63  32]
 [ 90  61  29]
 [ 92  58  24]
 [ 99  59  23]
 [102  58  20]
 [107  61  22]
 [114  66  26]
 [117  66  27]
 [119  63  24]
 [133  74  34]
 [138  81  42]
 [134  77  41]
 [115  64  33]
 [108  64  34]
 [103  69  36]
 [ 99  69  38]
 [ 95  70  40]
 [ 93  77  46]
 [ 89  73  45]
 [ 81  68  43]
 [103  92  67]
 [118 110  87]
 [107  96  76]
 [125 112  95]
 [120 110  97]
 [113 103  93]
 [100  88  78]
 [ 81  70  62]
 [110  97  89]


# Inspect the shape of the images

In [18]:
num_images = len(images)
num_classes = len(np.unique(labels))
print(f"Number of images: {num_images}")
print(f"Number of classes: {num_classes}")

Number of images: 400
Number of classes: 2


## Scale the images

This can be done here with /255, as the value range of RGB is only between 0 and 255.

In [19]:
images = images / 255.0 # divide by float forces the result to be a float

# Inspect scaled first row of pixels of the first image in RGB
print(images[0][0])

[[1.33333340e-01 1.49019614e-01 5.88235296e-02]
 [1.33333340e-01 1.49019614e-01 5.88235296e-02]
 [1.33333340e-01 1.52941182e-01 6.27451017e-02]
 [1.41452208e-01 1.57138482e-01 6.69424012e-02]
 [1.49019614e-01 1.64705887e-01 7.45098069e-02]
 [1.52941182e-01 1.60784319e-01 7.45098069e-02]
 [1.60010785e-01 1.64705887e-01 7.84313753e-02]
 [1.68627456e-01 1.60784319e-01 7.84313753e-02]
 [1.71537995e-01 1.63694859e-01 8.13419148e-02]
 [1.64705887e-01 1.68627456e-01 8.23529437e-02]
 [1.62676170e-01 1.69692099e-01 8.34175870e-02]
 [1.72334552e-01 1.68841913e-01 8.62745121e-02]
 [1.79136023e-01 1.71292886e-01 8.89399499e-02]
 [1.79209217e-01 1.72989845e-01 8.73893574e-02]
 [1.84313729e-01 1.80874690e-01 9.30530056e-02]
 [1.91697299e-01 1.83854163e-01 1.01501226e-01]
 [1.94548190e-01 1.82375923e-01 1.08273678e-01]
 [2.19073728e-01 2.06934735e-01 1.33803606e-01]
 [2.32903570e-01 2.17516437e-01 1.51095286e-01]
 [3.48524272e-01 3.21264595e-01 2.53724575e-01]
 [7.15902805e-01 6.87977731e-01 6.273619

# Split the data into train and test data

In [20]:
X_train, X_test, y_train, y_test = train_test_split(images, labels, test_size=0.2, random_state=42)

# Create the model

In [21]:
# sequential means that the layers are stacked on top of each other leading to a feed forward neural network
model = tf.keras.models.Sequential([
    # Adding a convolutional layer to detect features in the images
    
    # 16 filters to detect 16 different features like edges, textures, structure etc.
    # 3x3 kernel size to detect features in a 3x3 area
    # 1 stride to move the kernel by 1 pixel
    # relu activation function to introduce non-linearity
    # input shape: height, width, colors/channels (RGB)
    tf.keras.layers.Conv2D(16, (3, 3), 1, activation='relu', input_shape=(256, 256, 3)), # output shape: 254x254x16 (width, height, features)
    
    # Adding a max pooling layer to reduce the dimensionality of the data
    # achieved by taking the maximum value of a 2x2 area as the representative value for that area
    tf.keras.layers.MaxPool2D(), # output shape: 127x127x16
    
    # Adding more layers to continously learn more complex features
    # Continouse learning of more complex features is achieved by increasing the number of filters
    # A more compact representation of the data is achieved by reducing the kernel size continuously via max pooling
    tf.keras.layers.Conv2D(32, (3, 3), 1, activation='relu'), # output shape: 125x125x32
    tf.keras.layers.MaxPool2D(), # output shape: 62x62x32
    tf.keras.layers.Conv2D(64, (3, 3), 1, activation='relu'), # output shape: 60x60x64
    tf.keras.layers.MaxPooling2D(), # output shape: 30x30x64 (width, height, features)
    
    # Adding a flatten layer to transform the 2D data into a 1D vector
    tf.keras.layers.Flatten(), # output shape: 57600
    
    # Adding a dense layer to learn the classification
    tf.keras.layers.Dense(256, activation='relu'),
    
    # Adding the output layer to classify the images - either as a Kaninchen or as a Feldhase
    tf.keras.layers.Dense(1, activation='sigmoid')
])

# Configuring the neural network model

In [22]:
# 'compile' configures the selected neural network model
# 'adam' adapts the learning rate of each parameter individually, aiming to minimize the loss function (binary crossentropy in our case)
# the metrics are used to measure the performance of our model
model.compile('adam', loss=tf.losses.BinaryCrossentropy(), metrics=['accuracy'])

# Train the model

In [23]:
# 'fit' trains the model
# 'validation_data' is used to evaluate the model on the test data after each epoch
# 'epochs' is the number of times the model is trained on the whole dataset
hist = model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=20)

Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


In [24]:
# Evaluate the model based on the test data
test_loss, test_acc = model.evaluate(X_test, y_test, verbose=2) # verbose=2 to suppress the progress bar
print(f'Test accuracy: {test_acc}')

3/3 - 0s - loss: 0.6342 - accuracy: 0.8375 - 231ms/epoch - 77ms/step
Test accuracy: 0.8374999761581421
