# Catch Basin Classifier
An image classifier written in Python with Tensorflow. Classifies catch basins in 3 classes.

The three classes are:
* `blocked` 🠊 0
* `clear` 🠊 1
* `partial` 🠊 2

## Imports

In [2]:
import tensorflow as tf
from tensorflow.keras.models import Sequential
from keras import layers
from PIL import Image
from glob import glob
import os
import secrets

## Compute Average Image Size
The average image size is computed to ensure that all images are of the same width and height. This will be done by resizing all the images to the average dimension.

In [3]:
widths = []
heights = []

for path in glob("data/**/*.JPG"):
    with Image.open(path) as img:
        widths.append(img.width)
        heights.append(img.height)

In [5]:
image_size = round(sum(widths) / len(widths)), round(sum(heights) / len(heights))
image_size

(554, 732)

## Load and Prepare Data
Load the data and split into two groups: *training* and *validation*

In [6]:
training_dataset = tf.keras.utils.image_dataset_from_directory("data", validation_split=0.2, subset="training", seed=321, image_size=image_size)
validation_dataset = tf.keras.utils.image_dataset_from_directory("data", validation_split=0.2, subset="validation", seed=321, image_size=image_size)

# Get list of classnames to verify that the class names were interpreted correctly
training_dataset.class_names

Found 60 files belonging to 3 classes.
Using 48 files for training.
Found 60 files belonging to 3 classes.
Using 12 files for validation.


['blocked', 'clear', 'partial']

## Construction of the Model
Contruct a convolutional neural network. A `Rescaling` Layer is added to normalize `RGB` values. 

In [15]:
model = Sequential([
    layers.Rescaling(1./255, input_shape=(*image_size, 3)),
    layers.Conv2D(16, 3, padding='same', activation='relu'),
    layers.MaxPooling2D(),
    layers.Conv2D(32, 3, padding='same', activation='relu'),
    layers.MaxPooling2D(),
    layers.Conv2D(64, 3, padding='same', activation='relu'),
    layers.MaxPooling2D(),
    layers.Flatten(),
    layers.Dense(128, activation='relu'),
    layers.Dropout(0.5),
    layers.Dense(3, activation='sigmoid')
])

model.compile(optimizer='adam',
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=['accuracy'])

model.summary()

Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
rescaling_1 (Rescaling)      (None, 554, 732, 3)       0         
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 554, 732, 16)      448       
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 277, 366, 16)      0         
_________________________________________________________________
conv2d_4 (Conv2D)            (None, 277, 366, 32)      4640      
_________________________________________________________________
max_pooling2d_4 (MaxPooling2 (None, 138, 183, 32)      0         
_________________________________________________________________
conv2d_5 (Conv2D)            (None, 138, 183, 64)      18496     
_________________________________________________________________
max_pooling2d_5 (MaxPooling2 (None, 69, 91, 64)       

## Training
Begin training the model with `training_dataset` and `validation_dataset` for 15 epochs.

In [16]:
epochs = 10

history = model.fit(
  training_dataset,
  validation_data=validation_dataset,
  epochs=epochs
)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


## Training Analysis
See how the model did. 

Print out metrics such as accuracy and loss.

In [17]:
acc = history.history['accuracy']
val_acc = history.history['val_accuracy']

print('accuracy:', acc[-1], 'val_accuracy:', val_acc[-1])

loss = history.history['loss']
val_loss = history.history['val_loss']

print('loss', loss[-1], 'val_loss:', val_loss[-1])

accuracy: 0.6041666865348816 val_accuracy: 0.3333333432674408
loss 1.7774839401245117 val_loss: 6.487751483917236


## Save the Model
Save the model so that it can be loaded again for future use.

In [18]:
os.mkdir('saved_models')
model.save(f'model-{secrets.token_hex(3)}')

INFO:tensorflow:Assets written to: saved_model\assets
