# Catch Basin Classifier
An image classifier written in Python with Tensorflow. Classifies catch basins in 3 classes.

The three classes are:
* `blocked` 🠊 0
* `clear` 🠊 1
* `partial` 🠊 2

## Imports

In [12]:
import tensorflow as tf
from tensorflow.keras.models import Sequential
from keras import layers
from PIL import Image
from glob import glob
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as pl

## Compute Average Image Size
The average image size is computed to ensure that all images are of the same width and height. This will be done by resizing all the images to the average dimension.

In [10]:
widths = []
heights = []

for path in glob("data/**/*.JPG"):
    with Image.open(path) as img:
        widths.append(img.width)
        heights.append(img.height)

In [11]:
image_size = round(sum(widths) / len(widths)), round(sum(heights) / len(heights))
image_size

(554, 732)

## Load and Prepare Data
Load the data and split into two groups: *training* and *validation*

In [62]:
training_dataset = tf.keras.utils.image_dataset_from_directory("data", validation_split=0.2, subset="training", seed=321, image_size=image_size)
validation_dataset = tf.keras.utils.image_dataset_from_directory("data", validation_split=0.2, subset="validation", seed=321, image_size=image_size)

# Get list of classnames to verify that the class names were interpreted correctly
training_dataset.class_names

Found 47 files belonging to 3 classes.
Using 38 files for training.
Found 47 files belonging to 3 classes.
Using 9 files for validation.


['blocked', 'clear', 'partial']

## Construction of the Model
Contruct a convolutional neural network. A `Rescaling` Layer is added to normalize `RGB` values. 

In [5]:
model = Sequential([
    layers.Rescaling(1./255, input_shape=(*image_size, 3)),
    layers.Conv2D(16, 3, padding='same', activation='tanh'),
    layers.MaxPooling2D(),
    layers.Conv2D(32, 3, padding='same', activation='tanh'),
    layers.MaxPooling2D(),
    layers.Conv2D(64, 3, padding='same', activation='tanh'),
    layers.MaxPooling2D(),
    layers.Flatten(),
    layers.Dense(128, activation='relu'),
    layers.Dense(3)
])

model.compile(optimizer='adam',
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=['accuracy'])

model.summary()

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 rescaling (Rescaling)       (None, 554, 732, 3)       0         
                                                                 
 conv2d (Conv2D)             (None, 554, 732, 16)      448       
                                                                 
 max_pooling2d (MaxPooling2D  (None, 277, 366, 16)     0         
 )                                                               
                                                                 
 conv2d_1 (Conv2D)           (None, 277, 366, 32)      4640      
                                                                 
 max_pooling2d_1 (MaxPooling  (None, 138, 183, 32)     0         
 2D)                                                             
                                                                 
 conv2d_2 (Conv2D)           (None, 138, 183, 64)      1

## Training
Begin training the model with `training_dataset` and `validation_dataset` for 15 epochs.

In [6]:
epochs = 15

history = model.fit(
  training_dataset,
  validation_data=validation_dataset,
  epochs=epochs
)

Epoch 1/15
Epoch 2/15
Epoch 3/15
Epoch 4/15
Epoch 5/15
Epoch 6/15
Epoch 7/15
Epoch 8/15
Epoch 9/15
Epoch 10/15
Epoch 11/15
Epoch 12/15
Epoch 13/15
Epoch 14/15
Epoch 15/15


## Training Analysis
See how the model did. 

Print out metrics such as accuracy and loss.

In [20]:
import matplotlib.pyplot as plt

acc = history.history['accuracy']
val_acc = history.history['val_accuracy']

print('accuracy:', acc[-1], 'val_accuracy:', val_acc[-1])

loss = history.history['loss']
val_loss = history.history['val_loss']

print('loss', loss[-1], 'val_loss:', val_loss[-1])

accuracy: 0.9444444179534912 val_accuracy: 0.5
loss 0.17278875410556793 val_loss: 8.411188125610352


## Save the Model
Save the model so that it can be loaded again for future use.

In [17]:
model.save("saved_model")

INFO:tensorflow:Assets written to: saved_model/assets
