# Convolutional vs. Dense Network

The purpose of this code is to compare two approaches to semantic segmentation neural networks, those being a convolutional neural network similar to a U-Net, and a densely connected network of our design.

Running the full notebook will train the model type that is specified by modelType below, show some basic visualizations of the predictions, and save the predicted video data.

This block should always be run before anything else to seed random number generators, load the utils script, load the dataset labels, and define imageSize and other important variables

In [None]:
from utils import *

# Seed randomizers
tf.random.set_seed(1234)
random.seed(1234)

labels = loadLabels('CamVid/')

imageSize = (320, 256)

trainSamples = 369
testSamples = 20
imagesPerBatch = 5
# Defines the size of the tiles the densely connected model uses
poolSize = 32


# Choose from "dense" or "convolutional"
modelType = "convolutional"

In [None]:
# Load the training dataset
train_x, train_y = loadDataset('CamVid/', 'train', trainSamples, labels, imageSize)

# Convolutional Network

This code defines the convolutional network.

uNetDepth specifies the amount of downsamples the model will perform

kernelSize specifies the size of the convolutional window

filters controls the amount of convolutional filters the original convolutional block will have (Note: increasing this will very quickly increase the amount of training parameters)

In [None]:
uNetDepth = 5
kernelSize = 3
filters = 4
model = createUNet([imageSize[1], imageSize[0], 3], kernelSize, uNetDepth, filters, len(labels))
batchSize = imagesPerBatch

# Dense Network

This block will only run if the dense model is selected

In [None]:
if modelType == "dense":
    model = createDenseNet([poolSize, poolSize, 3], len(labels))
    
    train_x = tileImages(train_x, poolSize)
    train_y = tileImages(train_y, poolSize)
    
    batchSize = imagesPerBatch * int(train_x.shape[0] / trainSamples)

In [None]:
# Compile and train the model
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
model.fit(train_x, train_y, batch_size=batchSize, epochs=60)

In [None]:
# Delete training dataset to free memory
del train_x, train_y

In [None]:
# Save the trained model with the date it was trained
from datetime import datetime
dateString = datetime.now().strftime("%m.%d.%Y.%H")

model.save("Trained/trained " + dateString + ".keras")

# Load Model and Perform Visualization

Running the code underneath this block will prompt the user to select a trained model (Found in the Trained folder) to run visualization on

In [None]:
from tkinter.filedialog import askopenfilename
from tkinter import Tk

root = Tk()
root.iconify()

fileName = askopenfilename(parent=root, title="Select trained model file", filetypes=[("Model files", "*.keras")])

root.destroy()

model = tf.keras.models.load_model(fileName)

In [None]:
test_x, test_y = loadDataset('CamVid/', 'test', testSamples, labels, imageSize)

This block will only run if the dense model is selected

In [None]:
if modelType == "dense":
    test_x = tileImages(test_x, poolSize)
    test_y = tileImages(test_y, poolSize)

In [None]:
batchSize = 5 * int(test_x.shape[0] / testSamples)
predictions = model.predict(test_x, batch_size=int(batchSize / testSamples))

This block will only run if the dense model is selected

In [None]:
if modelType == "dense":
    test_x = undoTiling(test_x, imageSize)
    test_y = undoTiling(test_y, imageSize, 32)
    predictions = undoTiling(predictions, imageSize, 32)

In [None]:
import matplotlib.pyplot as plt

colors = np.array([label[1] for label in labels])

def colorizeImage(image):
    return colors[np.argmax(image, axis=2)]

# Basic Visualization

This block displays several images from the test dataset and their corresponding ground truth, along with the models prediction

Change totalCheck to change the amount of samples visualized

In [None]:
totalCheck = 10

fig = plt.figure(figsize=(18, 6 * totalCheck))
for checkNum in range(totalCheck):
    predictionToShow = predictions[checkNum]
    groundTruth = test_y[checkNum]
    
    originalImage = test_x[checkNum]
    groundImage = (colorizeImage(groundTruth) / 255.0 + originalImage) * 0.5
    predictionImage = (colorizeImage(predictionToShow) / 255.0 + originalImage) * 0.5

    fig.add_subplot(totalCheck, 3, checkNum * 3 + 1) 
    plt.axis('off')
    if not checkNum:
        plt.title("Original")
    plt.imshow(originalImage, interpolation='nearest')
    
    fig.add_subplot(totalCheck, 3, checkNum * 3 + 2) 
    plt.axis('off')
    if not checkNum:
        plt.title("Ground Truth")
    plt.imshow(groundImage, interpolation='nearest')
    
    fig.add_subplot(totalCheck, 3, checkNum * 3 + 3)
    plt.axis('off')
    if not checkNum:
        plt.title("Prediction")
    plt.imshow(predictionImage, interpolation='nearest')

plt.show()

# Generate GIFS
The below code loads in a trained model, performs predictions on the test set, and compiles the video data into a GIF.

Does not need to be run for testing the models

In [None]:
labels = loadLabels('CamVid/')
imageSize = [320, 256]
test_x, test_y = loadDataset('CamVid/', 'test', 232, labels, imageSize, shuffle=False)


modelType = "dense"


if modelType == "dense":
    test_x = tileImages(test_x, poolSize)

predictions = model.predict(test_x, batch_size=int(test_x.shape[0] / 232))

if modelType == "dense":
    test_x = undoTiling(test_x, imageSize)
    predictions = undoTiling(predictions, imageSize, 32)

In [None]:
originalImg = []
groundTruth = []
predictImgs = []

for i in range(len(test_x)):
    imageData = (test_x[i] * 255).astype(np.uint8)
    image = Image.fromarray(imageData)
    groundImage = Image.fromarray(((colorizeImage(test_y[i]) + imageData) / 2).astype(np.uint8))
    predictionImage = Image.fromarray(((colorizeImage(predictions[i]) + imageData) / 2).astype(np.uint8))
    print(i, end='\r')

    originalImg.append(image)
    groundTruth.append(groundImage)
    predictImgs.append(predictionImage)

In [None]:
originalImg[0].save("OriginalImg.gif", save_all=True, append_images=originalImg[1:])
groundTruth[0].save("GroundTruth.gif", save_all=True, append_images=groundTruth[1:])
predictImgs[0].save("Predictions.gif", save_all=True, append_images=predictImgs[1:])