# Street Roughness Predictor for Bikes
### Written by: Michael Krebsbach, Braden Michelson, and Oren Erlich

# Imports
We make use of the following libraries and external code sources: 
- **Tensorflow**: the structure of the NN
- **numpy**: for data manipulation
- **matplotlib**: plotting the data and images
- The rest of the imports are misc for running the NN

Tutorials used: https://www.tensorflow.org/tutorials/load_data/images

In [None]:
# TensorFlow and tf.keras
import tensorflow as tf

#allow access to gdrive
from google.colab import drive
import pathlib

#file manipulation
import os
import PIL
import PIL.Image
import tensorflow_datasets as tfds

# Helper libraries
import numpy as np
import matplotlib.pyplot as plt

print(tf.__version__)


2.8.0


# Pre-Process Data

Create a shortcut of this Google Drive folder in your main Google drive directory (a.k.a "My Drive"). The resulting path should be `My Drive/Pictures`. If the directory is placed anywhere else you will need to adjust the `data_dir` string to point to the correct location.

This directory contains the pre-sorted training images of the asphalt at different roughness levels that will be fed into the neural network to generate the model.

https://drive.google.com/drive/folders/1sgRUGI_mXmF1Ue8zNnxHRU0tR5-QgBfU?usp=sharing

In [None]:

drive.mount('/content/gdrive', force_remount=True)
data_dir = "gdrive/My Drive/Pictures"
data_dir = pathlib.Path(data_dir)


Mounted at /content/gdrive


In this step we simply get the number of image files recognized within Colab. The target number, assuming that the above step was done correctly, should be 99.

In [None]:
image_count = len(list(data_dir.glob('*/*.jpg')))
print(image_count)

99


We get the list of all .jpg files in the mounted directory, then render the first in the list to verify that it is not corrupted. 

The rendering code has been commented out as it is quite slow to run. Uncomment it out if you want to try it out!

In [None]:
road = list(data_dir.glob('*/*.jpg'))
# PIL.Image.open(str(road[0]))


NameError: ignored

# Initialize the Neural Network

Here we set up the information for the initial data set. You may notice that the `img_height` and `img_width` variables do not match with the resolution of the input images you will find in the provided `/Pictures/` directory. This is due to both limits on free Google GPU capacity, as well as observed improvements to accuracy when images were compressed to this size.

We initialize the training and validation sets of images (train_ds and val_ds, respectively) by using a 80/20 train to test split. The class names are derived by the names of the folders the images exist in. 

The total number of training pictures divided by the batch size defined how many training runs the model will go through per epoch. Each run takes a batch size number of pictures to train on per run within the epoch. We used 32 for our training which would result in 3 runs of training per epoch.

The autotune variable ensures that the images are cached, speeding up the training. 

In [None]:
batch_size = 32
img_height = 256
img_width = 256


train_ds = tf.keras.utils.image_dataset_from_directory(
  data_dir,
  validation_split= 0.2,
  subset="training",
  seed=123,
  image_size=(img_height, img_width),
  batch_size=batch_size)

val_ds = tf.keras.utils.image_dataset_from_directory(
  data_dir,
  validation_split=0.2,
  subset="validation",
  seed=123,
  image_size=(img_height, img_width),
  batch_size=batch_size)

class_names = train_ds.class_names
num_classes = len(class_names)
print(class_names)

AUTOTUNE = tf.data.AUTOTUNE

train_ds = train_ds.cache().prefetch(buffer_size=AUTOTUNE)
val_ds = val_ds.cache().prefetch(buffer_size=AUTOTUNE)


Found 99 files belonging to 10 classes.
Using 80 files for training.
Found 99 files belonging to 10 classes.
Using 19 files for validation.
['1', '10', '2', '3', '4', '5', '6', '7', '8', '9']


# Build Neural Network

The neural network is a sequential model made up of many layers of resclaing, 2D convolutional networks, max pooling layers, dense layers. The convolutional networks utilizes 128 filters to maximize the accuracy while not requiring too much GPU usage. They also utilize a relu activation function. 

In [None]:
model = tf.keras.Sequential([
  tf.keras.layers.Rescaling(1./255),
  tf.keras.layers.Conv2D(128, 3, activation='relu'),
  tf.keras.layers.MaxPooling2D(),
  tf.keras.layers.Conv2D(128, 3, activation='relu'),
  tf.keras.layers.MaxPooling2D(),
  tf.keras.layers.Conv2D(128, 3, activation='relu'),
  tf.keras.layers.MaxPooling2D(),
  tf.keras.layers.Flatten(),
  tf.keras.layers.Dense(4096, activation='relu'),
  tf.keras.layers.Dense(num_classes)
])


# Compile Neural Network

We compile the neural network by defining its optimizer function as the adam optimizer, a combination of two gradient descent methodologies, and a loss function of Sparse Categorial Cross entropy (detailed here: https://www.tensorflow.org/api_docs/python/tf/keras/losses/SparseCategoricalCrossentropy). 


In [None]:
model.compile(optimizer='adam',
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=['accuracy'])


# Train the model

Fit the model with 200 epochs defining the training and validation set. (Epochs can be increased for better accuracy results)

In [None]:
model.fit(
  train_ds,
  validation_data=val_ds,
  epochs=200
)

Epoch 1/200
Epoch 2/200
Epoch 3/200
Epoch 4/200
Epoch 5/200
Epoch 6/200
Epoch 7/200
Epoch 8/200
Epoch 9/200
Epoch 10/200
Epoch 11/200
Epoch 12/200
Epoch 13/200
Epoch 14/200
Epoch 15/200
Epoch 16/200
Epoch 17/200
Epoch 18/200
Epoch 19/200
Epoch 20/200
Epoch 21/200
Epoch 22/200
Epoch 23/200
Epoch 24/200
Epoch 25/200
Epoch 26/200
Epoch 27/200
Epoch 28/200
Epoch 29/200
Epoch 30/200
Epoch 31/200
Epoch 32/200
Epoch 33/200
Epoch 34/200
Epoch 35/200
Epoch 36/200
Epoch 37/200
Epoch 38/200
Epoch 39/200
Epoch 40/200
Epoch 41/200
Epoch 42/200
Epoch 43/200
Epoch 44/200
Epoch 45/200
Epoch 46/200
Epoch 47/200
Epoch 48/200
Epoch 49/200
Epoch 50/200
Epoch 51/200
Epoch 52/200
Epoch 53/200
Epoch 54/200
Epoch 55/200
Epoch 56/200
Epoch 57/200
Epoch 58/200
Epoch 59/200
Epoch 60/200
Epoch 61/200
Epoch 62/200
Epoch 63/200
Epoch 64/200
Epoch 65/200
Epoch 66/200
Epoch 67/200
Epoch 68/200
Epoch 69/200
Epoch 70/200
Epoch 71/200
Epoch 72/200
Epoch 73/200
Epoch 74/200
Epoch 75/200
Epoch 76/200
Epoch 77/200
Epoch 78

<keras.callbacks.History at 0x7ff95e567450>

# Check Accuracy

Evaluate the model and output simplified results. 

In [None]:
test_loss, test_acc = model.evaluate(val_ds, verbose=2)

print('\nTest accuracy:', test_acc)


1/1 - 0s - loss: 5.6272 - accuracy: 0.4737 - 143ms/epoch - 143ms/step

Test accuracy: 0.4736842215061188


test the model