Skip to content
Go to file

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time

Semantic Segmentation


In this project I am using vgg16 model that is pre-trained and the kitti dataset. I also add fcn8 network demonstrated in the paper "Fully Convolutional Networks For Semantic Segmentation".

This code should make it easy for some one to load a pre-existing model, add layers, train, freeze, optimize, perform a graph transform to 8 bit, and also simple graph, video, and image examples of the inference.


Frameworks and Packages

Make sure you have the following is installed:


Download the Kitti Road dataset from here. Extract the dataset in the data folder. This will create the folder data_road with all the training a test images.

To Start


Run the following command to run the training and save the initial model and checkpoint for tensorflow. This also creates the first set of inference samples. The others are not required to get samples.:


Run the following command to get extra information for the models and also run tensorflow tools to freeze, optimize, and transform the graphs. This also creates the dot files and converts them to a image.:


Run the following command to look at how the various graphs perform on video and images.:


Test Environment

All testing was done on my lab system.

OS : Kubuntu 16.04

CPU : AMD Threadripper 1950x

RAM : 32 GB with 15-15-15 latency

TensorFlow Version: 1.4.0-rc1 (compiled from source)

GPU : GeForce GTX 1070 major: 6 minor: 1 memoryClockRate(GHz): 1.683

Total GPU Memory: 7.92GiB

Hyper parameters used

Epochs : 15

Batch Size : 5


Cross Entropy Loss

Normal Semantic Segmentation Speeds

With this example I do think the speeds seem to suggest that the non-8 bit version was on average 1 second slower. The video I tested on was about 1700 frames and each frame had inference performed on it and it took about .5 seconds for each frame.

Downscale : 0.002460714997141622

SS Test : 1.6782658910015016

Upscale : 0.001284024998312816

1.0 frames

Downscale : 0.001150920994405169

SS Test : 0.6872002170057385

Upscale : 0.0005270979963825084

2.0 frames

8 Bit Video Speeds

After freezing the graph we use a TensorFlow tool called transform_graph to convert the graph and I perform inference on images and a video. The speed results from the first couple frames are below.

Downscale : 0.002117505995556712

SS Test : 1.8592919129878283

Upscale : 0.001217473007272929

1.0 frames

Downscale : 0.0011298349709250033

SS Test : 1.513521872984711

Upscale : 0.0006270769517868757

2.0 frames

8 bit Example 1

8 bit Example 2

8 bit Example 3


Currently for some reason I was unable to decern was why when using some of these Tensorflow tools it would alter graphs beyond use. Currently the graphs get altered in a way that eats up all memory so I could only create about 10 test images before it would overflow. With the 8 bit example I was able to do many more images, but still the same problem. When saving and restoring using Tensorflows Save Model structure and restoring the checkpoint I can perform inference on all my images and run over the video. Going to keep working on this issue even though it is somewhat unrelated to the end goals of this project.

Frozen Graph and Examples

Frozen Example 1

Frozen Example 2

Frozen Example 3

Frozen Graph Example

Optimized Graph and Examples

Optimized Example 1

Optimized Example 2

Optimized Example 3

Optimized Graph Example

Eight Bit Graph and Examples

8 bit Example 1

8 bit Example 2

8 bit Example 3

Optimized Graph Example


This is my semantic segmentation project where I use a pre-trained model and then also add additional layers and use that on an existing video.



No releases published


No packages published


You can’t perform that action at this time.