Deep convolutional networks for semantic segmentation

Parts of the code are are based on this project.

Code and data organization (outdated)

semseg
├── storage  % data (not code): datasets, trained models, log-files, ...
├── data  % data loading and preparation: Dataset, MiniBatchReader
|   ├── preparers  % Iccv09Preparer
|   |   ├── abstract_preparer.py
|   |   └── iccv09_preparer.py 
|   ├── dataset_dir.py
|   └── dataset.py
├── models  % AbstractModel, BaselineA
|   ├── preprocessing.py
|   ├── tf_utils
|   |   ├── blocks.py  % higher-level operations: ResNet block, LinkNet rencoder/decoder blocks 
|   |   ├── layers.py  % elementary operations: conv, max_pool, resize
|   |   └── variables.py 
|   ├── abstract_model.py  % AbstractModel
|   └── baseline_a.py  % BaselineA
├── processing  % image and label processing
|   ├── image_format.py
|   ├── labels.py
|   ├── shape.py  % TODO: test resize
|   └── transform.py  % TODO: use skimage.transform and test
├── test  % unit tests
|   :.
├── util  % helper functions and classes
|   ├── visualizer.py  % Visualizer
|   :.
├── evaluation.py
:.

Tasks

High priority

implement evaluation measures used in LinkNet and modify evaluation.py so that it makes use of numpy/scipy
evaluate our LinkNet
implement inference time measurement depending on mini-batch size (to compare with the results in the LinkNet paper)
write the report

Medium priority

nothing

Low priority

use tf.nn.sparse_softmax_cross_entropy_with_logits for more efficient training
make a baseline similar to BaselineA that uses strided convolutions instead of pooling layers (use 3x3 conv with stride 2 instead of pool->conv)
try IoU loss (like here)

Work in progress

finish saving and loading of trained models ~ Ivan
implement LinkNet ~ Josip

Completed

make data loading work (data.preparers.Iccv09Preparer, data.Dataset)
make a simple baseline
complete abstract_model
make the cost function (as well as other used evaluation measures) in BaselineA ignore "unknown" class (class 0)
implement util.Visualizer
improve the colors in util.visualizer.Visualizer
enable usage of util.Visualizer while training (by pressing d followed by ENTER in the console)
add stride:int and dilation:int parameters to tf_utils.layers.conv (use tf.nn.convolution)
add batch normalization to tf_utils.layers, use tf.layers.batch_normalization(input_layer, fused=True, data_format='NCHW')
improve random seeding in Dataset for beter reproducibility
add ResNet layers and encoder/decoder blocks used in LinkNet to tf_utils.blocks
implement a textual options menu that can be opened while training is paused, enabling network output visualization, saving/loading of weights, stopping training (after the current epoch) and other actions
add transposed convolution to tf_utils.layers

Current validation results on Stanford Background Dataset

mini-batch size = 16, Pentium 2020M

Model	mIoU	Pixel acc.	#epochs	Inference time [s]	Hardware
BaselineA	-	0.630	150	0.100*	Pentium 2020M

"Inference time" - on what hardware?

Name		Name	Last commit message	Last commit date
Latest commit History 49 Commits
documents		documents
semseg		semseg
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Deep convolutional networks for semantic segmentation

Code and data organization (outdated)

Tasks

High priority

Medium priority

Low priority

Work in progress

Completed

Current validation results on Stanford Background Dataset

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Deep convolutional networks for semantic segmentation

Code and data organization (outdated)

Tasks

High priority

Medium priority

Low priority

Work in progress

Completed

Current validation results on Stanford Background Dataset

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages