# Simplified Keras deeplabV3+ semantic segmentation model using Xception and MobileNetV2 as base models

## Simplified Keras based deeplabV3+ has been developed via referring to [Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation](https://arxiv.org/abs/1802.02611) and [the relevant github repository](https://github.com/tensorflow/models/tree/master/research/deeplab).

The deeplabV3+ semantic segmentation model is mainly composed of the encoder and decoder using atrous spatial pooling and separable depthwise convolution. As training data, [the augmented Pascal VOC 2012 data provided by DrSleep](https://www.dropbox.com/s/oeu149j8qtbs1x0/SegmentationClassAug.zip?dl=0) is used. These encoder and decoder become much more simplified and modularized, designing ASPP becomes simplified and flexible as the original deeplabv3+ model of deeplab, so you can design ASPP in the json format, and the boundary refinement layer is modularized, so you can use whether using the boundary refinement layer, or not according to your model's performance. 

# Tasks

- [x] Encoder develop.
- [x] Decoder develop.
- [x] Training and evaluating with Pasal VOC 2012 dataset.
- [x] Documentation.
- [x] The Keras framework is changed into the tensorflow 2.0 Keras framework.
- [x] Test and optimize the model.
- [x] Second documentation.

# Requirement

The simplified Keras deeplabV3+ semantic segmentation model is developed and tested on Tensorflow 2.4 and Python 3.6. To use it, Tensorflow 2.4 and Python 3.6 must be installed. OS and GPU environments are the Google Colab GPU environment.

In [1]:
!git clone https://github.com/tonandr/deeplabv3plus_keras

Cloning into 'deeplabv3plus_keras'...
remote: Enumerating objects: 279, done.[K
remote: Counting objects: 100% (279/279), done.[K
remote: Compressing objects: 100% (214/214), done.[K
remote: Total 279 (delta 141), reused 156 (delta 30), pack-reused 0[K
Receiving objects: 100% (279/279), 19.29 MiB | 11.32 MiB/s, done.
Resolving deltas: 100% (141/141), done.


In [2]:
cd deeplabv3plus_keras

/content/deeplabv3plus_keras


In [3]:
!python setup.py sdist bdist_wheel

running sdist
running egg_info
creating deeplabv3plus_keras.egg-info
writing deeplabv3plus_keras.egg-info/PKG-INFO
writing dependency_links to deeplabv3plus_keras.egg-info/dependency_links.txt
writing requirements to deeplabv3plus_keras.egg-info/requires.txt
writing top-level names to deeplabv3plus_keras.egg-info/top_level.txt
writing manifest file 'deeplabv3plus_keras.egg-info/SOURCES.txt'
writing manifest file 'deeplabv3plus_keras.egg-info/SOURCES.txt'
running check
creating deeplabv3plus-keras-1.0.0
creating deeplabv3plus-keras-1.0.0/.idea
creating deeplabv3plus-keras-1.0.0/.idea/inspectionProfiles
creating deeplabv3plus-keras-1.0.0/analysis
creating deeplabv3plus-keras-1.0.0/analysis/.ipynb_checkpoints
creating deeplabv3plus-keras-1.0.0/bodhi
creating deeplabv3plus-keras-1.0.0/bodhi/deeplabv3plus_keras
creating deeplabv3plus-keras-1.0.0/deeplabv3plus_keras.egg-info
creating deeplabv3plus-keras-1.0.0/pics
creating deeplabv3plus-keras-1.0.0/pics/mobilenetv2
creating deeplabv3plus-ker

In [4]:
!pip install -e ./

Obtaining file:///content/deeplabv3plus_keras
Installing collected packages: deeplabv3plus-keras
  Running setup.py develop for deeplabv3plus-keras
Successfully installed deeplabv3plus-keras


# Preparing data

As training data, the augmented Pascal VOC 2012 data is used, as validation, the orignal Pascal VOC 2012 is used, so the original Pascal VOC 2012 and augmented Pascal VOC 2012 must be downloaded and configured.

In [5]:
!mkdir resource

In [6]:
cd resource

/content/deeplabv3plus_keras/resource


In [7]:
!wget http://pjreddie.com/media/files/VOCtrainval_11-May-2012.tar

--2021-01-10 06:30:21--  http://pjreddie.com/media/files/VOCtrainval_11-May-2012.tar
Resolving pjreddie.com (pjreddie.com)... 128.208.4.108
Connecting to pjreddie.com (pjreddie.com)|128.208.4.108|:80... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: https://pjreddie.com/media/files/VOCtrainval_11-May-2012.tar [following]
--2021-01-10 06:30:22--  https://pjreddie.com/media/files/VOCtrainval_11-May-2012.tar
Connecting to pjreddie.com (pjreddie.com)|128.208.4.108|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1999639040 (1.9G) [application/octet-stream]
Saving to: ‘VOCtrainval_11-May-2012.tar’


2021-01-10 06:35:13 (6.56 MB/s) - ‘VOCtrainval_11-May-2012.tar’ saved [1999639040/1999639040]



In [8]:
!tar -xvf VOCtrainval_11-May-2012.tar

[1;30;43m스트리밍 출력 내용이 길어서 마지막 5000줄이 삭제되었습니다.[0m
VOCdevkit/VOC2012/SegmentationClass/2008_001874.png
VOCdevkit/VOC2012/SegmentationClass/2008_001876.png
VOCdevkit/VOC2012/SegmentationClass/2008_001882.png
VOCdevkit/VOC2012/SegmentationClass/2008_001885.png
VOCdevkit/VOC2012/SegmentationClass/2008_001895.png
VOCdevkit/VOC2012/SegmentationClass/2008_001896.png
VOCdevkit/VOC2012/SegmentationClass/2008_001926.png
VOCdevkit/VOC2012/SegmentationClass/2008_001966.png
VOCdevkit/VOC2012/SegmentationClass/2008_001971.png
VOCdevkit/VOC2012/SegmentationClass/2008_001992.png
VOCdevkit/VOC2012/SegmentationClass/2008_001997.png
VOCdevkit/VOC2012/SegmentationClass/2008_002032.png
VOCdevkit/VOC2012/SegmentationClass/2008_002043.png
VOCdevkit/VOC2012/SegmentationClass/2008_002064.png
VOCdevkit/VOC2012/SegmentationClass/2008_002066.png
VOCdevkit/VOC2012/SegmentationClass/2008_002067.png
VOCdevkit/VOC2012/SegmentationClass/2008_002073.png
VOCdevkit/VOC2012/SegmentationClass/2008_002079.png
VOCdevkit/VOC2

In [9]:
!wget https://www.dropbox.com/s/oeu149j8qtbs1x0/SegmentationClassAug.zip

--2021-01-10 06:35:23--  https://www.dropbox.com/s/oeu149j8qtbs1x0/SegmentationClassAug.zip
Resolving www.dropbox.com (www.dropbox.com)... 162.125.66.18, 2620:100:6022:18::a27d:4212
Connecting to www.dropbox.com (www.dropbox.com)|162.125.66.18|:443... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: /s/raw/oeu149j8qtbs1x0/SegmentationClassAug.zip [following]
--2021-01-10 06:35:24--  https://www.dropbox.com/s/raw/oeu149j8qtbs1x0/SegmentationClassAug.zip
Reusing existing connection to www.dropbox.com:443.
HTTP request sent, awaiting response... 302 Found
Location: https://uc29b83228b1ae60978681ca4f77.dl.dropboxusercontent.com/cd/0/inline/BGu3uRRQPq49UFTTaHW521aJooVRDKt9mtWu2Et_axZ8F3GScMzaK-7P4zXSluvz6kY0UY7g0Sbu1Jso1_zs5d5uwdXWKV2o5LwQv6nvAw5GJQ/file# [following]
--2021-01-10 06:35:24--  https://uc29b83228b1ae60978681ca4f77.dl.dropboxusercontent.com/cd/0/inline/BGu3uRRQPq49UFTTaHW521aJooVRDKt9mtWu2Et_axZ8F3GScMzaK-7P4zXSluvz6kY0UY7g0Sbu1Jso1_zs5d5uwdXWK

In [10]:
!unzip SegmentationClassAug.zip -d SegmentationClassAug

[1;30;43m스트리밍 출력 내용이 길어서 마지막 5000줄이 삭제되었습니다.[0m
  inflating: SegmentationClassAug/__MACOSX/SegmentationClassAug/._2010_003928.png  
  inflating: SegmentationClassAug/SegmentationClassAug/2010_003929.png  
  inflating: SegmentationClassAug/__MACOSX/SegmentationClassAug/._2010_003929.png  
  inflating: SegmentationClassAug/SegmentationClassAug/2010_003931.png  
  inflating: SegmentationClassAug/__MACOSX/SegmentationClassAug/._2010_003931.png  
  inflating: SegmentationClassAug/SegmentationClassAug/2010_003933.png  
  inflating: SegmentationClassAug/__MACOSX/SegmentationClassAug/._2010_003933.png  
  inflating: SegmentationClassAug/SegmentationClassAug/2010_003936.png  
  inflating: SegmentationClassAug/__MACOSX/SegmentationClassAug/._2010_003936.png  
  inflating: SegmentationClassAug/SegmentationClassAug/2010_003937.png  
  inflating: SegmentationClassAug/__MACOSX/SegmentationClassAug/._2010_003937.png  
  inflating: SegmentationClassAug/SegmentationClassAug/2010_003938.png  
  inflat

In [11]:
!cp -r SegmentationClassAug/SegmentationClassAug VOCdevkit/VOC2012

In [22]:
cd VOCdevkit/VOC2012/ImageSets/Segmentation

/content/deeplabv3plus_keras/resource/VOCdevkit/VOC2012/ImageSets/Segmentation


In [24]:
!wget https://www.dropbox.com/s/vrvelecbhqh2a4g/train_aug_val.txt

--2021-01-10 06:44:25--  https://www.dropbox.com/s/vrvelecbhqh2a4g/train_aug_val.txt
Resolving www.dropbox.com (www.dropbox.com)... 162.125.65.18, 2620:100:6022:18::a27d:4212
Connecting to www.dropbox.com (www.dropbox.com)|162.125.65.18|:443... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: /s/raw/vrvelecbhqh2a4g/train_aug_val.txt [following]
--2021-01-10 06:44:26--  https://www.dropbox.com/s/raw/vrvelecbhqh2a4g/train_aug_val.txt
Reusing existing connection to www.dropbox.com:443.
HTTP request sent, awaiting response... 302 Found
Location: https://uc7a2abc74ecb2af47c8f7db3ca1.dl.dropboxusercontent.com/cd/0/inline/BGuSyiRf9ZepTmkIDwBBC3jeu3Cxz5MjbyESYKsxVlFmlHaVwbytIR9R8HpWqM9m8tdqTN1jlijpdDeEtGpFfsxWbMQK7OeOqFiLPlWbGGf7JyRA-qUrgopgfEzXTsJcJ7M/file# [following]
--2021-01-10 06:44:26--  https://uc7a2abc74ecb2af47c8f7db3ca1.dl.dropboxusercontent.com/cd/0/inline/BGuSyiRf9ZepTmkIDwBBC3jeu3Cxz5MjbyESYKsxVlFmlHaVwbytIR9R8HpWqM9m8tdqTN1jlijpdDeEtGpFfsxWbMQK7

In [25]:
cd /content/deeplabv3plus_keras/bodhi/deeplabv3plus_keras/

/content/deeplabv3plus_keras/bodhi/deeplabv3plus_keras


# Neural network architecture and training strategy

Neural network configuration including neural network architecture and training strategy via hyper-parameters 
can be configured as the JSON format as below. ASPP can be designed in encoder_middle_conf. resource_path must be configured according to yours.

In [27]:
%%writefile semantic_segmentation_deeplabv3plus_conf.json
{
	"mode" : "train",
	"resource_type": "pascal_voc_2012_ext",
	"resource_path" : "/content/deeplabv3plus_keras/resource",
	"model_loading" : false,
	"multi_gpu" : false,
	"num_gpus" : 4,
	"eval_data_mode": 1,
	"eval_result_saving": false,
	"base_model": 0,
	"hps" : {
		"val_ratio": 0.1,
		"lr" : 0.0001,
		"beta_1" : 0.5,
		"beta_2" : 0.99,
		"decay" : 0.0,
		"epochs" : 64,
		"batch_size" : 6,
		"weight_decay": 0.00004,
		"bn_momentum": 0.9,
		"bn_scale": true,
		"reduce_lr_factor": 0.99
	},
	"nn_arch" : {
		"boundary_refinement": true,
		"output_stride": 16,
		"image_size": 512,
		"num_classes": 21,
		"mv2_depth_multiplier": 1,
		"depth_multiplier": 1,
		"conv_rate_multiplier" : 1,
		"reduction_size": 256,
		"dropout_rate": 0.5,
		"concat_channels": 256,
		"encoder_middle_conf": [
			{"kernel": 3, "rate": [1, 1], "op": "conv", "input": -1},
			{"kernel": 3, "rate": [18, 15], "op": "conv", "input": 0},
			{"kernel": 3, "rate": [6, 3], "op": "conv", "input": 1},
			{"kernel": 3, "rate": [1, 1], "op": "conv", "input": 0},
			{"kernel": 3, "rate": [6, 21], "op": "conv", "input": 0}
		],
		"encoder_middle_conf_xception": [
			{"kernel": 3, "rate": [1, 1], "op": "conv", "input": -1},
			{"kernel": 3, "rate": [6, 6], "op": "conv", "input": 0},
			{"kernel": 3, "rate": [12, 12], "op": "conv", "input": 0},
			{"kernel": 3, "rate": [18, 18], "op": "conv", "input": 0},
			{"kernel": 1, "rate": [1, 1], "op": "pyramid_pooling", "input": 0, "target_size_factor": [1, 1]}
		]
	}
}

Overwriting semantic_segmentation_deeplabv3plus_conf.json


# Training

In semantic_segmentation_deeplabv3plus_conf.json, mode must be configured to "train". Boundary refinement requires much computing resource, so when training, the batch size should be within about 6, and when evaluating, the batch size should be within about 1. 

## Xception

The encoder middle configuration is as follows.

```
"encoder_middle_conf": [
			{"kernel": 3, "rate": [1, 1], "op": "conv", "input": -1}, 
			{"kernel": 3, "rate": [6, 6], "op": "conv", "input": 0}, 
			{"kernel": 3, "rate": [12, 12], "op": "conv", "input": 0}, 
			{"kernel": 3, "rate": [18, 18], "op": "conv", "input": 0}, 
			{"kernel": 1, "rate": [1, 1], "op": "pyramid_pooling", "input": 0, "target_size_factor": [1, 1]}
		]
```

## MobileNetV2

The encoder middle configuration is as follows.

```
"encoder_middle_conf": [
			{"kernel": 3, "rate": [1, 1], "op": "conv", "input": -1}, 
			{"kernel": 3, "rate": [18, 15], "op": "conv", "input": 0}, 
			{"kernel": 3, "rate": [6, 3], "op": "conv", "input": 1}, 
			{"kernel": 3, "rate": [1, 1], "op": "conv", "input": 0}, 
			{"kernel": 3, "rate": [6, 21], "op": "conv", "input": 0}
		]
```

In [30]:
!python -V

Python 3.6.9


In [29]:
import tensorflow
tensorflow.__version__

'2.4.0'

In [None]:
!python semantic_segmentation.py

2021-01-10 06:47:51.847245: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.10.1
Seed:1024
2021-01-10 06:47:53.069886: I tensorflow/compiler/jit/xla_cpu_device.cc:41] Not creating XLA devices, tf_xla_enable_xla_devices not set
2021-01-10 06:47:53.071174: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcuda.so.1
2021-01-10 06:47:53.135992: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-01-10 06:47:53.136617: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1720] Found device 0 with properties: 
pciBusID: 0000:00:04.0 name: Tesla V100-SXM2-16GB computeCapability: 7.0
coreClock: 1.53GHz coreCount: 80 deviceMemorySize: 15.75GiB deviceMemoryBandwidth: 836.37GiB/s
2021-01-10 06:47:53.136669: I tensorflow/stream_executor/plat

# Evaluating

In semantic_segmentation_deeplabv3plus_conf.json, the mode must be configured to "evaluate" and model_loading must be configured to "true", and eval_data_mode consisting of MODE_TRAIN of 0, MODE_VAL of 1 can be configured, and evaluation image results can be saved via configuring eval_result_saving.

In [None]:
%%writefile semantic_segmentation_deeplabv3plus_conf.json
{
	"mode" : "evaluate",
	"resource_type": "pascal_voc_2012_ext",
	"resource_path" : "/content/deeplabv3plus_keras/resource",
	"model_loading" : true,
	"multi_gpu" : false,
	"num_gpus" : 4,
	"eval_data_mode": 1,
	"eval_result_saving": true,
	"base_model": 0,
	"hps" : {
		"val_ratio": 0.1,
		"lr" : 0.0001,
		"beta_1" : 0.5,
		"beta_2" : 0.99,
		"decay" : 0.0,
		"epochs" : 64,
		"batch_size" : 6,
		"weight_decay": 0.00004,
		"bn_momentum": 0.9,
		"bn_scale": true,
		"reduce_lr_factor": 0.99
	},
	"nn_arch" : {
		"boundary_refinement": true,
		"output_stride": 16,
		"image_size": 512,
		"num_classes": 21,
		"mv2_depth_multiplier": 1,
		"depth_multiplier": 1,
		"conv_rate_multiplier" : 1,
		"reduction_size": 256,
		"dropout_rate": 0.5,
		"concat_channels": 256,
		"encoder_middle_conf": [
			{"kernel": 3, "rate": [1, 1], "op": "conv", "input": -1},
			{"kernel": 3, "rate": [18, 15], "op": "conv", "input": 0},
			{"kernel": 3, "rate": [6, 3], "op": "conv", "input": 1},
			{"kernel": 3, "rate": [1, 1], "op": "conv", "input": 0},
			{"kernel": 3, "rate": [6, 21], "op": "conv", "input": 0}
		],
		"encoder_middle_conf_xception": [
			{"kernel": 3, "rate": [1, 1], "op": "conv", "input": -1},
			{"kernel": 3, "rate": [6, 6], "op": "conv", "input": 0},
			{"kernel": 3, "rate": [12, 12], "op": "conv", "input": 0},
			{"kernel": 3, "rate": [18, 18], "op": "conv", "input": 0},
			{"kernel": 1, "rate": [1, 1], "op": "pyramid_pooling", "input": 0, "target_size_factor": [1, 1]}
		]
	}
}

In [None]:
!python semantic_segmentation.py