Keras implementation of Deeplabv3+
This repo is not longer maintained. I won't respond to issues but will merge PR
DeepLab is a state-of-art deep learning model for semantic image segmentation.
Model is based on the original TF frozen graph. It is possible to load pretrained weights into this model. Weights are directly imported from original TF checkpoint.
Segmentation results of original TF model. Output Stride = 8
Segmentation results of this repo model with loaded weights and OS = 8
Results are identical to the TF model
Segmentation results of this repo model with loaded weights and OS = 16
Results are still good
How to get labels
Model will return tensor of shape
(batch_size, height, width, num_classes). To obtain labels, you need to apply argmax to logits at exit layer. Example of predicting on
import numpy as np from PIL import Image from matplotlib import pyplot as plt from model import Deeplabv3 # Generates labels using most basic setup. Supports various image sizes. Returns image labels in same format # as original image. Normalization matches MobileNetV2 trained_image_width=512 mean_subtraction_value=127.5 image = np.array(Image.open('imgs/image1.jpg')) # resize to max dimension of images from training dataset w, h, _ = image.shape ratio = float(trained_image_width) / np.max([w, h]) resized_image = np.array(Image.fromarray(image.astype('uint8')).resize((int(ratio * h), int(ratio * w)))) # apply normalization for trained dataset images resized_image = (resized_image / mean_subtraction_value) - 1. # pad array to square image to match training images pad_x = int(trained_image_width - resized_image.shape) pad_y = int(trained_image_width - resized_image.shape) resized_image = np.pad(resized_image, ((0, pad_x), (0, pad_y), (0, 0)), mode='constant') # make prediction deeplab_model = Deeplabv3() res = deeplab_model.predict(np.expand_dims(resized_image, 0)) labels = np.argmax(res.squeeze(), -1) # remove padding and resize back to original image if pad_x > 0: labels = labels[:-pad_x] if pad_y > 0: labels = labels[:, :-pad_y] labels = np.array(Image.fromarray(labels.astype('uint8')).resize((h, w))) plt.imshow(labels) plt.waitforbuttonpress()
How to use this model with custom input shape and custom number of classes
from model import Deeplabv3 deeplab_model = Deeplabv3(input_shape=(384, 384, 3), classes=4) #or you can use None as shape deeplab_model = Deeplabv3(input_shape=(None, None, 3), classes=4)
After that you will get a usual Keras model which you can train using
How to train this model
Useful parameters can be found in the original repository.
- This model doesn’t provide default weight decay, user needs to add it themselves.
- Due to huge memory use with
OS=8, Xception backbone should be trained with
OS=16and only inferenced with
- User can freeze feature extractor for Xception backbone (first 356 layers) and only fine-tune decoder. Right now (March 2019), there is a problem with finetuning Keras models with BN. You can read more about it here.
This model can be retrained check this notebook. Finetuning is tricky and difficult because of the confusion between
trainable in Keras. See this issue for a discussion and possible alternatives.
How to load model
In order to load model after using model.save() use this code:
from model import relu6 deeplab_model = load_model('example.h5')
Xception vs MobileNetv2
There are 2 available backbones. Xception backbone is more accurate, but has 25 times more parameters than MobileNetv2.
For MobileNetv2 there are pretrained weights only for
alpha=1. However, you can initiate model with different values of alpha.
The latest vesrion of this repo uses TF Keras, so you only need TF 2.0+ installed
If you want to use older version, use following commands:
git clone https://github.com/bonlime/keras-deeplab-v3-plus/ cd keras-deeplab-v3-plus/ git checkout 714a6b7d1a069a07547c5c08282f1a706db92e20