In this paper, a recurrent method of semantic segmentation is proposed where convolutional LSTM is used to generate a mask of objects present in the image recurrently. The proposed method only generates a mask of objects that are only present in the image. Therefore, requiring less output tensor space. The method uses Unet architecture and convolutional LSTM in the bottleneck and is trained using the PASCAL VOC dataset.
- PyTorch==1.8.1
- Weights:
Download the pre-trained weights
file of the recurrent semantic segmentation model and put theweights
folder in the working directory.
The proposed method is implemented using the following network architecture where encoder and decoder are used from U-net architecture and convolutional LSTM is used in the bottleneck.
The experimental results are reported in terms of accuracy and mean interaction over union (IOU) between ground truth and predicted mask. The mean accuracy among all the classes is 30.02%
and the mean IOU achieved among all the classes is 43.77%
, and the maximum accuracy and IOU achieved are 70%
and 71.54%
, subsequently.