Index | Semantic Seg | All |
---|---|---|
21 | 19 | 50 |
DFANet: Deep Feature Aggregation for Real-Time Semantic Segmentation
CVPR 2019
Hanchao Li, Pengfei Xiong, Haoqiang Fan, Jian Sun
Megvii
NO
CVPR 2019 | 旷视实时语义分割技术DFANet:高清虚化无需双摄
- The structure a) learns feature on a high resolution image, which will increase the computation cost. And the communication between branches are lacking. (PS: Why do we need the communication???)
- Structure b) is usually time-consuming. (PS: and also could not focus on small objects because it can only enlarge the receptive field, but never reduce)
- Structure c) the details are lost.
- d) is used in this paper to pass through the low level features to the next backbone.
It's basically a RNN structure, making the output of last module as the input of the next module, and the features from hidden layers of the last module are also passed to the same level of the next module. But RNN has forget gate to erase those useless features, and RNN only have one module being used repeatedly.
The most valuable part for me is the experiment part.
Cascade
SPP module can enrich features with multi-scale while leading a sharp increase of computation cost.
2. [Contribution / Method] What's new in this paper? / How does this paper solve the above problems?
A cascade structure is proposed to fuse multi-level features without increasing much computation cost.
- CityScapes
- CamVid
- mini-batch stochastic gradient descent (SGD) with batch size 48, momentum 0.9 and weight decay 1e - 5
- ”poly” learning rate policy
mIoU
The kernel number of the hidden layer is important. The more, the better performence. Backbone is not always helpful. With the feature going deep, the feature size will be very small and thus decrease the performance. Although the third backbone is not very good at segmentation, but it's helpful when fusing these features from different layers. The result is better for deeper backbone.The FC layer is really important for increasing 4~6% of the performance.
This table is very detailed. Listing the inputsize is very commendable here!
Fast speed and pretty well performance, from excellent backbone and fusing method.
Many helpful details are shown in the paper. 👍
- This fusing approach is useful, but it really looks like RNN and FC attention helped a lot but the ablation study is lacking.