DenseNetFCN not training to expected performance #63

ahundt · 2017-03-31T18:57:58Z

I'm training and testing DenseNetFCN on Pascal VOC2012

Could I get advice on next steps to take to debug and improve the results?

To do so I'm using the train.py training script in my fork of Keras-FCN in conjunction with DenseNetFCN in keras-contrib with #46 applied, which for DenseNetFCN really mostly changes the formatting for pep8, though the regular densenet is more heavily modified.

I use Keras-FCN because we don't have an FCN training script here in keras-contrib yet, though I plan to adapt & submit this here once things work properly. While the papers don't publish results on Pascal VOC, the original DenseNet has fairly state of the art results on ImageNet & cifar10/cifar100, and DenseNetFCN performed well on CamVid and the Gatech dataset. Considering this I expected that perhaps DenseNetFCN might fail to get state of the art results, but I figured it could most likely in a worst case get over 50% mIOU and around 70-80% pixel accuracy since it has many similarities in common with ResNet, and performed quite well on the much smaller CamVid dataset.

DenseNet FCN configuration I'm using

    return densenet.DenseNetFCN(input_shape=(320, 320, 3),
                                weights=None, classes=classes,
                                nb_layers_per_block=4,
                                growth_rate=13,
                                dropout_rate=0.2)

This is very close to the configuration FC-DenseNet56 from the 100 layers tiramisu aka DenseNetFCN paper.

Sparse training accuracy

Here is what I'm seeing as I train with the Adam optimizer and loss rate of 0.1:

lr: 0.100000
Epoch 2/450
366/366 [==============================] - 287s - loss: 62.5131 - 
sparse_accuracy_ignoring_last_label: 0.3657
[...snip...]
lr: 0.100000
Epoch 148/450
366/366 [==============================] - 286s - loss: 82.2138 - 
sparse_accuracy_ignoring_last_label: 0.3375

Similar networks and training scripts verified for a baseline

I've successfully trained AtrousFCNResNet50_16s from scratch up to around 7x% pixel accuracy on the pascal VOC2012 test set.
AtrousFCNResNet50_16s with resnet from the keras pretrained weights as fchollet provided downloaded using the Keras-FCN get_weights_path and transfer_FCN script trained without issue to 0.56025 mIOU and around 8x% accuracy.
I've been able to train plain DenseNet on cifar 10 up to expected levels of accuracy published in the original papers.
I've been able to train DenseNetFCN on a single image, and predict pixels with 99% accuracy on the image itself with all augmentation disabled, and mid 9x% accuracy on augmented versions of the training image.
- This should at least demonstrate that the network can be trained, and the labels aren't saved incorrectly. I know (4) isn't a valid experiment for final conclusions, it just helps eliminate a variety of possible bugs.

Verifying training scripts with AtrousFCNResNet50_16s

For comparison, AtrousFCNResNet50_16s test set training results, which can be brought to 0.661 mIOU with augmented pascal voc,

I also trained AtrousFCNResNet50_16s from scratch with the results below:

PASCAL VOC trained with pretrained imagenet weights
IOU:
[ 0.90873648  0.74772504  0.44416247  0.57239141  0.50728778  0.51896323
  0.69891196  0.66111323  0.64380596  0.19145411  0.49733934  0.32720705
  0.5488089   0.49649298  0.6157158   0.75780816  0.35492963  0.57446371
  0.32721105  0.63200183  0.53067634]
meanIOU: 0.550343
pixel acc: 0.896132
150.996609926s used to calculate IOU.

Download links

Pascal VOC
http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar  

Augmented 
http://www.eecs.berkeley.edu/Research/Projects/CS/vision/grouping/semantic_contours/benchmark.tgz

Automated VOC download script

Thanks

@titu1994 and @aurora95, since I'm using your respective DenseNetFCN and Keras-FCN implementations, could I get any comments or advice you might have on this? I'd appreciate your thoughts.

All, thanks for giving this a look, as well as your consideration and advice!

The text was updated successfully, but these errors were encountered:

ahundt · 2017-03-31T21:28:06Z

Update: One obvious problem is I had the wrong learning rate for Adam, which I changed from 0.1 to the default 0.001. Training is still in progress but perhaps this will get to where I expect.

lr: 0.001000
Epoch 8/450
366/366 [==============================] - 264s - loss: 3.8106 - 
sparse_accuracy_ignoring_last_label: 0.5282

titu1994 · 2017-04-04T17:36:05Z

@ahundt How was the training with the corrected learning rate? In my case, it seems to perform well enough, but does not beat the benchmark yet. Perhaps deeper and wider DenseNetFCNs will do the trick.

ahundt · 2017-04-04T18:11:39Z

@titu1994 I've done a few things since the last update. In your case, did you use the same scripts I reference here or did you try something else?

Here is what I've done:

When I ran training after my last post, it appeared to quickly reach 65% pixel accuracy. However, when I looked at the actual results it turns out the network fell into the local minimum where it essentially labeling all pixels as the background class.
I'm going to add some class weighting to see if it addresses the problem.
There was also a bug with my new parameters to select a top for segmentation or classification, fixed in 2a2c176.
I'm also going to try training on COCO to see if using pretrained weights can improve results. The enet repository looks like it has a nice script for this purpose, and my data_coco.py script can be used to download and extract the dataset with the command ± python data_coco.py coco_setup.
I've verified that the resnet training with pre-trained weights works quite well!

Performance may also benefit from employing the Atrous Densenet in the same way as the Atrous resnet from Keras-FCN, and converting imagenet based DenseNet pretrained weights. Here are a bunch of links that may be useful for that purpose:

Includes densenet weight links and caffe to keras conversion functions

titu1994 · 2017-04-04T18:22:00Z

No I used a private dataset, which normal networks like UNet have difficulty with.

DenseNet seems to perform better than UNet but then it plateaus and further improvement is trivial after that point. Although in my case there was no single class problem.

aurora95 · 2017-04-04T18:25:58Z

I'm not sure but it shouldn't label all pixels as background, there must be something wrong in the code or training settings. Could you please give a link to the code you are using?

ahundt · 2017-04-04T18:41:46Z

@titu1994 sorry I didn't realize I had submitted my most recent post then edited it, there is now a lot of additional information above.

@aurora95 Here is My Keras-FCN version where I'm training with DenseNet, with options for Atrous_DenseNet and DenseNet_FCN as generated after cloning keras-contrib with #46 applied. Most modification is in models.py where I import this keras-contrib repository, and train.py where I changed the file paths, switched from SGD to Adam, and changed the learning rate appropriately. A few additional minor changes were also needed for compatibility of image and batch dimensions.

titu1994 · 2017-04-04T18:47:13Z

Hmm I don't think there are weights for DenseNet trained on ImageNet. I'll have to do a more thorough search to be sure.

Atrous DenseNets seems nice, could you try implementing it ? I'll give it a look, but i don't think just adding the atrous rate parameter will translate to better performance.

ahundt · 2017-04-04T18:49:50Z

@titu1994 they are trained on imagenet with DenseNet-caffe and the original DenseNet repository. See my densenet + imagenet links two posts up for those and some caffe to keras conversion scripts that might help, I haven't had a chance to try it all out yet.

titu1994 · 2017-04-04T18:53:17Z

This is great news! I'll be sure to look at it in some time, and if possible port the weights to Keras.

However if it's in caffe, I won't be able to convert it, since im on Windows.

ahundt · 2017-04-04T19:30:37Z

@titu1994 Also, Atrous Densenet is already implemented in #46, next steps would be converting the pretrained imagenet weights or training from scratch.

If the pretrained weights don't work, I have access to a distributed GPU cluster... but it will quite some time before I have all of that implemented, integrated, and tested. tf-slim or tensorpack could help there once paired with a script to copy weights between tf and keras models.

ahundt · 2017-04-10T22:27:36Z

Found another bug which came from combining the two scripts where the loss function softmax_sparse_crossentropy_ignoring_last_label applies softmax a second time. This is due to a difference between the Keras_FCN models which never apply softmax and then the loss function applies softmax. The keras-contrib densenet models apply softmax by default, thus combining keras-contrib models with keras-fcn loss results in 2x softmax.

I'm now training a new model with the bugfix on Pascal VOC 2012.

ahundt · 2017-04-19T05:35:05Z

Other optimizers & hyperparameters tensorflow/tensorflow#9175 may help.

It also appears https://github.com/0bserver07/One-Hundred-Layers-Tiramisu has an independent keras implementation but has also run into similar training limitations.

ahundt · 2017-04-20T16:07:13Z

Update: my fork of Keras-FCN is merged to master with instructions in the README.md
It looks like a potential DenseNet weight conversion process is in https://github.com/nicolov/segmentation_keras which uses github.com/ethereon/caffe-tensorflow, though I got an error when trying to run the conversion script nicolov/segmentation_keras#13.

ahundt · 2017-04-20T16:37:34Z

It seems the original authors explain in SimJeg/FC-DenseNet#10 from their FC-Densenet repository that they have found DenseNetFCN performance isn't very good on Pascal VOC. @0bserver07 @titu1994 you will be interested in this info.

titu1994 · 2017-04-20T16:58:50Z

I have seen similar poor performance on a private dataset. Seems it learns rapidly upto a certain point, then cannot improve at all.

UNet seems to perform well in that dataset, far better than DenseNetFCN.

Perhaps the implementation is correct but the model is not able to learn properly on all datasets ?

ahundt · 2017-04-20T20:54:15Z

@titu1994 that seems likely, camvid is a much simpler dataset than Pascal VOC 2012. The real test would likely be to try training on camvid itself.

ahundt · 2017-05-08T01:59:30Z

Another interesting difference is the use of a ceil mode in pooling.
ethereon/caffe-tensorflow#112

I'm a bit doubtful that this is the key cause of the performance difference between the paper and the keras implementations, however. Especially considering the tiramisu paper didn't use pretrained weights.

ahundt changed the title ~~DenseNetFCN not training to performance reported in reference paper~~ DenseNetFCN not training to expected performance Mar 31, 2017

This was referenced Apr 4, 2017

MIT license & submit to Keras? PavlosMelissinos/enet-keras#1

Open

Dense predictions for semantic segmentation (fully convolutional) raghakot/keras-resnet#23

Open

This was referenced Apr 14, 2017

Extending ImageDataGenerator keras-team/keras#3338

Closed

Submit to keras-contrib + see Keras-FCN nicolov/segmentation_keras#12

Closed

Collaborate + Submit to keras-contrib + see Keras-FCN 0bserver07/One-Hundred-Layers-Tiramisu#5

Open

This was referenced Apr 19, 2017

Specifying a FCN based on VGG16 keras-team/keras#3540

Closed

FCN for Semantic Segmentation via Keras keras-team/keras#3354

Closed

Dose anyone can give an example of FCN using keras? keras-team/keras#1986

Closed

titu1994 mentioned this issue Jun 6, 2017

Fix training difficulty for DenseNetFCN #96

Merged

the-moliver closed this as completed in #96 Jun 16, 2017

wassname mentioned this issue Dec 20, 2017

Add jaccard distance loss #152

Closed

HasnainRaz mentioned this issue Sep 4, 2018

Pictures in output are all black HasnainRaz/FC-DenseNet-TensorFlow#6

Closed

HasnainRaz mentioned this issue Sep 28, 2018

why it stop in first trian step when i train it in my dataset HasnainRaz/FC-DenseNet-TensorFlow#7

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DenseNetFCN not training to expected performance #63

DenseNetFCN not training to expected performance #63

ahundt commented Mar 31, 2017 •

edited

Loading

ahundt commented Mar 31, 2017 •

edited

Loading

titu1994 commented Apr 4, 2017

ahundt commented Apr 4, 2017 •

edited

Loading

titu1994 commented Apr 4, 2017

aurora95 commented Apr 4, 2017

ahundt commented Apr 4, 2017 •

edited

Loading

titu1994 commented Apr 4, 2017

ahundt commented Apr 4, 2017 •

edited

Loading

titu1994 commented Apr 4, 2017

ahundt commented Apr 4, 2017

ahundt commented Apr 10, 2017 •

edited

Loading

ahundt commented Apr 19, 2017 •

edited

Loading

ahundt commented Apr 20, 2017

ahundt commented Apr 20, 2017 •

edited

Loading

titu1994 commented Apr 20, 2017

ahundt commented Apr 20, 2017

ahundt commented May 8, 2017

DenseNetFCN not training to expected performance #63

DenseNetFCN not training to expected performance #63

Comments

ahundt commented Mar 31, 2017 • edited Loading

I'm training and testing DenseNetFCN on Pascal VOC2012

DenseNet FCN configuration I'm using

Sparse training accuracy

Similar networks and training scripts verified for a baseline

Verifying training scripts with AtrousFCNResNet50_16s

Download links

Thanks

ahundt commented Mar 31, 2017 • edited Loading

titu1994 commented Apr 4, 2017

ahundt commented Apr 4, 2017 • edited Loading

titu1994 commented Apr 4, 2017

aurora95 commented Apr 4, 2017

ahundt commented Apr 4, 2017 • edited Loading

titu1994 commented Apr 4, 2017

ahundt commented Apr 4, 2017 • edited Loading

titu1994 commented Apr 4, 2017

ahundt commented Apr 4, 2017

ahundt commented Apr 10, 2017 • edited Loading

ahundt commented Apr 19, 2017 • edited Loading

ahundt commented Apr 20, 2017

ahundt commented Apr 20, 2017 • edited Loading

titu1994 commented Apr 20, 2017

ahundt commented Apr 20, 2017

ahundt commented May 8, 2017

ahundt commented Mar 31, 2017 •

edited

Loading

ahundt commented Mar 31, 2017 •

edited

Loading

ahundt commented Apr 4, 2017 •

edited

Loading

ahundt commented Apr 4, 2017 •

edited

Loading

ahundt commented Apr 4, 2017 •

edited

Loading

ahundt commented Apr 10, 2017 •

edited

Loading

ahundt commented Apr 19, 2017 •

edited

Loading

ahundt commented Apr 20, 2017 •

edited

Loading