Test Accuracy Stagnates #4

jain-avi · 2018-06-15T07:43:39Z

Can you tell me if your training and testing accuracies always followed each other? I am implementing a smaller and modified version of the network you coded, and my test accuracy seems to have stagnated at 81%.
Also, I think you have coded a different architecture because you are adding output of pool layer as well as the output of pool+conv layer to the upsampled input, while the actual architecture only adds the pool+conv output to the upsampled layer. Is that making all the difference?

tengshaofeng · 2018-06-15T08:48:46Z

@Neo96Mav , which network you used, or you have modified the network yourself based my code?

jain-avi · 2018-06-15T11:37:05Z

I have used your network and the official Caffe network for reference, and implemented my own small network. I am not using attention modules for 4x4 because I feel they are too small, and I am only using one attention module in 8x8. My network is relatively small, and its for CIFAR images only.
Can you let me know the intuition behind this -

You have added output of residual block, as well as the output of the skip connection to the upsampled layer!

tengshaofeng · 2018-06-19T01:52:59Z

@Neo96Mav ， this is refer to the caffe network, i think it is added for more detail information. You can remove it for testing the effectiveness.

josianerodrigues · 2018-06-25T18:23:31Z

Hi @Neo96Mav,
Did you test the model using only one 8x8 Attention module? Was the accuracy better?

jain-avi · 2018-07-05T06:10:03Z

Hi @josianerodrigues , I added the 4x4 attention module as well. I am stuck at 89.5% accuracy. Maybe my model is not big enough or I am not using the exact same configuration, but I feel that it should not have affected it so much. @tengshaofeng Do u have any ideas why we can't match the authors performance?

tengshaofeng · 2018-07-06T01:49:10Z

@Neo96Mav , the paper only give the archietcture details of attention_92 for imagenet with 224 input but not for cifar10. So I build the net ResidualAttentionModel_92_32input following my understanding.
I have tested on it on cifar10 test set, the result is as following:
Accuracy of the model on the test images: 0.9354

maybe some details is not good. you can refer to the data preprocessed in the paper, keep same with the author. or maybe you can tune the hyper parameters for better performance. U can also remove the add operation to test the network.

tengshaofeng · 2018-08-28T08:43:52Z

@Neo96Mav @josianerodrigues
the result now is 0.954

tengshaofeng closed this as completed Oct 18, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Test Accuracy Stagnates #4

Test Accuracy Stagnates #4

jain-avi commented Jun 15, 2018

tengshaofeng commented Jun 15, 2018

jain-avi commented Jun 15, 2018

tengshaofeng commented Jun 19, 2018

josianerodrigues commented Jun 25, 2018

jain-avi commented Jul 5, 2018

tengshaofeng commented Jul 6, 2018

tengshaofeng commented Aug 28, 2018

Test Accuracy Stagnates #4

Test Accuracy Stagnates #4

Comments

jain-avi commented Jun 15, 2018

tengshaofeng commented Jun 15, 2018

jain-avi commented Jun 15, 2018

tengshaofeng commented Jun 19, 2018

josianerodrigues commented Jun 25, 2018

jain-avi commented Jul 5, 2018

tengshaofeng commented Jul 6, 2018

tengshaofeng commented Aug 28, 2018