Shufflenet as backbone #67

YellowKyu · 2019-02-11T17:20:31Z

Hey guys,

Anyone tried to replace the backbone by something like a Shufflenet or Mobilenet ?
Since the Xception model is not released maybe it could be a good alternative to improve the inference speed !
I'm trying to add the architecture.py from https://github.com/TropComplique/shufflenet-v2-tensorflow to network_desp.py but during the training the rpn_cls_loss seems to be switching between 0.5, 0.6, 0.7, 0.8 and 0.9 without decreasing further....

Thanks for your help !

The text was updated successfully, but these errors were encountered:

karansomaiah · 2019-02-12T15:49:38Z

Hey @YellowKyu
I did try the mobilenet_v1 but I experienced nan loss very early in the training stage. I tried to reduce the learning rate but it didn't help. I also tried experimenting with the Xception like network as mentioned in the paper, faced the same issue though. Let me know if you find something.

Karan

YellowKyu · 2019-02-12T16:32:59Z

@karansomaiah Hi there ! which feature maps are you feeding to the RPN and the large separable convolution ? I have high loss (like around 10~15) with the Shufflenet but not NaN ...

karansomaiah · 2019-03-11T19:57:54Z

Have you solved it? @YellowKyu

These are the blocks:

blocks = [
    resnet_utils.Block('block1', bottleneck,
                               [(144, 24, 2, 1)] + [(144, 24, 1, 1)] * 3),
    resnet_utils.Block('block2', bottleneck,
                               [(288, 144, 2, 1)] + [(288, 144, 1, 1)] * 7),
    resnet_utils.Block('block3', bottleneck,
                               [(576, 288, 1, 1)] + [(576, 288, 1, 1)] * 3)
]

And I was passing block2 features to the RPN
Also, digging into the PSAlign code, I feel the loss is high because of the hard coded spatial scale for resnet101 in the original code. Appropriate scaling for the reduced size of the feature maps will fix the issue.

YellowKyu · 2019-03-12T01:44:03Z

hi @karansomaiah ,

For mobilenet, I fed Conv8_pointwise to the RPN and Conv11_pointwise to the large separable conv and it converged nicely.
For Shufflenet, I also succeed to make it converge but I only used Stage3 for both RPN and large separable conv. I noticed that it is related to the resolution of my features maps, which is also related to what you discovered with PSAlign. Did you try to modify PSAlign ?

YellowKyu closed this as completed Feb 22, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Shufflenet as backbone #67

Shufflenet as backbone #67

YellowKyu commented Feb 11, 2019

karansomaiah commented Feb 12, 2019

YellowKyu commented Feb 12, 2019

karansomaiah commented Mar 11, 2019

YellowKyu commented Mar 12, 2019

Shufflenet as backbone #67

Shufflenet as backbone #67

Comments

YellowKyu commented Feb 11, 2019

karansomaiah commented Feb 12, 2019

YellowKyu commented Feb 12, 2019

karansomaiah commented Mar 11, 2019

YellowKyu commented Mar 12, 2019