Reason to have a fixed inference size (473x473) #103

TouqeerAhmad · 2020-02-13T16:27:52Z

Hi, Thank you for sharing the code and trained models!
I have a question specific to the demo in your PyTorch implementation. As I understand, you are using a base_size = 512 to which any incoming image is resized regardless of the input dimensions while maintaining the original aspect ratio. Then the inference is run in a grid fashion on crops of 473x473.

My question is to why have a fixed crop_size or a base_size? There is no fully connected layer in the architecture so why we have this fixed size? Is this an arbitrary choice for numbers or there is a solid reason for it?

Thank you for your time!
Best,
Touqeer

qizhuli · 2020-02-21T14:41:59Z

I think the way Caffe is written is that each layer's Reshape() method is called exactly once during layer construction (see the Setup() method in layer.hpp), and afterwards, the shape of the blobs remain the same throughout training/testing. So by default, you can't have different input sizes for different images.

With that said, it's possible to implement your own layer, which calls the layer's Reshape() method inside each forward call. I am not 100% certain, but I have a feeling that it could negatively impact network speed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reason to have a fixed inference size (473x473) #103

Reason to have a fixed inference size (473x473) #103

TouqeerAhmad commented Feb 13, 2020

qizhuli commented Feb 21, 2020

Reason to have a fixed inference size (473x473) #103

Reason to have a fixed inference size (473x473) #103

Comments

TouqeerAhmad commented Feb 13, 2020

qizhuli commented Feb 21, 2020