Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reason to have a fixed inference size (473x473) #103

Open
TouqeerAhmad opened this issue Feb 13, 2020 · 1 comment
Open

Reason to have a fixed inference size (473x473) #103

TouqeerAhmad opened this issue Feb 13, 2020 · 1 comment

Comments

@TouqeerAhmad
Copy link

Hi, Thank you for sharing the code and trained models!
I have a question specific to the demo in your PyTorch implementation. As I understand, you are using a base_size = 512 to which any incoming image is resized regardless of the input dimensions while maintaining the original aspect ratio. Then the inference is run in a grid fashion on crops of 473x473.

My question is to why have a fixed crop_size or a base_size? There is no fully connected layer in the architecture so why we have this fixed size? Is this an arbitrary choice for numbers or there is a solid reason for it?

Thank you for your time!
Best,
Touqeer

@qizhuli
Copy link

qizhuli commented Feb 21, 2020

I think the way Caffe is written is that each layer's Reshape() method is called exactly once during layer construction (see the Setup() method in layer.hpp), and afterwards, the shape of the blobs remain the same throughout training/testing. So by default, you can't have different input sizes for different images.

With that said, it's possible to implement your own layer, which calls the layer's Reshape() method inside each forward call. I am not 100% certain, but I have a feeling that it could negatively impact network speed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants