-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
input size and crop #22
Comments
Thanks for your insightful question. In the training process, resizing the image to 320x320 is to achieve the training batch (within each batch the image s'spatial resolutions have to be the same). RandomCrop the 320x320 images to 288x288 is actually a dataset augmentation skill to obtain position invariant models. In the testing process, we need to have the same scale with the training stage, therefore, only resizing operation is used then.
In the latest updates, we changed the bilinear upsampling by replacing the upsampling factor (e.g. 2 or 4) to upsampling target size (height, width). That means the upsampling operation will updample the source feature maps to the exact same size with the target feature maps regardless of the size of the target size (could be odd or even numbers), which avoids errors in concatenation. Therefore, the model now is able to handle arbitrary input size. You can give it a try by changing the RescaleT(320) to whatever you want. But the performance is not guaranteed as we mentioned in the tips.: the modification is to facilitate the retraining of our network on different datasets.
… On May 18, 2020, at 1:00 PM, dinoByteBr ***@***.***> wrote:
Thanks a lot for you awesome performing model!
I'm wondering about scaling and random crop, for training you first scale and then crop to 288x288 and thus the tensor has this size (288), what role does then scaling play here and why you talk about 320x320 as input size instead of 288x288?
RescaleT(320),
RandomCrop(288),
With your latest model update, upscaling supports different ratios, as it looks like for me, or is only squarish input supported or e.g. 640x480 as well?
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub <#22>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ADSGORLFE4PUUTG23WXBCDTRSGAT5ANCNFSM4NEKY46A>.
|
thanks for your detailed answer, I start to be afraid completely miss the point here, sorry if this question is too dumb. I understood crop is for data aug, but should't be then the test size the same as used for crop? |
"but should't be then the test size the same as used for crop?"
RES: No. The networks are usually (theoretically) translation invariant but
not scale invariant. The cropping mainly changes the translation. But it
doesn't change the receptive fields. In both training and test, keeping the
scaling consistent is necessary, while cropping isn't. Because most of the
networks are not scale invariant. Besides, cropping in testing will
introduce another problem. How can we achieve the complete prediction map
of the whole input image in the testing process.
…On Mon, May 18, 2020 at 3:39 PM dinoByteBr ***@***.***> wrote:
thanks for your detailed answer, I start to be afraid completely miss the
point here, sorry if this question is too dumb.
as far as I see it, it doesn't matter with what size RescaleT is called in
training, the inputs for net() is always the crop size (288) in
net(inputs_v) -> crop happens after scale.
I understood crop is for data aug, but should't be then the test size the
same as used for crop?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#22 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADSGORNLKWQZMJCHGAWXVWTRSGTIJANCNFSM4NEKY46A>
.
--
Xuebin Qin
PhD
Department of Computing Science
University of Alberta, Edmonton, AB, Canada
Homepage:https://webdocs.cs.ualberta.ca/~xuebin/
|
thanks now all is clear and I could reproduce good results with arbitrary size! |
Thanks for your great works! I am using this on iOS application and converted to MLModel . When I use MLModel, I want any arbitrary size for input but it seems to support only square size. for example portrait, landscape like (240 * 320, 320 * 300) I am getting an error with arbitrary size. What is the solution? Is there a problem with converting? |
It is safer to resize all the input to 320x320, which will theoretically
give better results. Since there are several downsample and upsample
operations, your size may trigger some errors in that part. So it is good
to show the error, otherwise we cann't give exact solutions.
…On Mon, Jul 19, 2021 at 6:59 AM mgstar1021 ***@***.***> wrote:
Thanks for your great works!
I am using this on iOS application and converted to MLModel . When I use
MLModel, I want any arbitrary size for input but it seems to support only
square size. for example portrait, landscape like (240 * 320, 320 * 300)
I am getting an error with arbitrary size. What is the solution? Is there
a problem with converting?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#22 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADSGORN2ZYJNOR6XRGDYUA3TYOIHHANCNFSM4NEKY46A>
.
--
Xuebin Qin
PhD
Department of Computing Science
University of Alberta, Edmonton, AB, Canada
Homepage:https://webdocs.cs.ualberta.ca/~xuebin/
|
Thanks for your reply. Is it better to use square image than portrait or landscape image which sets one side(height or width) to 320? Is it difficult to support it? |
It is not different to support arbitrary resolution. But our model are
trained on 320x320 and it should give the best performance on that size.
…On Mon, Jul 19, 2021 at 6:54 PM mgstar1021 ***@***.***> wrote:
It is safer to resize all the input to 320x320, which will theoretically
give better results. Since there are several downsample and upsample
operations, your size may trigger some errors in that part. So it is good
to show the error, otherwise we cann't give exact solutions.
… <#m_7269284139668149187_>
On Mon, Jul 19, 2021 at 6:59 AM mgstar1021 *@*.***> wrote: Thanks for
your great works! I am using this on iOS application and converted to
MLModel . When I use MLModel, I want any arbitrary size for input but it
seems to support only square size. for example portrait, landscape like
(240 * 320, 320 * 300) I am getting an error with arbitrary size. What is
the solution? Is there a problem with converting? — You are receiving this
because you commented. Reply to this email directly, view it on GitHub <#22
(comment)
<#22 (comment)>>,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/ADSGORN2ZYJNOR6XRGDYUA3TYOIHHANCNFSM4NEKY46A
.
-- Xuebin Qin PhD Department of Computing Science University of Alberta,
Edmonton, AB, Canada Homepage:https://webdocs.cs.ualberta.ca/~xuebin/
Thanks for your reply. Is it better to use square image than portrait or
landscape image which sets one side(height or width) to 320? Is it
difficult to support it?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#22 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADSGORPP7AVIBLHRR6PY6BTTYQ4DDANCNFSM4NEKY46A>
.
--
Xuebin Qin
PhD
Department of Computing Science
University of Alberta, Edmonton, AB, Canada
Homepage:https://webdocs.cs.ualberta.ca/~xuebin/
|
Hey, first of all, thank you for your work. Eagerly waiting to see your new paper (and model). Regarding the fact that the input sizes are different for training (288x288 after cropping) and training (320x320 after resizing), you say that scaling has to be consistent since generally models are not scale-invariant.
Once again, thanks a lot for your work. This model is a godsend. |
@xuebinqin I have the same doubts |
Thanks a lot for you awesome performing model!
I'm wondering about scaling and random crop, for training you first scale and then crop to 288x288 and thus the tensor has this size (288), what role does then scaling play here and why you talk about 320x320 as input size instead of 288x288?
With your latest model update, upscaling supports different ratios, as it looks like for me, or is only squarish input supported or e.g. 640x480 as well?
The text was updated successfully, but these errors were encountered: