Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Confusing Resizing in the Pipelines #8254

Closed
sarmientoj24 opened this issue Jun 23, 2022 · 2 comments
Closed

Confusing Resizing in the Pipelines #8254

sarmientoj24 opened this issue Jun 23, 2022 · 2 comments
Assignees

Comments

@sarmientoj24
Copy link

sarmientoj24 commented Jun 23, 2022

I would like to understand how mmdetection resizes the images and pad them and how does keep_ratio come into play.

Consider the config

transforms=[
            dict(type='Resize', img_scale=(640, 640), keep_ratio=True),
            dict(type='RandomFlip', flip_ratio=0.0),
            dict(
                type='Normalize',
                mean=[123.675, 116.28, 103.53],
                std=[58.395, 57.12, 57.375],
                to_rgb=True),
            dict(type='Pad', size_divisor=32),
            dict(type='DefaultFormatBundle', keys=['img']),
            dict(type='Collect', keys=['img'])
        ])

If I have an Image that is 1400x1900, what happens here?

  1. Does it resize it to Mx640 where M is just the short side converted from the ratio?
  2. Does it get padded?
  3. What is the output here?

Is it possible to mimic square training like how YOLOv5 does it where it resizes it to the target 640x640, keeps ratio by padding it?

Additional

It seems that pytorch2onnx results for both ONNX and Pytorch resizes the input image into the test_cfg image size and that the predictions are predictions for that image size and not rescaled for the original image resolution. It also seems that it doesn't pad the image.

Meanwhile, the image_demo provides the outputs rescaled to the original input shape.

Could you please elaborate on this? It is rather confusing

@sarmientoj24
Copy link
Author

@RangiLyu any on this?

@RangiLyu
Copy link
Member

Sorry for the late reply.

Does it resize it to Mx640 where M is just the short side converted from the ratio?

yes

Does it get padded?

the Resize won't pad the image. The Pad transform dict(type='Pad', size_divisor=32), will.

What is the output here?

640 / 1900 * 1400 = 472
472 + 8(pad size_divisor=32) 480

Is it possible to mimic square training like how YOLOv5 does it where it resizes it to the target 640x640, keeps ratio by padding it?

just use a square padding dict(type='Pad', pad_to_square=True),

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants