Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[QUESTION] How to predict bounding boxed on big images with model trained on small images? #125

Closed
MichaelMonashev opened this issue Nov 13, 2020 · 4 comments
Labels
bug Something isn't working

Comments

@MichaelMonashev
Copy link
Contributor

I have efficientdet-d1 trained on images 640x640. And trying to predict bboxes on image 2048x2048.

I loaded model from snapshot and tried to change image_size for model.config to recreate anchors. But got error "omegaconf.errors.ReadonlyConfigError: Cannot change read-only config container" .

I look to the reset_head() , but it do not change anchors.

Think that I am going wrong way.

How to predict bounding boxed on big images with model trained on small images?

@MichaelMonashev MichaelMonashev added the bug Something isn't working label Nov 13, 2020
@MichaelMonashev MichaelMonashev changed the title How to predict bounding boxed on big images with model trained on small images? [QUESTION] How to predict bounding boxed on big images with model trained on small images? Nov 13, 2020
@MichaelMonashev
Copy link
Contributor Author

MichaelMonashev commented Nov 13, 2020

I found solution:

        config = get_efficientdet_config(model_name)
        config.image_size=(2048, 2048)

        self.model = create_model_from_config(config,
            bench_task='predict',
            num_classes=num_classes,
            pretrained=False,
            pretrained_backbone=False,
            checkpoint_path = '/snapshots/efficientdet-d1.pth',
        )

Is it correct?

@Ekta246
Copy link

Ekta246 commented Nov 17, 2020

Correct me if I am wrong!

So the Efficientdet-D0 model requires (512, 512) as the input image size.
My dataset consists images of size (600, 800)(H*W) input image size.

So 1) I resize the original image(600*800 ) to (512, 512) and scale the groundtruth bounding box accordingly.
While observing the bbox prediction output I see the bbox coordinates are higher than 512.

For eg, consider the bbox as [x1,y1,x2,y2] format I observe the output bbox as [0,0,700,600]
This means the output bbox is according to the original image and not the resized (512, 512) image.

I see out of 100 boxes 80 are in the wat discussed above. The shocking part is these boxes have higher classification scores like above 0.77 and all.

Am I understanding the concept correctly?
Any help appreciated!

I found solution:

        config = get_efficientdet_config(model_name)
        config.image_size=(2048, 2048)

        self.model = create_model_from_config(config,
            bench_task='predict',
            num_classes=num_classes,
            pretrained=False,
            pretrained_backbone=False,
            checkpoint_path = '/snapshots/efficientdet-d1.pth',
        )

Is it correct?

@Ekta246
Copy link

Ekta246 commented Nov 17, 2020

I found solution:

        config = get_efficientdet_config(model_name)
        config.image_size=(2048, 2048)

        self.model = create_model_from_config(config,
            bench_task='predict',
            num_classes=num_classes,
            pretrained=False,
            pretrained_backbone=False,
            checkpoint_path = '/snapshots/efficientdet-d1.pth',
        )

Is it correct?

I have one more question, when you input the (2048, 2048) image in the loader while predicting the create_loader takes you to transforms ResizePad function where the image resizes to (640, 640) because the efficientdet-d1 requires 640*640 image to operate, and then calculate the predictions correspondingly right?
So I don't think the larger image would arise problem.

@MichaelMonashev
Copy link
Contributor Author

@Ekta246 , I am using https://albumentations.ai/ for data augmentation. It can crop and resize image and bboxes together.

While observing the bbox prediction output I see the bbox coordinates are higher than 512.

I am clipping bbox.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants