Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upsample size mismatch in segmentation models #803

Closed
davidtvs opened this issue Mar 28, 2023 · 3 comments
Closed

Upsample size mismatch in segmentation models #803

davidtvs opened this issue Mar 28, 2023 · 3 comments
Labels

Comments

@davidtvs
Copy link

Describe the bug

Depending on the input image size, upsampled feature maps with nn.Upsample don't always match the size of the skip connection. This is a known issue, some reference links:

Replacing nn.Upsample with torch.nn.functional.interpolate seems to be the recommended solution.

To Reproduce

Here's a snippet using PP-LiteSeg. The dataset is cityscapes, but that's not important, the image size is the important factor. I imagine that the issue is in all models using nn.Upsample and concatenating with skip connections:

from super_gradients.training import models, dataloaders, Trainer
from super_gradients.common.object_names import Models
from super_gradients.training.metrics import IoU


trainer = Trainer(experiment_name="eval-pp-liteseg-b75")
val_loader = dataloaders.cityscapes_stdc_seg75_val(dataset_params={
    "transforms": [
            {
                "SegRescale": {
                    "long_size": 1025
                }
            }
        ]
    },
    dataloader_params={"batch_size": 1},
)
model = models.get(
    Models.PP_LITE_B_SEG75,
    pretrained_weights="cityscapes",
)
metric = IoU(num_classes=20, ignore_index=19)
miou = trainer.test(
    model=model,
    test_loader=val_loader,
    test_metrics_list=[metric],
    metrics_progress_verbose=False
)[0].cpu().item()
print(f"mIoU: {miou}")

Results in an error:

  File ".../src/super_gradients/training/models/segmentation_models/ppliteseg.py", line 52, in forward
    atten = torch.cat([*self._avg_max_spatial_reduce(x, use_concat=False), *self._avg_max_spatial_reduce(skip, use_concat=False)], dim=1)
RuntimeError: Sizes of tensors must match except in dimension 1. Expected size 66 but got size 65 for tensor number 2 in the list.

Expected behavior

Fully convolutional segmentation models should work for all input image sizes.

Environment:

  • Ubuntu
  • super-gradients v3.0.7
  • PyTorch 1.11
@dagshub
Copy link

dagshub bot commented Mar 28, 2023

@BloodAxe
Copy link
Collaborator

BloodAxe commented Apr 3, 2023

Hi! Thanks for raising this issue.

TLDR: One cannot feed arbitrary-sized image to the model.

I believe the root cause of the problem is that input image has a size that is not integer divisible by a maximum stride of the backbone (32).
In this case backbone produces feature maps that has size that is not a power of two.

Indeed, explicitly specifying output size for upsample operations could patch this.
However, this would work only for interpolation-based upsampling and not for nn.PixelShuffle or nn.ConvTranspose2D upsampling.

We definitely will look into it, but as of now I suggest to preprocess input images to have their size that is divisible by 32.

@davidtvs
Copy link
Author

davidtvs commented Apr 7, 2023

At least for nn.ConvTranspose2D there's output_padding to address this issue, see: https://pytorch.org/docs/stable/generated/torch.nn.ConvTranspose2d.html#convtranspose2d

Looks like for nn.PixelSuffle there's no way around it though, maybe that would be a good feature request for PyTorch.

@ranrubin ranrubin added the 🐛 Bug Something isn't working label May 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants