-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
semantic segmentation #4
Comments
We process each image in a sliding window way. As mentioned in the paper, we use the UperNet head as our decoder head. |
right thanks. |
@Andrew-Qibin I'm seeing rather poor Cityscapes segmentation results (training starts at 20 IOU first epoch and only gets to 53 IOU) right out of the box using a volo_d2 trunk (using imagenet pretrained weight). Probably i've got some tensor ordering wrong or something, but is there any trick to adapting the code to higher resolution? I had to of course override the positional encodings from the checkpoint with a new higher resolution positional encoding (1024). And i created a new forward() that supplies the features in N, C, H, W form. But hmm, not sure what i've got wrong right now. |
Hello, thanks very much for sharing the code for your tremendous research!
For semantic segmentation, did you just run evaluation with multiple square tiles to handle the non-square resolution of Cityscapes? Can you share any details, like decoder head architecture?
The text was updated successfully, but these errors were encountered: