Training procedure cropping and resizing for semantic segmentation #41

ksagoog · 2021-07-23T18:28:20Z

Hi,

Thanks for your great paper. For the semantic segmentation model on ADE20K, you state the following:

"""Images are resized to 520 pixels side length.
We use random horizontal flipping and random rescaling in
the range ∈ (0.5, 2.0) for data augmentation. We train on
square random crops of size 480."""

I feel I must not understand the procedure as randomly scaling a 520-pixel length image between the range (.5, 2.0) will result in some images of side-length less than your random crop size of 480. Could you please clarify the order of operations and any missing detail here? Thank you!

ranftlr · 2021-07-25T10:11:25Z

We used pytorch-encoding as our training framework. See here for the transform that is applied to the input during training: https://github.com/zhanghang1989/PyTorch-Encoding/blob/331ecdd5306104614cb414b16fbcd9d1a8d40e1e/encoding/datasets/base.py#L64

We used this class with base_size=520 and crop_size=480.

ksagoog · 2021-07-26T04:10:45Z

Thanks!

ksagoog closed this as completed Jul 26, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Training procedure cropping and resizing for semantic segmentation #41

Training procedure cropping and resizing for semantic segmentation #41

ksagoog commented Jul 23, 2021

ranftlr commented Jul 25, 2021

ksagoog commented Jul 26, 2021

Training procedure cropping and resizing for semantic segmentation #41

Training procedure cropping and resizing for semantic segmentation #41

Comments

ksagoog commented Jul 23, 2021

ranftlr commented Jul 25, 2021

ksagoog commented Jul 26, 2021