-
Notifications
You must be signed in to change notification settings - Fork 25.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve semantic segmentation models #14355
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding this, I left w few comments on the names.
@@ -76,6 +77,26 @@ def prepare_feat_extract_dict(self): | |||
} | |||
|
|||
|
|||
def prepare_semantic_single_inputs(): | |||
ds = load_dataset("hf-internal-testing/fixtures_ade20k", split="test") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's give a better name than just ds
, we try to avoid such short names in the codebase.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool, thanks! Agree with both of you to keep the data augmentation part out of the feature extractor. LGTM!
36cd4fa
to
ed14f3e
Compare
* Improve tests * Improve documentation * Add ignore_index attribute * Add semantic_ignore_index to BEiT model * Add segmentation maps argument to BEiTFeatureExtractor * Simplify SegformerFeatureExtractor and corresponding tests * Improve tests * Apply suggestions from code review * Minor docs improvements * Streamline segmentation map tests of SegFormer and BEiT * Improve reduce_labels docs and test * Fix code quality * Fix code quality again
What does this PR do?
SegformerFeatureExtractor
trying to access non-existent.ndim
attribute #14332) for SegFormersemantic_ignore_index
, which defaults to 255. The loss functions of semantic segmentation models typically use 255 instead of -100. The reason for this is that some datasets include a 0 in the annotated segmentation maps to indicate "background", however it can be that background is not included in any of the labels of the dataset. e.g. ADE20k has 150 labels, but "background" is not included. Therefore, one reduces the labels of all segmentation maps by 1 value, and replaces the 0 by 255 as shown here. It's only after that that images get resized using PIL. However, if we replace values by -100, PIL can't read these images, and will thrown an error.segmentation_maps
toBeitFeatureExtractor
, and add corresponding tests.To do:
SegformerFeatureExtractor
currently includes thealign
,do_random_crop
anddo_pad
arguments at initialization, however I wonder whether it's maybe better to remove those, and only include the bare minimum in the feature extractors (similar toViTFeatureExtractor
) to get started (i.e. resizing, center cropping, normalizing). Things like random cropping and padding is maybe already a bit too much and makes the feature extractors more complex. It's also not easy to determine good default values for this feature extractor; should it randomly crop + pad by default, or not?SegformerFeatureExtractor
, if one agrees on this.preprocessor_config.json
of the semantic segmentation models on the hub.