Skip to content

Conversation

@FrancescoSaverioZuppichini
Copy link
Contributor

What does this PR do?

Following a discussion with @Narsil , argument is_thing_map in FeatureExtractor.post_process_panoptic_segmentation will default to consider all instances thing. Thus, it won't perform instances merging.

The user should always provide a correct is_thing_map if he wants to merge instances, currently, the code will default to COCO that may be unwanted.

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

Copy link
Contributor

@Narsil Narsil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not particularly fond of the names thing vs stuff but I imagine this is the standards ?

@FrancescoSaverioZuppichini
Copy link
Contributor Author

Not particularly fond of the names thing vs stuff but I imagine this is the standards ?

More or less :)

Copy link
Contributor

@Narsil Narsil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will approve since thing and stuff seem to be standard in that case.

A more descriptive name seems more suitable from a pure code standpoint so I'll let you decide @FrancescoSaverioZuppichini

What I had in mind was more along the lines of:

ids_to_fuse: Set[int], *optional* : If the id is in the set, then only 1 instance of them can be outputted, for instance the class sky could not be part of 2 separate instances. If no map is passed, no fusion occurs.

Other name options that come to mind:

  • class_ids_to_fuse
  • merge might be better than fuse.
  • single_instance_classes/ids
  • singleton_ids

No need to use a Dict imo, since the default is not fusing, not belonging to the dict would mean not fusing, and so the dict values become redundant.
And the name really describes how it is used.

@FrancescoSaverioZuppichini
Copy link
Contributor Author

FrancescoSaverioZuppichini commented Mar 7, 2022

Thanks @Narsil for the very nice feedback. Following your amazing set of examples, I believe class_ids_to_fuse (or label_ids_to_fuse) is the most descriptive name we can use.

As you said, a Dict is not necessary. My implementation followed more or less what was done by the authors, but we can improve it.

I'll update the code.

"id": current_segment_id,
"category_id": pred_class,
"is_thing": not is_stuff,
"label_id": pred_class,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change will break pipelines actually.

Tagging @sgugger for visibility of this BC.

The pipeline addition is so new that I am OK aligning here for sure.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me know, I changed them to reflect the new naming

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pipeline for Maskformer was not included in the release I believe, so ok to break IMO!

# check if pred_class should be fused. For example, class "sky" cannot have more then one instance
should_fuse = False
if label_ids_to_fuse is not None:
should_fuse = pred_class in label_ids_to_fuse
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Forcing label_ids_to_fuse early on to be a set (maybe empty) is IMO better, since you can start relying on the variable type instead of having if ... None everywhere.

  if label_ids_to_fuse is None:
            logger.warning("`label_ids_to_fuse` unset. No instance will be fused.")
            label_ids_to_fuse = set()

And then here.:

should_fuse = pred_class in label_ids_to_fuse

No check necessary, you already changed label_ids_to_fuse to be a set.

Again the only reason why you don't use my_function(....label_ids_to_fuse=set()) is because set is mutable.
With immutable types (like tuple) it would be preferable to use a default argument imo.

Note: Technically you can use frozenset as the default argument but my personal take is that keeping things simple with None as default argument is easy enough and understandable too frozenset is not that widely used.
To be fair, set itself is probably not used that much either, but it does have a generator comprehension at least myset = {letter for letter in "hello"})

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Totally agree, thanks for pointing that out! Updated

Copy link
Collaborator

@sgugger sgugger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this!

"id": current_segment_id,
"category_id": pred_class,
"is_thing": not is_stuff,
"label_id": pred_class,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pipeline for Maskformer was not included in the release I believe, so ok to break IMO!

@FrancescoSaverioZuppichini FrancescoSaverioZuppichini merged commit e9fa7cd into master Mar 7, 2022
@FrancescoSaverioZuppichini FrancescoSaverioZuppichini deleted the maskformer_feature_extractor_is_thing_map branch March 7, 2022 18:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants