Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding support for multiple mask tokens. #14716

Merged
merged 6 commits into from
Dec 14, 2021

Conversation

Narsil
Copy link
Contributor

@Narsil Narsil commented Dec 10, 2021

What does this PR do?

When presented multiple masks, it's impossible to retrieve the conjugate probabilities.
Instead of trying to workaround that (see discussions in previous PR) this PR instead
just outputs the raw top_k proposition at each locus, since it gets trickier to find a good
proxy for "joint probabilities". Instead of trying to solve this impossible problem we simply
show exactly what the model outputs.

@naveenjafer mentionned as co-author since much of this PR was pulled from there.

This PR was resurrected partly because Perceiver (byte-level model) need to do this type of masking to be useful.

Fixes # (issue)

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@Narsil Narsil force-pushed the fill_mask_multi_mask branch 2 times, most recently from 8068d62 to 61686e2 Compare December 14, 2021 10:07
Narsil and others added 2 commits December 14, 2021 11:10
- Original implem: huggingface#10222

Co-authored-by: njafer <naveen.jafer@oracle.com>
we add information to the tasks to specify tasks where we know for sure
if we need the tokenizer/feature_extractor or not.
Copy link
Member

@LysandreJik LysandreJik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me, cool work @Narsil.

Regarding the conversations in #10222, should this be marked as experimental for the first few weeks so that conversation may occur on this feature?

Could you add a note about the support of this feature in the documentation?

Comment on lines +271 to +279
NO_FEATURE_EXTRACTOR_TASKS = set()
NO_TOKENIZER_TASKS = set()
for task, values in SUPPORTED_TASKS.items():
if values["type"] == "text":
NO_FEATURE_EXTRACTOR_TASKS.add(task)
elif values["type"] in {"audio", "image"}:
NO_TOKENIZER_TASKS.add(task)
elif values["type"] != "multimodal":
raise ValueError(f"SUPPORTED_TASK {task} contains invalid type {values['type']}")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this approach, it should make the pipelines more robust to models with different capabilities in terms of preprocessors.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me as well!

src/transformers/pipelines/fill_mask.py Outdated Show resolved Hide resolved
@@ -340,3 +338,27 @@ def fill_mask_with_duplicate_targets_and_top_k(self, model, tokenizer):
# The target list contains duplicates, so we can't output more
# than them
self.assertEqual(len(outputs), 3)

def fill_mask_with_multiple_masks(self, model, tokenizer):
Copy link
Contributor

@NielsRogge NielsRogge Dec 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we perhaps add a test for Perceiver (similar to the image classification models)?

Or is this not required here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this PR is pretty orthogonal to Perceiver.

We could add a slow test for sure, but it doesn't have to be Perceiver specific.

In fact, I'll add something on the random model (it just needs to be consistent, actual values are less important)

Copy link
Member

@LysandreJik LysandreJik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok this looks good to me! Looking forward to the additional test, feel free to merge whenever.

@Narsil Narsil merged commit e7ed7ff into huggingface:master Dec 14, 2021
@Narsil Narsil deleted the fill_mask_multi_mask branch December 14, 2021 15:46
Albertobegue pushed a commit to Albertobegue/transformers that referenced this pull request Jan 27, 2022
* Adding support for multiple mask tokens.

- Original implem: huggingface#10222

Co-authored-by: njafer <naveen.jafer@oracle.com>

* In order to accomodate optionally multimodal models like Perceiver

we add information to the tasks to specify tasks where we know for sure
if we need the tokenizer/feature_extractor or not.

* Adding info in the documentation about multi masks.

+ marked as experimental.

* Add a copy() to prevent overriding the same tensor over and over.

* Fixup.

* Adding small test for multi mask with real values..

Co-authored-by: njafer <naveen.jafer@oracle.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants