Skip to content

fill-mask target for full words not enabled? #17374

@i-am-neo

Description

@i-am-neo

System Info

- `transformers` version: 4.19.2
- Platform: Linux-5.4.188+-x86_64-with-Ubuntu-18.04-bionic
- Python version: 3.7.13
- Huggingface_hub version: 0.6.0
- PyTorch version (GPU?): 1.11.0+cu113 (False)
- Tensorflow version (GPU?): 2.8.0 (False)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed
- Using GPU in script?: <fill in>
- Using distributed or parallel set-up in script?: <fill in>

Who can help?

@Narsil and @LysandreJik (?)
How can one use Roberta for fill-mask to get the full word candidate and its "full" score for Roberta-large? Open to workaround solutions.

My example:
sentence = f"Nitzsch argues against the doctrine of the annihilation of the wicked, regards the teaching of Scripture about eternal {nlp.tokenizer.mask_token} as hypothetical."
Notebook here.

Using pipeline, the output I get is:
The specified target token damnationdoes not exist in the model vocabulary. Replacing withĠdamn.

Thanks.

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

See notebook above.

Expected behavior

I expect to see "damnation" with its score.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions