-
Notifications
You must be signed in to change notification settings - Fork 30.8k
Description
Using an example from the bart doc:
https://huggingface.co/transformers/model_doc/bart.html#bartforconditionalgeneration
from transformers import BartTokenizer, BartForConditionalGeneration
tokenizer = BartTokenizer.from_pretrained('facebook/bart-large')
TXT = "My friends are <mask> but they eat too many carbs."
model = BartForConditionalGeneration.from_pretrained('facebook/bart-large')
input_ids = tokenizer([TXT], return_tensors='pt')['input_ids']
logits = model(input_ids)[0]
masked_index = (input_ids[0] == tokenizer.mask_token_id).nonzero().item()
probs = logits[0, masked_index].softmax(dim=0)
values, predictions = probs.topk(5)
print(tokenizer.decode(predictions).split())
gives:
Some weights of the model checkpoint at facebook/bart-large were not used
when initializing BartForConditionalGeneration:
['encoder.version', 'decoder.version']
- This IS expected if you are initializing BartForConditionalGeneration from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPretraining model).
- This IS NOT expected if you are initializing BartForConditionalGeneration from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
test:9: UserWarning: This overload of nonzero is deprecated:
nonzero()
Consider using one of the following signatures instead:
nonzero(*, bool as_tuple) (Triggered internally at /opt/conda/conda-bld/pytorch_1597302504919/work/torch/csrc/utils/python_arg_parser.cpp:864.)
masked_index = (input_ids[0] == tokenizer.mask_token_id).nonzero().item()
['good', 'great', 'all', 'really', 'very']
well, there is one more issue of using a weird deprecated nonzero()
invocation, which has to do with some strange undocumented requirement to pass the as_tuple
arg, since pytorch 1.5 .pytorch/pytorch#43425
we have authorized_missing_keys
:
authorized_missing_keys = [r"final_logits_bias", r"encoder\.version", r"decoder\.version"]
https://github.com/huggingface/transformers/blob/master/src/transformers/modeling_bart.py#L942
which correctly updates missing_keys
- should there be also an authorized_unexpected_keys
which would clean up unexpected_keys
?
(note: I re-edited this issue once I understood it better to save reader's time, the history is there if someone needs it)
And found another variety of it: for ['model.encoder.version', 'model.decoder.version']
tests/test_modeling_bart.py::BartModelIntegrationTests::test_mnli_inference Some weights of the model checkpoint at facebook/bart-large-mnli were not used when initializing BartForSequenceClassification: ['model.encoder.version', 'model.decoder.version']
- This IS expected if you are initializing BartForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPretraining model).
- This IS NOT expected if you are initializing BartForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
PASSED