You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks for releasing the Russian BERT model! However, I found that the released model does not include the parameters for masked language modeling (i.e. the BertOnlyMLMHead layers). I manually checked the weight file downloaded from hugging face and found that it only contains weights for 12 transformer layers. As a result, every time I load the model using the following code, the BertOnlyMLMHead is randomly initialized and the prediction differs.
fromtransformersimport*# transformers 2.4.1tokenizer=AutoTokenizer.from_pretrained('DeepPavlov/rubert-base-cased')
model=AutoModelWithLMHead.from_pretrained('DeepPavlov/rubert-base-cased')
inp='Он [MASK] человек.'inp=tokenizer.encode(inp)
print(tokenizer.convert_ids_to_tokens(inp)) # print the tokenized input# ['[CLS]', 'он', '[MASK]', 'человек', '.', '[SEP]']out=model(torch.tensor([inp]))[0]
tokenizer.convert_ids_to_tokens(out[0, 2].max(0)[1]) # the most plausible prediction differs when I reload the model
Actually, I found exactly the same issue for Greek BERT (here), and they managed to fix it (here). I guess you can follow the same method.
The text was updated successfully, but these errors were encountered:
Thanks for releasing the Russian BERT model! However, I found that the released model does not include the parameters for masked language modeling (i.e. the
BertOnlyMLMHead
layers). I manually checked the weight file downloaded from hugging face and found that it only contains weights for 12 transformer layers. As a result, every time I load the model using the following code, theBertOnlyMLMHead
is randomly initialized and the prediction differs.Actually, I found exactly the same issue for Greek BERT (here), and they managed to fix it (here). I guess you can follow the same method.
The text was updated successfully, but these errors were encountered: