Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parameters of BertOnlyMLMHead missing from the released DeepPavlov/rubert-base-cased model #1148

Closed
jzbjyb opened this issue Mar 8, 2020 · 2 comments
Assignees

Comments

@jzbjyb
Copy link

jzbjyb commented Mar 8, 2020

Thanks for releasing the Russian BERT model! However, I found that the released model does not include the parameters for masked language modeling (i.e. the BertOnlyMLMHead layers). I manually checked the weight file downloaded from hugging face and found that it only contains weights for 12 transformer layers. As a result, every time I load the model using the following code, the BertOnlyMLMHead is randomly initialized and the prediction differs.

from transformers import *  # transformers 2.4.1
tokenizer = AutoTokenizer.from_pretrained('DeepPavlov/rubert-base-cased')
model = AutoModelWithLMHead.from_pretrained('DeepPavlov/rubert-base-cased')
inp = 'Он [MASK] человек.'
inp = tokenizer.encode(inp)
print(tokenizer.convert_ids_to_tokens(inp))  # print the tokenized input
# ['[CLS]', 'он', '[MASK]', 'человек', '.', '[SEP]']
out = model(torch.tensor([inp]))[0]
tokenizer.convert_ids_to_tokens(out[0, 2].max(0)[1]) # the most plausible prediction differs when I reload the model

Actually, I found exactly the same issue for Greek BERT (here), and they managed to fix it (here). I guess you can follow the same method.

@yurakuratov
Copy link
Contributor

Thank you!

We have weights for LM head in TensorFlow checkpoint. If you still need them you can find link on this page: http://docs.deeppavlov.ai/en/master/features/pretrained_vectors.html#downloads) and then convert it to PyTorch.

We will update our models in Transformers.

@jzbjyb
Copy link
Author

jzbjyb commented May 17, 2020

Thanks for updating the tf checkpoint! I converted it to PyTorch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants