-
Notifications
You must be signed in to change notification settings - Fork 25.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
different embedding weights for base-uncased with different transformers versions #8866
Comments
Facing the same issue. A reply on this is highly appreciated. |
can this be your solution? Hope it helps... |
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Please note that issues that do not follow the contributing guidelines are likely to be ignored. |
I had the same issue for
Now check the values of these two entries:
I found that they were different. However, they should be the same
And check the following values:
You will find that they are the same. However, So that's the cause of the problem. To get the same behavior in Hugging Face v4.x as you get in Hugging Face v3.x, I manually set both equal to |
As a further comment, for models saved under Hugging Face v4.x, For models saved under Hugging Face v3.x, |
Environment info
transformers
version: 4.0.0, 3.4.0 and 2.9.0Information
Model I am using: Bert
The problem arises when using my own scripts. I trained a LayoutLM model by using the original Unilm repo (https://github.com/microsoft/unilm/tree/master/layoutlm) and obtained pretty good results (± 0.9 f1 score). When the Huggingface implementation came out, I retrained the model with the same dataset, parameters and seed and got rubbish results (less then 0.2 F1 score). After investigating, I found that the weights of the embeddings of the pretrained model, loaded at the beginning of training are different for different transformers versions. The weights are also different for the final trained model: a model trained with the original implementation gives different predict results for the same data when predicting using the Huggingface implementation, due to the weights being different after loading.
To reproduce
Steps to reproduce the behavior:
Huggingface code:
With original Layoutlm implementation, transformers 2.9.0:
Expected behavior
Get the same weights regardless the transformers version used.
The text was updated successfully, but these errors were encountered: