New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fail to load structbert.en.large while trying to reproduce the result of GLUE #27
Comments
It seems that the real error occurred in run_classifier_multi_task.py, Line 1056. Could you please comment out the |
if I use init_checkpoint to store the path and just keep line 1056, the error will be: RuntimeError: Error(s) in loading state_dict for BertModel: |
I guess it may be you use the |
I use the following command
I have tried both |
We have tested the case that uses |
Unchanged code can run successfully all the time, but the result is only the half of the paper. And if you print the log you will find it execute the except part not the try part. |
nohup: ignoring input Epoch: 0%| | 0/3 [00:00<?, ?it/s] Iteration: 100%|██████████| 59359/59359 [53:25<00:00, 14.81it/s] Epoch: 33%|███▎ | 1/3 [59:51<1:59:43, 3591.88s/it] Iteration: 100%|██████████| 59359/59359 [53:50<00:00, 14.70it/s] Epoch: 67%|██████▋ | 2/3 [2:00:25<1:00:04, 3604.43s/it] Iteration: 100%|██████████| 59359/59359 [54:27<00:00, 14.53it/s] Epoch: 100%|██████████| 3/3 [3:01:22<00:00, 3620.24s/it] |
The performance of multi-task mode is related to the similarity between tasks. For example, MNLI and STS-B are similar, results of STS-B when run MNLI and STS-B together is better than run STS-B only, but CoLA and other tasks are not similar. You can get normal accuracy on the premise of running CoLA task alone (--task_name CoLA). |
Thank you very much for your patient reply. Seems like it is necessary to install apex for reproduce the correct result. without apex
with apex
|
Hi,
I downloaded the structbert.en.large through the given link (https://alice-open.oss-cn-zhangjiakou.aliyuncs.com/StructBERT/en_model), but the below error occured during running.
RuntimeError: Error(s) in loading state_dict for BertForSequenceClassificationMultiTask:
Missing key(s) in state_dict: "classifier.0.weight", "classifier.0.bias".
Unexpected key(s) in state_dict: "lm_bias", "linear.weight", "linear.bias", "LayerNorm.gamma", "LayerNorm.beta", "classifier.weight", "classifier.bias".
Do you have any idea why this happen? Thank you very much.
The text was updated successfully, but these errors were encountered: