You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, buddy, there are a lot of bugs that I cannot run the "train-supernet.py" successfully !
model settings issue in train_supernet.py
In addition, there is one imported model >from model.supermodel import BertForSequenceClassification
And you try to create teacher_model and super_model with the same SuperModel (SuperEmbedding, SuperLinear, SuperLayerNorm) setting?
The text was updated successfully, but these errors were encountered:
The teacher_model is just a used for further distillation, but in our implementation we do not use it as you can see in our code, so please ignore it. And we set the config of superBERT using the function set_sample_config , so this two models are different. I will make the code clearer in the next few days.
Ok, certainly. I notice that.
Thank you for your early reply.
Some minor mistakes in this code.
call torch.nn.Dataparallel twice, one in main function , and another call in train(). so you doubly pack the super_model, resulting in the >line 253 "super_model.module.set_sample_config()" ineffective.
In Multi-GPU training, there is a error: >RuntimeError: Input, output and indices must be on the current device.
load train_dataset twice.
hope to help you to speed up.
Hi, buddy, there are a lot of bugs that I cannot run the "train-supernet.py" successfully !
model settings issue in train_supernet.py
In addition, there is one imported model >from model.supermodel import BertForSequenceClassification
And you try to create teacher_model and super_model with the same SuperModel (SuperEmbedding, SuperLinear, SuperLayerNorm) setting?
The text was updated successfully, but these errors were encountered: