Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#TinyBert Training Pipeline Problems #153

Closed
mexiQQ opened this issue Nov 8, 2021 · 2 comments
Closed

#TinyBert Training Pipeline Problems #153

mexiQQ opened this issue Nov 8, 2021 · 2 comments

Comments

@mexiQQ
Copy link

mexiQQ commented Nov 8, 2021

Hi Huawei team:

Sorry to disturb you, can you guys answer my following question?

Why did the training pipeline of TinyBert "general_distill.py" not use DDP to initialize the student model, instead of only initializing the teacher model? And why there is no synchronization of the normalization layer?

image

And when opening the mixed-precision, where can I find the function "backward" from "optimizer"?

image

thx

@mexiQQ mexiQQ changed the title #TinyBert #TinyBert Training Pipeline Problems Nov 8, 2021
@zwjyyc
Copy link
Contributor

zwjyyc commented Nov 9, 2021

Hi,
This code does not support fp16 training and DDP training. So the relevant part is redundant.
Please refer to the AutoTinyBERT code which supports both fp16 and DDP training.

@mexiQQ
Copy link
Author

mexiQQ commented Nov 10, 2021

Thanks for your reply.

@mexiQQ mexiQQ closed this as completed Nov 10, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants