New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add temperature in kd doc classification #199

Closed
wants to merge 1 commit into
base: master
from

Conversation

Projects
None yet
2 participants
@liaimi
Copy link

liaimi commented Jan 10, 2019

Summary:
reference implementation from https://arxiv.org/pdf/1503.02531.pdf
I'm actually not sure if we need to re-train the teacher using the same temperature. Quoting "In the simplest form of distillation, knowledge is transferred to the distilled model by training it on a transfer set and using a soft target distribution for each case in the transfer set that is produced by using the cumbersome model with a high temperature in its softmax. The same high temperature is used when training the distilled model, but after it has been trained it uses a temperature of 1."

Differential Revision: D13605113

add temperature in kd doc classification
Summary:
reference implementation from https://arxiv.org/pdf/1503.02531.pdf
I'm actually not sure if we need to re-train the teacher using the same temperature. Quoting "In the simplest form of distillation, knowledge is transferred to the distilled model by training it on a transfer set and using a soft target distribution for each case in the transfer set that is produced by using the cumbersome model with a high temperature in its softmax. The same high temperature is used when training the distilled model, but after it has been trained it uses a temperature of 1."

Differential Revision: D13605113

fbshipit-source-id: ba39ff26f165259e17fd3d156bd9485c788dd1f1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment