We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
您好,请问一下,您使用余弦相似度在蒸馏的时候tea_logit的分布是不是过于光滑了?几乎都是0.0002和0.0003。以及蒸馏时学生模型使用L2计算,与余弦不太符合。还有使用logit_y=True是为什么?放缩吗?也不对吧
The text was updated successfully, but these errors were encountered:
No branches or pull requests
您好,请问一下,您使用余弦相似度在蒸馏的时候tea_logit的分布是不是过于光滑了?几乎都是0.0002和0.0003。以及蒸馏时学生模型使用L2计算,与余弦不太符合。还有使用logit_y=True是为什么?放缩吗?也不对吧
The text was updated successfully, but these errors were encountered: