Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

蒸馏 #8

Open
gzoftju opened this issue Apr 21, 2024 · 0 comments
Open

蒸馏 #8

gzoftju opened this issue Apr 21, 2024 · 0 comments

Comments

@gzoftju
Copy link

gzoftju commented Apr 21, 2024

您好,请问一下,您使用余弦相似度在蒸馏的时候tea_logit的分布是不是过于光滑了?几乎都是0.0002和0.0003。以及蒸馏时学生模型使用L2计算,与余弦不太符合。还有使用logit_y=True是为什么?放缩吗?也不对吧

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant