Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

question in Evaluate #10

Closed
muzhaohui opened this issue Jul 19, 2021 · 1 comment
Closed

question in Evaluate #10

muzhaohui opened this issue Jul 19, 2021 · 1 comment

Comments

@muzhaohui
Copy link

Hello there!

First of all, thank you for your outstanding work! I have a problem when reproducing your work.

Your GT is generated through the teacher network, so when the teacher network performance changes, then the GT will change accordingly. Do you have a more accurate GT? Or can you teach me how to measure the performance of the student model more accurately?

Thanks!

@avalada
Copy link
Contributor

avalada commented Jul 20, 2021

You may have misunderstood the goal of the approach. The teachers are first trained with GT data on disjoint modality-specific datasets. Then the student is trained to match the predictions of the teacher. Paired GT data for the teacher and the student is not available. If you have paired GT labels for the teacher and the student then there is no point using knowledge distillation for this case, you can just train the student on the GT directly without any teacher.

@avalada avalada closed this as completed Jul 20, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants