You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
First of all, thank you for the great PyTorch implementation.
But I can't get how the optimization procedure of the teacher network (eq.3 of the paper) is implemented.
The code should calculate the 2nd-order derivative during training, which is missing in the current version of the code.
Would you check the code again and let me know whether there is something I'm missing?
Thank you!
The text was updated successfully, but these errors were encountered:
First of all, thank you for the great PyTorch implementation.
But I can't get how the optimization procedure of the teacher network (eq.3 of the paper) is implemented.
The code should calculate the 2nd-order derivative during training, which is missing in the current version of the code.
Would you check the code again and let me know whether there is something I'm missing?
First of all, thank you for the great PyTorch implementation.
But I can't get how the optimization procedure of the teacher network (eq.3 of the paper) is implemented.
The code should calculate the 2nd-order derivative during training, which is missing in the current version of the code.
Would you check the code again and let me know whether there is something I'm missing?
Thank you!
The text was updated successfully, but these errors were encountered: