mnist_distilling_John

reproduce 《Distilling the Knowledge in a Neural Network》

[1] Student Network

architecture : two hidden layer , 800 rectified linear unit and no regularization

performance ： accuracy is 0.9697 on test dataset

source code ： Student_NN.ipynb

[2] Teacher Network

architecture : ConvNets

performance ： accuracy is 0.9936 on test dataset

source code ： mnist_distilling.ipynb

weights file ： frozen_model.pb

[3] Student Distilling Knowledge Network

architecture : the same to Student Network

performance ： accuracy is 0.9635 on test dataset

source code ： Student_Distilling.ipynb

weights file ： frozen_model.pb (teacher model)

super params : T = 1 （according to paper）， alpha = 0.1 is unkown

when net at a temperture of 20, the performace is very bad and can't converge !!!But I don't know why paper do this?

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.gitattributes		.gitattributes
README.md		README.md
Student_Distilling.ipynb		Student_Distilling.ipynb
Student_NN.ipynb		Student_NN.ipynb
frozen_graph_exporter.py		frozen_graph_exporter.py
frozen_model.pb		frozen_model.pb
mnist_distilling.ipynb		mnist_distilling.ipynb
paper.png		paper.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

mnist_distilling_John

[1] Student Network

[2] Teacher Network

[3] Student Distilling Knowledge Network

About

Releases

Packages

Languages

SpeedUp4DL/mnist_distilling_John

Folders and files

Latest commit

History

Repository files navigation

mnist_distilling_John

[1] Student Network

[2] Teacher Network

[3] Student Distilling Knowledge Network

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages