Skip to content

CNN、BiLSTM、Bert(3layers)对Bert(12layers)模型的蒸馏的keras实现

Notifications You must be signed in to change notification settings

wangbq18/distillation_model_keras_bert

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Knowledge Distillation

代码是在kaggle的kernel上面写的(ipynb格式),迁移下来时还没来得及调试,直接运行可能会报错。
kernel的代码地址:https://www.kaggle.com/duolaaa/weibo-distil-student-layers3?scriptVersionId=29557259
代码实现了CNN、BiLSTM、Bert(3layers)对Bert(12layers)模型的蒸馏。有distillation、patient以及patient.full三种模式,分别蒸馏teacher的logit、output feature以及hiddent feature的知识。

参考文献

Patient Knowledge Distillation for BERT Model Compression
Distilling the Knowledge in a Neural Network

About

CNN、BiLSTM、Bert(3layers)对Bert(12layers)模型的蒸馏的keras实现

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages