Skip to content
master
Switch branches/tags
Go to file
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
 
 
 
 
 
 
 
 
 
 

README.md

bert2dnn

Large Scale BERT Distillation

Code for paper "BERT2DNN: BERT Distillation with MassiveUnlabeled Data for Online E-Commerce Search"

TODOs

  • BERT2DNN model implement
  • SST/amazon data pipeline
  • BERT/ERNIE finetune

Requirements

  • Python 3
  • Tensorflow 1.15

Quickstart:

Traing data

SST-2 dataset is in a tab-seperated format:

sentence Label
hide new secretions from the parental units 0

After fine-tuning BERT/ERNIE with this data, we obtain the teacher model, which could be used to predict scores on the transfer dat aset.

sentence Label logits prob prob_t2
hide new secretions from the parental units 0 -1.2881309986114502 0.024137031017202534 0.13589785133992555

This script will generate TF examples containing pair of text and label for training. The text is already tokenized with unigram and bigram tokenizer. The label is a soft target with a selected temperature.

python gen_tfrecord.py \
--input_file INPUT_TSV_FILE \
--output_file OUTPUT_TFRECORD \
--idx_text 0 --idx_label 3

Model training

python run.py --do_train True --do_eval True

Transfer Dataset

Our experiment use two public datasets:

  1. Stanford Sentiment Treebank: SST-2 download
  2. Amazon review dataset: download

About

Large Scale BERT Distillation

Resources

Releases

No releases published

Packages

No packages published

Languages