Skip to content

Latest commit

 

History

History
50 lines (38 loc) · 1.55 KB

README.md

File metadata and controls

50 lines (38 loc) · 1.55 KB

ProphetNet

A pre-trained language model for sequence-to-sequence learning with a novel self-supervised objective called future n-gram prediction.

Speedup by using FastSeq

  • CNN daily mail validation data, NVIDIA-V100-16GB

    BatchSize 32 64 128
    prophetnet 2.4 samples/s 2.8 samples/s OOM
    above + fastseq 6.0 samples/s 7.6 samples/s 10.7 samples/s

Model

ProphetNet-large-160GB (fine-tuned on CNN/Daily Mail with 9 epochs) link

Task

CNN/DM validation data

Setting

$ fastseq-generate-for-fairseq \
      cnn_dm_bert/len-512.bin \
      --path prophetnet/model.pt \
      --fp16 \
      --task translation_prophetnet \
      --batch-size BATCH_SIZE \
      --beam 4 \
      --num-workers 4 \
      --min-len 55 \
      --max-len-b 140 \
      --no-repeat-ngram-size 3 \
      --lenpen 2.0 \
      --remove-bpe \
      --gen-subset valid \

To get baseline speed number which doesn't use FastSeq optimizations, replace fastseq-generate-for-fairseq by fairseq-generate.

Code Example

Refer to file.

Generate the binary data

bash generate_binary_data_for_prophetnet.sh INPUT_DATA_DIR