Skip to content
This repository has been archived by the owner on Nov 21, 2022. It is now read-only.

Latest commit

History

History
54 lines (31 loc) 路 2.63 KB

summarization.rst

File metadata and controls

54 lines (31 loc) 路 2.63 KB

Summarization

The Task

The Summarization task requires the model to summarize a document into a shorter sentence.

Datasets

Currently supports the CNN/DailyMail and XSUM dataset or custom input text files.

In the CNN/Daily Mail dataset, this involves taking long articles and summarizing them.

document: "The car was racing towards the tunnel, whilst blue lights were flashing behind it. The car entered the tunnel and vanished..."

Model answer: "Police are chasing a car entering a tunnel."

Training

To use this task, we must select a Seq2Seq Encoder/Decoder based model, such as T5 or BART. Conventional models like GPT/BERT will not work as they are encoder only.

python train.py task=nlp/summarization dataset=nlp/summarization/cnn_dailymail backbone.pretrained_model_name_or_path=t5-base # dataset can be swapped to xsum

Summarization Inference Pipeline (experimental)

By default we use the summarization pipeline, which requires an input document as text.

For Hydra to correctly parse your input argument, if your input contains any special characters you must either wrap the entire call in single quotes like '+x="my, sentence"' or escape special characters. See escaped characters in unquoted values.

python predict.py task=nlp/summarization +checkpoint_path=/path/to/model.ckpt '+x="The car was racing towards the tunnel, whilst blue lights were flashing behind it. The car entered the tunnel and vanished..."'

You can also run prediction using a default HuggingFace pre-trained model:

python predict.py task=nlp/summarization '+x="The car was racing towards the tunnel, whilst blue lights were flashing behind it. The car entered the tunnel and vanished..."'

Or run prediction on a specified HuggingFace pre-trained model:

python predict.py task=nlp/summarization backbone.pretrained_model_name_or_path=t5-base '+x="The car was racing towards the tunnel, whilst blue lights were flashing behind it. The car entered the tunnel and vanished..."'