# Fine Tunning BigBirdPegasus for Summarization

In [1]:
from transformers import PegasusTokenizer, BigBirdPegasusForConditionalGeneration
from datasets import list_datasets, load_dataset, list_metrics, load_metric

DATASET_NAME = "pubmed"
DEVICE = "cuda"
CACHE_DIR = DATASET_NAME
MODEL_ID = "google/bigbird-pegasus-large-arxiv"



In [2]:
test_dataset = load_dataset("scientific_papers", DATASET_NAME, split='train[:5%]')



Downloading builder script:   0%|          | 0.00/2.03k [00:00<?, ?B/s]

Downloading metadata:   0%|          | 0.00/1.23k [00:00<?, ?B/s]

Downloading and preparing dataset scientific_papers/pubmed (download: 4.20 GiB, generated: 2.33 GiB, post-processed: Unknown size, total: 6.53 GiB) to /root/.cache/huggingface/datasets/scientific_papers/pubmed/1.1.1/306757013fb6f37089b6a75469e6638a553bd9f009484938d8f75a4c5e84206f...


Downloading data files:   0%|          | 0/2 [00:00<?, ?it/s]

Downloading data:   0%|          | 0.00/3.62G [00:00<?, ?B/s]

Downloading data:   0%|          | 0.00/880M [00:00<?, ?B/s]

Extracting data files:   0%|          | 0/2 [00:00<?, ?it/s]

Generating train split:   0%|          | 0/119924 [00:00<?, ? examples/s]

Generating validation split:   0%|          | 0/6633 [00:00<?, ? examples/s]

Generating test split:   0%|          | 0/6658 [00:00<?, ? examples/s]

Dataset scientific_papers downloaded and prepared to /root/.cache/huggingface/datasets/scientific_papers/pubmed/1.1.1/306757013fb6f37089b6a75469e6638a553bd9f009484938d8f75a4c5e84206f. Subsequent calls will reuse this data.


In [3]:
tokenizer = PegasusTokenizer.from_pretrained(MODEL_ID)
model = BigBirdPegasusForConditionalGeneration.from_pretrained(MODEL_ID).to(DEVICE)


Downloading:   0%|          | 0.00/1.83M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/775 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/1.17k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/1.03k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/2.15G [00:00<?, ?B/s]

In [4]:
!pip install rouge_score

Collecting rouge_score
  Downloading rouge_score-0.0.4-py2.py3-none-any.whl (22 kB)
Installing collected packages: rouge_score
Successfully installed rouge_score-0.0.4
[0m

In [5]:
rouge = load_metric("rouge")

model.config.attention_type, model.config.block_size

Downloading builder script:   0%|          | 0.00/2.16k [00:00<?, ?B/s]

('block_sparse', 64)

In [6]:
def generate_answer(batch):
    inputs_dict = tokenizer(batch["article"], padding="max_length", max_length=256, return_tensors="pt", truncation=True)
    inputs_dict = {k: inputs_dict[k].to(DEVICE) for k in inputs_dict}
    predicted_abstract_ids = model.generate(**inputs_dict, max_length=128, num_beams=5, length_penalty=0.8)
    batch["predicted_abstract"] = tokenizer.decode(predicted_abstract_ids[0], skip_special_tokens=True)
    print(batch["predicted_abstract"])
    return batch

In [7]:
dataset_small = test_dataset.select(range(2))
result_small = dataset_small.map(generate_answer)

rouge.compute(predictions=result_small["predicted_abstract"], references=result_small["abstract"])

  0%|          | 0/2 [00:00<?, ?ex/s]

Attention type 'block_sparse' is not possible if sequence_length: 256 <= num global tokens: 2 * config.block_size + min. num sliding tokens: 3 * config.block_size + config.num_random_blocks * config.block_size + additional buffer: config.num_random_blocks * config.block_size = 704 with config.block_size = 64, config.num_random_blocks = 3. Changing attention type to 'original_full'...


school free food program ( nffp ) is implemented in elementary schools of deprived areas to cover all poor students. however, this is underweight program in iran a recent study among 752 high school students in sistan and baluchestan showed prevalence of 16.2%, underweight, overweight and obesity respectively. in this study, we report a systematic investigation of school free food program ( nffp ) in elementary schools of deprived areas in iran.<n> we have carried out a systematic investigation of the school free food program ( nffp ) in elementary schools of iran 
* purpose : * to present a systematic study of the effects of caa on quality of life ( qol ) and performance status in patients with cancer.<n> * methods : * the purpose of the study is to investigate the effects of caa on physical functioning, qol, and performance status in patients with cancer.<n> * results : * the results show that the effects of caa on physical functioning, qol, and performance status in patients with ca

{'rouge1': AggregateScore(low=Score(precision=0.4606741573033708, recall=0.11890243902439024, fmeasure=0.18978102189781018), mid=Score(precision=0.46527683768783, recall=0.14705805711903272, fmeasure=0.2218254954690289), high=Score(precision=0.46987951807228917, recall=0.1752136752136752, fmeasure=0.25386996904024767)),
 'rouge2': AggregateScore(low=Score(precision=0.06818181818181818, recall=0.02575107296137339, fmeasure=0.03738317757009346), mid=Score(precision=0.11335920177383592, recall=0.03275321232166529, fmeasure=0.05047642986084135), high=Score(precision=0.15853658536585366, recall=0.039755351681957186, fmeasure=0.06356968215158924)),
 'rougeL': AggregateScore(low=Score(precision=0.21348314606741572, recall=0.07621951219512195, fmeasure=0.11764705882352941), mid=Score(precision=0.2573439826722621, recall=0.07870804669585157, fmeasure=0.1196507800200372), high=Score(precision=0.30120481927710846, recall=0.0811965811965812, fmeasure=0.12165450121654502)),
 'rougeLsum': AggregateS

In [8]:
test_dataset = test_dataset.select(range(100))

In [9]:
result = test_dataset.map(generate_answer)

  0%|          | 0/100 [00:00<?, ?ex/s]

school free food program ( nffp ) is implemented in elementary schools of deprived areas to cover all poor students. however, this is underweight program in iran a recent study among 752 high school students in sistan and baluchestan showed prevalence of 16.2%, underweight, overweight and obesity respectively. in this study, we report a systematic investigation of school free food program ( nffp ) in elementary schools of deprived areas in iran.<n> we have carried out a systematic investigation of the school free food program ( nffp ) in elementary schools of iran 
* purpose : * to present a systematic study of the effects of caa on quality of life ( qol ) and performance status in patients with cancer.<n> * methods : * the purpose of the study is to investigate the effects of caa on physical functioning, qol, and performance status in patients with cancer.<n> * results : * the results show that the effects of caa on physical functioning, qol, and performance status in patients with ca

In [10]:
rouge.compute(predictions=result["predicted_abstract"], references=result["abstract"])

{'rouge1': AggregateScore(low=Score(precision=0.423224049533702, recall=0.1824497401150509, fmeasure=0.24597267177383014), mid=Score(precision=0.46460792392593764, recall=0.2014373014541887, fmeasure=0.26738916446184985), high=Score(precision=0.5052367545848682, recall=0.22040843431551263, fmeasure=0.28930457276654126)),
 'rouge2': AggregateScore(low=Score(precision=0.1319506343861715, recall=0.05572035455043805, fmeasure=0.0756337867993005), mid=Score(precision=0.16862482459003766, recall=0.07023722302672819, fmeasure=0.0949997990024753), high=Score(precision=0.2122420393086512, recall=0.08764764979727753, fmeasure=0.11806673826945982)),
 'rougeL': AggregateScore(low=Score(precision=0.2745787506136185, recall=0.11784207956862924, fmeasure=0.15838482038371202), mid=Score(precision=0.31321052100590374, recall=0.13204833048918513, fmeasure=0.1763579963257519), high=Score(precision=0.3512861819301715, recall=0.14830057804180544, fmeasure=0.1968155573567783)),
 'rougeLsum': AggregateScore(