<a href="https://colab.research.google.com/github/danielsaggau/deep_unsupervised_learning/blob/main/big_patent_Bigbird_Pegasus_Evaluation.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## **Evaluate 🤗's BigBirdPegasus on Pubmed**

In [None]:
!nvidia-smi

Tue Sep 14 11:35:23 2021       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.63.01    Driver Version: 460.32.03    CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  Tesla P100-PCIE...  Off  | 00000000:00:04.0 Off |                    0 |
| N/A   40C    P0    27W / 250W |      0MiB / 16280MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Proces

Let's first install `transformers`, `datasets`, `rouge_score` and `sentencepiece`.

In [None]:
%%capture
!pip3 install datasets
!pip3 install rouge_score
!pip3 install git+https://github.com/huggingface/transformers
!pip3 install sentencepiece
!pip install git+https://github.com/google-research/bleurt.git
!pip install bert_score

In [None]:
from datasets import load_dataset, load_metric
import torch
from transformers import BigBirdPegasusForConditionalGeneration, AutoTokenizer

In [None]:
DATASET_NAME = "big_patent" # arxiv
DEVICE = "cuda"
CACHE_DIR = DATASET_NAME
MODEL_ID = f"google/bigbird-pegasus-large-{DATASET_NAME}"

In [None]:
test_dataset = load_dataset(DATASET_NAME,'all', split="test", cache_dir=CACHE_DIR)
test_dataset

Downloading and preparing dataset big_patent/all (download: 6.01 GiB, generated: 24.17 GiB, post-processed: Unknown size, total: 30.17 GiB) to big_patent/big_patent/all/1.0.0/efa16ff728ce0a1726ef8a0faeb0376331093f8fff41cf4cfaccc11d9cdb442d...


Downloading: 0.00B [00:00, ?B/s]

  0%|          | 0/3 [00:00<?, ?it/s]

0 examples [00:00, ? examples/s]

0 examples [00:00, ? examples/s]

0 examples [00:00, ? examples/s]

Dataset big_patent downloaded and prepared to big_patent/big_patent/all/1.0.0/efa16ff728ce0a1726ef8a0faeb0376331093f8fff41cf4cfaccc11d9cdb442d. Subsequent calls will reuse this data.


Dataset({
    features: ['description', 'abstract'],
    num_rows: 67072
})

In [None]:
tokenizer = AutoTokenizer.from_pretrained('google/bigbird-pegasus-large-bigpatent')
model = BigBirdPegasusForConditionalGeneration.from_pretrained('google/bigbird-pegasus-large-bigpatent').to(DEVICE)

In [None]:
rouge = load_metric('rouge')

In [None]:
#!pip install git+https://github.com/google-research/bleurt.git
bleurt= load_metric('bleurt')

`BigBirdPegasus` makes use of *block sparse attention*. Let's verify the `config`'s attention type and the `block_size`.

In [None]:
model.config.attention_type, model.config.block_size

('block_sparse', 64)

In [None]:
def generate_answer(batch):
  inputs_dict = tokenizer(batch["description"], padding="max_length", max_length=4096, return_tensors="pt", truncation=True)
  inputs_dict = {k: inputs_dict[k].to(DEVICE) for k in inputs_dict}
  predicted_abstract_ids = model.generate(**inputs_dict, max_length=256, top_p= 0.95, repetition_penalty=1.1,length_penalty=0.8)
  batch["predicted_abstract"] = tokenizer.decode(predicted_abstract_ids[0], skip_special_tokens=True)
  print(batch["predicted_abstract"])
  return batch

In [None]:
dataset_small = test_dataset.select(range(2))
result_small = dataset_small.map(generate_answer)

rouge.compute(predictions=result_small["predicted_abstract"], references=result_small["abstract"])

  0%|          | 0/2 [00:00<?, ?ex/s]

To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor'). (Triggered internally at  /pytorch/aten/src/ATen/native/BinaryOps.cpp:467.)
  return torch.floor_divide(self, other)


A method and system for cleaning pet appendages including feet, hooves, and limbs using a plurality of flow-through type brushes that are readily transported and stored between uses, readily adapts to specific uses, and environments proximate that treatment surface is not limited.
A method for preparing an oatmeal composition is disclosed. The method includes the steps of hydrating steel cut oats, adding oat bran to the hydrated steel cut oats, granulating the oat bran, adding rolled oats to the granulated oat bran mixture, cooking the mixture, transferring the cooked mixture to a holding reservoir, heating the mixture in the reservoir to cook the rolled oats, and transferring the cooked mixture to a container.


{'rouge1': AggregateScore(low=Score(precision=0.3953488372093023, recall=0.1559633027522936, fmeasure=0.2236842105263158), mid=Score(precision=0.5849983622666229, recall=0.20956059874456784, fmeasure=0.3082706766917293), high=Score(precision=0.7746478873239436, recall=0.2631578947368421, fmeasure=0.3928571428571428)),
 'rouge2': AggregateScore(low=Score(precision=0.19047619047619047, recall=0.07407407407407407, fmeasure=0.10666666666666667), mid=Score(precision=0.24523809523809523, recall=0.08751780626780627, fmeasure=0.1288729016786571), high=Score(precision=0.3, recall=0.10096153846153846, fmeasure=0.1510791366906475)),
 'rougeL': AggregateScore(low=Score(precision=0.32558139534883723, recall=0.12844036697247707, fmeasure=0.1842105263157895), mid=Score(precision=0.416311824434982, recall=0.15034458540011414, fmeasure=0.22067669172932333), high=Score(precision=0.5070422535211268, recall=0.1722488038277512, fmeasure=0.2571428571428572)),
 'rougeLsum': AggregateScore(low=Score(precision

In [None]:
bleurt.compute(predictions=result_small["predicted_abstract"], references=result_small["abstract"])

{'scores': [-0.6322981715202332, 0.19446244835853577]}

In [None]:
test_dataset = test_dataset.select(range(600))

# generate Summaries

In [None]:
result = test_dataset.map(generate_answer)

  0%|          | 0/600 [00:00<?, ?ex/s]

A method and system for cleaning pet appendages including feet, hooves, and limbs using a plurality of flow-through type brushes that are readily transported and stored between uses, readily adapts to specific uses, and environments proximate that treatment surface is not limited.
A method for preparing an oatmeal composition is disclosed. The method includes the steps of hydrating steel cut oats, adding oat bran to the hydrated steel cut oats, granulating the oat bran, adding rolled oats to the granulated oat bran mixture, cooking the mixture, transferring the cooked mixture to a holding reservoir, heating the mixture in the reservoir to cook the rolled oats, and transferring the cooked mixture to a container.
The trunk rotation conditioning device of this invention provides the following. the user is in a weight bearing position that simulates a stance in many sports. the angle of the inclination is adjustable about a pivot to accommodate individual variation in the standing position

# Evaluation via ROUGE

In [None]:
result

Dataset({
    features: ['article', 'abstract', 'section_names', 'predicted_abstract'],
    num_rows: 600
})

In [None]:
rouge_result = rouge.compute(predictions=result["predicted_abstract"], references=result["abstract"])

# Evaluation via BLEURT

In [None]:
score = bleurt.compute(predictions=result["predicted_abstract"], references=result["abstract"])

In [None]:
import pandas as pd
dataframe = pd.DataFrame(rouge_result)
dataframe.to_csv('/content/bigpatent_rouge_result.csv', index = False)

In [None]:
dataframe = pd.DataFrame(score)
dataframe.to_csv('/content/bigpatent_bleurt_result.csv', index = False)

In [None]:
def generate_answer(batch):
  inputs_dict = tokenizer(batch["article"], padding="max_length", max_length=4096, return_tensors="pt", truncation=True)
  inputs_dict = {k: inputs_dict[k].to(DEVICE) for k in inputs_dict}
  predicted_abstract_ids = model.generate(**inputs_dict, max_length=256, top_p= 0.95,length_penalty=0.8)
  batch["predicted_abstract"] = tokenizer.decode(predicted_abstract_ids[0], skip_special_tokens=True)
  print(batch["predicted_abstract"])
  return batch

In [None]:
result = test_dataset.map(generate_answer)

  0%|          | 0/600 [00:00<?, ?ex/s]

the problem of the existence of the 155-day periodicity in the daily sunspot areas, the mean sunspot areas per carrington rotation, the monthly sunspot numbers and their fluctuations, which are obtained after removing the 11-year cycle is considered.<n> two methods of the power spectrum analysis are used : the fast fourier transformation algorithm with the hamming window function ( fft ) and the blackman - tukey ( bt ) method.<n> the fft method is used for the diagnosis of the reasons of the existence of peaks, which are computed by the fft method.<n> the bt method is used for the diagnosis of the reasons of the existence of peaks, which are obtained by the fft method.<n> numerical results of the new method of the diagnosis of an echo - effect for sunspot area data are discussed.<n> it is shown that the sunspot data from cycle 16 present the 155-day periodicity, which is characteristic for one of the solar hemispheres ( the southern hemisphere for cycles 1215 and the northern hemispher

In [None]:
arxiv_rouge_result = rouge.compute(predictions=result["predicted_abstract"], references=result["abstract"])
arxiv_bleurt_score = bleurt.compute(predictions=result["predicted_abstract"], references=result["abstract"])

In [None]:
dataframe = pd.DataFrame(arxiv_rouge_result)
dataframe.to_csv('/content/bigpatent_rouge_result_nopen.csv', index = False)
dataframe = pd.DataFrame(arxiv_bleurt_score)
dataframe.to_csv('/content/bigpatent_bleurt_score_nopen.csv', index = False)