In [1]:
!pip install transformers torch

Defaulting to user installation because normal site-packages is not writeable
Collecting transformers
  Using cached transformers-4.57.1-py3-none-any.whl.metadata (43 kB)
Collecting huggingface-hub<1.0,>=0.34.0 (from transformers)
  Using cached huggingface_hub-0.36.0-py3-none-any.whl.metadata (14 kB)
Collecting tokenizers<=0.23.0,>=0.22.0 (from transformers)
  Using cached tokenizers-0.22.1-cp39-abi3-win_amd64.whl.metadata (6.9 kB)
Using cached transformers-4.57.1-py3-none-any.whl (12.0 MB)
Using cached huggingface_hub-0.36.0-py3-none-any.whl (566 kB)
Using cached tokenizers-0.22.1-cp39-abi3-win_amd64.whl (2.7 MB)
Installing collected packages: huggingface-hub, tokenizers, transformers

   ---------------------------------------- 0/3 [huggingface-hub]
   ---------------------------------------- 0/3 [huggingface-hub]
   ---------------------------------------- 0/3 [huggingface-hub]
   ---------------------------------------- 0/3 [huggingface-hub]
   ------------------------------------

In [2]:
from transformers import pipeline

  from .autonotebook import tqdm as notebook_tqdm


In [3]:
bart_summarizer = pipeline("summarization", model="facebook/bart-large-cnn")

To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development
Xet Storage is enabled for this repo, but the 'hf_xet' package is not installed. Falling back to regular HTTP download. For better performance, install the package with: `pip install huggingface_hub[hf_xet]` or `pip install hf_xet`
Device set to use cpu


In [4]:
text = """
The red-billed quelea (Quelea quelea) is a small, migratory, sparrow-like bird of the weaver family, Ploceidae, native to sub-Saharan Africa. It is approximately 12 cm (4.7 in) long and weighs 15 to 26 g (0.53 to 0.92 oz) Non-breeding birds have light underparts, striped brown upper parts, yellow-edged flight feathers and a reddish bill. Breeding females attain a yellowish bill. Breeding males have a black (or rarely white) facial mask, surrounded by a purplish, pinkish, rusty or yellowish wash on the head and breast. The species avoids forests, deserts and colder areas. It constructs oval roofed nests woven from strips of grass hanging from thorny branches, sugar cane or reeds. It breeds in very large colonies. The quelea feeds primarily on seeds of annual grasses, but also causes extensive damage to cereal crops. It is regarded as the most numerous undomesticated bird on earth, with the population sometimes peaking at an estimated 1.5 billion individuals.
"""

In [5]:
summary = bart_summarizer(text, max_length=150, min_length=50, do_sample=False)
print("BART Summary:\n", summary[0]['summary_text'])

BART Summary:
 The red-billed quelea is a small, migratory, sparrow-like bird of the weaver family, Ploceidae. It is approximately 12 cm (4.7 in) long and weighs 15 to 26 g (0.53 to 0.92 oz) Non-breeding birds have light underparts, striped brown upper parts, yellow-edged flight feathers and a reddish bill.


In [6]:
from transformers import set_seed

In [7]:
gpt2_refiner = pipeline("text-generation", model="gpt2")
set_seed(42)

To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development
Xet Storage is enabled for this repo, but the 'hf_xet' package is not installed. Falling back to regular HTTP download. For better performance, install the package with: `pip install huggingface_hub[hf_xet]` or `pip install hf_xet`
Device set to use cpu


In [8]:
prompt = (
    "Summarize this paragraph in ONE short sentence only. "
    "Do not explain or add examples.\n\n"
    f"{summary[0]['summary_text']}"
)
output = gpt2_refiner(
    prompt,
    max_new_tokens=20,     # use max_new_tokens instead of max_length
    temperature=0.4,
    num_return_sequences=1
)

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


In [9]:
print("Refined Summary:\n", output[0]['generated_text'])

Refined Summary:
 Summarize this paragraph in ONE short sentence only. Do not explain or add examples.

The red-billed quelea is a small, migratory, sparrow-like bird of the weaver family, Ploceidae. It is approximately 12 cm (4.7 in) long and weighs 15 to 26 g (0.53 to 0.92 oz) Non-breeding birds have light underparts, striped brown upper parts, yellow-edged flight feathers and a reddish bill. The yellow-edged wing feathers are round, long and round. The red-billed quele


In [10]:
from transformers import pipeline, set_seed

set_seed(42)
refiner = pipeline("text-generation", model="gpt2")

text = """The red-billed quelea is a small, migratory, sparrow-like bird of the weaver family..."""

# Keep only essential part if text is long
text = text[:700]   # trim long input so GPT-2 keeps the prompt visible

prompt = (
    "Summarize this paragraph in ONE short sentence only. "
    "Do not explain or add examples.\n\n"
    f"{text}\n\nSummary:"
)

output = refiner(
    prompt,
    max_new_tokens=50,     # use max_new_tokens instead of max_length
    temperature=0.4,
    num_return_sequences=1
)

result = output[0]['generated_text'].split("Summary:")[-1].split('.')[0].strip() + '.'
print("Refined Summary:", result)


Device set to use cpu
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Refined Summary: The red-billed quelea is a small, migratory, sparrow-like bird of the weaver family.
