Abstractive text summary generation

In [23]:
# !pip install -U -q transformers datasets sentencepiece

In [24]:
# Load pre-trained T5 model and tokenizer
tokenizer = T5Tokenizer.from_pretrained("t5-base")
model = T5ForConditionalGeneration.from_pretrained("t5-base")

In [25]:
text = """
Abstractive summarization is the process of shortening a set of data computationally to create a summary that conveys the most important information.
It can generate entirely new phrases, reword, or interpret the content in its own way.
"""

In [26]:
# Define user word limit (e.g., 50 words)
word_limit = 120
token_limit = int(word_limit * 1.3)


In [27]:
# Preprocess input
input_text = "summarize: " + text.strip()
input_ids = tokenizer.encode(input_text, return_tensors="pt", max_length=512, truncation=True)

In [28]:
def summarize_text(text, word_limit):
    token_limit = int(word_limit * 1.3)
    input_text = "summarize: " + text.strip()
    input_ids = tokenizer.encode(input_text, return_tensors="pt", max_length=512, truncation=True)

    summary_ids = model.generate(
        input_ids,
        max_length=token_limit,
        min_length=int(token_limit * 0.5),
        length_penalty=2.0,
        num_beams=4,
        early_stopping=True
    )

    summary = tokenizer.decode(summary_ids[0], skip_special_tokens=True)
    return summary

In [29]:
import gradio as gr

In [30]:
gr.Interface(
    fn=summarize_text,
    inputs=[
        gr.Textbox(label="Enter Text", lines=10, placeholder="Paste text here..."),
        gr.Slider(20, 150, value=60, step=10, label="Word Limit")
    ],
    outputs=gr.Textbox(label="Generated Summary"),
    title="Abstractive Text Summarizer",
    description="Enter a long paragraph and choose your desired summary word limit."
).launch()

It looks like you are running Gradio on a hosted a Jupyter notebook. For the Gradio app to work, sharing must be enabled. Automatically setting `share=True` (you can turn this off by setting `share=False` in `launch()` explicitly).

Colab notebook detected. To show errors in colab notebook, set debug=True in launch()
* Running on public URL: https://663a62d7eb8ac79875.gradio.live

This share link expires in 1 week. For free permanent hosting and GPU upgrades, run `gradio deploy` from the terminal in the working directory to deploy to Hugging Face Spaces (https://huggingface.co/spaces)


