#**Text Summarization Using Transformers**



Install packages

In [None]:
!pip install sentencepiece
!pip install transformers



### Execute text summarization, where we use a pre-trained model. The model is trained on English-language text

In [None]:
from transformers import T5ForConditionalGeneration, T5Tokenizer

def summarize_text(text, model_name="t5-small", max_length=150, min_length=40):
    # Initialize the tokenizer and model
    tokenizer = T5Tokenizer.from_pretrained(model_name)
    model = T5ForConditionalGeneration.from_pretrained(model_name)

    # Preprocess the text
    preprocess_text = text.strip().replace("\n", " ")
    t5_prepared_text = "summarize: " + preprocess_text

    # Tokenize and summarize
    tokenized_text = tokenizer.encode(t5_prepared_text, return_tensors="pt", max_length=512, truncation=True)
    summary_ids = model.generate(tokenized_text, max_length=max_length, min_length=min_length, length_penalty=2.0, num_beams=4, early_stopping=True)
    summary = tokenizer.decode(summary_ids[0], skip_special_tokens=True)

    return summary

# Example text
large_text = """
Since 2017, Google Colab has been the easiest way to start programming in Python.
Over 7 million people, including students, already use Colab to access these
powerful computing resources, free of charge, without having to install or
manage any software. It’s a great tool for machine learning, data analysis,
and education — and now it’s getting even better with advances in AI.
Today, we’re announcing that Colab will soon add AI coding features like code
completions, natural language to code generation and even a code-assisting chatbot.
Colab will use Codey, a family of code models built on PaLM 2, which was
just announced at I/O last week. Codey was fine-tuned on a large dataset of
high quality, permissively licensed code from external sources to improve
performance on coding tasks. Plus, the versions of Codey being used to power
Colab have been customized especially for Python and for Colab-specific uses.
"""

summary = summarize_text(large_text)
large_text

Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.


'\nSince 2017, Google Colab has been the easiest way to start programming in Python. Over 7 million people, including students, already use Colab to access these powerful computing resources, free of charge, without having to install or manage any software. It’s a great tool for machine learning, data analysis, and education — and now it’s getting even better with advances in AI.\nToday, we’re announcing that Colab will soon add AI coding features like code completions, natural language to code generation and even a code-assisting chatbot. Colab will use Codey, a family of code models built on PaLM 2, which was just announced at I/O last week. Codey was fine-tuned on a large dataset of high quality, permissively licensed code from external sources to improve performance on coding tasks. Plus, the versions of Codey being used to power Colab have been customized especially for Python and for Colab-specific uses.\n'

In [None]:
summary

'since 2017, Google Colab has been the easiest way to start programming in Python. over 7 million people, including students, already use Colab to access these powerful computing resources, free of charge. Colab will soon add AI coding features like code completions, natural language to code generation and even a code-assisting chatbot.'