In [1]:
import torch
import json 
from transformers import T5Tokenizer, T5ForConditionalGeneration, T5Config

model = T5ForConditionalGeneration.from_pretrained('t5-large')
tokenizer = T5Tokenizer.from_pretrained('t5-large')
device = torch.device('cpu')

Some weights of T5ForConditionalGeneration were not initialized from the model checkpoint at t5-large and are newly initialized: ['encoder.embed_tokens.weight', 'decoder.embed_tokens.weight', 'lm_head.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


In [2]:
def generate_summary(text):
    preprocess_text = text.strip().replace("\n","")
    t5_prepared_Text = "summarize: " + preprocess_text
    tokenized_text = tokenizer.encode(t5_prepared_Text, return_tensors="pt").to(device)
    
    summary_ids = model.generate(tokenized_text,
                                    num_beams=8,
                                    no_repeat_ngram_size=10,
                                    min_length=75,
                                    max_length=100,
                                    early_stopping=True)

    output = tokenizer.decode(summary_ids[0], skip_special_tokens=True)
    
    output_sentences = output.split('. ')    
    result = []
    for sentence in output_sentences:
        sentence = sentence.capitalize()
        result.append(sentence)        
    
    return ". ".join(result)

In [3]:
sample_text = """An approach to semi-supervised learning is proposed that is based on a Gaussian random field model. Labeled and unlabeled data are represented as vertices in a weighted graph, with edge weights encoding the similarity between instances. The learning problem is then formulated in terms of a Gaussian random field on this graph, where the mean of the field is characterized in terms of harmonic functions, and is efficiently obtained using matrix methods or belief propagation. The resulting learning algorithms have intimate connections with random walks, electric networks, and spectral graph theory. We discuss methods to incorporate class priors and the predictions of classifiers obtained by supervised learning. We also propose a method of parameter learning by entropy minimization, and show the algorithm\u2019s ability to perform feature selection. Promising experimental results are presented for synthetic data, digit classification, ands text classification tasks."""

summary = generate_summary(sample_text)

print('Summary')
print('-'*50)
print(summary)

Summary
--------------------------------------------------
Semi-supervised learning is based on a gaussian random field model. Labeled and unlabeled data are represented as vertices in a weighted graph. The learning problem is then formulated in terms of a gaussian random field. The mean of the field is characterized in terms of harmonic functions.


In [4]:
sample_text = "Inspired by recent work in machine translation and object detection, we introduce an attention based model that automatically learns to describe the content of images. We describe how we can train this model in a deterministic manner using standard backpropagation techniques and stochastically by maximizing a variational lower bound. We also show through visualization how the model is able to automatically learn to fix its gaze on salient objects while generating the corresponding words in the output sequence. We validate the use of attention with state-of-theart performance on three benchmark datasets: Flickr9k, Flickr30k and MS COCO."

summary = generate_summary(sample_text)

print('Summary')
print('-'*50)
print(summary)

Summary
--------------------------------------------------
Inspired by recent work in machine translation and object detection, we introduce an attention based model that automatically learns to describe the content of images. The model is able to automatically learn to fix its gaze on salient objects while generating the corresponding words in the output sequence. We validate the use of attention with state-of-the-art performance on three benchmark datasets: flickr9k, flickr30k and ms coco.
