## 🤗Transformers - Generating Articles from Paper's Abstracts using T5 Model
This notebook uses T5 model - A Sequence to Sequence model fully capable to perform any text to text tasks. What does it mean - It means that T5 model can take any input text and convert it into any output text. Such Text to Text conversion is useful in NLP tasks like language translation, summarization etc.

In this notebook, we will take paper's abstracts as our input text and paper's title as output text and feed it to T5 model. So,let's dive in...



We will install dependencies and work with latest stable pytorch 1.6

In [1]:
# !pip install -U simpletransformers  

In [9]:
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python Docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)

# Input data files are available in the read-only "../input/" directory
# For example, running this (by clicking run or pressing Shift+Enter) will list all files under the input directory

import os

# You can write up to 5GB to the current directory (/kaggle/working/) that gets preserved as output when you create a version using "Save & Run All" 
# You can also write temporary files to /kaggle/temp/, but they won't be saved outside of the current session

In [10]:
import os, psutil  

def cpu_stats():
    pid = os.getpid()
    py = psutil.Process(pid)
    memory_use = py.memory_info()[0] / 2. ** 30
    return 'memory GB:' + str(np.round(memory_use, 2))

In [11]:
cpu_stats()

'memory GB:2.29'

In [12]:
import json

data_file = '../dataset/arxiv-metadata-oai-snapshot.json'

def get_metadata():
    with open(data_file, 'r') as f:
        for line in f:
            yield line

In [13]:
metadata = get_metadata()
for paper in metadata:
    paper_dict = json.loads(paper)
    print('Title: {}\n\nAbstract: {}\nRef: {}'.format(paper_dict.get('title'), paper_dict.get('abstract'), paper_dict.get('journal-ref')))
#     print(paper)
    break

Title: Calculation of prompt diphoton production cross sections at Tevatron and
  LHC energies

Abstract:   A fully differential calculation in perturbative quantum chromodynamics is
presented for the production of massive photon pairs at hadron colliders. All
next-to-leading order perturbative contributions from quark-antiquark,
gluon-(anti)quark, and gluon-gluon subprocesses are included, as well as
all-orders resummation of initial-state gluon radiation valid at
next-to-next-to-leading logarithmic accuracy. The region of phase space is
specified in which the calculation is most reliable. Good agreement is
demonstrated with data from the Fermilab Tevatron, and predictions are made for
more detailed tests with CDF and DO data. Predictions are shown for
distributions of diphoton pairs produced at the energy of the Large Hadron
Collider (LHC). Distributions of the diphoton pairs from the decay of a Higgs
boson are contrasted with those produced from QCD processes at the LHC, showing
tha

**We will take last 5 years ArXiv papers (2016-2021) due to Kaggle'c compute limits**

In [14]:
titles = []
abstracts = []
years = []
metadata = get_metadata()
for paper in metadata:
    paper_dict = json.loads(paper)
    ref = paper_dict.get('journal-ref')
    try:
        year = int(ref[-4:]) 
        if 2010 < year < 2016:
            years.append(year)
            titles.append(paper_dict.get('title'))
            abstracts.append(paper_dict.get('abstract'))
    except:
        pass 

len(titles), len(abstracts), len(years)

(22705, 22705, 22705)

In [15]:
papers = pd.DataFrame({
    'title': titles,
    'abstract': abstracts,
    'year': years
})
papers.head()

Unnamed: 0,title,abstract,year
0,The World as Evolving Information,This paper discusses the benefits of describ...,2012
1,A unified analysis of the reactor neutrino pro...,We present in this article a detailed quanti...,2013
2,Heat Equations and the Weighted $\bar\partial$...,The purpose of this article is to establish ...,2012
3,The KATRIN sensitivity to the neutrino mass an...,The aim of the KArlsruhe TRItium Neutrino ex...,2011
4,"Penguin-mediated B_(d,s)->VV decays and the Bs...","In this letter, we propose three different s...",2011


In [9]:
del titles, abstracts, years

In [10]:
cpu_stats()

'memory GB:0.13'

 **We will use `simpletransformers` library to train a T5 model**

In [19]:
import logging
logging.basicConfig(level=logging.INFO)
transformers_logger = logging.getLogger("transformers")
transformers_logger.setLevel(logging.WARNING)


**Simpletransformers implementation of T5 model expects a data to be a dataframe with 3 columns:**
`<prefix>, <input_text>, <target_text>`
* `<prefix>`: A string indicating the task to perform. (E.g. "question", "stsb")
* `<input_text>`: The input text sequence (we will use Paper's abstract as `input_text`  )
* `<target_text`: The target sequence (we will use Paper's title as `output_text` )
    
    
 You can read about the data format:  https://github.com/ThilinaRajapakse/simpletransformers#t5-transformer

In [16]:
papers = papers[['title','abstract']]
papers.columns = ['target_text', 'input_text']
papers = papers.dropna()

In [17]:
eval_df = papers.sample(frac=0.2, random_state=101)
train_df = papers.drop(eval_df.index)

In [18]:
train_df.shape, eval_df.shape

((18164, 2), (4541, 2))

In [None]:
import logging

import pandas as pd
from simpletransformers.t5 import T5Model

logging.basicConfig(level=logging.INFO)
transformers_logger = logging.getLogger("transformers")
transformers_logger.setLevel(logging.WARNING)

train_df['prefix'] = "summarize"
eval_df['prefix'] = "summarize"


model_args = {
    "reprocess_input_data": True,
    "overwrite_output_dir": True,
    "max_seq_length": 512,
    "train_batch_size": 16,
    "num_train_epochs": 4,
}

# Create T5 Model
# model = T5Model(model_name="t5-small", model_type="t5", args=model_args, use_cuda=True)

import pickle

model = pickle.load(open('../models/title-generator-t5-arxiv-16-4.pkl', 'rb'))

# Train T5 Model on new task
model.train_model(train_df)

# Evaluate T5 Model on new task
results = model.eval_model(eval_df)

# Predict with trained T5 model
#print(model.predict(["convert: four"]))

**We will training out T5 model with very bare minimum `num_train_epochs=4`, `train_batch_size=16` to  fit into Kaggle's compute limits**

In [20]:
import logging

import pandas as pd
from simpletransformers.t5 import T5Model

logging.basicConfig(level=logging.INFO)
transformers_logger = logging.getLogger("transformers")
transformers_logger.setLevel(logging.WARNING)

train_df['prefix'] = "summarize"
eval_df['prefix'] = "summarize"


model_args = {
    "reprocess_input_data": True,
    "overwrite_output_dir": True,
    "max_seq_length": 512,
    "train_batch_size": 16,
    "num_train_epochs": 4,
}

# Create T5 Model
model = T5Model(model_name="t5-small", model_type="t5", args=model_args, use_cuda=True)

# Train T5 Model on new task
model.train_model(train_df)

# Evaluate T5 Model on new task
results = model.eval_model(eval_df)

# Predict with trained T5 model
#print(model.predict(["convert: four"]))

INFO:simpletransformers.t5.t5_utils: Creating features from dataset file at cache_dir/
  0%|          | 0/18164 [00:27<?, ?it/s]


In [16]:
from torch import cuda
cuda.is_available()

True

In [17]:
results

{'eval_loss': 2.1558538724017398}

## And We're Done ! 
**Let's see how our model performs in generating paper's titles**

In [18]:
random_num = 350
actual_title = eval_df.iloc[random_num]['target_text']
actual_abstract = ["summarize: "+eval_df.iloc[random_num]['input_text']]
predicted_title = model.predict(actual_abstract)

print(f'Actual Title: {actual_title}')
print(f'Predicted Title: {predicted_title}')
print(f'Actual Abstract: {actual_abstract}')


`prepare_seq2seq_batch` is deprecated and will be removed in version 5 of HuggingFace Transformers. Use the regular
`__call__` method to prepare your inputs and the tokenizer under the `as_target_tokenizer` context manager to prepare
your targets.

Here is a short example:

model_inputs = tokenizer(src_texts, ...)
with tokenizer.as_target_tokenizer():
    labels = tokenizer(tgt_texts, ...)
model_inputs["labels"] = labels["input_ids"]

See the documentation of your specific tokenizer for more details on the specific arguments to the tokenizer of choice.
For a more complete example, see the implementation of `prepare_seq2seq_batch`.

Generating outputs: 100%|██████████| 1/1 [00:00<00:00, 12.00it/s]
Decoding outputs: 100%|██████████| 1/1 [00:04<00:00,  4.94s/it]


Actual Title: Integral Hodge conjecture for Fermat varieties
Predicted Title: ['Lattice of Hodge cycles']
Actual Abstract: ['summarize:   We describe an algorithm which verifies whether linear algebraic cycles of\nthe Fermat variety generate the lattice of Hodge cycles. A computer\nimplementation of this confirms the integral Hodge conjecture for quartic and\nquintic Fermat fourfolds. Our algorithm is based on computation of the list of\nelementary divisors of both the lattice of linear algebraic cycles, and the\nlattice of Hodge cycles written in terms of vanishing cycles, and observing\nthat these two lists are the same.\n']


In [19]:
random_num = 478
actual_title = eval_df.iloc[random_num]['target_text']
actual_abstract = ["summarize: "+eval_df.iloc[random_num]['input_text']]
predicted_title = model.predict(actual_abstract)

print(f'Actual Title: {actual_title}')
print(f'Predicted Title: {predicted_title}')
print(f'Actual Abstract: {actual_abstract}')

Generating outputs: 100%|██████████| 1/1 [00:00<00:00,  6.50it/s]
Decoding outputs: 100%|██████████| 1/1 [00:04<00:00,  4.96s/it]


Actual Title: Construction of genuine multipartite entangled states
Predicted Title: ['A Novel Product for Real Multipartite Entanglement']
Actual Abstract: ['summarize:   Genuine multipartite entanglement is of great importance in quantum\ninformation, especially from the experimental point of view. Nevertheless, it\nis difficult to construct genuine multipartite entangled states systematically,\nbecause the genuine multipartite entanglement is unruly. We propose another\nproduct based on the Kronecker product in this paper. The Kronecker product is\na common product in quantum information with good physical interpretation. We\nmainly investigate whether the proposed product of two genuine multipartite\nentangled states is still a genuine entangled one. We understand the\nentanglement of the proposed product better by characterizing the entanglement\nof the Kronecker product. Then we show the proposed product is a genuine\nmultipartite entangled state in two cases. The results provide

In [20]:
random_num = 999
actual_title = eval_df.iloc[random_num]['target_text']
actual_abstract = ["summarize: "+eval_df.iloc[random_num]['input_text']]
predicted_title = model.predict(actual_abstract)

print(f'Actual Title: {actual_title}')
print(f'Predicted Title: {predicted_title}')
print(f'Actual Abstract: {actual_abstract}')

Generating outputs: 100%|██████████| 1/1 [00:00<00:00,  7.47it/s]
Decoding outputs: 100%|██████████| 1/1 [00:04<00:00,  4.86s/it]


Actual Title: Automating Motion Correction in Multishot MRI Using Generative
  Adversarial Networks
Predicted Title: ['GAN: Generative Adversarial Network for Multishot Magnetic Resonance Imaging']
Actual Abstract: ['summarize:   Multishot Magnetic Resonance Imaging (MRI) has recently gained popularity as\nit accelerates the MRI data acquisition process without compromising the\nquality of final MR image. However, it suffers from motion artifacts caused by\npatient movements which may lead to misdiagnosis. Modern state-of-the-art\nmotion correction techniques are able to counter small degree motion, however,\ntheir adoption is hindered by their time complexity. This paper proposes a\nGenerative Adversarial Network (GAN) for reconstructing motion free\nhigh-fidelity images while reducing the image reconstruction time by an\nimpressive two orders of magnitude.\n']


In [6]:
actual_abstract = ["summarize: "+"""Rococo Rococo (/rəˈkoʊkoʊ/, also US: /ˌroʊkəˈkoʊ/), less commonly Roccoco or Late Baroque, is an exceptionally ornamental and theatrical style of architecture, art and decoration which combines asymmetry, scrolling curves, gilding, white and pastel colors, sculpted molding, and trompe-l'œil frescoes to create surprise and the illusion of motion and drama. 8][9] In the late 17th and early 18th century rocaille became the term for a kind of decorative motif or ornament that appeared in the late Style Louis XIV, in the form of a seashell interlaced with acanthus leaves."""]

In [None]:
print(model.predict(actual_abstract))

In [1]:
import pickle

In [None]:
pickle.dump(model, open('../models/title-generator.pkl', 'wb'))

In [2]:
loaded_model = pickle.load(open('../models/title-generator-t5-arxiv-16-4.pkl', 'rb'))

In [7]:
loaded_model.predict(actual_abstract)

`prepare_seq2seq_batch` is deprecated and will be removed in version 5 of HuggingFace Transformers. Use the regular
`__call__` method to prepare your inputs and the tokenizer under the `as_target_tokenizer` context manager to prepare
your targets.

Here is a short example:

model_inputs = tokenizer(src_texts, ...)
with tokenizer.as_target_tokenizer():
    labels = tokenizer(tgt_texts, ...)
model_inputs["labels"] = labels["input_ids"]

See the documentation of your specific tokenizer for more details on the specific arguments to the tokenizer of choice.
For a more complete example, see the implementation of `prepare_seq2seq_batch`.

Generating outputs: 100%|██████████| 1/1 [00:00<00:00,  4.69it/s]
Decoding outputs: 100%|██████████| 1/1 [00:14<00:00, 14.72s/it]


['Rococo Rococo: A Novel Design and a Novel Design']

In [8]:
loaded_model.predict("Industrial Training Programme offered by the Faculty of Computing and Information Technology, Tunku Abdul Rahman University College (TAR UC) The main objective of the industrial training programme is to provide students with practical training opportunities in one or more of the following areas . We believe with the expert guidance and experience of your esteemed organisation, our students will acquire relevant practical skills and experience which would be valuable to the students later in their working life .")

Generating outputs: 100%|██████████| 65/65 [00:06<00:00, 10.34it/s]
Decoding outputs: 100%|██████████| 65/65 [00:15<00:00,  4.29it/s]


['Industri',
 'Al Train al Train',
 'Prog ing Prog ing Prog ing Prog ing Prog',
 'Ramme of ramme of ramme of',
 'Fered by fered by fered by fered by fered by',
 'Facing the Facc Fac',
 'ulty of ulty of ulty of',
 'Computin Computin',
 'In g and Infrared',
 'Formatio',
 'Technological n Technological n Technological',
 'logy, Tu Tu logy, Tu Tu Tu Tu Tu Tu Tu Tu Tu Tu Tu',
 'Abdudu nku Abdu Abdu nku Abdu Abdu',
 'Rahman Rahman',
 "Universit's Universit'es Universit'e",
 'ity Coll Coll Coll Coll',
 'Ege (TAR): a ege (TAR)',
 'UC Theorems',
 'Main obstructor ob',
 'jective jective jive',
 'I',
 'ndustria ndustria',
 'Traini l traini',
 'ng progr ng progr ng progr ng prog',
 'Amme isotropic and amme isotropic',
 'Proviate Proviant Proviant Proviant',
 'De stude em es em es em',
 'nts with nts with nts',
 'Practic practicum',
 'Al train',
 'Oppospos oppos oppos oppos oppos oppos',
 'Rtunitie rtunitie',
 'One s s s a s a s',
 'Or more: A note on the number of adobes',
 'The',
 'Followin follo

In [17]:
actual_abstract = ["summarize: "+"""Venetian commodes imitated the curving lines and carved ornament of the French rocaille, but with a particular Venetian variation; the pieces were painted, often with landscapes or flowers or scenes from Guardi or other painters, or Chinoiserie, against a blue or green background, matching the colours of the Venetian school of painters whose work decorated the salons. 24] Ceiling of church of Santi Giovanni e Paolo in Venice, by Piazzetta (1727) Juno and Luna by Giovanni Battista Tiepolo (1735–45) Murano glass chandelier at the Ca Rezzonico (1758) Ballroom ceiling of the Ca Rezzonico with ceiling by Giovanni Battista Crosato (1753) In church construction, especially in the southern German-Austrian region, gigantic spatial creations are sometimes created for practical reasons alone, which, however, do not appear monumental, but are characterized by a unique fusion of architecture, painting, stucco, etc. ,."""]

loaded_model.predict(actual_abstract)

Generating outputs: 100%|██████████| 1/1 [00:00<00:00,  4.97it/s]
Decoding outputs: 100%|██████████| 1/1 [00:09<00:00,  9.09s/it]


['The Venetian Commodes']

In [14]:
actual_abstract = ["summarize: "+"""Many adults put off talking to young children about ‘race’ and racism until they’re five or older. Research shows that three-month-olds have racial preferences. By the age of three, children in the U.S. have negative associations of ‘low-status racial groups. By ‘racist’ I mean that which upholds, reinforces, reproduces, or perpetuates racism."""]

loaded_model.predict(actual_abstract)

Generating outputs: 100%|██████████| 1/1 [00:00<00:00,  5.95it/s]
Decoding outputs: 100%|██████████| 1/1 [00:13<00:00, 13.18s/it]


["Race and Rasism in Children's Families"]

In [13]:
actual_abstract = ["""Many adults put off talking to young children about ‘race’ and racism until they’re five or older. Research shows that three-month-olds have racial preferences. By the age of three, children in the U.S. have negative associations of ‘low-status racial groups. By ‘racist’ I mean that which upholds, reinforces, reproduces, or perpetuates racism."""]

loaded_model.predict(actual_abstract)

Generating outputs: 100%|██████████| 1/1 [00:00<00:00,  4.02it/s]
Decoding outputs: 100%|██████████| 1/1 [00:15<00:00, 15.32s/it]


['Rasism in the U.S.: Race and Rasism']

In [18]:
actual_abstract = ["""Covid-19: Worry about risk of infection, not effects of vaccine, for children, Malaysian parents told Friday, 11 Feb 2022 09:17 AM MYT A child gets his Covid-19 jab at the Ideal Convention Centre in Shah Alam January 3, 2022. — The two doses of the Covid-19 vaccine, when completed, can generate a strong immune response in children in that (five to 11) age group, which can prevent hospital admission and death due to Covid-19,” she told Bernama."""]

loaded_model.predict(actual_abstract)

Generating outputs: 100%|██████████| 1/1 [00:00<00:00,  3.45it/s]
Decoding outputs: 100%|██████████| 1/1 [00:12<00:00, 12.94s/it]


['The Covid-19 vaccine vaccine can generate strong immune response in children']

## Retrain Using Diff Parameters

In [29]:
import logging

import pandas as pd
from simpletransformers.t5 import T5Model

logging.basicConfig(level=logging.INFO)
transformers_logger = logging.getLogger("transformers")
transformers_logger.setLevel(logging.WARNING)

train_df['prefix'] = "summarize"
eval_df['prefix'] = "summarize"


model_args = {
    "reprocess_input_data": True,
    "overwrite_output_dir": True,
    "max_seq_length": 512,
    "train_batch_size": 36,
    "num_train_epochs": 4,
}

# Create T5 Model
model = T5Model(model_name="t5-small", model_type="t5", args=model_args, use_cuda=True)

# Train T5 Model on new task
model.train_model(train_df)

# Evaluate T5 Model on new task
results = model.eval_model(eval_df)

# Predict with trained T5 model
#print(model.predict(["convert: four"]))

INFO:simpletransformers.t5.t5_utils: Creating features from dataset file at cache_dir/
100%|██████████| 14855/14855 [00:22<00:00, 647.40it/s] 
INFO:simpletransformers.t5.t5_utils: Saving features into cached file cache_dir/t5-small_cached_51214855
INFO:simpletransformers.t5.t5_model: Training started
Epochs 0/4. Running Loss:    2.5037: 100%|██████████| 413/413 [04:25<00:00,  1.56it/s]
Epochs 1/4. Running Loss:    2.7283: 100%|██████████| 413/413 [04:25<00:00,  1.56it/s]
Epochs 2/4. Running Loss:    1.5897: 100%|██████████| 413/413 [04:25<00:00,  1.55it/s]
Epochs 3/4. Running Loss:    1.8895: 100%|██████████| 413/413 [04:25<00:00,  1.56it/s]
Epoch 4 of 4: 100%|██████████| 4/4 [17:43<00:00, 265.80s/it]
INFO:simpletransformers.t5.t5_model: Training of t5-small model complete. Saved to outputs/.
INFO:simpletransformers.t5.t5_utils: Creating features from dataset file at cache_dir/
100%|██████████| 3714/3714 [00:07<00:00, 469.86it/s]
INFO:simpletransformers.t5.t5_utils: Saving features int

In [30]:
actual_abstract = ["summarize: "+"""Rococo Rococo (/rəˈkoʊkoʊ/, also US: /ˌroʊkəˈkoʊ/), less commonly Roccoco or Late Baroque, is an exceptionally ornamental and theatrical style of architecture, art and decoration which combines asymmetry, scrolling curves, gilding, white and pastel colors, sculpted molding, and trompe-l'œil frescoes to create surprise and the illusion of motion and drama. 8][9] In the late 17th and early 18th century rocaille became the term for a kind of decorative motif or ornament that appeared in the late Style Louis XIV, in the form of a seashell interlaced with acanthus leaves."""]

print(model.predict(actual_abstract))

Generating outputs: 100%|██████████| 1/1 [00:00<00:00,  2.67it/s]
Decoding outputs: 100%|██████████| 1/1 [00:11<00:00, 11.73s/it]


['Rococo Rococo: Roccoco, Late Baroque, and']


In [33]:
actual_abstract = ["summarize: "+"""Venetian commodes imitated the curving lines and carved ornament of the French rocaille, but with a particular Venetian variation; the pieces were painted, often with landscapes or flowers or scenes from Guardi or other painters, or Chinoiserie, against a blue or green background, matching the colours of the Venetian school of painters whose work decorated the salons. 24] Ceiling of church of Santi Giovanni e Paolo in Venice, by Piazzetta (1727) Juno and Luna by Giovanni Battista Tiepolo (1735–45) Murano glass chandelier at the Ca Rezzonico (1758) Ballroom ceiling of the Ca Rezzonico with ceiling by Giovanni Battista Crosato (1753) In church construction, especially in the southern German-Austrian region, gigantic spatial creations are sometimes created for practical reasons alone, which, however, do not appear monumental, but are characterized by a unique fusion of architecture, painting, stucco, etc. ,."""]

model.predict(actual_abstract)

`prepare_seq2seq_batch` is deprecated and will be removed in version 5 of HuggingFace Transformers. Use the regular
`__call__` method to prepare your inputs and the tokenizer under the `as_target_tokenizer` context manager to prepare
your targets.

Here is a short example:

model_inputs = tokenizer(src_texts, ...)
with tokenizer.as_target_tokenizer():
    labels = tokenizer(tgt_texts, ...)
model_inputs["labels"] = labels["input_ids"]

See the documentation of your specific tokenizer for more details on the specific arguments to the tokenizer of choice.
For a more complete example, see the implementation of `prepare_seq2seq_batch`.

Generating outputs: 100%|██████████| 1/1 [00:00<00:00,  4.42it/s]
Decoding outputs: 100%|██████████| 1/1 [00:07<00:00,  7.66s/it]


['The Venetian Commodes and the Renaissance']

In [34]:
pickle.dump(model, open('../models/title-generator-t5-arxiv-36-4.pkl', 'wb'))

In [18]:
import logging

import pandas as pd
from simpletransformers.t5 import T5Model

logging.basicConfig(level=logging.INFO)
transformers_logger = logging.getLogger("transformers")
transformers_logger.setLevel(logging.WARNING)

train_df['prefix'] = "summarize"
eval_df['prefix'] = "summarize"


model_args = {
    "reprocess_input_data": True,
    "overwrite_output_dir": True,
    "max_seq_length": 512,
    "train_batch_size": 36,
    "num_train_epochs": 8,
}

# Create T5 Model
model = T5Model(model_name="t5-small", model_type="t5", args=model_args, use_cuda=True)

# Train T5 Model on new task
model.train_model(train_df)

# Evaluate T5 Model on new task
results = model.eval_model(eval_df)

# Predict with trained T5 model
#print(model.predict(["convert: four"]))

actual_abstract = ["summarize: "+"""Rococo Rococo (/rəˈkoʊkoʊ/, also US: /ˌroʊkəˈkoʊ/), less commonly Roccoco or Late Baroque, is an exceptionally ornamental and theatrical style of architecture, art and decoration which combines asymmetry, scrolling curves, gilding, white and pastel colors, sculpted molding, and trompe-l'œil frescoes to create surprise and the illusion of motion and drama. 8][9] In the late 17th and early 18th century rocaille became the term for a kind of decorative motif or ornament that appeared in the late Style Louis XIV, in the form of a seashell interlaced with acanthus leaves."""]

print(model.predict(actual_abstract))

actual_abstract = ["summarize: "+"""Venetian commodes imitated the curving lines and carved ornament of the French rocaille, but with a particular Venetian variation; the pieces were painted, often with landscapes or flowers or scenes from Guardi or other painters, or Chinoiserie, against a blue or green background, matching the colours of the Venetian school of painters whose work decorated the salons. 24] Ceiling of church of Santi Giovanni e Paolo in Venice, by Piazzetta (1727) Juno and Luna by Giovanni Battista Tiepolo (1735–45) Murano glass chandelier at the Ca Rezzonico (1758) Ballroom ceiling of the Ca Rezzonico with ceiling by Giovanni Battista Crosato (1753) In church construction, especially in the southern German-Austrian region, gigantic spatial creations are sometimes created for practical reasons alone, which, however, do not appear monumental, but are characterized by a unique fusion of architecture, painting, stucco, etc. ,."""]

print(model.predict(actual_abstract))

pickle.dump(model, open('../models/title-generator-t5-arxiv-{}-{}.pkl'.format(model_args['train_batch_size'], model_args['num_train_epochs']), 'wb'))

INFO:simpletransformers.t5.t5_utils: Creating features from dataset file at cache_dir/
100%|██████████| 14855/14855 [00:16<00:00, 923.34it/s] 
INFO:simpletransformers.t5.t5_utils: Saving features into cached file cache_dir/t5-small_cached_51214855
INFO:simpletransformers.t5.t5_model: Training started
Epochs 0/8. Running Loss:    2.3553: 100%|██████████| 413/413 [04:23<00:00,  1.57it/s]
Epochs 1/8. Running Loss:    2.2879: 100%|██████████| 413/413 [04:07<00:00,  1.67it/s]
Epochs 2/8. Running Loss:    2.1958: 100%|██████████| 413/413 [04:07<00:00,  1.67it/s]
Epochs 3/8. Running Loss:    2.2356: 100%|██████████| 413/413 [04:07<00:00,  1.67it/s]
Epochs 4/8. Running Loss:    1.8977: 100%|██████████| 413/413 [04:07<00:00,  1.67it/s]
Epochs 5/8. Running Loss:    1.6054: 100%|██████████| 413/413 [04:07<00:00,  1.67it/s]
Epochs 6/8. Running Loss:    1.7475: 100%|██████████| 413/413 [04:07<00:00,  1.67it/s]
Epochs 7/8. Running Loss:    1.1424: 100%|██████████| 413/413 [04:07<00:00,  1.67it/s]
Ep

['Rococo Rococo: a classical and classical style of architecture, art and decoration']


Generating outputs: 100%|██████████| 1/1 [00:00<00:00, 10.45it/s]
Decoding outputs: 100%|██████████| 1/1 [00:04<00:00,  4.86s/it]


['The Ca Rezzonico: a Venetian Renaissance']


In [14]:
import logging

import pandas as pd
from simpletransformers.t5 import T5Model

logging.basicConfig(level=logging.INFO)
transformers_logger = logging.getLogger("transformers")
transformers_logger.setLevel(logging.WARNING)

train_df['prefix'] = "summarize"
eval_df['prefix'] = "summarize"


model_args = {
    "reprocess_input_data": True,
    "overwrite_output_dir": True,
    "max_seq_length": 512,
    "train_batch_size": 32,
    "num_train_epochs": 4,
}

# Create T5 Model
model = T5Model(model_name="t5-small", model_type="t5", args=model_args, use_cuda=True)

# Train T5 Model on new task
model.train_model(train_df)

# Evaluate T5 Model on new task
results = model.eval_model(eval_df)

# Predict with trained T5 model
#print(model.predict(["convert: four"]))

actual_abstract = ["summarize: "+"""Rococo Rococo (/rəˈkoʊkoʊ/, also US: /ˌroʊkəˈkoʊ/), less commonly Roccoco or Late Baroque, is an exceptionally ornamental and theatrical style of architecture, art and decoration which combines asymmetry, scrolling curves, gilding, white and pastel colors, sculpted molding, and trompe-l'œil frescoes to create surprise and the illusion of motion and drama. 8][9] In the late 17th and early 18th century rocaille became the term for a kind of decorative motif or ornament that appeared in the late Style Louis XIV, in the form of a seashell interlaced with acanthus leaves."""]

print(model.predict(actual_abstract))

actual_abstract = ["summarize: "+"""Venetian commodes imitated the curving lines and carved ornament of the French rocaille, but with a particular Venetian variation; the pieces were painted, often with landscapes or flowers or scenes from Guardi or other painters, or Chinoiserie, against a blue or green background, matching the colours of the Venetian school of painters whose work decorated the salons. 24] Ceiling of church of Santi Giovanni e Paolo in Venice, by Piazzetta (1727) Juno and Luna by Giovanni Battista Tiepolo (1735–45) Murano glass chandelier at the Ca Rezzonico (1758) Ballroom ceiling of the Ca Rezzonico with ceiling by Giovanni Battista Crosato (1753) In church construction, especially in the southern German-Austrian region, gigantic spatial creations are sometimes created for practical reasons alone, which, however, do not appear monumental, but are characterized by a unique fusion of architecture, painting, stucco, etc. ,."""]

print(model.predict(actual_abstract))

INFO:simpletransformers.t5.t5_utils: Creating features from dataset file at cache_dir/
100%|██████████| 14855/14855 [00:15<00:00, 961.38it/s] 
INFO:simpletransformers.t5.t5_utils: Saving features into cached file cache_dir/t5-small_cached_51214855
INFO:simpletransformers.t5.t5_model: Training started
Epochs 0/4. Running Loss:    2.6198: 100%|██████████| 465/465 [04:33<00:00,  1.70it/s]
Epochs 1/4. Running Loss:    2.2277: 100%|██████████| 465/465 [04:29<00:00,  1.72it/s]
Epochs 2/4. Running Loss:    1.9162: 100%|██████████| 465/465 [04:30<00:00,  1.72it/s]
Epochs 3/4. Running Loss:    1.7281: 100%|██████████| 465/465 [04:29<00:00,  1.72it/s]
Epoch 4 of 4: 100%|██████████| 4/4 [18:04<00:00, 271.07s/it]
INFO:simpletransformers.t5.t5_model: Training of t5-small model complete. Saved to outputs/.
INFO:simpletransformers.t5.t5_utils: Creating features from dataset file at cache_dir/
100%|██████████| 3714/3714 [00:15<00:00, 244.44it/s]
INFO:simpletransformers.t5.t5_utils: Saving features int

['Rococo Rococo Rococo: a classic style of architecture, art and']


Generating outputs: 100%|██████████| 1/1 [00:00<00:00,  6.90it/s]
Decoding outputs: 100%|██████████| 1/1 [00:04<00:00,  4.92s/it]


['The commodes of the Venetian painters, the cosmosphere,']


In [17]:
pickle.dump(model, open('../models/title-generator-t5-arxiv-{}-{}.pkl'.format(model_args['train_batch_size'], model_args['num_train_epochs']), 'wb'))