In [39]:
import pandas as pd
from transformers import pipeline, set_seed
import pickle
pd.set_option('display.max_colwidth', None)
from tqdm import tqdm
set_seed(42)

In [49]:
help(pipeline)

Help on function pipeline in module transformers.pipelines:

pipeline(task: str, model: Optional = None, config: Union[str, transformers.configuration_utils.PretrainedConfig, NoneType] = None, tokenizer: Union[str, transformers.tokenization_utils.PreTrainedTokenizer, NoneType] = None, framework: Union[str, NoneType] = None, revision: Union[str, NoneType] = None, use_fast: bool = True, use_auth_token: Union[str, bool, NoneType] = None, model_kwargs: Dict[str, Any] = {'use_auth_token': None}, **kwargs) -> transformers.pipelines.base.Pipeline
    Utility factory method to build a :class:`~transformers.Pipeline`.
    
    Pipelines are made of:
    
        - A :doc:`tokenizer <tokenizer>` in charge of mapping raw textual input to token.
        - A :doc:`model <model>` to make predictions from the inputs.
        - Some (optional) post processing for enhancing model's output.
    
    Args:
        task (:obj:`str`):
            The task defining which pipeline will be returned. Currently

In [2]:
generator = pipeline('text-generation', model='EleutherAI/gpt-neo-2.7B')
generatorgpt2 = pipeline('text-generation', model='gpt2')
df = pd.read_csv('Ontology_final.csv', sep=';')

In [52]:
generator3 = pipeline('text-generation', model = 'xlnet-base-cased')

Downloading:   0%|          | 0.00/760 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/467M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/798k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/1.38M [00:00<?, ?B/s]

In [53]:
xlnet_sentence_list = []
for idx in tqdm(range(len(df))):
    keyword = df['keyword'][idx]
    gen_sen = generator3(keyword, min_length = 100, num_return_sequences = 20, device = 0)
    xlnet_sentence_list.append(gen_sen)

100%|███████████████████████████████████████████████████████████████████████████| 4122/4122 [98:55:19<00:00, 86.39s/it]


In [38]:
# gptneo_sentence_list = []
# for idx in tqdm(range(len(df))):
#     keyword = df['keyword'][idx]
#     gen_sen = generator(keyword, min_length=100,
#           num_return_sequences=20, device = 0)
#     gptneo_sentence_list.append(gen_sen)

In [40]:
# gpt2_sentence_list = []
# for idx in tqdm(range(len(df))):
#     keyword = df['keyword'][idx]
#     gpt2_sentence_list.append(generatorgpt2(keyword, min_length=100,
#                                             num_return_sequences=20, device = 0))

In [29]:
# df['gpt2_sentences'] = gpt2_sentence_list
# df['gpt_neosentences'] = sentence_list

In [54]:
filename3 = 'UNDP_SDG_Sentences.sav'
pickle.dump(xlnet_sentence_list, open(filename3, 'wb'))

In [55]:
df['xlnet_sentences'] = xlnet_sentence_list

In [57]:
df.to_json('UNDP_SDG_SentencesGPT2-NEO-Xlnet.json')

In [32]:
# save the model to disk
filename = 'UNDP_SDG_SentencesGPT-NEO.sav'
filename2 = 'UNDP_SDG_SetencesGPT-2.sav'
pickle.dump(sentence_list, open(filename, 'wb'))
pickle.dump(gpt2_sentence_list, open(filename2, 'wb'))

In [34]:
df.to_json('UNDP_Sentence_GPT_Generation.json')

In [59]:
df['gpt2_sentences'][1]

[{'generated_text': "absolute poverty.\n\nWith its population of only 13,000 people, the tiny southern city of Potsdam is one of the largest in Europe and the only capital of the former East Timor nation. It is still home to the world's"},
 {'generated_text': 'absolute poverty for its population, the cost of an entire city will come at least three times its initial level of cost. The value of this initial cost will be measured in terms of how far it will go in terms of total cost (it will be'},
 {'generated_text': 'absolute poverty" from the lack of jobs and education in the developed and less developed region (Tobin 1988). Of course, this is all part of the "new" neoliberal strategy. It\'s the same old thing, except different people are working on'},
 {'generated_text': 'absolute poverty rate among the elderly, however.\n\n"It was not an issue for the young people who have been able to move in and move to the city centre for a longer period and the young people who haven\'t," Rolf say

In [60]:
df['gptneo_sentences'][1]

[{'generated_text': 'absolute poverty,” as the children are unable to attend school because they lack the money to feed themselves.\n\n“I would think, after seeing the way the children have been forced into poverty, they would have to want to go to'},
 {'generated_text': "absolute poverty', which he also felt, but added that he had not yet been invited to read all his books, he would have to make sure of not letting other people tell him what to do.\n\nHe was to die five years later,"},
 {'generated_text': 'absolute poverty of the soul.” And so, in this way, the doctrine of\nChrist begins to be applied in practice to the poor and destitute. The\ndoctrine has already begun to be applied to the needy and poor of'},
 {'generated_text': "absolute poverty' (cough). In some countries it can even be used to mean 'being poor'. Thus, in New Zealand one could be 'a millionaire' and 'be poor but not in poverty'.\n\nThe adjective 'low-income"},
 {'generated_text': 'absolute poverty, the people of 

In [61]:
df['xlnet_sentences'][1]

[{'generated_text': 'absolute poverty to the very lowest of the subsistence level of India (, a state where the population of some 169 million has risen to about 330 million by the decade of the last year). ((The National Government of India): "... a nation that suffers from extreme poverty is a country with great tolerance and stability.") <4. In the year 2010–2011'},
 {'generated_text': 'absolute poverty <s/> the poverty of the most people". In addition to absolute poverty<s/> absolute poverty <s/> absolute poverty<s/> absolute poverty <s/> absolute poverty <s/> absolute poverty<s/> absolute poverty<s/> absolute poverty<s/> absolute poverty<s/> absolute poverty<s/> absolute poverty<s/> absolute poverty'},
 {'generated_text': 'absolute poverty <</ </<< < < <><< << < <<< < <> <   < *<< < <<<< <=  << = <.000'},
 {'generated_text': 'absolute poverty (in the main (for export)) In an extreme poverty (save for the local economy) in Ukraine (except for the largest market), (except for export