In [13]:
import torch
from transformers import TFGPT2LMHeadModel, GPT2Tokenizer, GPT2LMHeadModel

tokenizer = GPT2Tokenizer.from_pretrained("gpt2")
# add the EOS token as PAD token to avoid warnings
model = GPT2LMHeadModel.from_pretrained("gpt2", pad_token_id=tokenizer.eos_token_id)

Some weights of GPT2LMHeadModel were not initialized from the model checkpoint at gpt2 and are newly initialized: ['h.0.attn.masked_bias', 'h.1.attn.masked_bias', 'h.2.attn.masked_bias', 'h.3.attn.masked_bias', 'h.4.attn.masked_bias', 'h.5.attn.masked_bias', 'h.6.attn.masked_bias', 'h.7.attn.masked_bias', 'h.8.attn.masked_bias', 'h.9.attn.masked_bias', 'h.10.attn.masked_bias', 'h.11.attn.masked_bias', 'lm_head.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


In [15]:
# set no_repeat_ngram_size to 2
input_ids = tokenizer.encode('When AI was first introduced into business processes,', return_tensors='pt')

beam_output = model.generate(
    input_ids, 
    max_length=100, 
    num_beams=5, 
    no_repeat_ngram_size=2, 
    early_stopping=True
)

print("Output:\n" + 100 * '-')
print(tokenizer.decode(beam_output[0], skip_special_tokens=True))

Output:
----------------------------------------------------------------------------------------------------
When AI was first introduced into business processes, it was thought that AI could be used to solve complex problems. However, in the last few years, AI has become more and more popular.

AI has been used in a number of industries, such as healthcare, education, finance, health care, and many other fields. It has also become a major player in many industries. In fact, there are many companies that are using AI to help them solve problems in their business. For example,


In [4]:
# set no_repeat_ngram_size to 2
input_ids = tokenizer.encode('I enjoy walking with my cute dog', return_tensors='pt')

beam_output = model.generate(
    input_ids, 
    max_length=100, 
    num_beams=5, 
    no_repeat_ngram_size=2, 
    early_stopping=True
)

print("Output:\n" + 100 * '-')
print(tokenizer.decode(beam_output[0], skip_special_tokens=True))

Output:
----------------------------------------------------------------------------------------------------
I enjoy walking with my cute dog, but I'm not sure if I'll ever be able to walk with him again.

I've been thinking about this for a while now, and I think it's time for me to take a step back and think about what I want to do next. I've always wanted to be a writer, so I thought I'd share my thoughts on how I would go about writing a book about my love of writing. Here are some of the things I


In [8]:
!python run_language_modeling.py \
    --output_dir=output \
    --model_type=gpt2 \
    --model_name_or_path=gpt2 \
    --do_train \
    --train_data_file=nlp_finetuning/data/posts.txt\
    --do_eval \
    --eval_data_file=nlp_finetuning/data/posts.txt\

07/02/2020 11:55:55 - INFO - transformers.training_args -   PyTorch: setting up devices
07/02/2020 11:55:55 - INFO - __main__ -   Training/evaluation parameters TrainingArguments(output_dir='output', overwrite_output_dir=False, do_train=True, do_eval=True, do_predict=False, evaluate_during_training=False, per_device_train_batch_size=8, per_device_eval_batch_size=8, per_gpu_train_batch_size=None, per_gpu_eval_batch_size=None, gradient_accumulation_steps=1, learning_rate=5e-05, weight_decay=0.0, adam_epsilon=1e-08, max_grad_norm=1.0, num_train_epochs=3.0, max_steps=-1, warmup_steps=0, logging_dir='runs/Jul02_11-55-55_PAR-ZM1RJLVDQ', logging_first_step=False, logging_steps=500, save_steps=500, save_total_limit=None, no_cuda=False, seed=42, fp16=False, fp16_opt_level='O1', local_rank=-1, tpu_num_cores=None, tpu_metrics_debug=False, debug=False, dataloader_drop_last=False, eval_steps=1000, past_index=-1)
07/02/2020 11:55:55 - INFO - transformers.configuration_utils -   loading configuration

Iteration:  90%|████████████████████████████▊   | 9/10 [16:31<01:46, 106.98s/it][A
Iteration: 100%|███████████████████████████████| 10/10 [17:36<00:00, 105.60s/it][A
Epoch:  33%|████████████                        | 1/3 [17:36<35:12, 1056.00s/it]
Iteration:   0%|                                         | 0/10 [00:00<?, ?it/s][A
Iteration:  10%|███▏                            | 1/10 [01:44<15:39, 104.40s/it][A
Iteration:  20%|██████▍                         | 2/10 [03:21<13:37, 102.15s/it][A
Iteration:  30%|█████████▌                      | 3/10 [04:58<11:43, 100.57s/it][A
Iteration:  40%|█████████████▏                   | 4/10 [06:36<09:58, 99.79s/it][A
Iteration:  50%|████████████████▌                | 5/10 [08:10<08:11, 98.26s/it][A
Iteration:  60%|███████████████████▊             | 6/10 [09:47<06:30, 97.64s/it][A
Iteration:  70%|███████████████████████          | 7/10 [11:22<04:50, 96.85s/it][A
Iteration:  80%|██████████████████████████▍      | 8/10 [12:58<03:13, 96.87s/it

In [11]:
tokenizer = GPT2Tokenizer.from_pretrained("output")
# add the EOS token as PAD token to avoid warnings
model = GPT2LMHeadModel.from_pretrained("output", pad_token_id=tokenizer.eos_token_id)

In [21]:
# set no_repeat_ngram_size to 2
input_ids = tokenizer.encode('Demand forecasting is critical to businesses across almost all industries. It can seem easy,', return_tensors='pt')

beam_output = model.generate(
    input_ids, 
    max_length=100, 
    num_beams=5, 
    no_repeat_ngram_size=2, 
    early_stopping=True
)

print("Output:\n" + 100 * '-')
print(tokenizer.decode(beam_output[0], skip_special_tokens=True))


Output:
----------------------------------------------------------------------------------------------------
Demand forecasting is critical to businesses across almost all industries. It can seem easy, if not impossible, to predict how a company will perform in a given period of time. However, it is important to understand that forecasting can be done in many different ways.

In this article, we will take a look at some of the most commonly used forecasting tools and how they can help you predict the future of your business. We will also show you how you can use these tools to make better decisions about how to spend your time and money. If you have any questions or comments, please let us know in the comments below.
