In [1]:
from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("openai-community/gpt2")

In [2]:
prompt = "It was a dark and stormy"
input_ids = tokenizer(prompt).input_ids
input_ids

[1026, 373, 257, 3223, 290, 6388, 88]

In [3]:
for t in input_ids:
    print(f"{tokenizer.decode(t)} \t: {t}")

It 	: 1026
 was 	: 373
 a 	: 257
 dark 	: 3223
 and 	: 290
 storm 	: 6388
y 	: 88


In [4]:
from transformers import AutoModelForCausalLM

gpt2 = AutoModelForCausalLM.from_pretrained("openai-community/gpt2")

In [11]:
input_ids = tokenizer(prompt, return_tensors="pt").input_ids

In [18]:
outputs = gpt2(input_ids)
outputs.logits.shape

torch.Size([1, 7, 50257])

In [26]:
final_logits = gpt2(input_ids).logits[0,-1]
print("max = ",final_logits.argmax())
print("min = ", final_logits.argmin())

max =  tensor(1755)
min =  tensor(15272)


In [28]:
print("max : ",tokenizer.decode(1755))
print("min : ",tokenizer.decode(15272))

max :   night
min :   pione


In [29]:
import torch 

top10_logits = torch.topk(final_logits, 10)
for index in top10_logits.indices:
    print(tokenizer.decode(index))

 night
 day
 evening
 morning
 afternoon
 summer
 time
 winter
 weekend
,


In [31]:
output_ids = gpt2.generate(input_ids,max_new_tokens = 20)
decoded_text = tokenizer.decode(output_ids[0])

print("input Ids", input_ids[0])
print("output Ids", output_ids)
print("decoded text", decoded_text)

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


input Ids tensor([1026,  373,  257, 3223,  290, 6388,   88])
output Ids tensor([[ 1026,   373,   257,  3223,   290,  6388,    88,  1755,    13,   383,
          2344,   373, 19280,    11,   290,   262, 15114,   547,  7463,    13,
           383,  2344,   373, 19280,    11,   290,   262]])
decoded text It was a dark and stormy night. The wind was blowing, and the clouds were falling. The wind was blowing, and the


In [39]:
beam_output = gpt2.generate(input_ids, max_new_tokens=30, num_beams=5)
print(beam_output)

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


tensor([[1026,  373,  257, 3223,  290, 6388,   88, 1755,   13,  198,  198,    1,
         1026,  373, 3223,  290, 6388,   88,  553,  339,  531,   13,  198,  198,
            1, 1026,  373, 3223,  290, 6388,   88,  553,  339,  531,   13,  198,
          198]])


In [40]:
print(tokenizer.decode(beam_output[0],skip_special_tokens=True)) # tokenizer

It was a dark and stormy night.

"It was dark and stormy," he said.

"It was dark and stormy," he said.




In [49]:
beam_output = gpt2.generate(input_ids,repetition_penalty=1.2, max_new_tokens=38, num_beams=5) #vary the repetition_penalty and num_beams
print(tokenizer.decode(beam_output[0],skip_special_tokens=True))

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


It was a dark and stormy night.

"I don't think I've ever seen anything like it before," he said. "There's no doubt in my mind that this is going to be one of the


In [12]:
from transformers import set_seed

set_seed(70)

sampling_output = gpt2.generate(input_ids,max_length=34,do_sample =True,top_k = 0)

print(tokenizer.decode(sampling_output[0]))

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
The attention mask is not set and cannot be inferred from input because pad token is same as eos token.As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.


It was a dark and stormy day until it broke down the big canvas on my sleep station, making me money dilapidated, and, with a big soothing mug


In [13]:
sampling_output = gpt2.generate(input_ids,max_length=34,temperature = 0.4 ,do_sample =True,top_k = 0)

print(tokenizer.decode(sampling_output[0]))

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


It was a dark and stormy night, and I was alone. I was in the middle of the night, and I was suddenly awakened bygoodness, and I


In [14]:
sampling_output = gpt2.generate(input_ids,max_length=34,temperature = 0.001 ,do_sample =True,top_k = 0)

print(tokenizer.decode(sampling_output[0]))

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


It was a dark and stormy night. The wind was blowing, and the clouds were falling. The wind was blowing, and the clouds were falling. The wind was


In [15]:
sampling_output = gpt2.generate(input_ids,max_length=34,temperature = 3.0 ,do_sample =True,top_k = 0)

print(tokenizer.decode(sampling_output[0]))

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


It was a dark and stormy radiant. Orb drillsgarPosition serv preserving Imperial licenseium Botiping Fuji complimentbeitlake amended ChurchillFlying set crou 175 dualKing Bucc statue


In [16]:
sampling_output = gpt2.generate(input_ids,max_length=40,do_sample =True,top_k = 10)

print(tokenizer.decode(sampling_output[0]))

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


It was a dark and stormy morning. I had a long time to spare and I had to be careful about going too far. I went back to the bathroom and tried to take a shower.


In [26]:
sampling_output = gpt2.generate(input_ids,max_length=40,do_sample =True,top_k = 0,top_p = 0.94)

print(tokenizer.decode(sampling_output[0]))

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


It was a dark and stormy night. The dog stretched slightly out to the side while my friend cried "A little boy..." It was at this point that the last bass unearthing started,


In [30]:
sampling_output = gpt2.generate(input_ids,max_length=40,do_sample =True,penalty_alpha = 0.6) #beautiful, use penalty_alpha

print(tokenizer.decode(sampling_output[0]))

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


It was a dark and stormy night, so the only thing that can really explain the strange silence after they had finished eating lunch was just that there were other guests.

The first guest was
