In [1]:
gpt_2_english_path = "/data/pretrained_models/gpt2"

In [3]:
from transformers import GPT2Tokenizer, GPT2LMHeadModel

tokenizer = GPT2Tokenizer.from_pretrained(gpt_2_english_path)

model = GPT2LMHeadModel.from_pretrained(gpt_2_english_path, pad_token_id=tokenizer.eos_token_id)
#在这里手动设置pad_token_id是为了取消warning。也可以通过在`generate()`函数中添加入参`pad_token_id=50256`或`pad_token_id=model.config.eos_token_id`实现相同的目标。

In [4]:
toks = tokenizer("I want to say", padding=False, return_tensors="pt")

In [19]:
greedy_output = model.generate(**toks, max_length=50)
tokenizer.decode(greedy_output[0], skip_special_tokens=True)

'I want to say that I\'m not a fan of the idea of a "big-budget" movie. I\'m not a fan of the idea of a "big-budget" movie. I\'m not a fan of the idea of a "'

In [20]:
beam_output = model.generate(
    **toks, 
    max_length=50, 
    num_beams=5, 
    early_stopping=True
)
tokenizer.decode(beam_output[0], skip_special_tokens=True)

'I want to say thank you to all of you who have supported me over the years. I want to say thank you to all of you who have supported me over the years. I want to say thank you to all of you who have supported me'

In [21]:
beam_output = model.generate(
    **toks, 
    max_length=50, 
    num_beams=5, 
    early_stopping=True,
    no_repeat_ngram_size=2
)
tokenizer.decode(beam_output[0], skip_special_tokens=True)

'I want to say thank you to all of you who have supported me over the years. I am so grateful for all the support you have given me, and I hope you will continue to do the same.\n\nThank you all for your support'

In [22]:
beam_outputs = model.generate(
    **toks, 
    max_length=50, 
    num_beams=5, 
    early_stopping=True,
    no_repeat_ngram_size=2,
    num_return_sequences=5
)
for i, beam_output in enumerate(beam_outputs):
  print("{}: {}".format(i, tokenizer.decode(beam_output, skip_special_tokens=True)))

0: I want to say thank you to all of you who have supported me over the years. I am so grateful for all the support you have given me, and I hope you will continue to do the same.

Thank you all for your support
1: I want to say thank you to all of you who have supported me over the years. I am so grateful for all the support you have given me, and I hope you will continue to do the same.

Thank you for your support.
2: I want to say thank you to all of you who have supported me over the years. I am so grateful for all the support you have given me, and I hope you will continue to do the same.

Thank you for your continued support
3: I want to say thank you to all of you who have supported me over the years. I am so grateful for all the support you have given me, and I hope you will continue to do the same. Thank you.

Thank you for
4: I want to say thank you to all of you who have supported me over the years. I am so grateful for all the support you have given me, and I hope you will 

In [24]:
import torch

torch.random.manual_seed(20240315)

sample_output = model.generate(
    **toks, 
    do_sample=True, 
    max_length=50, 
    top_k=0
)

print(tokenizer.decode(sample_output[0], skip_special_tokens=True))

I want to say kid this — I am talking about you, right? So, yeah — Tappy Charlie? Help me with this work! We're going to get a place in your joint. We're going to get a place.




In [26]:
sample_output = model.generate(
    **toks, 
    do_sample=True, 
    max_length=50, 
    top_k=0,
    temperature=0.7
)

print(tokenizer.decode(sample_output[0], skip_special_tokens=True))

I want to say I would be a good president if I could."

Follow @politico


In [27]:
sample_output = model.generate(
    **toks, 
    do_sample=True, 
    max_length=50, 
    top_k=50
)

print(tokenizer.decode(sample_output[0], skip_special_tokens=True))

I want to say 'Thank you for listening to me talk,'" he says. "I was really pleased that I received the letter."


In [28]:
sample_output = model.generate(
    **toks, 
    do_sample=True, 
    max_length=50, 
    top_p=0.92, 
    top_k=0
)

print(tokenizer.decode(sample_output[0], skip_special_tokens=True))

I want to say how honored I am to attend your farewell lecture, honor some of my colleagues who will come to Earth in the next few years to lead the next phase of our strategic work here at our company. And do not hesitate to come here


In [29]:
sample_outputs = model.generate(
    **toks,
    do_sample=True,
    max_length=50,
    top_k=50,
    top_p=0.95,
    num_return_sequences=3
)

print("Output:\n" + 100 * '-')
for i, sample_output in enumerate(sample_outputs):
  print("{}: {}".format(i, tokenizer.decode(sample_output, skip_special_tokens=True)))

Output:
----------------------------------------------------------------------------------------------------
0: I want to say my name to you! I'm sure you already know what it is, but I can't help but hear the same from all of you. I wanted to speak to you about my family in particular. I know it is just
1: I want to say, if you haven't been paying attention, I should ask you this:

Did you see this? The following image of this image (a post titled The War of the World) shows how China is now fighting an armed
2: I want to say it's something that has had a huge impact on the football team for me as a kid, so I'll take it as a sign of respect for our coaches.

"Obviously I've always loved this game. I've
