# [How to generate text: using different decoding methods for language generation with Transformers](https://huggingface.co/blog/how-to-generate)

Greedy search, Beam search, Top-K sampling와 Top-p sampling을 이용해서 문장 생성을 진행해보자.

In [1]:
import torch
from transformers import GPT2LMHeadModel, GPT2Tokenizer

tokenizer = GPT2Tokenizer.from_pretrained('gpt2')

# add the EOS token as PAD token to avoid warnings
model = GPT2LMHeadModel.from_pretrained('gpt2', pad_token_id=tokenizer.eos_token_id)

  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
  _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
  _np_qint16 = np.dtype([("qint16", np.int16, 1)])
  _np_quint16 = np.dtype([("quint16", np.uint16, 1)])
  _np_qint32 = np.dtype([("qint32", np.int32, 1)])
  np_resource = np.dtype([("resource", np.ubyte, 1)])
  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
  _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
  _np_qint16 = np.dtype([("qint16", np.int16, 1)])
  _np_quint16 = np.dtype([("quint16", np.uint16, 1)])
  _np_qint32 = np.dtype([("qint32", np.int32, 1)])
  np_resource = np.dtype([("resource", np.ubyte, 1)])


HBox(children=(IntProgress(value=0, description='Downloading', max=548118077, style=ProgressStyle(description_…




## Greedy Search

매 토큰마다 확률이 가장 높은 후보를 선택한다.

![greedy](https://huggingface.co/blog/assets/02_how-to-generate/greedy_search.png)

In [2]:
# encode context the generation is conditioned on
input_ids = tokenizer.encode('I enjoy walking with my cute dog', return_tensors='pt')

# generate text until the output length (which includes the context length) reaches 50
greedy_output = model.generate(input_ids, max_length=50)
#print(greedy_output.size())
print('Output:\n' + 100*'-')
print(tokenizer.decode(greedy_output[0], skip_special_tokens=True))

Output:
----------------------------------------------------------------------------------------------------
I enjoy walking with my cute dog, but I'm not sure if I'll ever be able to walk with my dog. I'm not sure if I'll ever be able to walk with my dog.

I'm not sure if I'll


Greedy search는 가장 확률 높은 문장을 놓칠 수 있다...

## Beam Search

매번 문장마다 `num_beams`만큼 후보를 남겨둠

![beam](https://huggingface.co/blog/assets/02_how-to-generate/beam_search.png)

In [7]:
# activate beam search and early_stopping
beam_output = model.generate(
    input_ids,
    max_length=50,
    num_beams=5,
    early_stopping=True
)

print("Output:\n" + 100 * '-')
print(tokenizer.decode(beam_output[0], skip_special_tokens=True))

Output:
----------------------------------------------------------------------------------------------------
I enjoy walking with my cute dog, but I'm not sure if I'll ever be able to walk with him again.

I'm not sure if I'll ever be able to walk with him again. I'm not sure if I'll


이것도 썩 만족스럽진 않으므로 n-그램의 반복 등장을 막아보자.

In [9]:
# set no_repeat_ngram_size to 2
beam_output = model.generate(
    input_ids,
    max_length=50,
    num_beams=5,
    no_repeat_ngram_size=2,
    early_stopping=True
)

print("Output:\n" + 100 * '-')
print(tokenizer.decode(beam_output[0], skip_special_tokens=True))

Output:
----------------------------------------------------------------------------------------------------
I enjoy walking with my cute dog, but I'm not sure if I'll ever be able to walk with him again.

I've been thinking about this for a while now, and I think it's time for me to take a break


훨씬 낫지만, n-그램은 조심해서 써야한다. 왜냐면 `New York` 같은 게 문제가 될 수 있으므로.

`num_return_squences` 옵션도 도움이 된다. 다만 `num_return_sequences <= num_beams`를 주의!

In [11]:
# set return_num_sequences > 1
beam_outputs = model.generate(
    input_ids,
    max_length=50,
    num_beams=5,
    no_repeat_ngram_size=2,
    num_return_sequences=5,
    early_stopping=True
)

# now we have 5 output sequences
print("Output:\n" + 100 * '-')
for i, beam_output in enumerate(beam_outputs):
  print("{}: {}".format(i, tokenizer.decode(beam_output, skip_special_tokens=True)))

Output:
----------------------------------------------------------------------------------------------------
0: I enjoy walking with my cute dog, but I'm not sure if I'll ever be able to walk with him again.

I've been thinking about this for a while now, and I think it's time for me to take a break
1: I enjoy walking with my cute dog, but I'm not sure if I'll ever be able to walk with him again.

I've been thinking about this for a while now, and I think it's time for me to get back to
2: I enjoy walking with my cute dog, but I'm not sure if I'll ever be able to walk with her again.

I've been thinking about this for a while now, and I think it's time for me to take a break
3: I enjoy walking with my cute dog, but I'm not sure if I'll ever be able to walk with her again.

I've been thinking about this for a while now, and I think it's time for me to get back to
4: I enjoy walking with my cute dog, but I'm not sure if I'll ever be able to walk with him again.

I've been thinking about 

보면 알겠지만 거의 차이가 없다... 이건 다음 그림이 설명해준다.

![ramdom](https://blog.fastforwardlabs.com/images/2019/05/Screen_Shot_2019_05_08_at_3_06_36_PM-1557342561886.png)

따라서 무작위성을 도입해보자.

## Sampling

In [12]:
# set seed to reproduce result
torch.manual_seed(0)

# activate sampleing and deactivate top_k by setting top_k sampling to 0
sample_output = model.generate(
    input_ids,
    do_sample=True,
    max_length=50,
    top_k=0
)

print(sample_output.size())
print("Output:\n" + 100 * '-')
print(tokenizer.decode(sample_output[0], skip_special_tokens=True))

torch.Size([1, 50])
Output:
----------------------------------------------------------------------------------------------------
I enjoy walking with my cute dog Nimmy the Cleveland GP because I want to know what's good for her. The day after birth, she's wanted to see your doctor but you couldn't get to her brain by taking bath and she's suddenly


그럴듯 해 보이지만 자세히 보면 말 안되는 문장이 많다. 따라서 확률을 양 극단으로 몰아주기 위해 temperature를 도입하자.

In [17]:
# set seed to reproduce result
torch.manual_seed(1)

# activate sampleing and deactivate top_k by setting top_k sampling to 0
sample_output = model.generate(
    input_ids,
    do_sample=True,
    max_length=50,
    top_k=0,
    temperature=0.7
)


print("Output:\n" + 100 * '-')
print(tokenizer.decode(sample_output[0], skip_special_tokens=True))

Output:
----------------------------------------------------------------------------------------------------
I enjoy walking with my cute dog, and I also love to play with my dog. I'd also like to give a special shout out at our Irish Pub, as well as at this year's Edinburgh Fashion Show for the free food.




## Top-K Sampling

K개를 먼저 뽑은 후 그 안에서 다시 확률 분포 구하는 Top-K 방식을 써보자. GPT2가 이걸 도입했다.

In [20]:
# set seed to reproduce result
torch.manual_seed(0)

# activate sampleing and deactivate top_k by setting top_k sampling to 0
sample_output = model.generate(
    input_ids,
    do_sample=True,
    max_length=50,
    top_k=50,
)


print("Output:\n" + 100 * '-')
print(tokenizer.decode(sample_output[0], skip_special_tokens=True))

Output:
----------------------------------------------------------------------------------------------------
I enjoy walking with my cute dog – all in less than 30 minutes. He's been to the vet with me and I'm confident I'll be able to keep him after about 2 weeks and is on his best behavior and health.

So


이건 나쁘지 않지만, Top-K 방식은 샘플링된 단어에 동적으로 적응하지 못하는 문제가 있다. 따라서 다음 Top-p 방식을 도입한다.

## Top-p (nucleus) sampling

Top-p 샘플링은 확률이 p가 넘도록 단어를 샘플링하는 방식이다. `top_p` 인자를 통해 적용할 수 있다.

In [22]:
# set seed to reproduce result
torch.manual_seed(0)

# activate sampleing and deactivate top_k by setting top_k sampling to 0
sample_output = model.generate(
    input_ids,
    do_sample=True,
    max_length=50,
    top_p=0.92,
    top_k=0,
)


print("Output:\n" + 100 * '-')
print(tokenizer.decode(sample_output[0], skip_special_tokens=True))

Output:
----------------------------------------------------------------------------------------------------
I enjoy walking with my cute dog Yang from time to time as I make sure I don't need to talk to her. She knows I'm in love and she wants to try and figure out when to stop crying and play with me.




이젠 정말 그럴듯한 문장이 나온다. 마지막으로 출력 문장 수를 늘려보자.

In [23]:
# set seed to reproduce result
torch.manual_seed(0)

# activate sampleing and deactivate top_k by setting top_k sampling to 0
sample_outputs = model.generate(
    input_ids,
    do_sample=True,
    max_length=50,
    top_p=0.92,
    top_k=0,
    num_return_sequences=3
)


print("Output:\n" + 100 * '-')
for i, sample_output in enumerate(sample_outputs):
    print(f'{i}: {tokenizer.decode(sample_output, skip_special_tokens=True)}')

Output:
----------------------------------------------------------------------------------------------------
0: I enjoy walking with my cute dog Yang inside a space that's ready for you in between the walls. When you squeeze his tender flesh out between your teeth, he'll pop out of his thong and jump onto your lap. We should practice this
1: I enjoy walking with my cute dog at least for a week. One of my dogs even takes me to the playground where I play, and if she plays well, I just give her money and go to the park. This is fun for a friend
2: I enjoy walking with my cute dog in our garden, but I also love playing with them because they are very much part of my family.

Once, I will go out to dinner and while enjoying his fragrance and his breath, I got a


## 끗!