Support `min_new_tokens` generation config in pytorch engine #1096

grimoire · 2024-02-01T09:28:20Z

tested on chat and benchmark_pytorch_throughput

AllentDan

Is this expected behavior of the model for min_new_tokens?

grimoire · 2024-02-19T09:02:12Z

@AllentDan Fixed, eos would be ignored before reaching min_new_tokens

AllentDan · 2024-02-19T09:15:33Z

Got this.

grimoire · 2024-02-19T09:26:32Z

Got this.

@lvhan028 is this OK?

lvhan028 · 2024-02-19T12:26:24Z

@irexyc what are the behavior of transformers and turbomind if min_new_token is set?

irexyc · 2024-02-19T12:52:39Z

what are the behavior of transformers and turbomind if min_new_token is set?

set score of eos token to -inf when generated token length < min_new_tokens

https://github.com/huggingface/transformers/blob/main/src/transformers/generation/logits_process.py#L160

lvhan028 · 2024-02-26T02:31:15Z

@AllentDan @irexyc I think PyTorch engine did the right thing. When min_new_tokens is set, eos_id and stop_words tokens should be banned when the generated token number is less than min_new_tokens

AllentDan · 2024-02-26T03:34:01Z

@AllentDan @irexyc I think PyTorch engine did the right thing. When min_new_tokens is set, eos_id and stop_words tokens should be banned when the generated token number is less than min_new_tokens

Is it possible that stop_words got banned, but other possible new tokens are used when n>1? min_new_tokens to me, is a minimal number of newly generated tokens while not output tokens like <eoa>, <Bot> to users.

lvhan028 · 2024-02-27T04:49:40Z

stop_words should be banned. They cannot be generated.

lvhan028 · 2024-02-29T09:52:19Z

from lmdeploy import pipeline, GenerationConfig, TurbomindEngineConfig, PytorchEngineConfig

# test torch min new tokens
pipe = pipeline('/workspace/models-140/InternLM/internlm2-chat-7b/', 
                backend_config=PytorchEngineConfig())

response = pipe('hi', gen_config=GenerationConfig(
    min_new_tokens=100
))

print(response)

The generated response is:

Response(text='你好！有什么我可以帮助你的吗？', generate_token_len=8, input_token_len=103, session_id=0, finish_reason='stop')

I think generated_token_len shouldn't be less than min_new_tokens

grimoire · 2024-02-29T10:05:50Z

@lvhan028
https://github.com/grimoire/lmdeploy/blob/637cc512027b41986fcfabba8f2011e682aa37e5/lmdeploy/messages.py#L87

gen_config conversion does not include min_new_tokens

grimoire and others added 3 commits February 1, 2024 15:34

support torch min_tokens

4ff191e

fix profile

92e49a2

fix sampling

80d9dc7

grimoire added the improvement label Feb 1, 2024

merge main

f9444e7

lvhan028 requested a review from AllentDan February 19, 2024 02:29

AllentDan reviewed Feb 19, 2024

View reviewed changes

grimoire added 2 commits February 19, 2024 16:34

Merge branch 'main' into support-torch-min-tokens

556dbcd

ignore eos before reach min_new_tokens

39fd43d

fix

5e0fc0d

align with turbomind

a9e74da

grimoire added 3 commits February 26, 2024 11:47

Merge branch 'main' into support-torch-min-tokens

a075c40

fix

d14c743

remove useless check

0fa2abf

done

637cc51

fix conversion gen config

7f09c66

lvhan028 approved these changes Feb 29, 2024

View reviewed changes

lvhan028 requested a review from irexyc February 29, 2024 12:03

irexyc approved these changes Feb 29, 2024

View reviewed changes

lvhan028 changed the title ~~Support torch min new tokens~~ Support min_new_tokens generation config in pytorch engine Feb 29, 2024

lvhan028 merged commit 16da6ae into InternLM:main Feb 29, 2024
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support `min_new_tokens` generation config in pytorch engine #1096

Support `min_new_tokens` generation config in pytorch engine #1096

grimoire commented Feb 1, 2024

AllentDan left a comment

grimoire commented Feb 19, 2024

AllentDan commented Feb 19, 2024

grimoire commented Feb 19, 2024

lvhan028 commented Feb 19, 2024

irexyc commented Feb 19, 2024 •

edited

lvhan028 commented Feb 26, 2024

AllentDan commented Feb 26, 2024

lvhan028 commented Feb 27, 2024

lvhan028 commented Feb 29, 2024

grimoire commented Feb 29, 2024

Support min_new_tokens generation config in pytorch engine #1096

Support min_new_tokens generation config in pytorch engine #1096

Conversation

grimoire commented Feb 1, 2024

AllentDan left a comment

Choose a reason for hiding this comment

grimoire commented Feb 19, 2024

AllentDan commented Feb 19, 2024

grimoire commented Feb 19, 2024

lvhan028 commented Feb 19, 2024

irexyc commented Feb 19, 2024 • edited

lvhan028 commented Feb 26, 2024

AllentDan commented Feb 26, 2024

lvhan028 commented Feb 27, 2024

lvhan028 commented Feb 29, 2024

grimoire commented Feb 29, 2024

Support `min_new_tokens` generation config in pytorch engine #1096

Support `min_new_tokens` generation config in pytorch engine #1096

irexyc commented Feb 19, 2024 •

edited