Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix returning logits in prefill phase of pytorch engine #1209

Merged
merged 8 commits into from
Mar 4, 2024

Conversation

grimoire
Copy link
Collaborator

Fix pytorch engine decode.

@lvhan028 lvhan028 requested review from irexyc and lvhan028 March 1, 2024 07:20
@lvhan028
Copy link
Collaborator

lvhan028 commented Mar 1, 2024

May resolve the conflicts

@irexyc
Copy link
Collaborator

irexyc commented Mar 1, 2024

from lmdeploy.pytorch.engine import Engine
import numpy as np
m = Engine('/nvme/shared/llama2/huggingface/llama-2-7b-chat')
g = m.create_instance()

# sequence_start 似乎没效果,每次结果都不一样。
g.decode([[11,12,13,14,15]], sequence_end=False)

@grimoire
Copy link
Collaborator Author

grimoire commented Mar 1, 2024

@irexyc seqence_start 的行为具体是啥?要用新的 session_id 吗?还是说清掉旧的 cache 复用原来的?

@irexyc
Copy link
Collaborator

irexyc commented Mar 1, 2024

# 第一次调用,session_id=0, 未结束
g.decode([[11,12,13,14,15]], sequence_end=False)

# 第二次调用的时候,session_id=0,之前的请求没有end,并且当前请求sequence_start=True,所以 session_id 冲突了,turbomind 会清空 session_id=0 对应的历史, pytorch 这边似乎没有清空。
g.decode([[11,12,13,14,15]], sequence_end=False)

@grimoire
Copy link
Collaborator Author

grimoire commented Mar 1, 2024

修了,现在 sequence_start=True 时会尝试 end 一次

@irexyc
Copy link
Collaborator

irexyc commented Mar 1, 2024

r1 = m.decode([[11,12,13,14,15]], sequence_end=False)
r2 = m.decode([[16,17,18,19,20]], sequence_start=False)

# r2每次都不一样

@lvhan028 lvhan028 changed the title fix decode fix returning logits in prefill phase of pytorch engine Mar 4, 2024
@lvhan028 lvhan028 merged commit 5aaeb5c into InternLM:main Mar 4, 2024
4 of 5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants