Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix local variable 'response' referenced before assignment in async_engine.generate #1513

Merged
merged 1 commit into from
Apr 28, 2024

Conversation

irexyc
Copy link
Collaborator

@irexyc irexyc commented Apr 28, 2024

Motivation

When input is long or the chat template doesn't match, the first output token id with turbomind backend may be eos_id or in stop_words. In this case, the async_engine.generate will meet the following error.

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/chenxin/ws3/vl/lmdeploy/serve/async_engine.py", line 325, in __call__
    return self.batch_infer(prompts,
  File "/home/chenxin/ws3/vl/lmdeploy/serve/async_engine.py", line 441, in batch_infer
    _get_event_loop().run_until_complete(gather())
  File "/home/chenxin/miniconda3/envs/38/lib/python3.8/asyncio/base_events.py", line 616, in run_until_complete
    return future.result()
  File "/home/chenxin/ws3/vl/lmdeploy/serve/async_engine.py", line 436, in gather
    await asyncio.gather(*[
  File "/home/chenxin/ws3/vl/lmdeploy/serve/async_engine.py", line 423, in _inner_call
    async for out in generator:
  File "/home/chenxin/ws3/vl/lmdeploy/serve/async_engine.py", line 648, in generate
    if not response.endswith(''):
UnboundLocalError: local variable 'response' referenced before assignment

reproduce

from lmdeploy import pipeline
pipe = pipeline('/mnt/140/Qwen/Qwen1.5-7B-Chat')
pipe('hello ' * 7500)

@zhulinJulia24
Copy link
Collaborator

Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
[WARNING] gemm_config.in is not found; using default GEMM algo
Response(text='', generate_token_len=1, input_token_len=7520, session_id=0, finish_reason='stop', token_ids=[], logprobs=None)

fixed

Copy link
Collaborator

@zhulinJulia24 zhulinJulia24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@lvhan028 lvhan028 merged commit d72432e into InternLM:main Apr 28, 2024
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants