Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: qwen1.5 gptq int8 errored #1046

Closed
qinxuye opened this issue Feb 28, 2024 · 4 comments
Closed

BUG: qwen1.5 gptq int8 errored #1046

qinxuye opened this issue Feb 28, 2024 · 4 comments
Labels
bug Something isn't working stale
Milestone

Comments

@qinxuye
Copy link
Contributor

qinxuye commented Feb 28, 2024

Describe the bug

A clear and concise description of what the bug is.

To Reproduce

To help us to reproduce this bug, please provide information below:

  1. Your Python version.
  2. The version of xinference you use.
  3. Versions of crucial packages.
  4. Full stack of the error.
  5. Minimized code to reproduce the error.
2024-02-28 03:45:45,757 xinference.api.restful_api 188628 ERROR    Chat completion stream got an error: [address=0.0.0.0:43203, pid=188725] probability tensor contains either `inf`, `nan` or element < 0
Traceback (most recent call last):
  File "/new_data2/xuyeqin-data/projects/inference/xinference/api/restful_api.py", line 1257, in stream_results
    async for item in iterator:
  File "/home/xuyeqin/miniconda3/miniconda/lib/python3.11/site-packages/xoscar/api.py", line 340, in __anext__
    return await self._actor_ref.__xoscar_next__(self._uid)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/xuyeqin/miniconda3/miniconda/lib/python3.11/site-packages/xoscar/backends/context.py", line 227, in send
    return self._process_result_message(result)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/xuyeqin/miniconda3/miniconda/lib/python3.11/site-packages/xoscar/backends/context.py", line 102, in _process_result_message
    raise message.as_instanceof_cause()
  File "/home/xuyeqin/miniconda3/miniconda/lib/python3.11/site-packages/xoscar/backends/pool.py", line 657, in send
    result = await self._run_coro(message.message_id, coro)
    ^^^^^^^^^^^^^^^^^
  File "/home/xuyeqin/miniconda3/miniconda/lib/python3.11/site-packages/xoscar/backends/pool.py", line 368, in _run_coro
    return await coro
  File "/home/xuyeqin/miniconda3/miniconda/lib/python3.11/site-packages/xoscar/api.py", line 384, in __on_receive__
    return await super().__on_receive__(message)  # type: ignore
    ^^^^^^^^^^^^^^^^^
  File "xoscar/core.pyx", line 558, in __on_receive__
    raise ex
  File "xoscar/core.pyx", line 520, in xoscar.core._BaseActor.__on_receive__
    async with self._lock:
    ^^^^^^^^^^^^^^^^^
  File "xoscar/core.pyx", line 521, in xoscar.core._BaseActor.__on_receive__
    with debug_async_timeout('actor_lock_timeout',
    ^^^^^^^^^^^^^^^^^
  File "xoscar/core.pyx", line 526, in xoscar.core._BaseActor.__on_receive__
    result = await result
    ^^^^^^^^^^^^^^^^^
  File "/home/xuyeqin/miniconda3/miniconda/lib/python3.11/site-packages/xoscar/api.py", line 431, in __xoscar_next__
    raise e
  File "/home/xuyeqin/miniconda3/miniconda/lib/python3.11/site-packages/xoscar/api.py", line 417, in __xoscar_next__
    r = await asyncio.to_thread(_wrapper, gen)
    ^^^^^^^^^^^^^^^^^
  File "/home/xuyeqin/miniconda3/miniconda/lib/python3.11/asyncio/threads.py", line 25, in to_thread
    return await loop.run_in_executor(None, func_call)
      ^^^^^^^^^^^^^^^^^
  File "/home/xuyeqin/miniconda3/miniconda/lib/python3.11/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
    ^^^^^^^^^^^^^^^^^
  File "/home/xuyeqin/miniconda3/miniconda/lib/python3.11/site-packages/xoscar/api.py", line 402, in _wrapper
    return next(_gen)
  File "/new_data2/xuyeqin-data/projects/inference/xinference/core/model.py", line 257, in _to_json_generator
    for v in gen:
  File "/new_data2/xuyeqin-data/projects/inference/xinference/model/llm/utils.py", line 470, in _to_chat_completion_chunks
    for i, chunk in enumerate(chunks):
    ^^^^^^^^^^^^^^^^^
  File "/new_data2/xuyeqin-data/projects/inference/xinference/model/llm/pytorch/core.py", line 253, in generator_wrapper
    for completion_chunk, completion_usage in generate_stream(
    ^^^^^^^^^^^^^^^^^
  File "/home/xuyeqin/miniconda3/miniconda/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 35, in generator_context
    response = gen.send(None)
    ^^^^^^^^^^^^^^^^^
  File "/new_data2/xuyeqin-data/projects/inference/xinference/model/llm/pytorch/utils.py", line 214, in generate_stream
    indices = torch.multinomial(probs, num_samples=2)
    ^^^^^^^^^^^^^^^^^
RuntimeError: [address=0.0.0.0:43203, pid=188725] probability tensor contains either `inf`, `nan` or element < 0

Expected behavior

A clear and concise description of what you expected to happen.

Additional context

Add any other context about the problem here.

@XprobeBot XprobeBot added the bug Something isn't working label Feb 28, 2024
@XprobeBot XprobeBot added this to the v0.9.1 milestone Feb 28, 2024
@qinxuye
Copy link
Contributor Author

qinxuye commented Feb 28, 2024

qwen1.5 gptq int8 worked for torch == 2.1.2, error only showed for torch == 2.2.0

@qinxuye
Copy link
Contributor Author

qinxuye commented Feb 28, 2024

Similar issue #733 .

@XprobeBot XprobeBot modified the milestones: v0.9.1, v0.9.2, v0.9.3 Mar 1, 2024
@XprobeBot XprobeBot modified the milestones: v0.9.3, v0.9.4, v0.9.5 Mar 15, 2024
@XprobeBot XprobeBot modified the milestones: v0.10.0, v0.10.1 Mar 29, 2024
@XprobeBot XprobeBot modified the milestones: v0.10.1, v0.10.2 Apr 12, 2024
@XprobeBot XprobeBot modified the milestones: v0.10.2, v0.10.3, v0.11.0 Apr 19, 2024
@XprobeBot XprobeBot modified the milestones: v0.11.0, v0.11.1, v0.11.2 May 11, 2024
@XprobeBot XprobeBot modified the milestones: v0.11.2, v0.11.3 May 24, 2024
@XprobeBot XprobeBot modified the milestones: v0.11.3, v0.11.4, v0.12.0, v0.12.1 May 31, 2024
@XprobeBot XprobeBot modified the milestones: v0.12.1, v0.12.2 Jun 14, 2024
@XprobeBot XprobeBot modified the milestones: v0.12.2, v0.12.4, v0.13.0, v0.13.1 Jun 28, 2024
@XprobeBot XprobeBot modified the milestones: v0.13.1, v0.13.2 Jul 12, 2024
@XprobeBot XprobeBot modified the milestones: v0.13.2, v0.13.4 Jul 26, 2024
Copy link

github-actions bot commented Aug 7, 2024

This issue is stale because it has been open for 7 days with no activity.

@github-actions github-actions bot added the stale label Aug 7, 2024
Copy link

This issue was closed because it has been inactive for 5 days since being marked as stale.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Aug 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working stale
Projects
None yet
Development

No branches or pull requests

2 participants