New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] The fetch request of the rerank model cannot be handled correctly #1432
Comments
Replace |
ValueError: [address=0.0.0.0:36170, pid=79661] Currently |
The request body needs to be a json, the value of the |
The parameter
|
I have no idea what you are talking about. Last reply:
This will launch the model to CPU. |
Thanks bro, the code works, but it still works on the gpu and takes up 1749M of video memory |
Describe the bug
A clear and concise description of what the bug is.
To Reproduce
To help us to reproduce this bug, please provide information below:
Expected behavior
A clear and concise description of what you expected to happen.
Additional context
Add any other context about the problem here.
Python 3.10.14
inference 0.10.3+11.gda1b62c
Description: I need to run the rerank model on the cpu, so I introduced the 'device' parameter in fetch, but it will not work as well as the embedding model.
-------------------error----------------------
2024-05-07 10:24:07,813 xinference.core.worker 69017 ERROR Failed to load model bge-reranker-large-1-0
Traceback (most recent call last):
File "/root/Xinference/custom_packages/inference/xinference/core/worker.py", line 707, in launch_builtin_model
await model_ref.load()
File "/root/Xinference/venv/lib/python3.10/site-packages/xoscar/backends/context.py", line 227, in send
return self._process_result_message(result)
File "/root/Xinference/venv/lib/python3.10/site-packages/xoscar/backends/context.py", line 102, in _process_result_message
raise message.as_instanceof_cause()
File "/root/Xinference/venv/lib/python3.10/site-packages/xoscar/backends/pool.py", line 659, in send
result = await self._run_coro(message.message_id, coro)
File "/root/Xinference/venv/lib/python3.10/site-packages/xoscar/backends/pool.py", line 370, in _run_coro
return await coro
File "/root/Xinference/venv/lib/python3.10/site-packages/xoscar/api.py", line 384, in on_receive
return await super().on_receive(message) # type: ignore
File "xoscar/core.pyx", line 558, in on_receive
raise ex
File "xoscar/core.pyx", line 520, in xoscar.core._BaseActor.on_receive
async with self._lock:
File "xoscar/core.pyx", line 521, in xoscar.core._BaseActor.on_receive
with debug_async_timeout('actor_lock_timeout',
File "xoscar/core.pyx", line 524, in xoscar.core._BaseActor.on_receive
result = func(*args, **kwargs)
File "/root/Xinference/custom_packages/inference/xinference/core/model.py", line 239, in load
self._model.load()
File "/root/Xinference/custom_packages/inference/xinference/model/rerank/core.py", line 134, in load
self._model = CrossEncoder(
TypeError: [address=0.0.0.0:45176, pid=70382] sentence_transformers.cross_encoder.CrossEncoder.CrossEncoder() got multiple values for keyword argument 'device'
2024-05-07 10:24:07,888 xinference.api.restful_api 68848 ERROR [address=0.0.0.0:45176, pid=70382] sentence_transformers.cross_encoder.CrossEncoder.CrossEncoder() got multiple values for keyword argument 'device'
Traceback (most recent call last):
File "/root/Xinference/custom_packages/inference/xinference/api/restful_api.py", line 741, in launch_model
model_uid = await (await self._get_supervisor_ref()).launch_builtin_model(
File "/root/Xinference/venv/lib/python3.10/site-packages/xoscar/backends/context.py", line 227, in send
return self._process_result_message(result)
File "/root/Xinference/venv/lib/python3.10/site-packages/xoscar/backends/context.py", line 102, in _process_result_message
raise message.as_instanceof_cause()
File "/root/Xinference/venv/lib/python3.10/site-packages/xoscar/backends/pool.py", line 659, in send
result = await self._run_coro(message.message_id, coro)
File "/root/Xinference/venv/lib/python3.10/site-packages/xoscar/backends/pool.py", line 370, in _run_coro
return await coro
File "/root/Xinference/venv/lib/python3.10/site-packages/xoscar/api.py", line 384, in on_receive
return await super().on_receive(message) # type: ignore
File "xoscar/core.pyx", line 558, in on_receive
raise ex
File "xoscar/core.pyx", line 520, in xoscar.core._BaseActor.on_receive
async with self._lock:
File "xoscar/core.pyx", line 521, in xoscar.core._BaseActor.on_receive
with debug_async_timeout('actor_lock_timeout',
File "xoscar/core.pyx", line 526, in xoscar.core._BaseActor.on_receive
result = await result
File "/root/Xinference/custom_packages/inference/xinference/core/supervisor.py", line 892, in launch_builtin_model
await _launch_model()
File "/root/Xinference/custom_packages/inference/xinference/core/supervisor.py", line 856, in _launch_model
await _launch_one_model(rep_model_uid)
File "/root/Xinference/custom_packages/inference/xinference/core/supervisor.py", line 838, in _launch_one_model
await worker_ref.launch_builtin_model(
File "xoscar/core.pyx", line 284, in __pyx_actor_method_wrapper
async with lock:
File "xoscar/core.pyx", line 287, in xoscar.core.__pyx_actor_method_wrapper
result = await result
File "/root/Xinference/custom_packages/inference/xinference/core/utils.py", line 45, in wrapped
ret = await func(*args, **kwargs)
File "/root/Xinference/custom_packages/inference/xinference/core/worker.py", line 707, in launch_builtin_model
await model_ref.load()
File "/root/Xinference/venv/lib/python3.10/site-packages/xoscar/backends/context.py", line 227, in send
return self._process_result_message(result)
File "/root/Xinference/venv/lib/python3.10/site-packages/xoscar/backends/context.py", line 102, in _process_result_message
raise message.as_instanceof_cause()
File "/root/Xinference/venv/lib/python3.10/site-packages/xoscar/backends/pool.py", line 659, in send
result = await self._run_coro(message.message_id, coro)
File "/root/Xinference/venv/lib/python3.10/site-packages/xoscar/backends/pool.py", line 370, in _run_coro
return await coro
File "/root/Xinference/venv/lib/python3.10/site-packages/xoscar/api.py", line 384, in on_receive
return await super().on_receive(message) # type: ignore
File "xoscar/core.pyx", line 558, in on_receive
raise ex
File "xoscar/core.pyx", line 520, in xoscar.core._BaseActor.on_receive
async with self._lock:
File "xoscar/core.pyx", line 521, in xoscar.core._BaseActor.on_receive
with debug_async_timeout('actor_lock_timeout',
File "xoscar/core.pyx", line 524, in xoscar.core._BaseActor.on_receive
result = func(*args, **kwargs)
File "/root/Xinference/custom_packages/inference/xinference/core/model.py", line 239, in load
self._model.load()
File "/root/Xinference/custom_packages/inference/xinference/model/rerank/core.py", line 134, in load
self._model = CrossEncoder(
TypeError: [address=0.0.0.0:45176, pid=70382] sentence_transformers.cross_encoder.CrossEncoder.CrossEncoder() got multiple values for keyword argument 'device'
-----------------------end-----------------------
--------------------fetch code-------------------
fetch("http://192.168.100.172:12009/v1/models", {
"headers": {
"accept": "/",
"accept-language": "en-US,en;q=0.9",
"content-type": "application/json"
},
"referrer": "http://192.168.100.172:12009/ui/",
"referrerPolicy": "strict-origin-when-cross-origin",
"body": "{"model_uid":null,"model_name":"bge-reranker-large","model_type":"rerank","device":"cpu","replica":1}",
"method": "POST",
"mode": "cors",
"credentials": "include"
});
------------------------end-----------------------
The text was updated successfully, but these errors were encountered: