You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
bin /app/venv/lib/python3.8/site-packages/bitsandbytes/libbitsandbytes_cpu.so
CUDA SETUP: Loading binary /app/venv/lib/python3.8/site-packages/bitsandbytes/libbitsandbytes_cpu.so...
{"asctime": "2024-02-01 16:51:03,775", "name": "torch.distributed.nn.jit.instantiator", "levelname": "INFO", "message": "Created a temporary directory at /tmp/tmp7h8azrb7"}
{"asctime": "2024-02-01 16:51:03,775", "name": "torch.distributed.nn.jit.instantiator", "levelname": "INFO", "message": "Writing /tmp/tmp7h8azrb7/_remote_module_non_scriptable.py"}
/app/venv/lib/python3.8/site-packages/bitsandbytes/cextension.py:33: UserWarning: The installed version of bitsandbytes was compiled without GPU support. 8-bit optimizers, 8-bit multiplication, and GPU quantization are unavailable.
warn("The installed version of bitsandbytes was compiled without GPU support. "
{"asctime": "2024-02-01 16:52:11,342", "name": "root", "levelname": "INFO", "message": "PATH_TO_MODEL is set to ==>/mnt/models/"}
{"asctime": "2024-02-01 16:52:11,342", "name": "root", "levelname": "INFO", "message": "Registering model: llama-2-7b-chat-hf"}
{"asctime": "2024-02-01 16:52:11,343", "name": "root", "levelname": "INFO", "message": "Setting max asyncio worker threads as 8"}
{"asctime": "2024-02-01 16:52:11,344", "name": "root", "levelname": "INFO", "message": "Starting uvicorn with 2 workers"}
Loading checkpoint shards: 0%| | 0/2 [00:00<?, ?it/s]
Loading checkpoint shards: 50%|█████ | 1/2 [00:48<00:48, 48.85s/it]
Loading checkpoint shards: 100%|██████████| 2/2 [01:06<00:00, 30.56s/it]
Loading checkpoint shards: 100%|██████████| 2/2 [01:06<00:00, 33.31s/it]
{"asctime": "2024-02-01 16:52:11,368", "name": "root", "levelname": "INFO", "message": "Starting gRPC server on [::]:8081"}
{"asctime": "2024-02-01 16:52:11,369", "name": "root", "levelname": "ERROR", "message": "uncaught exception", "exc_info": "Traceback (most recent call last):\n File \"serve.py\", line 56, in <module>\n kserve.ModelServer(workers=NUM_WORKERS).start([model])\n File \"/app/venv/lib/python3.8/site-packages/kserve/model_server.py\", line 167, in start\n asyncio.run(servers_task())\n File \"/usr/lib/python3.8/asyncio/runners.py\", line 43, in run\n return loop.run_until_complete(main)\n File \"/usr/lib/python3.8/asyncio/base_events.py\", line 608, in run_until_complete\n return future.result()\n File \"/app/venv/lib/python3.8/site-packages/kserve/model_server.py\", line 165, in servers_task\n await asyncio.gather(*servers)\n File \"/app/venv/lib/python3.8/site-packages/kserve/model_server.py\", line 154, in serve\n multiprocessing.set_start_method('fork')\n File \"/usr/lib/python3.8/multiprocessing/context.py\", line 243, in set_start_method\n raise RuntimeError('context has already been set')\nRuntimeError: context has already been set"}
Exception ignored in: <function Server.__del__ at 0x7f3b818b5ca0>
Traceback (most recent call last):
File "/app/venv/lib/python3.8/site-packages/grpc/aio/_server.py", line 170, in __del__
File "src/python/grpcio/grpc/_cython/_cygrpc/aio/common.pyx.pxi", line 118, in grpc._cython.cygrpc.schedule_coro_threadsafe
File "src/python/grpcio/grpc/_cython/_cygrpc/aio/common.pyx.pxi", line 110, in grpc._cython.cygrpc.schedule_coro_threadsafe
File "/usr/lib/python3.8/asyncio/base_events.py", line 425, in create_task
File "/usr/lib/python3.8/asyncio/base_events.py", line 504, in _check_closed
RuntimeError: Event loop is closed
11:54
I would think kserve should handle multiprocessing gracefully since it offers passing more worker to the kserve.ModelServer().start() function but maybe I'm wrong?
/kind bug
I’m trying to start a kserve model server with 2 workers like this
But I get this error in the log
I would think kserve should handle multiprocessing gracefully since it offers passing more worker to the
kserve.ModelServer().start()
function but maybe I'm wrong?Slack Message
What steps did you take and what happened:
[A clear and concise description of what the bug is.]
What did you expect to happen:
What's the InferenceService yaml:
[To help us debug please run
kubectl get isvc $name -n $namespace -oyaml
and paste the output]Anything else you would like to add:
[Miscellaneous information that will assist in solving the issue.]
Environment:
kubectl version
):/etc/os-release
):The text was updated successfully, but these errors were encountered: