Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: 503 for /readyz with model-id facebook/opt-125m #79

Closed
VfBfoerst opened this issue Jun 27, 2023 · 14 comments
Closed

bug: 503 for /readyz with model-id facebook/opt-125m #79

VfBfoerst opened this issue Jun 27, 2023 · 14 comments

Comments

@VfBfoerst
Copy link

Describe the bug

First at all: Thank you very much, openllm looks awesome so far 馃挴

This issue is regarding to #47. We tried to start an openllm server with the command:

openllm start opt --model-id facebook/opt-125m

The server started successfully and the webinterface is reachable. But we cannot generate anything, the on the webinterface given examples do not work. openllm query did not work either.

To reproduce

  1. openllm start opt --model-id facebook/opt-125m
  2. try the examples from the webinterface:
    grafik
  3. Or: openllm query "Tell me the truth!"

Logs

`openllm query "Tell me the truth!"`   

Timed out while connecting to localhost:3000:
Timed out waiting 30 seconds for server at 'localhost:3000' to be ready.
Traceback (most recent call last):
  File "/home/openllm/openllm_environment/bin/openllm", line 8, in <module>
    sys.exit(cli())
  File "/home/openllm/openllm_environment/lib64/python3.9/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "/home/openllm/openllm_environment/lib64/python3.9/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/home/openllm/openllm_environment/lib64/python3.9/site-packages/click/core.py", line 1657, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/openllm/openllm_environment/lib64/python3.9/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/openllm/openllm_environment/lib64/python3.9/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/home/openllm/openllm_environment/lib64/python3.9/site-packages/openllm/cli.py", line 380, in wrapper
    return func(*args, **attrs)
  File "/home/openllm/openllm_environment/lib64/python3.9/site-packages/openllm/cli.py", line 353, in wrapper
    return_value = func(*args, **attrs)
  File "/home/openllm/openllm_environment/lib64/python3.9/site-packages/openllm/cli.py", line 328, in wrapper
    return f(*args, **attrs)
  File "/home/openllm/openllm_environment/lib64/python3.9/site-packages/openllm/cli.py", line 1345, in query
    openllm[client.framework],  # type: ignore (internal API)
  File "/home/openllm/openllm_environment/lib64/python3.9/site-packages/openllm_client/runtimes/http.py", line 52, in framework
    return self._metadata["framework"]
  File "/home/openllm/openllm_environment/lib64/python3.9/site-packages/openllm_client/runtimes/base.py", line 102, in _metadata
    return self.call("metadata")
  File "/home/openllm/openllm_environment/lib64/python3.9/site-packages/openllm_client/runtimes/base.py", line 143, in call
    return self._cached.call(f"{name}_{self._api_version}", *args, **attrs)
  File "/home/openllm/openllm_environment/lib64/python3.9/site-packages/openllm_client/runtimes/base.py", line 151, in _cached
    self._client_class.wait_until_server_ready(self._host, int(self._port), timeout=self._timeout)
  File "/home/openllm/openllm_environment/lib64/python3.9/site-packages/bentoml/_internal/client/http.py", line 67, in wait_until_server_ready
    raise TimeoutError(
TimeoutError: Timed out waiting 30 seconds for server at 'localhost:3000' to be ready.

  
Examples/readyz from the Webgui:  
```python
2023-06-27T08:14:33+0200 [WARNING] [cli] No known supported resource available for <class 'types.OptRunnable'>, falling back to using CPU.
2023-06-27T08:14:34+0200 [INFO] [cli] Environ for worker 0: set CPU thread count to 16
2023-06-27T08:14:34+0200 [WARNING] [cli] No known supported resource available for <class 'types.OptRunnable'>, falling back to using CPU.
2023-06-27T08:14:34+0200 [INFO] [cli] Prometheus metrics for HTTP BentoServer from "_service.py:svc" can be accessed at http://localhost:3000/metrics.
2023-06-27T08:14:34+0200 [INFO] [cli] Starting production HTTP BentoServer from "_service.py:svc" listening on http://0.0.0.0:3000 (Press CTRL+C to quit)
2023-06-27T08:14:43+0200 [INFO] [api_server:llm-opt-service:16] 123.123.123.123:60081 (scheme=http,method=GET,path=/,type=,length=) (status=200,type=text/html; charset=utf-8,length=2859) 0.470ms (trace=9a0ba0a3ad009c89aebddd40fa94efa0,span=f5801bc407cca64c,sampled=1,service.name=llm-opt-service)
2023-06-27T08:14:44+0200 [INFO] [api_server:llm-opt-service:16] 123.123.123.123:60081 (scheme=http,method=GET,path=/static_content/swagger-ui.css,type=,length=) (status=200,type=text/css; charset=utf-8,length=143980) 6.529ms (trace=a044f5c38e8f8b15777f10f6a61e5e5b,span=d7759d1a32791968,sampled=1,service.name=llm-opt-service)
2023-06-27T08:14:44+0200 [INFO] [api_server:llm-opt-service:16] 123.123.123.123:60081 (scheme=http,method=GET,path=/static_content/index.css,type=,length=) (status=200,type=text/css; charset=utf-8,length=1125) 1.547ms (trace=13289048526d6668c8c59767576296b6,span=7480ec06dd86230a,sampled=1,service.name=llm-opt-service)
2023-06-27T08:14:44+0200 [INFO] [api_server:llm-opt-service:16] 123.123.123.123:60083 (scheme=http,method=GET,path=/static_content/swagger-initializer.js,type=,length=) (status=200,type=application/javascript,length=383) 1.395ms (trace=0f6357f294a61675d8b97a2d388116a6,span=e07593e79162735e,sampled=1,service.name=llm-opt-service)
2023-06-27T08:14:44+0200 [INFO] [api_server:llm-opt-service:16] 123.123.123.123:60081 (scheme=http,method=GET,path=/static_content/swagger-ui-bundle.js,type=,length=) (status=304,type=,length=) 0.877ms (trace=d076ac9766e7be7e4b142a7305daa54e,span=133caf1d609a19aa,sampled=1,service.name=llm-opt-service)
2023-06-27T08:14:44+0200 [INFO] [api_server:llm-opt-service:16] 123.123.123.123:60081 (scheme=http,method=GET,path=/static_content/swagger-ui-standalone-preset.js,type=,length=) (status=304,type=,length=) 0.881ms (trace=1d5078f1aee1173ee967829168d0ac01,span=0b4cd579c15c02f6,sampled=1,service.name=llm-opt-service)
2023-06-27T08:14:44+0200 [INFO] [api_server:llm-opt-service:16] 123.123.123.123:60081 (scheme=http,method=GET,path=/static_content/favicon-96x96.png,type=,length=) (status=200,type=image/png,length=5128) 3.940ms (trace=f141d17fbcbba97061c02f688df3659b,span=95f42383d374830e,sampled=1,service.name=llm-opt-service)
2023-06-27T08:14:44+0200 [INFO] [api_server:llm-opt-service:16] 123.123.123.123:60083 (scheme=http,method=GET,path=/static_content/favicon-32x32.png,type=,length=) (status=200,type=image/png,length=1912) 4.628ms (trace=766eb0b7b95b86767e67bafdc784f69f,span=8cd6fe4667b2d25d,sampled=1,service.name=llm-opt-service)
2023-06-27T08:14:44+0200 [INFO] [api_server:llm-opt-service:15] 123.123.123.123:60084 (scheme=http,method=GET,path=/docs.json,type=,length=) (status=200,type=application/json,length=8166) 16.775ms (trace=36ee0fd098a50c360588732a6eff754e,span=92bfc9de114573d5,sampled=1,service.name=llm-opt-service)
2023-06-27T08:14:57+0200 [INFO] [runner:llm-opt-runner:1] _ (scheme=http,method=GET,path=http://127.0.0.1:8000/readyz,type=,length=) (status=404,type=text/plain; charset=utf-8,length=9) 1.229ms (trace=825cdd8b43c097cf0fe9e7a89664d8c7,span=bf06eb4b65e76b31,sampled=1,service.name=llm-opt-runner)
2023-06-27T08:14:57+0200 [INFO] [api_server:llm-opt-service:16] 123.123.123.123:60109 (scheme=http,method=GET,path=/readyz,type=,length=) (status=503,type=text/plain; charset=utf-8,length=22) 77.489ms (trace=825cdd8b43c097cf0fe9e7a89664d8c7,span=023a43fb11c48835,sampled=1,service.name=llm-opt-service)
2023-06-27T08:26:51+0200 [INFO] [api_server:llm-opt-service:15] 123.123.123.123:60622 (scheme=http,method=POST,path=/v1/metadata,type=text/plain,length=4) (status=200,type=application/json,length=731) 5.742ms (trace=7214af978f9398a2d7b8a9feaebc215e,span=b9bf604323a0daa2,sampled=1,service.name=llm-opt-service)
2023-06-27T08:27:13+0200 [INFO] [api_server:llm-opt-service:15] 123.123.123.123:60651 (scheme=http,method=POST,path=/v1/metadata,type=text/plain,length=20) (status=200,type=application/json,length=731) 4.030ms (trace=e7360b8ce2bbd984fbe1a93dc0bc3b18,span=ac391d391a530d3b,sampled=1,service.name=llm-opt-service)
2023-06-27T08:27:59+0200 [INFO] [runner:llm-opt-runner:1] _ (scheme=http,method=GET,path=http://127.0.0.1:8000/readyz,type=,length=) (status=404,type=text/plain; charset=utf-8,length=9) 0.848ms (trace=3d67a35c7cae0ea522a4752ee78ad8c9,span=37de79502ec15faf,sampled=1,service.name=llm-opt-runner)
2023-06-27T08:27:59+0200 [INFO] [api_server:llm-opt-service:16] 127.0.0.1:58740 (scheme=http,method=GET,path=/readyz,type=,length=) (status=503,type=text/plain; charset=utf-8,length=22) 4.934ms (trace=3d67a35c7cae0ea522a4752ee78ad8c9,span=f05679dc029016fd,sampled=1,service.name=llm-opt-service)
2023-06-27T08:28:00+0200 [INFO] [runner:llm-opt-runner:1] _ (scheme=http,method=GET,path=http://127.0.0.1:8000/readyz,type=,length=) (status=404,type=text/plain; charset=utf-8,length=9) 0.706ms (trace=6feded7ab0a1ebb5850e8e07205137ce,span=545366f670aa6670,sampled=1,service.name=llm-opt-runner)
2023-06-27T08:28:00+0200 [INFO] [api_server:llm-opt-service:16] 127.0.0.1:58750 (scheme=http,method=GET,path=/readyz,type=,length=) (status=503,type=text/plain; charset=utf-8,length=22) 3.470ms (trace=6feded7ab0a1ebb5850e8e07205137ce,span=bfcf97403b38166f,sampled=1,service.name=llm-opt-service)
2023-06-27T08:28:01+0200 [INFO] [runner:llm-opt-runner:1] _ (scheme=http,method=GET,path=http://127.0.0.1:8000/readyz,type=,length=) (status=404,type=text/plain; charset=utf-8,length=9) 0.727ms (trace=56d62f117e9c36e4601564f3b4546e8b,span=80f35138b90a4d11,sampled=1,service.name=llm-opt-runner)
2023-06-27T08:28:01+0200 [INFO] [api_server:llm-opt-service:16] 127.0.0.1:58764 (scheme=http,method=GET,path=/readyz,type=,length=) (status=503,type=text/plain; charset=utf-8,length=22) 3.528ms (trace=56d62f117e9c36e4601564f3b4546e8b,span=2b0e2872be7a99f0,sampled=1,service.name=llm-opt-service)
2023-06-27T08:28:02+0200 [INFO] [runner:llm-opt-runner:1] _ (scheme=http,method=GET,path=http://127.0.0.1:8000/readyz,type=,length=) (status=404,type=text/plain; charset=utf-8,length=9) 0.692ms (trace=29a97e0ce492e2e54d0e47792360f091,span=2a83ddb3c9fec1f0,sampled=1,service.name=llm-opt-runner)
2023-06-27T08:28:02+0200 [INFO] [api_server:llm-opt-service:16] 127.0.0.1:58770 (scheme=http,method=GET,path=/readyz,type=,length=) (status=503,type=text/plain; charset=utf-8,length=22) 3.434ms (trace=29a97e0ce492e2e54d0e47792360f091,span=6b1596a336c1d33f,sampled=1,service.name=llm-opt-service)
2023-06-27T08:28:03+0200 [INFO] [runner:llm-opt-runner:1] _ (scheme=http,method=GET,path=http://127.0.0.1:8000/readyz,type=,length=) (status=404,type=text/plain; charset=utf-8,length=9) 0.698ms (trace=c63dee9e650759297b537c42d531bf9f,span=148454642c31920d,sampled=1,service.name=llm-opt-runner)
2023-06-27T08:28:03+0200 [INFO] [api_server:llm-opt-service:16] 127.0.0.1:58784 (scheme=http,method=GET,path=/readyz,type=,length=) (status=503,type=text/plain; charset=utf-8,length=22) 3.481ms (trace=c63dee9e650759297b537c42d531bf9f,span=3685ee5030f131ba,sampled=1,service.name=llm-opt-service)
2023-06-27T08:28:04+0200 [INFO] [runner:llm-opt-runner:1] _ (scheme=http,method=GET,path=http://127.0.0.1:8000/readyz,type=,length=) (status=404,type=text/plain; charset=utf-8,length=9) 0.703ms (trace=3d4f393d8e9d87c9489b80a2602c3d72,span=76da5185f188a13f,sampled=1,service.name=llm-opt-runner)
2023-06-27T08:28:04+0200 [INFO] [api_server:llm-opt-service:16] 127.0.0.1:58786 (scheme=http,method=GET,path=/readyz,type=,length=) (status=503,type=text/plain; charset=utf-8,length=22) 3.489ms (trace=3d4f393d8e9d87c9489b80a2602c3d72,span=1f55db959f3f0e8c,sampled=1,service.name=llm-opt-service)
2023-06-27T08:28:05+0200 [INFO] [runner:llm-opt-runner:1] _ (scheme=http,method=GET,path=http://127.0.0.1:8000/readyz,type=,length=) (status=404,type=text/plain; charset=utf-8,length=9) 0.688ms (trace=a79e3934f5765460d9453d5d22c70244,span=484c8de3837265e3,sampled=1,service.name=llm-opt-runner)
2023-06-27T08:28:05+0200 [INFO] [api_server:llm-opt-service:16] 127.0.0.1:58790 (scheme=http,method=GET,path=/readyz,type=,length=) (status=503,type=text/plain; charset=utf-8,length=22) 3.438ms (trace=a79e3934f5765460d9453d5d22c70244,span=6708a4fbbd796615,sampled=1,service.name=llm-opt-service)
2023-06-27T08:28:06+0200 [INFO] [runner:llm-opt-runner:1] _ (scheme=http,method=GET,path=http://127.0.0.1:8000/readyz,type=,length=) (status=404,type=text/plain; charset=utf-8,length=9) 0.693ms (trace=6ab794b317bb28b325ce65004f5adf95,span=bf26e25355405222,sampled=1,service.name=llm-opt-runner)
2023-06-27T08:28:06+0200 [INFO] [api_server:llm-opt-service:16] 127.0.0.1:58792 (scheme=http,method=GET,path=/readyz,type=,length=) (status=503,type=text/plain; charset=utf-8,length=22) 3.643ms (trace=6ab794b317bb28b325ce65004f5adf95,span=76c919c82c4d3ded,sampled=1,service.name=llm-opt-service)
2023-06-27T08:28:07+0200 [INFO] [runner:llm-opt-runner:1] _ (scheme=http,method=GET,path=http://127.0.0.1:8000/readyz,type=,length=) (status=404,type=text/plain; charset=utf-8,length=9) 0.728ms (trace=1f0affee83d5baa143853575680021b8,span=403ddab11fe64cdc,sampled=1,service.name=llm-opt-runner)
2023-06-27T08:28:07+0200 [INFO] [api_server:llm-opt-service:16] 127.0.0.1:58808 (scheme=http,method=GET,path=/readyz,type=,length=) (status=503,type=text/plain; charset=utf-8,length=22) 4.121ms (trace=1f0affee83d5baa143853575680021b8,span=b4981d0ffb1e79d5,sampled=1,service.name=llm-opt-service)
2023-06-27T08:28:08+0200 [INFO] [runner:llm-opt-runner:1] _ (scheme=http,method=GET,path=http://127.0.0.1:8000/readyz,type=,length=) (status=404,type=text/plain; charset=utf-8,length=9) 0.684ms (trace=79bee6a0c93b14dc8a0d52298af710a8,span=a2a6853a038532b1,sampled=1,service.name=llm-opt-runner)
2023-06-27T08:28:08+0200 [INFO] [api_server:llm-opt-service:16] 127.0.0.1:58820 (scheme=http,method=GET,path=/readyz,type=,length=) (status=503,type=text/plain; charset=utf-8,length=22) 3.425ms (trace=79bee6a0c93b14dc8a0d52298af710a8,span=63935df1dd7a4b6f,sampled=1,service.name=llm-opt-service)
2023-06-27T08:28:09+0200 [INFO] [runner:llm-opt-runner:1] _ (scheme=http,method=GET,path=http://127.0.0.1:8000/readyz,type=,length=) (status=404,type=text/plain; charset=utf-8,length=9) 0.713ms (trace=3a17ccfc409db03a76f4c1e99d2f7abd,span=bf36173e51d2114f,sampled=1,service.name=llm-opt-runner)
2023-06-27T08:28:09+0200 [INFO] [api_server:llm-opt-service:16] 127.0.0.1:42736 (scheme=http,method=GET,path=/readyz,type=,length=) (status=503,type=text/plain; charset=utf-8,length=22) 3.536ms (trace=3a17ccfc409db03a76f4c1e99d2f7abd,span=54f8a583b1e7f02f,sampled=1,service.name=llm-opt-service)
2023-06-27T08:28:10+0200 [INFO] [runner:llm-opt-runner:1] _ (scheme=http,method=GET,path=http://127.0.0.1:8000/readyz,type=,length=) (status=404,type=text/plain; charset=utf-8,length=9) 0.711ms (trace=08a437207ba29e28532062be8544627c,span=f66c358bbd8aa2eb,sampled=1,service.name=llm-opt-runner)


### Environment

#### Environment variable

```bash
BENTOML_DEBUG=''
BENTOML_QUIET=''
BENTOML_BUNDLE_LOCAL_BUILD=''
BENTOML_DO_NOT_TRACK=''
BENTOML_CONFIG=''
BENTOML_CONFIG_OPTIONS=''
BENTOML_PORT=''
BENTOML_HOST=''
BENTOML_API_WORKERS=''

System information

bentoml: 1.0.22
python: 3.9.2
platform: Linux-4.18.0-305.76.1.el8_4.x86_64-x86_64-with-glibc2.28
uid_gid: 8007:8008

pip_packages
accelerate==0.20.3
aiohttp==3.8.4
aiosignal==1.3.1
anyio==3.7.0
appdirs==1.4.4
asgiref==3.7.2
async-timeout==4.0.2
attrs==23.1.0
bentoml==1.0.22
build==0.10.0
cattrs==23.1.2
certifi==2023.5.7
charset-normalizer==3.1.0
circus==0.18.0
click==8.1.3
click-option-group==0.5.6
cloudpickle==2.2.1
cmake==3.26.4
coloredlogs==15.0.1
contextlib2==21.6.0
datasets==2.13.1
deepmerge==1.1.0
Deprecated==1.2.14
dill==0.3.6
exceptiongroup==1.1.1
filelock==3.12.2
filetype==1.2.0
frozenlist==1.3.3
fs==2.4.16
fsspec==2023.6.0
grpcio==1.56.0
grpcio-health-checking==1.48.2
h11==0.14.0
httpcore==0.17.2
httpx==0.24.1
huggingface-hub==0.15.1
humanfriendly==10.0
idna==3.4
importlib-metadata==6.0.1
inflection==0.5.1
Jinja2==3.1.2
lit==16.0.6
markdown-it-py==3.0.0
MarkupSafe==2.1.3
mdurl==0.1.2
mpmath==1.3.0
multidict==6.0.4
multiprocess==0.70.14
networkx==3.1
numpy==1.25.0
nvidia-cublas-cu11==11.10.3.66
nvidia-cuda-cupti-cu11==11.7.101
nvidia-cuda-nvrtc-cu11==11.7.99
nvidia-cuda-runtime-cu11==11.7.99
nvidia-cudnn-cu11==8.5.0.96
nvidia-cufft-cu11==10.9.0.58
nvidia-curand-cu11==10.2.10.91
nvidia-cusolver-cu11==11.4.0.1
nvidia-cusparse-cu11==11.7.4.91
nvidia-nccl-cu11==2.14.3
nvidia-nvtx-cu11==11.7.91
openllm==0.1.14
opentelemetry-api==1.17.0
opentelemetry-instrumentation==0.38b0
opentelemetry-instrumentation-aiohttp-client==0.38b0
opentelemetry-instrumentation-asgi==0.38b0
opentelemetry-instrumentation-grpc==0.38b0
opentelemetry-sdk==1.17.0
opentelemetry-semantic-conventions==0.38b0
opentelemetry-util-http==0.38b0
optimum==1.8.8
orjson==3.9.1
packaging==23.1
pandas==2.0.2
pathspec==0.11.1
Pillow==9.5.0
pip-requirements-parser==32.0.1
pip-tools==6.13.0
prometheus-client==0.17.0
protobuf==3.20.3
psutil==5.9.5
pyarrow==12.0.1
pydantic==1.10.9
Pygments==2.15.1
pynvml==11.5.0
pyparsing==3.1.0
pyproject_hooks==1.0.0
python-dateutil==2.8.2
python-json-logger==2.0.7
python-multipart==0.0.6
pytz==2023.3
PyYAML==6.0
pyzmq==25.1.0
regex==2023.6.3
requests==2.31.0
rich==13.4.2
safetensors==0.3.1
schema==0.7.5
sentencepiece==0.1.99
simple-di==0.1.5
six==1.16.0
sniffio==1.3.0
starlette==0.28.0
sympy==1.12
tabulate==0.9.0
tokenizers==0.13.3
tomli==2.0.1
torch==2.0.1
torchvision==0.15.2
tornado==6.3.2
tqdm==4.65.0
transformers==4.30.2
triton==2.0.0
typing_extensions==4.6.3
tzdata==2023.3
urllib3==2.0.3
uvicorn==0.22.0
watchfiles==0.19.0
wcwidth==0.2.6
wrapt==1.15.0
xxhash==3.2.0
yarl==1.9.2
zipp==3.15.0
  • transformers version: 4.30.2
  • Platform: Linux-4.18.0-305.76.1.el8_4.x86_64-x86_64-with-glibc2.28
  • Python version: 3.9.2
  • Huggingface_hub version: 0.15.1
  • Safetensors version: 0.3.1
  • PyTorch version (GPU?): 2.0.1+cu117 (False)
  • Tensorflow version (GPU?): not installed (NA)
  • Flax version (CPU?/GPU?/TPU?): not installed (NA)
  • Jax version: not installed
  • JaxLib version: not installed
  • Using GPU in script?: No
  • Using distributed or parallel set-up in script?: No

System information (Optional)

memory: 240 GB
CPU: 16 vCPU
Platform: VMWare ESXi 7.0 U 3
OS: RHEL 8.4

@VfBfoerst
Copy link
Author

We updated our packages with
pip3 install openllm --upgrade
and are now using openllm-0.1.16. The behavior did not change.

@aarnphm
Copy link
Member

aarnphm commented Jun 27, 2023

I have addressed this on the main. It leads down to an internal bug in how I first implemented the load model in the bento.

I will release a new patch

@aarnphm
Copy link
Member

aarnphm commented Jun 27, 2023

Can you try with 0.1.17?

@VfBfoerst
Copy link
Author

Can you try with 0.1.17?

We upgraded to 0.1.17, but the behavior did not change.

@VfBfoerst
Copy link
Author

Thank you for your effort btw 馃挴

@VfBfoerst
Copy link
Author

After enabling Debug mode, we found out that the browser seems to replace the colons in the URL leading to a 404 status code:
2023-06-28T11:19:28+0200 [INFO] [runner:llm-opt-runner:1] - "GET http%3A//127.0.0.1%3A8000/readyz HTTP/1.1" 404 (trace=174c629a1c188d1c14ab37cfaddd151c,span=2142bdde3c143280,sampled=1,service.name=llm-opt-runner)
Maybe related?
In the folllowing message, the url seems also incorrect:
2023-06-28T11:19:28+0200 [INFO] [runner:llm-opt-runner:1] _ (scheme=http,method=GET,path=http://127.0.0.1:8000/readyz,type=,length=) (status=404,type=text/plain; charset=utf-8,length=9) 1.595ms (trace=174c629a1c188d1c14ab37cfaddd151c,span=b4058afcd2732c6f,sampled=1,service.name=llm-opt-runner)

Is there anything else I can do to help? 馃樃

@VfBfoerst
Copy link
Author

VfBfoerst commented Jul 6, 2023

Hi @aarnphm, we upraded to version openllm-0.1.20. Currently the bug seems to be fixed, we get an status code 200 from the /readyz endpoint. We will further investigate the other endpoints as well. 馃惐

@VfBfoerst
Copy link
Author

I don't know why, but it does not work again. Meanwhile, we tried other models but we did not change any config at all. /readyz says again

Runners are not ready.
2023-07-06T13:46:31+0200 [INFO] [runner:llm-opt-runner:1] _ (scheme=http,method=GET,path=http://127.0.0.1:8000/readyz,type=,length=) (status=404,type=text/plain; charset=utf-8,length=9) 1.260ms (trace=a40d30effa29a800dda013f411636681,span=c1e0f53b8a8740f4,sampled=1,service.name=llm-opt-runner)
2023-07-06T13:46:31+0200 [INFO] [api_server:llm-opt-service:9] 172.30.54.84:54409 (scheme=http,method=GET,path=/readyz,type=,length=) (status=503,type=text/plain; charset=utf-8,length=22) 80.519ms (trace=a40d30effa29a800dda013f411636681,span=309b08f0f05da281,sampled=1,service.name=llm-opt-service)

@aarnphm
Copy link
Member

aarnphm commented Jul 6, 2023

can you walk me through how you run this again?

are you just doing openllm start opt?

@aarnphm
Copy link
Member

aarnphm commented Jul 6, 2023

What is the resource you are running on?

@VfBfoerst
Copy link
Author

can you walk me through how you run this again?

are you just doing openllm start opt?

Yeah, its a virtual environment and we Run openllm start opt --model-id facebook/opt-125m

@VfBfoerst
Copy link
Author

VfBfoerst commented Jul 6, 2023

What is the resource you are running on?

Its a virtual machine running on VMware esxi, without GPU, it is CPU only

@aarnphm
Copy link
Member

aarnphm commented Jul 7, 2023

i can successfully run opt without any hiccups on mac. OPT shouldn't require GPU to run at all

@VfBfoerst
Copy link
Author

Hi, we found out it was the proxy who caused the issue. We needed to add 127.0.0.1,localhost to the no_proxy variable, e.g. (temporarily):
export no_proxy=127.0.0.1,localhost

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants