bug: openllm start opt or openllm start dolly-v2 faild #47

hurner · 2023-06-21T08:58:25Z

Describe the bug

openllm start opt and openllm start dolly-v2 shows OK.
when i made the query below came out.

2023-06-21T16:45:40+0800 [INFO] [runner:llm-dolly-v2-runner:1] _ (scheme=http,method=GET,path=http://127.0.0.1:8000/readyz,type=,length=) (status=404,type=text/plain; charset=utf-8,length=9) 0.574ms (trace=100fa96d33433a772259d444a0006ca9,span=4caf6df55eb0c67e,sampled=1,service.name=llm-dolly-v2-runner)
2023-06-21T16:45:40+0800 [INFO] [api_server:llm-dolly-v2-service:9] 127.0.0.1:63613 (scheme=http,method=GET,path=/readyz,type=,length=) (status=503,type=text/plain; charset=utf-8,length=22) 5.220ms (trace=100fa96d33433a772259d444a0006ca9,span=7ec016176efc036d,sampled=1,service.name=llm-dolly-v2-service)

To reproduce

No response

Logs

2023-06-21T16:45:40+0800 [INFO] [runner:llm-dolly-v2-runner:1] _ (scheme=http,method=GET,path=http://127.0.0.1:8000/readyz,type=,length=) (status=404,type=text/plain; charset=utf-8,length=9) 0.574ms (trace=100fa96d33433a772259d444a0006ca9,span=4caf6df55eb0c67e,sampled=1,service.name=llm-dolly-v2-runner)
2023-06-21T16:45:40+0800 [INFO] [api_server:llm-dolly-v2-service:9] 127.0.0.1:63613 (scheme=http,method=GET,path=/readyz,type=,length=) (status=503,type=text/plain; charset=utf-8,length=22) 5.220ms (trace=100fa96d33433a772259d444a0006ca9,span=7ec016176efc036d,sampled=1,service.name=llm-dolly-v2-service)

Environment

Environment variable

BENTOML_DEBUG=''
BENTOML_QUIET=''
BENTOML_BUNDLE_LOCAL_BUILD=''
BENTOML_DO_NOT_TRACK=''
BENTOML_CONFIG=''
BENTOML_CONFIG_OPTIONS=''
BENTOML_PORT=''
BENTOML_HOST=''
BENTOML_API_WORKERS=''

System information

bentoml: 1.0.22
python: 3.8.16
platform: macOS-13.4-arm64-arm-64bit
uid_gid: 501:20
conda: 23.3.1
in_conda_env: True

conda_packages

name: openllm
channels:
  - defaults
dependencies:
  - ca-certificates=2023.05.30=hca03da5_0
  - libcxx=14.0.6=h848a8c0_0
  - libffi=3.4.4=hca03da5_0
  - ncurses=6.4=h313beb8_0
  - openssl=3.0.8=h1a28f6b_0
  - pip=23.1.2=py38hca03da5_0
  - python=3.8.16=hb885b13_4
  - readline=8.2=h1a28f6b_0
  - setuptools=67.8.0=py38hca03da5_0
  - sqlite=3.41.2=h80987f9_0
  - tk=8.6.12=hb8d0fd4_0
  - wheel=0.38.4=py38hca03da5_0
  - xz=5.4.2=h80987f9_0
  - zlib=1.2.13=h5a0b063_0
  - pip:
      - accelerate==0.20.3
      - aiohttp==3.8.4
      - aiosignal==1.3.1
      - anyio==3.7.0
      - appdirs==1.4.4
      - asgiref==3.7.2
      - async-timeout==4.0.2
      - attrs==23.1.0
      - bentoml==1.0.22
      - build==0.10.0
      - cattrs==23.1.2
      - certifi==2023.5.7
      - charset-normalizer==3.1.0
      - circus==0.18.0
      - click==8.1.3
      - click-option-group==0.5.6
      - cloudpickle==2.2.1
      - coloredlogs==15.0.1
      - contextlib2==21.6.0
      - cpm-kernels==1.0.11
      - datasets==2.13.0
      - deepmerge==1.1.0
      - deprecated==1.2.14
      - dill==0.3.6
      - exceptiongroup==1.1.1
      - filelock==3.12.2
      - filetype==1.2.0
      - frozenlist==1.3.3
      - fs==2.4.16
      - fsspec==2023.6.0
      - grpcio==1.54.2
      - grpcio-health-checking==1.48.2
      - h11==0.14.0
      - httpcore==0.17.2
      - httpx==0.24.1
      - huggingface-hub==0.15.1
      - humanfriendly==10.0
      - idna==3.4
      - importlib-metadata==6.0.1
      - inflection==0.5.1
      - jinja2==3.1.2
      - markdown-it-py==3.0.0
      - markupsafe==2.1.3
      - mdurl==0.1.2
      - mpmath==1.3.0
      - multidict==6.0.4
      - multiprocess==0.70.14
      - networkx==3.1
      - numpy==1.24.3
      - openllm==0.1.8
      - opentelemetry-api==1.17.0
      - opentelemetry-instrumentation==0.38b0
      - opentelemetry-instrumentation-aiohttp-client==0.38b0
      - opentelemetry-instrumentation-asgi==0.38b0
      - opentelemetry-instrumentation-grpc==0.38b0
      - opentelemetry-sdk==1.17.0
      - opentelemetry-semantic-conventions==0.38b0
      - opentelemetry-util-http==0.38b0
      - optimum==1.8.8
      - orjson==3.9.1
      - packaging==23.1
      - pandas==2.0.2
      - pathspec==0.11.1
      - pillow==9.5.0
      - pip-requirements-parser==32.0.1
      - pip-tools==6.13.0
      - prometheus-client==0.17.0
      - protobuf==3.20.3
      - psutil==5.9.5
      - pyarrow==12.0.1
      - pydantic==1.10.9
      - pygments==2.15.1
      - pynvml==11.5.0
      - pyparsing==3.1.0
      - pyproject-hooks==1.0.0
      - python-dateutil==2.8.2
      - python-json-logger==2.0.7
      - python-multipart==0.0.6
      - pytz==2023.3
      - pyyaml==6.0
      - pyzmq==25.1.0
      - regex==2023.6.3
      - requests==2.31.0
      - rich==13.4.2
      - safetensors==0.3.1
      - schema==0.7.5
      - sentencepiece==0.1.99
      - simple-di==0.1.5
      - six==1.16.0
      - sniffio==1.3.0
      - starlette==0.28.0
      - sympy==1.12
      - tabulate==0.9.0
      - tokenizers==0.13.3
      - tomli==2.0.1
      - torch==2.0.1
      - torchvision==0.15.2
      - tornado==6.3.2
      - tqdm==4.65.0
      - transformers==4.30.2
      - typing-extensions==4.6.3
      - tzdata==2023.3
      - urllib3==2.0.3
      - uvicorn==0.22.0
      - watchfiles==0.19.0
      - wcwidth==0.2.6
      - wrapt==1.15.0
      - xxhash==3.2.0
      - yarl==1.9.2
      - zipp==3.15.0
prefix: /Users/tim/anaconda3/envs/openllm

pip_packages

accelerate==0.20.3
aiohttp==3.8.4
aiosignal==1.3.1
anyio==3.7.0
appdirs==1.4.4
asgiref==3.7.2
async-timeout==4.0.2
attrs==23.1.0
bentoml==1.0.22
build==0.10.0
cattrs==23.1.2
certifi==2023.5.7
charset-normalizer==3.1.0
circus==0.18.0
click==8.1.3
click-option-group==0.5.6
cloudpickle==2.2.1
coloredlogs==15.0.1
contextlib2==21.6.0
cpm-kernels==1.0.11
datasets==2.13.0
deepmerge==1.1.0
Deprecated==1.2.14
dill==0.3.6
exceptiongroup==1.1.1
filelock==3.12.2
filetype==1.2.0
frozenlist==1.3.3
fs==2.4.16
fsspec==2023.6.0
grpcio==1.54.2
grpcio-health-checking==1.48.2
h11==0.14.0
httpcore==0.17.2
httpx==0.24.1
huggingface-hub==0.15.1
humanfriendly==10.0
idna==3.4
importlib-metadata==6.0.1
inflection==0.5.1
Jinja2==3.1.2
markdown-it-py==3.0.0
MarkupSafe==2.1.3
mdurl==0.1.2
mpmath==1.3.0
multidict==6.0.4
multiprocess==0.70.14
networkx==3.1
numpy==1.24.3
openllm==0.1.8
opentelemetry-api==1.17.0
opentelemetry-instrumentation==0.38b0
opentelemetry-instrumentation-aiohttp-client==0.38b0
opentelemetry-instrumentation-asgi==0.38b0
opentelemetry-instrumentation-grpc==0.38b0
opentelemetry-sdk==1.17.0
opentelemetry-semantic-conventions==0.38b0
opentelemetry-util-http==0.38b0
optimum==1.8.8
orjson==3.9.1
packaging==23.1
pandas==2.0.2
pathspec==0.11.1
Pillow==9.5.0
pip-requirements-parser==32.0.1
pip-tools==6.13.0
prometheus-client==0.17.0
protobuf==3.20.3
psutil==5.9.5
pyarrow==12.0.1
pydantic==1.10.9
Pygments==2.15.1
pynvml==11.5.0
pyparsing==3.1.0
pyproject_hooks==1.0.0
python-dateutil==2.8.2
python-json-logger==2.0.7
python-multipart==0.0.6
pytz==2023.3
PyYAML==6.0
pyzmq==25.1.0
regex==2023.6.3
requests==2.31.0
rich==13.4.2
safetensors==0.3.1
schema==0.7.5
sentencepiece==0.1.99
simple-di==0.1.5
six==1.16.0
sniffio==1.3.0
starlette==0.28.0
sympy==1.12
tabulate==0.9.0
tokenizers==0.13.3
tomli==2.0.1
torch==2.0.1
torchvision==0.15.2
tornado==6.3.2
tqdm==4.65.0
transformers==4.30.2
typing_extensions==4.6.3
tzdata==2023.3
urllib3==2.0.3
uvicorn==0.22.0
watchfiles==0.19.0
wcwidth==0.2.6
wrapt==1.15.0
xxhash==3.2.0
yarl==1.9.2
zipp==3.15.0

transformers version: 4.30.2
Platform: macOS-13.4-arm64-arm-64bit
Python version: 3.8.16
Huggingface_hub version: 0.15.1
Safetensors version: 0.3.1
PyTorch version (GPU?): 2.0.1 (False)
Tensorflow version (GPU?): not installed (NA)
Flax version (CPU?/GPU?/TPU?): not installed (NA)
Jax version: not installed
JaxLib version: not installed
Using GPU in script?: no
Using distributed or parallel set-up in script?:

System information (Optional)

apple m1 max

The text was updated successfully, but these errors were encountered:

hurner · 2023-06-21T09:00:50Z

openllm start dolly-v2 is OK as below
2023-06-21T16:59:44+0800 [INFO] [cli] Environ for worker 0: set CPU thread count to 10
2023-06-21T16:59:45+0800 [INFO] [cli] Prometheus metrics for HTTP BentoServer from "_service.py:svc" can be accessed at http://localhost:3000/metrics.
2023-06-21T16:59:45+0800 [INFO] [cli] Starting production HTTP BentoServer from "_service.py:svc" listening on http://0.0.0.0:3000 (Press CTRL+C to quit)
2023-06-21T16:59:49+0800 [WARNING] [runner:llm-dolly-v2-runner:1] The model weights are not tied. Please use the tie_weights method before using the infer_auto_device function.

hurner · 2023-06-21T09:01:21Z

http://localhost:3000/readyz
shows:
Runners are not ready.

aarnphm · 2023-06-21T10:37:23Z

are you seeing this with dolly, or with both?

I have tested OPT on my end, on both linux and macos and it startup just fine

aarnphm · 2023-06-24T01:25:19Z

Hey, I have fixed this issue on main and will release a patch version soon.

hurner · 2023-06-26T01:30:14Z

OK， THANKS.

VfBfoerst · 2023-06-26T15:08:13Z

Hey @aarnphm, I ran into the exact same behavior.
I tried to deploy openllm within a podman container, with registry.redhat.io/ubi8/python-39:latest as base image. Are there plans for containerizing openllm or to support it running in a rootless container?

VfBfoerst · 2023-06-26T15:30:24Z

I tried it on the system (RHEL 8.4) outside of the container with a venv (Python 3.9), the readyz endpoint also indicates Runners are not ready..
Start-Command:
openllm start opt --model-id facebook/opt-125m

aarnphm · 2023-06-26T20:44:36Z

Can you dumped the whole stack trace in a new issue?

VfBfoerst · 2023-06-27T11:48:02Z

Hey, I have fixed this issue on main and will release a patch version soon.

Is there a commit or a branch where you can see these changes? I did not find any resources.

aarnphm · 2023-06-27T12:13:29Z

Containerizing Bento with podman should already be supported.

See https://docs.bentoml.com/en/latest/guides/containerization.html#containerization-with-different-container-engines

bentoml containerize llm-bento --backend podman --opt ...

aarnphm · 2023-06-27T12:14:20Z

Tho there is an internal bug that I just discovered recently wrt running within the container. I will post updates about this soon

aarnphm · 2023-06-27T12:14:57Z

Hey @aarnphm, I ran into the exact same behavior. I tried to deploy openllm within a podman container, with registry.redhat.io/ubi8/python-39:latest as base image. Are there plans for containerizing openllm or to support it running in a rootless container?

I believe this is related to the container deployment. can you create a new issue? thanks

aarnphm closed this as completed Jun 24, 2023

VfBfoerst mentioned this issue Jun 27, 2023

bug: 503 for /readyz with model-id facebook/opt-125m #79

Closed

aarnphm reopened this Jun 27, 2023

aarnphm closed this as completed Jun 27, 2023

lyh007 mentioned this issue Jul 12, 2023

bug: [ERROR] [runner:llm-falcon-runner:2] Exception in ASGI application #108

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bug: openllm start opt or openllm start dolly-v2 faild #47

bug: openllm start opt or openllm start dolly-v2 faild #47

hurner commented Jun 21, 2023

hurner commented Jun 21, 2023

hurner commented Jun 21, 2023

aarnphm commented Jun 21, 2023

aarnphm commented Jun 24, 2023

hurner commented Jun 26, 2023

VfBfoerst commented Jun 26, 2023

VfBfoerst commented Jun 26, 2023

aarnphm commented Jun 26, 2023

VfBfoerst commented Jun 27, 2023

aarnphm commented Jun 27, 2023 •

edited

aarnphm commented Jun 27, 2023

aarnphm commented Jun 27, 2023

bug: openllm start opt or openllm start dolly-v2 faild #47

bug: openllm start opt or openllm start dolly-v2 faild #47

Comments

hurner commented Jun 21, 2023

Describe the bug

To reproduce

Logs

Environment

Environment variable

System information

System information (Optional)

hurner commented Jun 21, 2023

hurner commented Jun 21, 2023

aarnphm commented Jun 21, 2023

aarnphm commented Jun 24, 2023

hurner commented Jun 26, 2023

VfBfoerst commented Jun 26, 2023

VfBfoerst commented Jun 26, 2023

aarnphm commented Jun 26, 2023

VfBfoerst commented Jun 27, 2023

aarnphm commented Jun 27, 2023 • edited

aarnphm commented Jun 27, 2023

aarnphm commented Jun 27, 2023

aarnphm commented Jun 27, 2023 •

edited