Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failure of bento serve in production with AnyIO error #2271

Closed
andreea-anghel opened this issue Feb 14, 2022 · 29 comments
Closed

Failure of bento serve in production with AnyIO error #2271

andreea-anghel opened this issue Feb 14, 2022 · 29 comments
Labels
bug Something isn't working

Comments

@andreea-anghel
Copy link

Describe the bug

The sklearn example available here https://docs.bentoml.org/en/latest/quickstart.html#installation fails at inference time with AnyIO error. The bento service is deployed in production mode bentoml serve iris_classifier:latest --production. When deployed in development mode, the inference works as expected.

To Reproduce

Steps to reproduce the issue:

  1. Install BentoML: pip install bentoml --pre
  2. Train and save bento sklearn model:
import bentoml

from sklearn import svm
from sklearn import datasets

# Load predefined training set to build an example model
iris = datasets.load_iris()
X, y = iris.data, iris.target

# Model Training
clf = svm.SVC(gamma='scale')
clf.fit(X, y)

# Call to bentoml.<FRAMEWORK>.save(<MODEL_NAME>, model)
# In order to save to BentoML's standard format in a local model store
bentoml.sklearn.save("iris_clf", clf)
  1. Create BentoML service
# bento.py
import bentoml
import bentoml.sklearn
import numpy as np

from bentoml.io import NumpyNdarray

# Load the runner for the latest ScikitLearn model we just saved
iris_clf_runner = bentoml.sklearn.load_runner("iris_clf:latest")

# Create the iris_classifier service with the ScikitLearn runner
# Multiple runners may be specified if needed in the runners array
# When packaged as a bento, the runners here will included
svc = bentoml.Service("iris_classifier", runners=[iris_clf_runner])

# Create API function with pre- and post- processing logic with your new "svc" annotation
@svc.api(input=NumpyNdarray(), output=NumpyNdarray())
def predict(input_ndarray: np.ndarray) -> np.ndarray:
    # Define pre-processing logic
    result = iris_clf_runner.run(input_ndarray)
    # Define post-processing logic
    return result
  1. Create BentoML configuration file
# bentofile.yaml
service: "bento.py:svc"  # A convention for locating your service: <YOUR_SERVICE_PY>:<YOUR_SERVICE_ANNOTATION>
include:
 - "*.py"  # A pattern for matching which files to include in the bento
python:
  packages:
   - scikit-learn  # Additional libraries to be included in the bento
  1. Build BentoML service bentoml build
  2. Run bento in production bentoml serve iris_classifier:latest --production
  3. Send inference request
import requests
response = requests.post(
    "http://127.0.0.1:5000/predict",
    headers={"content-type": "application/json"},
    data="[5,4,3,2]").text
print(response)

Expected behavior

The response should be the classification result, namely 1.

Screenshots/Logs

The error generated by the server is the following:

 Exception on /predict [POST]
                       ╭──────────────────────────────────────────────────────────────── Traceback (most recent call last) ─────────────────────────────────────────────────────────────────╮
                       │                                                                                                                                                                    │
                       │ /usr/local/lib/python3.8/dist-packages/anyio/from_thread.py:31 in run                                                                                              │
                       │                                                                                                                                                                    │
                       │    28 │                                                                                                                                                            │
                       │    29 │   """                                                                                                                                                      │
                       │    30 │   try:                                                                                                                                                     │
                       │ ❱  31 │   │   asynclib = threadlocals.current_async_module                                                                                                         │
                       │    32 │   except AttributeError:                                                                                                                                   │
                       │    33 │   │   raise RuntimeError('This function can only be run from an AnyIO worker thread')                                                                      │
                       │    34                                                                                                                                                              │
                       ╰────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
                       AttributeError: '_thread._local' object has no attribute 'current_async_module'

                       During handling of the above exception, another exception occurred:

                       ╭──────────────────────────────────────────────────────────────── Traceback (most recent call last) ─────────────────────────────────────────────────────────────────╮
                       │                                                                                                                                                                    │
                       │ /usr/local/lib/python3.8/dist-packages/bentoml/_internal/server/service_app.py:356 in api_func                                                                     │
                       │                                                                                                                                                                    │
                       │   353 │   │   │   │   │   if isinstance(api.input, Multipart):                                                                                                     │
                       │   354 │   │   │   │   │   │   output: t.Any = await run_in_threadpool(api.func, **input_data)                                                                      │
                       │   355 │   │   │   │   │   else:                                                                                                                                    │
                       │ ❱ 356 │   │   │   │   │   │   output: t.Any = await run_in_threadpool(api.func, input_data)                                                                        │
                       │   357 │   │   │   │   response = await api.output.to_http_response(output)                                                                                         │
                       │   358 │   │   │   except BentoMLException as e:                                                                                                                    │
                       │   359 │   │   │   │   log_exception(request, sys.exc_info())                                                                                                       │
                       │ /usr/local/lib/python3.8/dist-packages/starlette/concurrency.py:40 in run_in_threadpool                                                                            │
                       │                                                                                                                                                                    │
                       │   37 │   elif kwargs:  # pragma: no cover                                                                                                                          │
                       │   38 │   │   # loop.run_in_executor doesn't accept 'kwargs', so bind them in here                                                                                  │
                       │   39 │   │   func = functools.partial(func, **kwargs)                                                                                                              │
                       │ ❱ 40 │   return await loop.run_in_executor(None, func, *args)                                                                                                      │
                       │   41                                                                                                                                                               │
                       │   42                                                                                                                                                               │
                       │   43 class _StopIteration(Exception):                                                                                                                              │
                       │                                                                                                                                                                    │
                       │ /usr/lib/python3.8/concurrent/futures/thread.py:57 in run                                                                                                          │
                       │                                                                                                                                                                    │
                       │    54 │   │   │   return                                                                                                                                           │
                       │    55 │   │                                                                                                                                                        │
                       │    56 │   │   try:                                                                                                                                                 │
                       │ ❱  57 │   │   │   result = self.fn(*self.args, **self.kwargs)                                                                                                      │
                       │    58 │   │   except BaseException as exc:                                                                                                                         │
                       │    59 │   │   │   self.future.set_exception(exc)                                                                                                                   │
                       │    60 │   │   │   # Break a reference cycle with the exception 'exc'                                                                                               │
                       │                                                                                                                                                                    │
                       │ /root/bentoml/bentos/iris_classifier/fgzarmenwoh6jsyx/src/bento.py:20 in predict                                                                                   │
                       │                                                                                                                                                                    │
                       │   17 @svc.api(input=NumpyNdarray(), output=NumpyNdarray())                                                                                                         │
                       │   18 def predict(input_ndarray: np.ndarray) -> np.ndarray:                                                                                                         │
                       │   19 │   # Define pre-processing logic                                                                                                                             │
                       │ ❱ 20 │   result = iris_clf_runner.run(input_ndarray)                                                                                                               │
                       │   21 │   # Define post-processing logic                                                                                                                            │
                       │   22 │   return result                                                                                                                                             │
                       │   23                                                                                                                                                               │
                       │                                                                                                                                                                    │
                       │ /usr/local/lib/python3.8/dist-packages/bentoml/_internal/runner/runner.py:141 in run                                                                               │
                       │                                                                                                                                                                    │
                       │   138 │   │   return await self._impl.async_run_batch(*args, **kwargs)                                                                                             │
                       │   139 │                                                                                                                                                            │
                       │   140 │   def run(self, *args: t.Any, **kwargs: t.Any) -> t.Any:                                                                                                   │
                       │ ❱ 141 │   │   return self._impl.run(*args, **kwargs)                                                                                                               │
                       │   142 │                                                                                                                                                            │
                       │   143 │   def run_batch(self, *args: t.Any, **kwargs: t.Any) -> t.Any:                                                                                             │
                       │   144 │   │   return self._impl.run_batch(*args, **kwargs)                                                                                                         │
                       │                                                                                                                                                                    │
                       │ /usr/local/lib/python3.8/dist-packages/bentoml/_internal/runner/remote.py:111 in run                                                                               │
                       │                                                                                                                                                                    │
                       │   108 │   def run(self, *args: t.Any, **kwargs: t.Any) -> t.Any:                                                                                                   │
                       │   109 │   │   import anyio                                                                                                                                         │
                       │   110 │   │                                                                                                                                                        │
                       │ ❱ 111 │   │   return anyio.from_thread.run(self.async_run, *args, **kwargs)                                                                                        │
                       │   112 │                                                                                                                                                            │
                       │   113 │   def run_batch(self, *args: t.Any, **kwargs: t.Any) -> t.Any:                                                                                             │
                       │   114 │   │   import anyio                                                                                                                                         │
                       │                                                                                                                                                                    │
                       │ /usr/local/lib/python3.8/dist-packages/anyio/from_thread.py:33 in run                                                                                              │
                       │                                                                                                                                                                    │
                       │    30 │   try:                                                                                                                                                     │
                       │    31 │   │   asynclib = threadlocals.current_async_module                                                                                                         │
                       │    32 │   except AttributeError:                                                                                                                                   │
                       │ ❱  33 │   │   raise RuntimeError('This function can only be run from an AnyIO worker thread')                                                                      │
                       │    34 │                                                                                                                                                            │
                       │    35 │   return asynclib.run_async_from_thread(func, *args)                                                                                                       │
                       │    36                                                                                                                                                              │
                       ╰────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
                       RuntimeError: This function can only be run from an AnyIO worker thread

Environment:

  • OS: Ubuntu 20.04
  • Python Version : 3.8
  • BentoML Version: 1.0.0.a3
  • AnyIO version: 3.5.0

Additional context

@andreea-anghel andreea-anghel added the bug Something isn't working label Feb 14, 2022
@parano
Copy link
Member

parano commented Feb 14, 2022

Is this a duplication of #2271? Could you try it again with pip install -U aiohttp==3.8.1 ?

@andreea-anghel
Copy link
Author

andreea-anghel commented Feb 14, 2022

The setup is similar to #2270 which was indeed solved by running pip install -U aiohttp==3.8.1. In this issue, I'm using an s390x system and Python 3.8 (in #2270 I was using an x86 system and Python 3.9) and I get an AnyIO error as shown above (not an aiohttp error as in #2270). Aiohttp is already at v3.8.1. Any idea on how to solve this issue is appreciated - thanks.

@andreea-anghel
Copy link
Author

Issue solved by downgrading bentoml to v1.0.0a2

@aarnphm
Copy link
Member

aarnphm commented Feb 14, 2022

can you try to install bentoml with pip install git+https://github.com/bentoml/BentoML.git?

@andreea-anghel
Copy link
Author

thanks @aarnphm but it does not work. The only solution was to downgrade bentoml.

@andreea-anghel
Copy link
Author

After sending multiple curl requests, I see the following error when running the bento service in production mode:

> bentoml serve iris_classifier:latest --production
...
[03:43:32 AM] INFO     Application startup complete.                                                                                                                                 [03:43:40 AM] INFO      - "POST /run HTTP/1.1" 200 OK                                                                                                                         
[03:43:40 AM] INFO     127.0.0.1:37420 - "POST /predict HTTP/1.1" 200 OK                                                                                                      
ERROR:asyncio:Unclosed client session
client_session: <aiohttp.client.ClientSession object at 0x3ffb0a736d0>
ERROR:asyncio:Unclosed connector
connections: ['[(<aiohttp.client_proto.ResponseHandler object at 0x3ffb044f040>, 175623.222113761)]']
connector: <aiohttp.connector.UnixConnector object at 0x3ffb0a73970>
[03:43:44 AM] INFO      - "POST /run HTTP/1.1" 200 OK                                                                                                                         

Basically only the first request works fine. Starting with the second request I get the asyncio errors above. The curl request is successful, but the errors above could affect the performance of the service I suspect. Any idea why these errors happen? I'm using 1.0.0a2.

@aarnphm
Copy link
Member

aarnphm commented Feb 16, 2022

I believe that the new 1.0.0a4 might solve this issue. Can you try using 1.0.0a4?

@andreea-anghel
Copy link
Author

When upgrading to 1.0.0a4 I get the initial error posted in the description of the issue

RuntimeError: This function can only be run from an AnyIO worker thread

@parano
Copy link
Member

parano commented Feb 16, 2022

@andreea-anghel could you share your service/runner definition code?

@parano
Copy link
Member

parano commented Feb 17, 2022

@andreea-anghel are you running the example from the gallery project? I can't seem to reproduce it on my end. Could you share the detailed error log?

@andreea-anghel
Copy link
Author

andreea-anghel commented Feb 17, 2022

Here is the service definition:

# bento.py
import bentoml
import bentoml.sklearn
import numpy as np

from bentoml.io import NumpyNdarray

# Load the runner for the latest ScikitLearn model we just saved
iris_clf_runner = bentoml.sklearn.load_runner("iris_clf:latest")

# Create the iris_classifier service with the ScikitLearn runner
# Multiple runners may be specified if needed in the runners array
# When packaged as a bento, the runners here will included
svc = bentoml.Service("iris_classifier", runners=[iris_clf_runner])

# Create API function with pre- and post- processing logic with your new "svc" annotation
@svc.api(input=NumpyNdarray(), output=NumpyNdarray())
def predict(input_ndarray: np.ndarray) -> np.ndarray:
    # Define pre-processing logic
    result = iris_clf_runner.run_batch(input_ndarray)
    # Define post-processing logic
    return result

and here is the client:

import requests
import numpy as np

num_rep = 5
data = [[5,4,3,2],[5,4,3,2]]

for req_index in range(num_rep):
    response = requests.post(
           "http://127.0.0.1:5000/predict",
            headers={"content-type": "application/json"},
            data=str(data))
    print(response.text)

When I run the client I get the expected predictions. However on the server side I see errors

ERROR:asyncio:Unclosed client session
client_session: <aiohttp.client.ClientSession object at 0x3ff9cbed730>
ERROR:asyncio:Unclosed connector
connections: ['[(<aiohttp.client_proto.ResponseHandler object at 0x3ff9cbdef40>, 262468.915262691)]']
connector: <aiohttp.connector.UnixConnector object at 0x3ff9cbed9a0>
              INFO      - "POST /run_batch HTTP/1.1" 200 OK                                                                                                                   h11_impl.py:429

This is how I start the server bentoml serve iris_classifier:latest --production

@andreea-anghel
Copy link
Author

This error does not show up in the ~/bentoml/logs/active.log file, only in the terminal where the server was started. Is there another log file I should check?

@andreea-anghel
Copy link
Author

If you cannot reproduce it, could you please share your env details (the list of python packages + their versions)? Thanks!

@aarnphm
Copy link
Member

aarnphm commented Feb 17, 2022

everything works on my end, 1.0.0a4, on both conda 3.9.7 and pyenv 3.9.7. Also works with main branch.

Can you send the version of anyio: pip list | grep anyio?

@andreea-anghel
Copy link
Author

Thanks for checking @aarnphm and sorry for my late reply. I'm using anyio v3.5.0.

@aarnphm
Copy link
Member

aarnphm commented Mar 15, 2022

can you try with the latest releases?

@timliubentoml
Copy link
Collaborator

@andreea-anghel I tested your code on a clean ubuntu 20 machine in aws with the exact same python and anyio. Did not have the issue that you see with the latest build or with a4.

How long do you have to wait before that error log message appears? What environment are you running this in? You can definitely send your pip dependencies over but maybe it's something about the way you're running it.

@andreea-anghel
Copy link
Author

andreea-anghel commented Apr 5, 2022

@timliubentoml thank you for looking into this issue. The problem shows up very quickly, after the second HTTP packet received. Here is a list of the Python packages installed in my environment pythonenv.txt

@timliubentoml
Copy link
Collaborator

@andreea-anghel Just to confirm, you're running on Ubuntu 20.04 right?

@andreea-anghel
Copy link
Author

@timliubentoml yes, that's correct

@timliubentoml
Copy link
Collaborator

@andreea-anghel I tried again and couldn't reproduce your issue. I took another issue at your pythonenv.txt and you are running a very old version (at this point) of bentoml and it's dependencies. It looks like you're at a2. We are now at a7.

I would really recommend that you update your bentoml to the latest version by doing this:
pip install bentoml --pre -U

If you need the code to the quick start without having to create again yourself, you can also get it from here along with instructions:
https://github.com/bentoml/gallery/tree/main/quickstart

@bojiang bojiang removed their assignment Apr 8, 2022
@bojiang
Copy link
Member

bojiang commented Apr 8, 2022

I believe it was fixed here #2223.

@andreea-anghel
Copy link
Author

@timliubentoml Thanks for trying to reproduce the error. Have you tried running the code on a s390x system?

@timliubentoml
Copy link
Collaborator

@andreea-anghel Do you have any suggestions for running bentoml on the s390x system? I tried using the docker container here: https://hub.docker.com/r/s390x/ubuntu/

But ran into a few issues getting the necessary libraries to build. Any idea how I could try to reproduce this?

If you've already got it setup and the issue is reproducible on your side, I would say that you should try to get the latest bentoml version. We upgraded a bunch of stuff with regard to anyio

@andreea-anghel
Copy link
Author

andreea-anghel commented Apr 26, 2022

@timliubentoml I've installed bentoml v1.0.0a7 and reinstalled my python env (see the attached pip freeze output).
pip-env.txt And now it works without any error. Thank you.

By the way, it would be great to have a docker image for s390x.

@timliubentoml
Copy link
Collaborator

@andreea-anghel That's awesome! Good to hear.

Let me talk about the docker image for s390x. Are you using docker right now for prod deployment?

@andreea-anghel
Copy link
Author

@timliubentoml Unfortunately I'm not using a docker at the moment. Otherwise, I would have been happy to contribute.

@ajms
Copy link

ajms commented May 23, 2022

I am experiencing the exact same bug with

Environment:
OS: Ubuntu 20.04
Python Version : 3.8
BentoML Version: 1.0.0a7
AnyIO version: 3.5.0

@ajms
Copy link

ajms commented May 23, 2022

Addendum: after containerizing, running the bento as docker container works.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

6 participants