Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Latest Docker Image Error Python failed to call 'asyncio.get_event_loop' #815

Closed
treyrich opened this issue Dec 20, 2022 · 38 comments
Closed
Assignees

Comments

@treyrich
Copy link

Upgrading our docker image from 1.28.0-python3.10 to 1.29.0-python3.11 results in the following log output:

2022/12/20 18:38:19 [info] 21#21 unit 1.29.0 started
2022/12/20 18:38:19 [info] 28#28 discovery started
2022/12/20 18:38:19 [notice] 28#28 module: python 3.11.1 "/usr/lib/unit/modules/python3.unit.so"
2022/12/20 18:38:19 [info] 24#24 controller started
2022/12/20 18:38:19 [notice] 24#24 process 28 exited with code 0
2022/12/20 18:38:19 [info] 33#33 router started
2022/12/20 18:38:19 [info] 33#33 OpenSSL 1.1.1n  15 Mar 2022, 101010ef
{
	"certificates": {},
	"config": {
		"listeners": {},
		"applications": {}
	},

	"status": {
		"connections": {
			"accepted": 0,
			"active": 0,
			"idle": 0,
			"closed": 0
		},

		"requests": {
			"total": 0
		},

		"applications": {}
	}
}
/usr/local/bin/docker-entrypoint.sh: Looking for certificate bundles in /docker-entrypoint.d/...
/usr/local/bin/docker-entrypoint.sh: Looking for configuration snippets in /docker-entrypoint.d/...
/usr/local/bin/docker-entrypoint.sh: Applying configuration /docker-entrypoint.d/config.json
2022/12/20 18:38:19 [info] 44#44 "webapp" prototype started
2022/12/20 18:38:19 [info] 46#46 "webapp" application started
2022/12/20 18:38:19 [alert] 0#46 [unit] Python failed to call 'asyncio.get_event_loop'
2022/12/20 18:38:20 [notice] 44#44 app process 46 exited with code 1
2022/12/20 18:38:20 [warn] 33#33 failed to start application "webapp"
2022/12/20 18:38:20 [alert] 33#33 failed to apply new conf
2022/12/20 18:38:20 [notice] 24#24 process 44 exited with code 0
/usr/local/bin/docker-entrypoint.sh: Error: HTTP response status code is '500'
 }error": "Failed to apply new configuration."

I've confirmed that this is a bug by replicating it with the simplest python application (copied from Unit docs):

async def application(scope, receive, send):

    await send({"type": "http.response.start", "status": 200})

    await send({"type": "http.response.body", "body": b"Hello, ASGI\n"})

Configuration is:

{
  "listeners": {
    "*:8080": {
      "pass": "applications/webapp"
    }
  },

  "applications": {
    "webapp": {
      "type": "python",
      "path": "/www/",
      "module": "nunittest.asgi",
      "callable": "application"
    }
  }
}

There appears to be no path forward to using 1.29.0-python3.11 for an asgi application at all at this point, but perhaps I'm missing something?

@thresheek
Copy link
Member

Hi @treyrich !

Our internal tests for docker images did not catch that issue, however we are testing an wsgi app there.

We're going to investigate the possible causes.

Many thanks

@treyrich
Copy link
Author

Great, thanks for the quick reply. Do let me know if there's anything I can do to help.

@thresheek
Copy link
Member

thresheek commented Dec 20, 2022

It also seems to work with Python 3.11 as shipped with Fedora 37 (that's where our automated tests run, which cover asgi).

@andrew-nuwber
Copy link

Our team has the same issue running python app on quart

@ac000
Copy link
Member

ac000 commented Dec 21, 2022

Trying to reproduce this locally (current unit master, Python 3.11).

With the above config I get

2022/12/21 15:13:58 [alert] 9398#9398 Python failed to import module "nunittest.asgi"
ModuleNotFoundError: No module named 'nunittest'

With the following

{
  "listeners": {
    "[::1]:8080": {
      "pass": "applications/webapp"
    }
  },

  "applications": {
    "webapp": {
      "type": "python",
      "path": "/srv/unit/python/app/",
      "module": "asgi"
    }
  }
}

$ cat /srv/unit/python/app/asgi.py

async def application(scope, receive, send):
    await send({"type": "http.response.start", "status": 200})

    await send({"type": "http.response.body", "body": b"Hello, ASGI\n"})

The app seems to start, but not really...

{
        "success": "Reconfiguration done."
}

From the log

2022/12/21 15:28:06 [info] 9642#9642 "webapp" application started
2022/12/21 15:28:06 [warn] 0#9642 [unit] Got invalid state transition on lifespan protocol
2022/12/21 15:28:06 [info] 0#9642 [unit] ASGI Lifespan processing exception
Traceback (most recent call last):
  File "/srv/unit/python/app/asgi.py", line 2, in application
    await send({"type": "http.response.start", "status": 200})
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError: Got invalid state transition on lifespan protocol

And when trying to access it

$ curl http://localhost:8080/
<!DOCTYPE html><title>Error 503</title><p>Error 503.
2022/12/21 15:28:45 [alert] 9642#9642 [unit] #8: Python failed to create 'client' pair
2022/12/21 15:28:45 [alert] 9642#9642 [unit] Python failed to call 'loop.call_soon'
ValueError: invalid literal for int() with base 10: ':1'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/lib64/python3.11/asyncio/base_events.py", line 762, in call_soon
    handle = self._call_soon(callback, args, context)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib64/python3.11/asyncio/base_events.py", line 778, in _call_soon
    handle = events.Handle(callback, args, self, context)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
SystemError: <class 'asyncio.events.Handle'> returned a result with an exception set

@treyrich
Copy link
Author

Here, perhaps this will help. This is the simplest recreation I can come up with. Create a Dockerfile with the following contents:

# Uncomment 1.28 to see working version
# FROM public.ecr.aws/nginx/unit:1.28.0-python3.10
FROM public.ecr.aws/nginx/unit:1.29.0-python3.11

WORKDIR /www
# port used by the listener in nginx-unit-config.json
EXPOSE 8080
# Copy our Nginx configuration file to the container
RUN echo '{"listeners":{"*:8080":{"pass":"applications/webapp"}},"applications":{"webapp":{"type":"python 3","path":"/www/","module":"nunittest.asgi"}}}' > /docker-entrypoint.d/config.json
# Copy our application code to the container
RUN mkdir /www/nunittest
RUN echo '' > /www/nunittest/__init__.py
RUN echo 'async def application(scope, receive, send):\n\
    await send({"type": "http.response.start", "status": 200})\n\
    \n\
    await send({"type": "http.response.body", "body": b"Hello, ASGI\\n"})\n' > /www/nunittest/asgi.py

Then run it with this command docker run --publish 8080:8080 --rm $(docker build -q .) which will build and run the dockerfile in the current directory.

You'll see that with 1.29.0 the server crashes, but if you uncomment the 1.28.0 image and comment out 1.29.0 that the server launches without issue.

@hawiliali
Copy link

hawiliali commented Dec 26, 2022

Hi,

Using the sample fastapi app (ASGI) in your documentation https://unit.nginx.org/howto/fastapi/
After updating to 1.29.0 with python 3.11 , yielded same error in this issue

2022/12/26 12:20:07 [info] 8422#8422 "fastapi" prototype started
2022/12/26 12:20:08 [info] 8427#8427 "fastapi" application started
2022/12/26 12:20:08 [alert] 0#8427 [unit] Python failed to call 'asyncio.get_event_loop'
2022/12/26 12:20:09 [notice] 8422#8422 app process 8427 exited with code 1
2022/12/26 12:20:09 [warn] 30882#30882 failed to start application "fastapi"
2022/12/26 12:20:09 [alert] 30882#30882 failed to apply new conf
2022/12/26 12:20:09 [notice] 30879#30879 process 8422 exited with code 0

using same sample application with python 3.10 module compiled to work on unit 1.29.0 is working without any issue

@mamashin
Copy link

mamashin commented Jan 9, 2023

Same error with python 3.11.1 and unit 1.29.0 :(

@tippexs
Copy link
Contributor

tippexs commented Jan 11, 2023

Hi Team - I am working with the Dev Team on an analysis about the root cause. Will let you all know once we have something to share.

Atm. it looks like it is something new in the combination of Unit 1.29 and Python 3.11.1 as Python 3.10 on 1.29 works just fine.

@andrew-nuwber
Copy link

@tippexs hi, thank you for working on that case!

Just like temporary workaround would be great to have docker image with unit 1.29.0 and python 3.10

@tippexs
Copy link
Contributor

tippexs commented Jan 13, 2023

@andrew-nuwber this sounds like a plan. Let me share a Dockerfile that will let you build a Image with Unit1.29 as well as Python 3.10 and a language module.

@tippexs
Copy link
Contributor

tippexs commented Jan 13, 2023

Or if this would help in the meantime:

http://unit.nginx.org/installation/#inst-lang-docker

git clone https://github.com/nginx/unit
cd unit
git checkout 1.29.0  # Optional; use to choose a specific Unit version
cd pkg/docker/
make build-python3.10 VERSION_python=3.10

It will build you the Image using Unit 1.29.0 and Python 3.10.

docker run --rm -it unit:1.29.0-python3.10 /bin/bash

root@89cea2af82af:/# python --version
Python 3.10.9
root@89cea2af82af:/#
root@89cea2af82af:/# which unitd
/usr/sbin/unitd
root@89cea2af82af:/# unitd --version
unit version: 1.29.0
configured as ./configure --prefix=/usr --state=/var/lib/unit --control=unix:/var/run/control.unit.sock --pid=/var/run/unit.pid --log=/var/log/unit.log --tmp=/var/tmp --user=unit --group=unit --openssl --libdir=/usr/lib/x86_64-linux-gnu --cc-opt='-g -O2 -ffile-prefix-map=/unit=. -specs=/usr/share/dpkg/no-pie-compile.specs -fstack-protector-strong -Wformat -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -fPIC' --ld-opt='-specs=/usr/share/dpkg/no-pie-link.specs -Wl,-z,relro -Wl,-z,now -Wl,--as-needed -pie' --modules=/usr/lib/unit/modules

@tippexs
Copy link
Contributor

tippexs commented Jan 13, 2023

I have just tested Unit 1.29 using Python 3.10 and I get the same error as with Python 3.11.

However even Unit 1.28.0 with Python3.10 shows me

tippexs@beaker:/tmp$ docker run --publish 8080:8080 --rm $(docker build -q .)
/usr/local/bin/docker-entrypoint.sh: /docker-entrypoint.d/ is not empty, launching Unit daemon to perform initial configuration...
2023/01/13 13:33:40 [info] 10#10 unit 1.28.0 started
2023/01/13 13:33:40 [info] 12#12 discovery started
2023/01/13 13:33:40 [notice] 12#12 module: python 3.10.9 "/usr/lib/unit/modules/python3.unit.so"
2023/01/13 13:33:40 [info] 11#11 controller started
2023/01/13 13:33:40 [notice] 11#11 process 12 exited with code 0
2023/01/13 13:33:40 [info] 15#15 router started
2023/01/13 13:33:40 [info] 15#15 OpenSSL 1.1.1n  15 Mar 2022, 101010ef
{
        "certificates": {},
        "config": {
                "listeners": {},
                "applications": {}
        },

        "status": {
                "connections": {
                        "accepted": 0,
                        "active": 0,
                        "idle": 0,
                        "closed": 0
                },

                "requests": {
                        "total": 0
                },

                "applications": {}
        }
}
/usr/local/bin/docker-entrypoint.sh: Looking for certificate bundles in /docker-entrypoint.d/...
/usr/local/bin/docker-entrypoint.sh: Looking for configuration snippets in /docker-entrypoint.d/...
/usr/local/bin/docker-entrypoint.sh: Applying configuration /docker-entrypoint.d/config.json
2023/01/13 13:33:40 [info] 19#19 "webapp" prototype started
2023/01/13 13:33:40 [info] 20#20 "webapp" application started
2023/01/13 13:33:41 [alert] 0#20 [unit] Python failed to call 'asyncio.get_event_loop'
2023/01/13 13:33:41 [notice] 19#19 app process 20 exited with code 1
2023/01/13 13:33:41 [warn] 15#15 failed to start application "webapp"
2023/01/13 13:33:41 [alert] 15#15 failed to apply new conf
2023/01/13 13:33:41 [notice] 11#11 process 19 exited with code 0
/usr/local/bin/docker-entrypoint.sh: Error: HTTP response status code is '500'
{
        "error": "Failed to apply new configuration."
}

So for me 1.28 AND 1.29 are failing regardless of the Python Versions 3.10 or 3.11.

Using the Images from the AWS ECR Registry the Unit 1.28 Python 3.10 Demo Application is working.

We are looking into it. I need to check if the custom build for our images is working as we expect.

@tippexs
Copy link
Contributor

tippexs commented Jan 13, 2023

Okay scratch that - It was an issue on my end! I am now trying to play around with the images.

@tippexs
Copy link
Contributor

tippexs commented Jan 13, 2023

Updates! I was able to make it work with Unit 1.29 and Python 3.10.7. As soon as I am trying to use the latest Python 3.10 version which is 3.10.9 the app starts to crash on boot.

To build a container with Unit 1.29 and Python 3.10.7 issue

git clone https://github.com/nginx/unit
cd unit
git checkout 1.29.0  # Optional; use to choose a specific Unit version
cd pkg/docker/
make build-python3.10 VERSION_python=3.10.7                                                                                                                                                                                                         

This will create a Docker image tagged as unit:1.29.0-python3.10. Feel free to test your apps with this image. In the meantime we are looking into the issue with Python Versions > 3.10.7

@88Ocelot
Copy link

i had managed to run django asgi app with python3.11 using asyncio runner instead of using get_event_loop and new_event_loop. i dont think its the perfect solution, but i havent seen any memory leaks or leaks after applying load testing for an hour.
#826

@ac000
Copy link
Member

ac000 commented Jan 18, 2023

Hi @88Ocelot. First of all thank you for taking the time and effort to implement a fix for this. However, and I'm sure you'll agree, that it's a rather large patch and fundamentally changes how we do the event loop.

So I would rather propose the following more minimal change

diff --git a/src/python/nxt_python_asgi.c b/src/python/nxt_python_asgi.c
index 587a17cf..9fe5b56e 100644
--- a/src/python/nxt_python_asgi.c
+++ b/src/python/nxt_python_asgi.c
@@ -201,14 +201,53 @@ nxt_python_asgi_init(nxt_unit_init_t *init, nxt_python_proto_t *proto)
 }
 
 
+static PyObject *
+nxt_python_asgi_get_event_loop(PyObject *asyncio, const char *event_loop_func)
+{
+    PyObject  *event_loop, *loop;
+
+    event_loop = PyDict_GetItemString(PyModule_GetDict(asyncio),
+                                      event_loop_func);
+    if (nxt_slow_path(event_loop == NULL)) {
+        nxt_unit_alert(NULL, "Python failed to get '%s' from module 'asyncio'",
+                       event_loop_func);
+        return NULL;
+    }
+
+    if (nxt_slow_path(PyCallable_Check(event_loop) == 0)) {
+        nxt_unit_alert(NULL, "'asyncio.%s' is not a callable object",
+                       event_loop_func);
+        return NULL;
+    }
+
+    loop = PyObject_CallObject(event_loop, NULL);
+    if (nxt_slow_path(loop == NULL)) {
+        if (strcmp(event_loop_func, "get_running_loop") != 0) {
+            nxt_unit_alert(NULL, "Python failed to call 'asyncio.%s'",
+                           event_loop_func);
+        }
+
+        return NULL;
+    }
+
+    return loop;
+}
+
+
 static int
 nxt_python_asgi_ctx_data_alloc(void **pdata, int main)
 {
     uint32_t                i;
-    PyObject                *asyncio, *loop, *event_loop, *obj;
+    PyObject                *asyncio, *loop, *obj;
     const char              *event_loop_func;
     nxt_py_asgi_ctx_data_t  *ctx_data;
 
+#if PY_VERSION_HEX < NXT_PYTHON_VER(3, 7)
+    static const char       *main_event_loop_func = "get_event_loop";
+#else
+    static const char       *main_event_loop_func = "get_running_loop";
+#endif
+
     ctx_data = nxt_unit_malloc(NULL, sizeof(nxt_py_asgi_ctx_data_t));
     if (nxt_slow_path(ctx_data == NULL)) {
         nxt_unit_alert(NULL, "Failed to allocate context data");
@@ -241,29 +280,25 @@ nxt_python_asgi_ctx_data_alloc(void **pdata, int main)
         goto fail;
     }
 
-    event_loop_func = main ? "get_event_loop" : "new_event_loop";
+    event_loop_func = main ? main_event_loop_func : "new_event_loop";
 
-    event_loop = PyDict_GetItemString(PyModule_GetDict(asyncio),
-                                      event_loop_func);
-    if (nxt_slow_path(event_loop == NULL)) {
-        nxt_unit_alert(NULL,
-                       "Python failed to get '%s' from module 'asyncio'",
-                       event_loop_func);
+    loop = nxt_python_asgi_get_event_loop(asyncio, event_loop_func);
+    if (loop == NULL) {
+#if PY_VERSION_HEX < NXT_PYTHON_VER(3, 7)
         goto fail;
-    }
+#else
+        if (main) {
+            PyErr_Clear();
 
-    if (nxt_slow_path(PyCallable_Check(event_loop) == 0)) {
-        nxt_unit_alert(NULL,
-                       "'asyncio.%s' is not a callable object",
-                       event_loop_func);
-        goto fail;
-    }
+            loop = nxt_python_asgi_get_event_loop(asyncio, "new_event_loop");
+            if (nxt_slow_path(loop == NULL)) {
+                goto fail;
+            }
 
-    loop = PyObject_CallObject(event_loop, NULL);
-    if (nxt_slow_path(loop == NULL)) {
-        nxt_unit_alert(NULL, "Python failed to call 'asyncio.%s'",
-                       event_loop_func);
-        goto fail;
+        } else {
+            goto fail;
+        }
+#endif
     }
 
     for (i = 0; i < nxt_nitems(handlers); i++) {

which AFAICT has the same behaviour as previously. So if you are able, please do give this patch a spin. I have tested it against several versions of Python but only as far as ensuring the asyncio.get_event_loop error doesn't show up and that the (very basic) application starts.

@alejandro-colomar
Copy link
Contributor

diff --git a/src/python/nxt_python_asgi.c b/src/python/nxt_python_asgi.c
index 587a17cf..9fe5b56e 100644
--- a/src/python/nxt_python_asgi.c
+++ b/src/python/nxt_python_asgi.c

[...]

Please, split that into a refactoring patch that adds the function, and then one that does the fix, to keep my sanity when reviewing :)

@ac000
Copy link
Member

ac000 commented Jan 23, 2023

I have pushed a branch that contains the above fix, if that facilitates easier testing.

@luckydonald
Copy link

When will this be merged?
And if that's not yet planned, what are the current workarounds, now that the problem is understood?

I'm using FROM nginx/unit:1.29.0-python3.11 in my docker file, as I require py3.11.

@luckydonald
Copy link

I think a workaround is the following for now:

config:

        "backend": {
            "type": "python3",
            <your custom stuff stays as before>
            "protocol": "asgi",
-           "module": "app",
+           "module": "workaround-issue815",
            "callable": "app"
        }

workaround-issue815.py:

import asyncio

# workaround for [unit] Python failed to call 'asyncio.get_event_loop'
# See https://github.com/nginx/unit/issues/815 for the bug status,
# see also: https://github.com/nginx/unit/commit/7dcbd778ac21fe2cc76ea7f1761e41e91a398230 for an implementation
try:
    asyncio.get_running_loop()
except RuntimeError:
    loop = asyncio.new_event_loop()
    asyncio.set_event_loop(loop)
# end def

# noinspection PyUnresolvedReferences
from app import app  # basically the old setting: from "module" import "callable"

@ac000
Copy link
Member

ac000 commented Jan 28, 2023

@luckydonald Rather than workarounds you could test the proposed fix... that might speed things along...

Either the patch or the linked branch.

@luckydonald
Copy link

Sure, could help me out with an docker image or Dockerfile?
I'd gladly will test it out.

@rafsaf
Copy link

rafsaf commented Feb 5, 2023

Hello, thanks for working on it, waiting patiently here for this bug to be fixed, I think everything is already clear.

@lcrilly
Copy link
Contributor

lcrilly commented Feb 6, 2023

waiting patiently here for this bug to be fixed

@rafsaf we have a fix. Are you in a position to test it?
https://github.com/ac000/unit/tree/asyncio

@rafsaf
Copy link

rafsaf commented Feb 6, 2023

@lcrilly sure thing I can do it, hope I can find time today and try it on evening with an app that broke after image upgrade

@tippexs
Copy link
Contributor

tippexs commented Feb 6, 2023

Hi @rafsaf
I have just tested the build process with the Patch from @ac000 and to build the Docker Image using the Pkg-Makefile you can simply change the template.Dockerfile in `pkg/docker/.

Currently it is (

&& hg clone https://hg.nginx.org/unit \
)

    && hg clone https://hg.nginx.org/unit \
    && cd unit \
    && hg up @@VERSION@@ \

For your tests it should be

    && git clone -b asyncio https://github.com/ac000/unit.git \
    && cd unit \

This will pull down the correct sources. I am about to make this configurable in the make process to accept git, hg [default] and local sources.

After changing the template.Dockerfile you can run

make build-python3.10 VERSION_python=3.10

Let me know if that works.

@rafsaf
Copy link

rafsaf commented Feb 6, 2023

@tippexs Awesome, probably there would a bit of headache if not you ;) Not a bad idea to clone repo in Dockerfile btw.

But to the point, I was able to build locally both unit:1.30.0-python3.10 using make build-python3.10 VERSION_python=3.10 and unit:1.30.0-python3.11 using make build-python3.11 VERSION_python=3.11 just as you described it.

Both are working alright with my setup where current official image nginx/unit:1.29.0-python3.11 fails.

Steps to reproduce results if anybody is concerned (python 3.10/3.11 fastapi asgi app) can be found here https://github.com/rafsaf/nginx_unit_issue_815

If there is anything I could do, don't hesitate, glad I could help.

@ac000
Copy link
Member

ac000 commented Feb 6, 2023

@rafsaf Thanks for taking the time and effort to test this!

I will look at getting the patches reviewed and merged.

@ac000
Copy link
Member

ac000 commented Feb 7, 2023

The fix for this has been merged.

@ac000 ac000 closed this as completed Feb 7, 2023
@micaelmalta
Copy link

Thanks @ac000 !

When can we expect an official docker release for this fix?

@thresheek
Copy link
Member

Next Unit release will have the fixes.

@micaelmalta
Copy link

micaelmalta commented Feb 8, 2023

Do we have any ETA as NGINX-UNIT is currently completely useless with Python 3.11 ?

@lcrilly
Copy link
Contributor

lcrilly commented Feb 10, 2023

Hi @micaelmalta

We're making plans for the next release now. Will update you in the next week or so. In the meantime, we can provide a Dockerfile that will produce a working Unit 1.29 + Python build if that will help you?

@rafsaf
Copy link

rafsaf commented Feb 10, 2023

Thanks for eta

It could be more scalable (also for situations like that in the future) for unit to have nightly releases of docker images, something that for example rust does and many others nowadays https://doc.rust-lang.org/book/appendix-07-nightly-rust.html.

This way you wouldn't need to rush with new stable releases or deal with one time images 😀

@0x6f677548
Copy link

Just for the sake of others: this seems resolved in docker image 1.29.1-python3.11

@lcrilly
Copy link
Contributor

lcrilly commented Mar 21, 2023

Thanks @0x6f677548 👍

Sorry I dropped the ball on the comms on this one. It was indeed fixed with this fix and shipped with 1.29.1.

@luckydonald
Copy link

luckydonald commented Apr 13, 2023

Works there. At least does boot.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests