-
Notifications
You must be signed in to change notification settings - Fork 678
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"cheat" env behavior causes an application crash with SystemError #1408
Comments
Hi @a-feld, thanks for your analysis. I think it is time to start removing nowadays-useless optimizations in the core, like the one you mentioned. In 2.1 i have just changed the default to the holy one. Now we need to find a workaround for 2.0 without changing the default. You solution in python_call would be a bit 'brutal' so before accepting it i would like to find a better one. Eventually we could consider setting the default allocator for python 3 as the 'holy' one (that probably is even better) |
Hi @unbit! Thanks for the quick response! |
What about this ? da68b9d |
Decrementing the reference count on the tuple without deleting the object referencing the tuple can cause the tuple to be deallocated while there is still a reference to the tuple. If the tuple is deallocated in that case, it will eventually result in undefined interpreter behavior. What about this solution? #1411 Since |
Fix SystemError exception when async_args PyTuple is reused. #1408
I think your patch is less invasive than mine. Decref'ing blindly as i did could lead to side problems. Merged it. @xrmx safe for backport |
Thanks for the speedy response and resolution @unbit! I'll go ahead and close this ticket. |
In the case where the uwsgi_python_create_env_cheat env behavior is used, the async_args PyTuple is reused on every request. In some cases, the reference count on async_args may be > 1 at the end of a request/response cycle. If PyTuple_SetItem is called with a reference count > 1, a SystemError is raised. When the cheat env behavior is used, the environ dictionary is also reused between requests. This enables the elimination of redundant PyTuple_SetItem calls by first checking to see if the environ dictionary is already in the async_args tuple.
Backported, thanks! |
Hi!
My name is Allan and I work at New Relic.
Overview
We've had various customer reports of application crashes when using uwsgi + Python3 with our agent.
(Example: https://github.com/Mobeye/NewRelic)
The reported error is:
After some investigation, we discovered (under certain circumstances) using the
uwsgi_python_create_env_cheat
env behaviorcan cause the underlying WSGI application to cease communicating with uwsgi.
This error can appear and cause application crashes even when the New Relic agent is not used.
Although uwsgi consistently causes an error to be set with
uwsgi_python_create_env_cheat
independent of the cpython version, this error only exposes a crash in Python 3.5 and greater. As more developers move to Python 3.5, this issue will start to occur more frequently.Debug Summary
The following versions were used for reproduction of this issue:
uwsgi_python_create_env_cheat
re-uses theasync_args
tuple between transactions (request/response).As part of its re-use it calls
PyTuple_SetItem(wsgi_req->async_args, 0, wsgi_req->async_environ)
on every request.If the reference count on the
wsgi_req->async_args
tuple != 1, then Python raises aSystemError
and causesPyEval_CallObject
to return NULL.To understand why, in some cases, the reference count on
wsgi_req->async_args
never returns to 1, consider the following example:In this case, the wrapt library adds a function between uwsgi and the
__call__
method ofWSGIHandler
. The reference count forasync_args
starts at 1 and the subsequent call to thewrapper(wrapped, instance, args, kwargs)
passesasync_args
asargs
which brings the reference count to 2.When a traceback is generated in any version of cpython, a reference to the frame which generated the traceback is created and the frame's reference count is subsequently incremented. This ensures that the frame which generated the traceback is not deallocated before the traceback is handled.
In the above example, the traceback is generated from the
raise Exception('BAD APPLICATION')
. The exception is subsequently handled in uwsgi through aPyErr_Print()
call inuwsgi_python_exception_log
here.According to the C-API documentation, this call sets the
sys.last_traceback
variable and holds a reference to the traceback which consequently holds a reference to the frame. Since the frame is never deallocated, the reference count ofwsgi_req->async_args
remains at 2 for the next transaction.When the next request arrives, the call to
PyTuple_SetItem(wsgi_req->async_args, 0, wsgi_req->async_environ)
(plugins/python/wsgi_subhandler.c:237) causes aSystemError
exception with the application not being able to communicate with uwsgi.To further demonstrate that this is the case, changing PyErr_Print to
PyErr_PrintEx(0);
eliminates theSystemError
above.In some cases,
sys.last_traceback
is not the only Python object that holds references to the frame. There are examples relating to tracebacks / frames being held due to cyclical references which are garbage collected on the next request/response. In those cases, even whensys.last_traceback
is not stored, the frame is not deallocated at the end of the request/response.Possible Solutions
uwsgi can use
uwsgi_python_create_env_holy
as the default env creation process. This creates a new Tuple for every request and therefore resolves theSystemError
issue.uwsgi can check for
PyErr_Occurred
as part ofpython_call
and choose to clear certain types of errors before the underlying function call is made. Printing the error may be a reasonable compromise rather than keeping the error set for the function call (which causes the app to cease communications with uwsgi). An example of this code might be:I would be happy to submit a pull request with a fix, but I thought I'd ask the project what the preferred solution is before doing so.
Thanks in advance for your help!
The text was updated successfully, but these errors were encountered: