Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

uWSGI process got Segmentation Fault #330

Closed
maximekl opened this issue Nov 10, 2022 · 9 comments
Closed

uWSGI process got Segmentation Fault #330

maximekl opened this issue Nov 10, 2022 · 9 comments

Comments

@maximekl
Copy link

Hi,
It's look like one of our dependencies bumped greenlet to 2.0.1.
Last night we had a failure on our server :

!!! uWSGI process 2603100 got Segmentation Fault !!!
*** backtrace of 2603100 ***
uWSGI worker [634941: backend](uwsgi_backtrace+0x2a) [0x558b2fb529ca]
uWSGI worker [634941: backend](uwsgi_segfault+0x23) [0x558b2fb52db3]
/lib/x86_64-linux-gnu/libc.so.6(+0x37840) [0x7fdd9c614840]
.../lib/python3.7/site-packages/greenlet/_greenlet.cpython-37m-x86_64-linux-gnu.so(_ZN24ThreadState_DestroyNoGIL19DestroyQueueWithGILEPv+0x$
.../3.7.7/lib/libpython3.7m.so.1.0(Py_MakePendingCalls+0x112) [0x7fdd9c187662]
.../3.7.7/lib/libpython3.7m.so.1.0(_PyEval_EvalFrameDefault+0x6b79) [0x7fdd9c145779]
.../3.7.7/lib/libpython3.7m.so.1.0(_PyFunction_FastCallDict+0x11a) [0x7fdd9c094c6a]
.../3.7.7/lib/libpython3.7m.so.1.0(_PyObject_CallMethodId+0x2de) [0x7fdd9c11548e]
.../3.7.7/lib/libpython3.7m.so.1.0(+0x21ff04) [0x7fdd9c1a1f04]
.../3.7.7/lib/libpython3.7m.so.1.0(Py_FinalizeEx+0x1f) [0x7fdd9c1bb98f]
uWSGI worker [634941: backend](uwsgi_plugins_atexit+0x71) [0x558b2fb50d51]
/lib/x86_64-linux-gnu/libc.so.6(+0x39d8c) [0x7fdd9c616d8c]
/lib/x86_64-linux-gnu/libc.so.6(+0x39eba) [0x7fdd9c616eba]
uWSGI worker [634941: backend](+0x2db5f) [0x558b2fb05b5f]
uWSGI worker [634941: backend](end_me+0x25) [0x558b2fb4f8d5]
uWSGI worker [634941: backend](uwsgi_ignition+0x138) [0x558b2fb52fa8]
uWSGI worker [634941: backend](uwsgi_worker_run+0x280) [0x558b2fb56650]
uWSGI worker [634941: backend](uwsgi_run+0x454) [0x558b2fb56bd4]
uWSGI worker [634941: backend](+0x2a4ee) [0x558b2fb024ee]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xeb) [0x7fdd9c60109b]
uWSGI worker [634941: backend](_start+0x2a) [0x558b2fb0251a]
*** end of backtrace ***

Remind me of #325 cause of _ZN24ThreadState_DestroyNoGIL19DestroyQueueWithGILEPv

@maximekl maximekl changed the title uWSGI process got Segmentation Fault !!! uWSGI process got Segmentation Fault Nov 10, 2022
@jamadden
Copy link
Contributor

Thank you for the report. From the backtrace, this appears to be happening while the process is shutting down (Py_Finalize has been called).

I haven't been able to reproduce it, though, so any details that would assist in that would be appreciated.

@maximekl
Copy link
Author

Using uWSGI 2.0.20 (64bit)

Logs for one process :

spawned uWSGI worker 1 (pid: 3573452, cores: 1)
spawned 1 offload threads for uWSGI worker 1
WSGI app 0 (mountpoint='') ready in 2 seconds on interpreter 0x5631c694ff70 pid: 3573452 (default app)
[busyness] 5s average busyness is at 92%, will spawn 1 new worker(s)
spawned uWSGI worker 2 (pid: 3574020, cores: 1)
spawned 1 offload threads for uWSGI worker 2
WSGI app 0 (mountpoint='') ready in 2 seconds on interpreter 0x5631c694ff70 pid: 3574020 (default app)
[busyness] 5s average busyness is at 1%, cheap one of 2 running workers
[HERE SEGFAULT]
worker 1 killed successfully (pid: 3573452)
uWSGI worker 1 cheaped.

From logs, i think this is produces when we spawn more workers and we remove one

@jamadden
Copy link
Contributor

Thank you, but there's a lot more that I need to know to reproduce this (trust me, I tried for two days and could not reproduce a crash with any combination of uWSGI args and app code). Things like:

  • uWSGI has numerous settings that change how it works. How are you configuring uWSGI?
  • What is your app doing in the worker? Can you produce a minimal app that exhibits the same problem?
  • If you can't provide any of that, can you compile and run a version of greenlet with debugging (minimally, the -g -Og compiler arguments) so we perhaps get a slightly more useful backtrace when it crashes for you?

@maximekl
Copy link
Author

Hi, i will try my best to give you more details, here some new traces

WSGI app 0 (mountpoint='') ready in 2 seconds on interpreter 0x55dc54f91f70 pid: 1552112 (default app)
[busyness] 5s average busyness is at 0%, cheap one of 4 running workers
terminate called after throwing an instance of 'greenlet::TypeError'
  what():  Expected a greenlet
worker 2 killed successfully (pid: 1539874)

@icu0755
Copy link

icu0755 commented Feb 23, 2023

Hi all!
I got a similar backtrace in my app and could narrow it down to the following code.

# wsgi.py
from boolean_parser.parsers import Parser

def application(env, start_response):
    start_response('200 OK', [('Content-Type','text/html')])
    return [b"Hello World"]

I start it with the following uwsgi settings

FROM python:3.8-slim AS app
RUN apt-get update \
    && apt-get install --no-install-recommends -y \
    build-essential
RUN pip install uwsgi boolean-parser
ADD wsgi.py /code/
CMD ["uwsgi", "--http", ":8000", "--wsgi-file", "/code/wsgi.py", "--processes", "2", "--strict", "--max-requests", "1", "--master", "--lazy-apps"]
docker build . -t foo
docker run -it --rm -p 8000:8000 foo

--max-requests 1 makes it shutdown the uwsgi worker after 1 request which triggers the error.
I've noticed that it only works with --lazy-apps. Not sure if it is a greenlet issue.

Here are the logs

...The work of process 8 is done. Seeya!
!!! uWSGI process 8 got Segmentation Fault !!!
*** backtrace of 8 ***
uwsgi(uwsgi_backtrace+0x2f) [0x5562b3af009f]
uwsgi(uwsgi_segfault+0x23) [0x5562b3af0463]
/lib/x86_64-linux-gnu/libc.so.6(+0x38d60) [0x7fc291ac1d60]
/usr/local/lib/python3.8/site-packages/greenlet/_greenlet.cpython-38-x86_64-linux-gnu.so(_ZN24ThreadState_DestroyNoGIL19DestroyQueueWithGILEPv+0x230) [0x7fc2909f1640]
/usr/local/lib/libpython3.8.so.1.0(+0xe9f2e) [0x7fc291d47f2e]
/usr/local/lib/libpython3.8.so.1.0(+0xe74a2) [0x7fc291d454a2]
/usr/local/lib/libpython3.8.so.1.0(_PyFunction_Vectorcall+0x1ca) [0x7fc291db073a]
/usr/local/lib/libpython3.8.so.1.0(+0x152552) [0x7fc291db0552]
/usr/local/lib/libpython3.8.so.1.0(_PyObject_CallMethodId+0xb3) [0x7fc291e02d23]
/usr/local/lib/libpython3.8.so.1.0(+0x1cb0c8) [0x7fc291e290c8]
/usr/local/lib/libpython3.8.so.1.0(Py_FinalizeEx+0x20) [0x7fc291e28f20]
uwsgi(uwsgi_plugins_atexit+0x71) [0x5562b3aed401]
/lib/x86_64-linux-gnu/libc.so.6(+0x3b4d7) [0x7fc291ac44d7]
/lib/x86_64-linux-gnu/libc.so.6(+0x3b67a) [0x7fc291ac467a]
uwsgi(+0x3795f) [0x5562b3aa695f]
uwsgi(simple_goodbye_cruel_world+0x59) [0x5562b3aefa19]
uwsgi(+0x80a58) [0x5562b3aefa58]
uwsgi(uwsgi_close_request+0x54f) [0x5562b3aa767f]
uwsgi(simple_loop_run+0xd0) [0x5562b3aec280]
uwsgi(simple_loop+0x6f) [0x5562b3aec34f]
uwsgi(uwsgi_ignition+0x279) [0x5562b3af0799]
uwsgi(uwsgi_worker_run+0x25e) [0x5562b3af4d7e]
uwsgi(uwsgi_run+0x434) [0x5562b3af52d4]
uwsgi(+0x3470e) [0x5562b3aa370e]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xea) [0x7fc291aacd0a]
uwsgi(_start+0x2a) [0x5562b3aa373a]
*** end of backtrace ***

I hope I could help :)

@alexmv
Copy link

alexmv commented Mar 1, 2023

This bisects to c7841e3, and can be simplified to:

from greenlet import getcurrent

getcurrent()


def application(env, start_response):
    start_response("200 OK", [("Content-Type", "text/html")])
    return [b"Hello World"]

I found more repeatability with stopping the Docker container, rather than making a request:

#!/usr/bin/env bash
set -eux

cd "$(dirname "$0")"
docker build . -t uwsgi-greenlet-crash
(
        sleep 3
        docker stop "$(docker ps -ql)" -t 1
) &
docker run -it --rm -p 8000:8000 uwsgi-greenlet-crash

@alexmv
Copy link

alexmv commented Mar 20, 2023

I lied -- that commit happened to pass the race condition, and wasn't the actual culprit.

AFAICT a421362 is. During ThreadStateCreator_Destroy the state->borrow_main_greenlet() gets pushed onto the cleanup queue, but when _ThreadStateCreator_DestroyAll is called immediately after, it pops off a null.

rectalogic added a commit to rectalogic/greenlet that referenced this issue May 10, 2023
@rectalogic
Copy link
Contributor

Working from latest master, this issue repros every time for me. It looks like the tricks it is playing aren't working:

// We play some tricks with placement new to be able to allocate this
// object statically still, so that references to its members don't
// incur an extra pointer indirection.

Adding some debugging I see the GreenletGlobals constructor is called twice, then the destructor is called while thread_states_to_destroy still contains a thread state - then it attempts to destroy that now gone (and never destroyed) thread state.

This is the debugging I added master...rectalogic:greenlet:crash#diff-456554c7ddb2c4b7e56ce6326b9f03dedb03a89a243b924f7fc312737f83b22c

And this is what it prints (see the lines prefixed DEBUG):

spawned uWSGI master process (pid: 35)
spawned uWSGI worker 1 (pid: 36, cores: 1)
DEBUG GreenletGlobals(dummy) -197907296
DEBUG GreenletGlobals() -197907296
WSGI app 0 (mountpoint='') ready in 0 seconds on interpreter 0x555cd6ff2ba0 pid: 36 (default app)
^CSIGINT/SIGTERM received...killing workers...
DEBUG queue_to_destroy -195593936
DEBUG ~GreenletGlobals() -197907296 thread_states_to_destroy.size()==1
DEBUG take_next_to_destroy 0
DEBUG DestroyWithGIL 0
!!! uWSGI process 36 got Segmentation Fault !!!
*** backtrace of 36 ***
uwsgi(uwsgi_backtrace+0x2f) [0x555cd5cc10bf]
uwsgi(uwsgi_segfault+0x23) [0x555cd5cc1483]
/lib/x86_64-linux-gnu/libc.so.6(+0x38d60) [0x7f12f4791d60]
/usr/local/lib/python3.10/site-packages/greenlet/_greenlet.cpython-310-x86_64-linux-gnu.so(_ZN24ThreadState_DestroyNoGIL19DestroyQueueWithGILEPv+0x62) [0x7f12f4339132]
/usr/local/lib/libpython3.10.so.1.0(+0x99e11) [0x7f12f49c6e11]
/usr/local/lib/libpython3.10.so.1.0(+0x67851) [0x7f12f4994851]
/usr/local/lib/libpython3.10.so.1.0(+0x165d74) [0x7f12f4a92d74]
/usr/local/lib/libpython3.10.so.1.0(+0x160bb6) [0x7f12f4a8dbb6]
/usr/local/lib/libpython3.10.so.1.0(+0x1c1813) [0x7f12f4aee813]
/usr/local/lib/libpython3.10.so.1.0(_PyObject_CallMethodIdObjArgs+0x107) [0x7f12f4af2be7]
/usr/local/lib/libpython3.10.so.1.0(PyImport_ImportModuleLevelObject+0x293) [0x7f12f4aa4b83]
/usr/local/lib/libpython3.10.so.1.0(+0x1c99e8) [0x7f12f4af69e8]
/usr/local/lib/libpython3.10.so.1.0(+0x17068e) [0x7f12f4a9d68e]
/usr/local/lib/libpython3.10.so.1.0(_PyObject_MakeTpCall+0x80) [0x7f12f4a9a580]
/usr/local/lib/libpython3.10.so.1.0(+0x1607d7) [0x7f12f4a8d7d7]
/usr/local/lib/libpython3.10.so.1.0(PyObject_CallFunction+0xaf) [0x7f12f4aeb74f]
/usr/local/lib/libpython3.10.so.1.0(PyImport_Import+0xdc) [0x7f12f4af682c]
/usr/local/lib/libpython3.10.so.1.0(PyImport_ImportModule+0x19) [0x7f12f4af6739]
uwsgi(uwsgi_python_atexit+0x9a) [0x555cd5ccc3fa]
uwsgi(uwsgi_plugins_atexit+0x71) [0x555cd5cbe421]
/lib/x86_64-linux-gnu/libc.so.6(+0x3b4d7) [0x7f12f47944d7]
/lib/x86_64-linux-gnu/libc.so.6(+0x3b67a) [0x7f12f479467a]
uwsgi(+0x3797f) [0x555cd5c7797f]
uwsgi(end_me+0x25) [0x555cd5cbe465]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x13140) [0x7f12f4db8140]
/lib/x86_64-linux-gnu/libc.so.6(epoll_wait+0x16) [0x7f12f4854d16]
uwsgi(event_queue_wait+0x25) [0x555cd5cb4675]
uwsgi(wsgi_req_accept+0x10a) [0x555cd5c7523a]
uwsgi(simple_loop_run+0xb6) [0x555cd5cbd286]
uwsgi(simple_loop+0x6f) [0x555cd5cbd36f]
uwsgi(uwsgi_ignition+0x279) [0x555cd5cc17b9]
uwsgi(uwsgi_worker_run+0x25e) [0x555cd5cc5d9e]
uwsgi(uwsgi_run+0x434) [0x555cd5cc62f4]
uwsgi(+0x3472e) [0x555cd5c7472e]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xea) [0x7f12f477cd0a]
uwsgi(_start+0x2a) [0x555cd5c7475a]
*** end of backtrace ***

rectalogic added a commit to rectalogic/greenlet that referenced this issue May 11, 2023
The destructor runs before the last take_next_to_destroy - and so it
gets a NULL pointer and crashes.

Fixes python-greenlet#330
jamadden added a commit that referenced this issue Jun 20, 2023
alexmv added a commit to alexmv/zulip that referenced this issue Aug 23, 2023
alexmv added a commit to alexmv/zulip that referenced this issue Aug 24, 2023
alexmv added a commit to alexmv/zulip that referenced this issue Aug 30, 2023
alexmv added a commit to alexmv/zulip that referenced this issue Aug 30, 2023
timabbott pushed a commit to zulip/zulip that referenced this issue Aug 30, 2023
@jamadden
Copy link
Contributor

I apologize, it looks like I never said thanks for all the detective work in this thread.

Vector73 pushed a commit to Vector73/zulip that referenced this issue Sep 7, 2023
freshpex pushed a commit to freshpex/zulip that referenced this issue Sep 14, 2023
freshpex pushed a commit to freshpex/zulip that referenced this issue Sep 14, 2023
andersk pushed a commit to andersk/zulip that referenced this issue Sep 15, 2023
netbsd-srcmastr pushed a commit to NetBSD/pkgsrc that referenced this issue Oct 29, 2023
Switch to wheel.mk.

3.0.1 (2023-10-25)
==================

- Fix a potential crash on Python 3.8 at interpreter shutdown time.
  This was a regression from earlier 3.0.x releases. Reported by Matt
  Wozniski in `issue 376 <https://github.com/python-greenlet/greenlet/issues/376>`_.



3.0.0 (2023-10-02)
==================

- No changes from 3.0rc3 aside from the version number.


3.0.0rc3 (2023-09-12)
=====================

- Fix an intermittent error during process termination on some
  platforms (GCC/Linux/libstdc++).


3.0.0rc2 (2023-09-09)
=====================

- Fix some potential bugs (assertion failures and memory leaks) in
  previously-untested error handling code. In some cases, this means
  that the process will execute a controlled ``abort()`` after severe
  trouble when previously the process might have continued for some
  time with a corrupt state. It is unlikely those errors occurred in
  practice.
- Fix some assertion errors and potential bugs with re-entrant
  switches.
- Fix a potential crash when certain compilers compile greenlet with
  high levels of optimization. The symptom would be that switching to
  a greenlet for the first time immediately crashes.
- Fix a potential crash when the callable object passed to the
  greenlet constructor (or set as the ``greenlet.run`` attribute) has
  a destructor attached to it that switches. Typically, triggering
  this issue would require an unlikely subclass of
  ``greenlet.greenlet``.
- Python 3.11+: Fix rare switching errors that could occur when a
  garbage collection was triggered during the middle of a switch, and
  Python-level code in ``__del__`` or weakref callbacks switched to a
  different greenlet and ultimately switched back to the original
  greenlet. This often manifested as a ``SystemError``: "switch
  returned NULL without an exception set."

For context on the fixes, see `gevent issue #1985
<https://github.com/gevent/gevent/issues/1985>`_.

3.0.0rc1 (2023-09-01)
=====================

- Windows wheels are linked statically to the C runtime in an effort
  to prevent import errors on systems without the correct C runtime
  installed. It's not clear if this will make the situation better or
  worse, so please share your experiences in `issue 346
  <https://github.com/python-greenlet/greenlet/issues/346>`_.

  Note that this only applies to the binary wheels found on PyPI.
  Building greenlet from source defaults to the shared library. Set
  the environment variable ``GREENLET_STATIC_RUNTIME=1`` at build time
  to change that.
- Build binary wheels for Python 3.12 on macOS.
- Fix compiling greenlet on a debug build of CPython 3.12. There is
  `one known issue
  <https://github.com/python-greenlet/greenlet/issues/368>`_ that
  leads to an interpreter crash on debug builds.
- Python 3.12: Fix walking the frame stack of suspended greenlets.
  Previously accessing ``glet.gr_frame.f_back`` would crash due to
  `changes in CPython's undocumented internal frame handling <https://github.com/python/cpython/commit/1e197e63e21f77b102ff2601a549dda4b6439455>`_.

Platforms
---------
- Now, greenlet *may* compile and work on Windows ARM64 using
  llvm-mingw, but this is untested and unsupported. See `PR
  <https://github.com/python-greenlet/greenlet/pull/224>`_ by Adrian
  Vladu.
- Now, greenlet *may* compile and work on LoongArch64 Linux systems,
  but this is untested and unsupported. See `PR 257
  <https://github.com/python-greenlet/greenlet/pull/257/files>`_ by merore.

Known Issues
------------

- There may be (very) subtle issues with tracing on Python 3.12, which
  has redesigned the entire tracing infrastructure.

3.0.0a1 (2023-06-21)
====================

- Build binary wheels for S390x Linux. See `PR 358
  <https://github.com/python-greenlet/greenlet/pull/358>`_ from Steven
  Silvester.
- Fix a rare crash on shutdown seen in uWSGI deployments. See `issue
  330 <https://github.com/python-greenlet/greenlet/issues/330>`_ and `PR 356
  <https://github.com/python-greenlet/greenlet/pull/356>`_ from Andrew
  Wason.
- Make the platform-specific low-level C/assembly snippets stop using
  the ``register`` storage class. Newer versions of standards remove
  this storage class, and it has been generally ignored by many
  compilers for some time. See `PR 347
  <https://github.com/python-greenlet/greenlet/pull/347>`_ from Khem
  Raj.
- Add initial support for Python 3.12. See `issue
  <https://github.com/python-greenlet/greenlet/issues/323>`_ and `PR
  <https://github.com/python-greenlet/greenlet/pull/327>`_; thanks go
  to (at least) Michael Droettboom, Andreas Motl, Thomas A Caswell,
  raphaelauv, Hugo van Kemenade, Mark Shannon, and Petr Viktorin.
- Remove support for end-of-life Python versions, including Python
  2.7, Python 3.5 and Python 3.6.
- Require a compiler that supports ``noinline`` directives. See
  `issue 271
  <https://github.com/python-greenlet/greenlet/issues/266>`_.
- Require a compiler that supports C++11.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants