Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mod_wsgi child thread segfaults when something is printed to stdout #147

Closed
bkabrda opened this issue Aug 15, 2016 · 31 comments
Closed

mod_wsgi child thread segfaults when something is printed to stdout #147

bkabrda opened this issue Aug 15, 2016 · 31 comments

Comments

@bkabrda
Copy link

bkabrda commented Aug 15, 2016

Hi, so I've been using mod_wsgi express successfully for couple of months now. Today I updated to 4.5.4 and part of my app that did a forgotten debug print() started to fail:

[Mon Aug 15 09:25:41.253064 2016] [core:notice] [pid 1:tid 140002568149184] AH00052: child pid 2511 exit signal Segmentation fault (11)

When I downgraded to 4.5.3, the issue disappeared, which makes me think this is a regression in 4.5.4.

I'm using Python 3.5.1 and httpd 2.4.23; this is how I start my application:

mod_wsgi-express start-server --entry-point myapp --application-type module --callable-object app --port 5000 --user myapp --group myapp --processes 4 --threads 10 --reload-on-changes --access-log --access-log-format "%h %l %u %t \"%r\" %>s %b %{Referer}i \"%{User-agent}i\"" --log-to-terminal --python-eggs /home/myapp --include-file /etc/httpd/conf.d/myapp-httpd.conf --modules-directory /usr/lib64/httpd/modules
@GrahamDumpleton
Copy link
Owner

Most likely a regression. The person who asked for the changes related to logging never responded when asked to test them prior to release. Eventually came back complaining that couldn't download official release with changes already in it. After pointing out once again they need to test from repo, still didn't get an acknowledgement that okay and didn't cause any issues. My own quick testing didn't pick up an issue, so couldn't really keep holding it back. Obviously missed something. Will look into it.

@bkabrda
Copy link
Author

bkabrda commented Aug 15, 2016

No problem. I would have looked into this myself but unfortunately have no time to do that ATM. Feel free to ping me if you need some more information about my environment or how to reproduce this.

@GrahamDumpleton
Copy link
Owner

Where would the print() have been called from? Global scope in module when being imported? In the context of a handler for a request?

What made you determine that it was the print() if the process was crashing? Did the print() cause the process to crash the very first time was called, or at some random call?

@bkabrda
Copy link
Author

bkabrda commented Aug 15, 2016

  • I tried calling the print from a flask view function, so handler for a request.
  • The thread crashed every time when the print was there and I hit the endpoint that contained it. It never crashed when I removed the print. (I didn't do any more sophisticated debugging that that :))

@GrahamDumpleton
Copy link
Owner

Was there anything fancy about the print() call, or was just plain print("string")?

@GrahamDumpleton
Copy link
Owner

What OS are you on? Linux?

If you have gdb installed, can you try adding the options:

--debug-mode --enable-gdb

and if that doesn't cause startup itself to fail, see if that gives you a back trace in gdb for where it crashed.

If you are on MacOS X this may well not work. Not sure why but trying to use gdb on MacOS X causes Apache to crash in a strange way.

@bkabrda
Copy link
Author

bkabrda commented Aug 15, 2016

It's Fedora 24 docker container, I didn't realize that I should mention that as well. I can try using --debug-mode and --enable-gdb, but I'm not sure what that will do in docker container... I'll try and see.

@bkabrda
Copy link
Author

bkabrda commented Aug 15, 2016

Oh, and to answer the previous question, it was just a plain print("something"), no fancy stuff.

@GrahamDumpleton
Copy link
Owner

Is unlikely you would have gdb in Docker container. Even if did, would likely need to run container manually and with -it option in an interactive terminal.

@GrahamDumpleton
Copy link
Owner

Hmm, wondering whether issue may be Python 3. I only tested with Python 2.7 as wouldn't have expected Python 3 to be different as far as changes made.

@GrahamDumpleton
Copy link
Owner

Yep, can crash it with Python 3.4.

@GrahamDumpleton
Copy link
Owner

Location found. Due to differences in file type objects in Python 3. Not sure what yet.

Program received signal SIGSEGV, Segmentation fault.
0x000000010000af33 in ap_log_rerror_ ()
(gdb) where
#0  0x000000010000af33 in ap_log_rerror_ ()
#1  0x000000010131c017 in Log_call ()
   from /Users/graham/Python/mod_wsgi-py34/lib/python3.4/site-packages/mod_wsgi/server/mod_wsgi-py34.so
#2  0x000000010131c0a1 in Log_flush ()
   from /Users/graham/Python/mod_wsgi-py34/lib/python3.4/site-packages/mod_wsgi/server/mod_wsgi-py34.so
#3  0x0000000101347a08 in PyObject_Call ()
   from /Library/Frameworks/Python.framework/Versions/3.4/Python
#4  0x0000000101348206 in callmethod ()
   from /Library/Frameworks/Python.framework/Versions/3.4/Python
#5  0x00000001013491f7 in _PyObject_CallMethodId_SizeT ()
   from /Library/Frameworks/Python.framework/Versions/3.4/Python
#6  0x0000000101347a08 in PyObject_Call ()
   from /Library/Frameworks/Python.framework/Versions/3.4/Python
#7  0x0000000101348206 in callmethod ()
   from /Library/Frameworks/Python.framework/Versions/3.4/Python
#8  0x00000001013491f7 in _PyObject_CallMethodId_SizeT ()
   from /Library/Frameworks/Python.framework/Versions/3.4/Python
#9  0x00000001014a085a in textiowrapper_close ()
   from /Library/Frameworks/Python.framework/Versions/3.4/Python
#10 0x0000000101347a08 in PyObject_Call ()
   from /Library/Frameworks/Python.framework/Versions/3.4/Python
#11 0x0000000101415d97 in PyEval_CallObjectWithKeywords ()
   from /Library/Frameworks/Python.framework/Versions/3.4/Python
#12 0x0000000101314183 in wsgi_execute_script ()
   from /Users/graham/Python/mod_wsgi-py34/lib/python3.4/site-packages/mod_wsgi/server/mod_wsgi-py34.so
#13 0x00000001013095ae in wsgi_hook_handler ()
   from /Users/graham/Python/mod_wsgi-py34/lib/python3.4/site-packages/mod_wsgi/server/mod_wsgi-py34.so
#14 0x0000000100001a85 in ap_run_handler ()
#15 0x000000010000232e in ap_invoke_handler ()
#16 0x0000000100063bd5 in ap_process_async_request ()
#17 0x0000000100063cb4 in ap_process_request ()
#18 0x000000010005e428 in ap_process_http_sync_connection ()
#19 0x000000010005e002 in ap_process_http_connection ()
#20 0x00000001000209d5 in ap_run_process_connection ()
#21 0x000000010002102e in ap_process_connection ()
#22 0x000000010007542d in child_main ()
#23 0x000000010007409e in make_child ()
#24 0x00000001000725f0 in prefork_run ()
#25 0x0000000100024515 in ap_run_mpm ()
#26 0x000000010000f308 in main ()

@GrahamDumpleton
Copy link
Owner

@bkabrda Can you try your application with:

pip install -U https://codeload.github.com/GrahamDumpleton/mod_wsgi/tar.gz/develop#mod_wsgi-4.5.5

and see if that eliminates the problem?

@bkabrda
Copy link
Author

bkabrda commented Aug 15, 2016

Sure, give me like an hour and I'll post back the results... I'll have to rebuild a pretty big image and then try testing thoroughly.

@GrahamDumpleton
Copy link
Owner

I am going to take a punt and release it, it can't be worse. My own Docker images are a pain to rebuild based on a branch.

@bkabrda
Copy link
Author

bkabrda commented Aug 15, 2016

Hmm, it seems that this doesn't fix the problem. The issue is still the same.

@GrahamDumpleton
Copy link
Owner

Very important to make sure is now running 4.5.5. The #mod_wsgi-4.5.5 tag on end is really important if running pip manually to install correct version.

@bkabrda
Copy link
Author

bkabrda commented Aug 15, 2016

Yup, I copied the whole string and also pip3 lists mod_wsgi as 4.5.5 in the running container:

$ docker exec -ti c45427b7041e /bin/bash
[root@c45427b7041e /]# pip3 list | grep mod-wsgi
mod-wsgi (4.5.5)

Here's something interesting - I tried using faulthandler.enable() in the global context and it raised as well - I'm guessing it's trying to log something:

[Mon Aug 15 12:43:17.939477 2016] [wsgi:error] [pid 408:tid 139673613674688] Traceback (most recent call last):
[Mon Aug 15 12:43:17.939512 2016] [wsgi:error] [pid 408:tid 139673613674688]   File "/tmp/mod_wsgi-localhost:5000:0/handler.wsgi", line 94, in <module>
[Mon Aug 15 12:43:17.939517 2016] [wsgi:error] [pid 408:tid 139673613674688]     recorder_directory=recorder_directory)
[Mon Aug 15 12:43:17.939525 2016] [wsgi:error] [pid 408:tid 139673613674688]   File "/usr/lib64/python3.5/site-packages/mod_wsgi/server/__init__.py", line 1275, in __init__
[Mon Aug 15 12:43:17.939527 2016] [wsgi:error] [pid 408:tid 139673613674688]     __import__(entry_point)
[Mon Aug 15 12:43:17.939533 2016] [wsgi:error] [pid 408:tid 139673613674688]   File "/usr/lib/python3.5/site-packages/myapp/__init__.py", line 78, in <module>
[Mon Aug 15 12:43:17.939555 2016] [wsgi:error] [pid 408:tid 139673613674688]     app = create_app()
[Mon Aug 15 12:43:17.939563 2016] [wsgi:error] [pid 408:tid 139673613674688]   File "/usr/lib/python3.5/site-packages/myapp/__init__.py", line 24, in create_app
[Mon Aug 15 12:43:17.939566 2016] [wsgi:error] [pid 408:tid 139673613674688]     from .api_v1 import api_v1
[Mon Aug 15 12:43:17.939573 2016] [wsgi:error] [pid 408:tid 139673613674688]   File "/usr/lib/python3.5/site-packages/myapp/api_v1.py", line 45, in <module>
[Mon Aug 15 12:43:17.939577 2016] [wsgi:error] [pid 408:tid 139673613674688]     faulthandler.enable()
[Mon Aug 15 12:43:17.939592 2016] [wsgi:error] [pid 408:tid 139673613674688] OSError: Apache/mod_wsgi log object is not associated with a file descriptor.

@GrahamDumpleton
Copy link
Owner

The fault handler is being naughty. It is assuming that stdout/stderr streams are associated with a file descriptor that it can write to directly. That isn't the case under mod_wsgi.

@bkabrda
Copy link
Author

bkabrda commented Aug 15, 2016

Ah, ok, sorry for the confusion then. Anything else I can try to help you debugging this?

@GrahamDumpleton
Copy link
Owner

I have pushed a further change. Not confident that this will solve it. I can't replicate with this change, but after first change I also wasn't seeing issue until you came back and said it was still failing, at which point it started showing for me again.

@bkabrda
Copy link
Author

bkabrda commented Aug 15, 2016

Ok, I'll test again, give me couple of minutes (it seems that an hour was an overestimation :))

@GrahamDumpleton
Copy link
Owner

BTW. Instead of that URL gave before, can use:

@bkabrda
Copy link
Author

bkabrda commented Aug 15, 2016

I tried rebuilding the image with the develop.zip archive, but it's still failing for me with the same error message.

@GrahamDumpleton
Copy link
Owner

Yep it will fail. I have been getting caught over the trap of using python setup.py install and pip into same Python installation. Hasn't always been giving me the correct version.

@GrahamDumpleton
Copy link
Owner

Giving up for the night. Getting memory corruption all over the place now and can’t even get sane stack traces from gdb. Nothing making sense. Will start over tomorrow.

@bkabrda
Copy link
Author

bkabrda commented Aug 15, 2016

Sure. Give me a shout when you need testing again. Good night! :)

@GrahamDumpleton
Copy link
Owner

Reason for problem found. Under Python 3 file like objects when used to replace sys.stdout and sys.stder have to be wrapped in a TextIOWrapper instance. The new code wasn't taking this into consideration and I was casting the TextIOWrapper instance to my file like object type (implemented as C). So expecting a completely different object layout and was blowing up.

@bkabrda
Copy link
Author

bkabrda commented Aug 16, 2016

4.5.6 works for me, thanks!

@GrahamDumpleton
Copy link
Owner

Closing this issue as presume all fixed now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants