New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segfault in 1.2.0 #916

Closed
nedbat opened this Issue Dec 25, 2016 · 15 comments

Comments

Projects
None yet
5 participants
@nedbat

nedbat commented Dec 25, 2016

  • gevent version: 1.2.0
  • Python version: 2.7.13
  • Operating System: OS X 10.10.5

Description:

The coverage.py test suite includes concurrency tests using gevent. They pass with gevent 1.1.2, they segfault with 1.2.0.

What I've run:

from gevent import monkey
monkey.patch_thread()
import threading
import gevent.queue as queue

class Producer(threading.Thread):
    def __init__(self, limit, q):
        threading.Thread.__init__(self)
        self.limit = limit
        self.q = q

    def run(self):
        for i in range(self.limit):
            self.q.put(i)
        self.q.put(None)

class Consumer(threading.Thread):
    def __init__(self, q, qresult):
        threading.Thread.__init__(self)
        self.q = q
        self.qresult = qresult

    def run(self):
        sum = 0
        while True:
            i = self.q.get()
            if i is None:
                break
            sum += i
        self.qresult.put(sum)

def sum_range(limit):
    q = queue.Queue()
    qresult = queue.Queue()
    c = Consumer(q, qresult)
    p = Producer(limit, q)
    c.start()
    p.start()

    p.join()
    c.join()
    return qresult.get()

print(sum_range(1000))

Under 1.1.2, this prints 499500. Under 1.2.0, it segfaults. I've looked through the changelog, but nothing jumps out at me as a change that would affect this.

@jamadden

This comment has been minimized.

Member

jamadden commented Dec 25, 2016

Thanks for the report! Are you using the pre-built binary wheels from PyPI or did you build gevent yourself? If the former, could you see if it still happens if you install from source?

@jamadden jamadden added the bug label Dec 25, 2016

@nedbat

This comment has been minimized.

nedbat commented Dec 25, 2016

I simply used "pip install gevent==1.2.0".

$ .tox/py27/bin/pip install --no-cache-dir gevent==1.2.0
Collecting gevent==1.2.0
  Downloading gevent-1.2.0-cp27-cp27m-macosx_10_6_intel.whl (1.1MB)
    100% |████████████████████████████████| 1.1MB 4.4MB/s
Requirement already satisfied: greenlet>=0.4.10 in ./.tox/py27/lib/python2.7/site-packages (from gevent==1.2.0)
Installing collected packages: gevent
Successfully installed gevent-1.2.0

BTW, it also segfaults with 3.4.5, 3.5.2, and 3.6.0. It works with 3.3.6.

I'll try installing from source.

@nedbat

This comment has been minimized.

nedbat commented Dec 25, 2016

When I install into 2.7 from source, it works.

@nedbat

This comment has been minimized.

nedbat commented Dec 25, 2016

Installing from source in 3.6 also works.

@jamadden

This comment has been minimized.

Member

jamadden commented Dec 25, 2016

Thanks, I had that suspicion. (On 3.3, I haven't been able to post wheels in awhile because Cython is broken for that version, so it probably got installed from source.)

The wheels were generated by the python.org Python distribution's standard distutils with no CFLAGS, etc, set in the environment (because we've run into issues with that before). They were generated on 10.12, the current OS X/macOS release, with the current Xcode, which is probably the source of the problem. But it's not clear to me why there would be binary compatibility issues with that, because the python.org sets 10.6 to be the target. I'll have to see if I can dig up some info.

Can you post the crash report with the segfault stacktrace? That might help.

@monitorius

This comment has been minimized.

monitorius commented Dec 25, 2016

I've got the same problem on OSX 10.11.6, Python 3.6.0

~/temp/geventbug$ pip list
gevent (1.2.0)
greenlet (0.4.11)
pip (9.0.1)
setuptools (28.8.0)

~/temp/geventbug$ python -c "from gevent.pool import Pool; Pool(1)"
Segmentation fault: 11

And backtrace from core dump (never did it before, hope that's what you need)

~/temp/geventbug$ lldb -c /cores/core.38078
(lldb) target create --core "/cores/core.38078"
warning: (x86_64) /cores/core.38078 load command 96 LC_SEGMENT_64 has a fileoff + filesize (0x2490f000) that extends beyond the end of the file (0x2490e000), the segment will be truncated to match
Core file '/cores/core.38078' (x86_64) was loaded.
(lldb) bt
* thread #1: tid = 0x0000, 0x0000000000000000, stop reason = signal SIGSTOP
  * frame #0: 0x0000000000000000
    frame #1: 0x000000010beda27e corecext.cpython-36m-darwin.so`loop_init + 62
    frame #2: 0x000000010beddd37 corecext.cpython-36m-darwin.so`__pyx_pw_6gevent_5libev_8corecext_4loop_1__init__ + 743
    frame #3: 0x000000010b6e48d9 python`type_call + 313
    frame #4: 0x000000010b684f55 python`_PyObject_FastCallDict + 309
    frame #5: 0x000000010b685385 python`_PyObject_FastCallKeywords + 197
    frame #6: 0x000000010b755889 python`call_function + 217
    frame #7: 0x000000010b752721 python`_PyEval_EvalFrameDefault + 25761
    frame #8: 0x000000010b756899 python`_PyEval_EvalCodeWithName + 3641
    frame #9: 0x000000010b75743b python`_PyFunction_FastCallDict + 923
    frame #10: 0x000000010b684f88 python`_PyObject_FastCallDict + 360
    frame #11: 0x000000010b6850a5 python`_PyObject_Call_Prepend + 149
    frame #12: 0x000000010b684cd5 python`PyObject_Call + 101
    frame #13: 0x000000010b6e869e python`slot_tp_init + 158
    frame #14: 0x000000010b6e48d9 python`type_call + 313
    frame #15: 0x000000010b684cd5 python`PyObject_Call + 101
    frame #16: 0x000000010b75299f python`_PyEval_EvalFrameDefault + 26399
    frame #17: 0x000000010b756899 python`_PyEval_EvalCodeWithName + 3641
    frame #18: 0x000000010b757086 python`fast_function + 742
    frame #19: 0x000000010b755969 python`call_function + 441
    frame #20: 0x000000010b752692 python`_PyEval_EvalFrameDefault + 25618
    frame #21: 0x000000010b7572c4 python`_PyFunction_FastCallDict + 548
    frame #22: 0x000000010b684f88 python`_PyObject_FastCallDict + 360
    frame #23: 0x000000010b6850a5 python`_PyObject_Call_Prepend + 149
    frame #24: 0x000000010b684cd5 python`PyObject_Call + 101
    frame #25: 0x000000010b6e869e python`slot_tp_init + 158
    frame #26: 0x000000010b6e48d9 python`type_call + 313
    frame #27: 0x000000010b684f55 python`_PyObject_FastCallDict + 309
    frame #28: 0x000000010b755889 python`call_function + 217
    frame #29: 0x000000010b752692 python`_PyEval_EvalFrameDefault + 25618
    frame #30: 0x000000010b756899 python`_PyEval_EvalCodeWithName + 3641
    frame #31: 0x000000010b757086 python`fast_function + 742
    frame #32: 0x000000010b755969 python`call_function + 441
    frame #33: 0x000000010b752692 python`_PyEval_EvalFrameDefault + 25618
    frame #34: 0x000000010b756899 python`_PyEval_EvalCodeWithName + 3641
    frame #35: 0x000000010b75743b python`_PyFunction_FastCallDict + 923
    frame #36: 0x000000010b684f88 python`_PyObject_FastCallDict + 360
    frame #37: 0x000000010b6850a5 python`_PyObject_Call_Prepend + 149
    frame #38: 0x000000010b684cd5 python`PyObject_Call + 101
    frame #39: 0x000000010b6e869e python`slot_tp_init + 158
    frame #40: 0x000000010b6e48d9 python`type_call + 313
    frame #41: 0x000000010b684f55 python`_PyObject_FastCallDict + 309
    frame #42: 0x000000010b755889 python`call_function + 217
    frame #43: 0x000000010b752692 python`_PyEval_EvalFrameDefault + 25618
    frame #44: 0x000000010b756899 python`_PyEval_EvalCodeWithName + 3641
    frame #45: 0x000000010b74c1c4 python`PyEval_EvalCode + 100
    frame #46: 0x000000010b780bb7 python`PyRun_StringFlags + 151
    frame #47: 0x000000010b780ae5 python`PyRun_SimpleStringFlags + 69
    frame #48: 0x000000010b798f30 python`Py_Main + 2528
    frame #49: 0x000000010b679d8c python`main + 236
    frame #50: 0x00007fff8e6d05ad libdyld.dylib`start + 1

Also, there is no segfault with gevent==1.2a1

@jamadden

This comment has been minimized.

Member

jamadden commented Dec 27, 2016

Thanks, the stack trace could be helpful. I haven't had a chance to investigate any further on this yet, though.

Also, there is no segfault with gevent==1.2a1

I'm pretty sure those were built with an older Xcode and/or OS X version, but they were built with the same environment and CPython versions (except for 3.6, obviously). So that is pointing the finger more strongly in the Xcode/OS X direction.

@trollknurr

This comment has been minimized.

trollknurr commented Dec 28, 2016

After playing with debugger, i found call where segfault happens.

May be it will help.

python 2.7.12
gevent==1.2.0
OSX 10.11.2

@faith0811

This comment has been minimized.

faith0811 commented Dec 30, 2016

Same issue here on Python version 3.5.1 OSX version 10.11.2.
And after I uninstall 1.2.0 and install 1.2a2 back on, the issue is still there.
But in another environment, I installed 1.2a2 from scratch, it works fine.

harti2006 pushed a commit to harti2006/zmon-slo-metrics that referenced this issue Jan 3, 2017

André Hartmann
pin gevent to version <1.2.0
On Mac I ran into "segmentation fault" issues.
The issue is already reported on github gevent/gevent#916
@jamadden

This comment has been minimized.

Member

jamadden commented Jan 4, 2017

I believe I've tracked this down to the addition of the clock_gettime family of calls on macOS 10.12. libev will use these if they are available at build time, but these symbols aren't defined on earlier releases (good luck finding any documentation online about this, though). Python extensions are build with -undefined dynamic_lookup, though, which means that all symbols not explicitly linked too are looked up weakly at runtime. The net result is that on 10.11 and earlier, the binary contains a call to clock_gettime, which, when the extension is loaded, is a NULL pointer; hence the SEGFAULT.

$ python -c 'import platform, gevent; print(platform.mac_ver(), gevent.get_hub())'
Segmentation fault: 11

We can see the embedded symbol reference in the PyPI binary:

$ wget https://pypi.python.org/packages/58/af/d84caa4fe355c51874ef85f6f2a5a5868575def5a6c3a3ce87d1deea08fc/gevent-1.2.0-cp27-cp27m-macosx_10_6_intel.whl#md5=892a48c25bb2f48c863ac46d58159dc7
$ unzip gevent*whl
$ otool -I -V gevent/libev/corecext.so  | grep clock
0x000000000003b568  3647 _clock_gettime
0x0000000000045488  3647 _clock_gettime
$

I believe this can be worked around to produce builds that work on 10.11 and before (when building on 10.12) by setting the cpp flag "-D_DARWIN_FEATURE_CLOCK_GETTIME=0" (based on reading /usr/include/time.h). If I do this, the symbol is no longer referenced:

$ cd dist
$ unzip gevent*whl
$ otool -I -V gevent/libev/corecext.so  | grep clock
$

And loading that wheel into the system python running on 10.11 produces no crash (whereas the PyPI build crashes immediately):

$ python -c 'import platform, gevent; print(platform.mac_ver(), gevent.get_hub())'
((10.11', ('', '', ''), 'x86_64), <Hub at 0x1x... ref=0>
$

I'm attaching the wheel built this way. I would appreciate it if others can verify that it works for them, especially on relaeses/python's besides the 10.11 system python.

gevent-1.2.0-cp27-cp27m-macosx_10_6_intel.whl.zip

jamadden added a commit that referenced this issue Jan 4, 2017

Build OS X wheels with -D_DARWIN_FEATURE_CLOCK_GETTIME=0 for compatib…
…ility with pre-10.12 releases. Fixes #916. [skip ci]
@jamadden

This comment has been minimized.

Member

jamadden commented Jan 4, 2017

I've replaced the affected binary wheels on PyPI with versions using the macosx_10_12 tag so that they should only be installed on that version now.

@monitorius

This comment has been minimized.

monitorius commented Jan 5, 2017

There is no segfault on OSX 10.11.6, Python 3.6.0 anymore, thank you! (installed 1.2.0 from PyPI)

@jamadden

This comment has been minimized.

Member

jamadden commented Jan 5, 2017

There is no segfault on OSX 10.11.6, Python 3.6.0 anymore, thank you! (installed 1.2.0 from PyPI)

Ok, so that means you installed from source, right (since there are no 10.11 compatible wheels on PyPI anymore)?

Has anyone had a chance to try the wheel I posted in the comment above?

@monitorius

This comment has been minimized.

monitorius commented Jan 5, 2017

Well, no, not from source. That's how it looks:

# .env_bckp is a copy of virtualenv with segfaulting gevent
~/temp/geventbug$ cp -r .env_bckp .env 

~/temp/geventbug$ . .env/bin/activate

~/temp/geventbug$ python -c 'import platform, gevent; print(platform.mac_ver(), gevent.get_hub())'
Segmentation fault: 11

~/temp/geventbug$ pip list
gevent (1.2.0)
greenlet (0.4.11)
pip (9.0.1)
setuptools (28.8.0)

~/temp/geventbug$ pip uninstall gevent
Uninstalling gevent-1.2.0:
...
Proceed (y/n)? y
  Successfully uninstalled gevent-1.2.0

~/temp/geventbug$ pip install gevent==1.2.0 --no-cache-dir
Collecting gevent==1.2.0
  Downloading gevent-1.2.0.tar.gz (2.8MB)
    100% |████████████████████████████████| 2.8MB 12.0MB/s
Requirement already satisfied: greenlet>=0.4.10 in ./.env/lib/python3.6/site-packages (from gevent==1.2.0)
Installing collected packages: gevent
  Running setup.py install for gevent ... done
Successfully installed gevent-1.2.0

~/temp/geventbug$ python -c 'import platform, gevent; print(platform.mac_ver(), gevent.get_hub())'
('10.11.6', ('', '', ''), 'x86_64') <Hub at 0x1072e9df0 select default pending=0 ref=0>

But, as I found, at first I've installed gevent==1.2.0 from a pypi mirror (my company has internal mirror), so I've switched to global pypi in pip.conf to get your new version. If you're saying I can't get anything new from global pypi, then it all sounds weird. Need more feedback I suppose

@jamadden

This comment has been minimized.

Member

jamadden commented Jan 5, 2017

There is no new version of a wheel on PyPI for 10.11 or earlier (only for 10.12). The .tar.gz file you're downloading is the source. Whenever you see "Downloading ...tar.gz" and "Running setup.py ", that means you're installing from source.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment