Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BytesWarning in execnet when deserializing float #596

Closed
The-Compiler opened this issue Sep 7, 2020 · 15 comments
Closed

BytesWarning in execnet when deserializing float #596

The-Compiler opened this issue Sep 7, 2020 · 15 comments

Comments

@The-Compiler
Copy link
Member

The-Compiler commented Sep 7, 2020

Behold, this is probably one of the strangest issues I've ever had the joy (?) of debugging. I'd really appreciate some help, as I'm completely stumped here.

I run tests in qutebrowser with python -bb, which enables byte warnings and turns them into errors.

When I run the following in the qutebrowser repository:

  • tox -e py38-pyqt515 --notest
  • .tox/py38-pyqt515/bin/pip install pytest-xdist
  • tox -e py38-pyqt515 -- -n 2

I get crashing gateways with:

Traceback (most recent call last):
  File "/home/florian/proj/qutebrowser/git/.tox/py38-pyqt515/lib/python3.8/site-packages/execnet/gateway_base.py", line 855, in _local_receive
    data = loads_internal(data, channel, strconfig)
  File "/home/florian/proj/qutebrowser/git/.tox/py38-pyqt515/lib/python3.8/site-packages/execnet/gateway_base.py", line 1368, in loads_internal
    return Unserializer(io, channelfactory, strconfig).load()
  File "/home/florian/proj/qutebrowser/git/.tox/py38-pyqt515/lib/python3.8/site-packages/execnet/gateway_base.py", line 1177, in load
    loader(self)
  File "/home/florian/proj/qutebrowser/git/.tox/py38-pyqt515/lib/python3.8/site-packages/execnet/gateway_base.py", line 1218, in load_float
    self.stack.append(struct.unpack(FLOAT_FORMAT, binary)[0])
BytesWarning: Comparison between bytes and string

This happens when trying to deserialize the test duration (?). When I print self.stack before that line I see e.g.:

['testreport', {}, 'data', {'nodeid': 'tests/end2end/features/test_backforward_bdd.py::test_going_backforward', 'location': ('.tox/py38-pyqt515/lib/python3.8/site-packages/pytest_bdd/scenario.py', 197, 'test_going_backforward'), 'keywords': {'pytestmark': 1, 'usefixtures': 1, 'end2end': 1, 'tests/end2end/features/test_backforward_bdd.py': 1, 'git': 1, 'test_going_backforward': 1, 'gui': 1, '__pytest_bdd_counter__': 1, 'skip': 1, '__scenario__': 1}, 'outcome': 'skipped', 'longrepr': ('/home/florian/proj/qutebrowser/git/.tox/py38-pyqt515/lib/python3.8/site-packages/pytest_bdd/scenario.py', 198, 'Skipped: unconditional skip'), 'when': 'setup', 'user_properties': [], 'sections': []}, 'duration']

As you might guess, there's no comparison between bytes and strings involved, this is a completely normal struct.unpack call:

(Pdb) interact
*interactive*
>>> FLOAT_FORMAT
'!d'
>>> binary
b'?\xb7\xd8x\x89t\x00\x00'

Yet I can reproduce the error when I run the same thing by hand, inside the pdb environment:

>>> struct.unpack(FLOAT_FORMAT, binary)                                                                              
Traceback (most recent call last):
  File "<console>", line 1, in <module>
BytesWarning: Comparison between bytes and string
 
>>> struct.unpack('!d', b'?\xb7\xd8x\x89t\x00\x00')
Traceback (most recent call last):
  File "<console>", line 1, in <module>
BytesWarning: Comparison between bytes and string

No funny things are going on, as far as I can see:

>>> struct.unpack
<built-in function unpack>
>>> struct
<module 'struct' from '/usr/lib/python3.8/struct.py'>
>>> type(FLOAT_FORMAT)
<class 'str'>
>>> type(binary)
<class 'bytes'>

Here's where things are getting weird: Both according to the docs and according to the cpython code, both > and ! mean the same - but using > instead works!

>>> struct.unpack('>d', b'?\xb7\xd8x\x89t\x00\x00')
(0.09314683299817261,)

And because this wasn't weird enough, I can not reproduce the issue in a Python console with the same Python version, on the same machine:

$ PYTHONSTARTUP= python3 -bb
Python 3.8.5 (default, Sep  5 2020, 10:50:12)
[GCC 10.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import struct
 
>>> struct.unpack('!d', b'?\xb7\xd8x\x89t\x00\x00')
(0.09314683299817261,)

Any idea what kind of black magic xdist/execnet could be doing to cause this? I haven't been able to find a reproducer outside of qutebrowser so far, but I guess that's not too surprising with how strange this issue is. Did I find a bug in cpython or something?

@The-Compiler
Copy link
Member Author

The-Compiler commented Sep 7, 2020

NOTE: This was most likely a red herring


So I think the Python warning stack is just a complete lie. When I use gdb and break in PyErr_WarnEx where the warning is emitted, I get this C stacktrace.

Using the Python gdb macros I was able to find out that this is apparently happening with a comparison to 'PATH':

Breakpoint 1, PyErr_WarnEx (category=0x7ffff7f637c0 <_PyExc_BytesWarning>, 
    text=0x7ffff7e65d20 "Comparison between bytes and string", stack_level=1) at Python/_warnings.c:1152
1152	   PyObject *message = PyUnicode_FromString(text);
(gdb) up
#1  0x00007ffff7c9198c in bytes_richcompare (a=<optimized out>, b=0x7fffe22082f0, op=2) at Objects/bytesobject.c:1581
1581	               if (PyErr_WarnEx(PyExc_BytesWarning,
(gdb) pyo b
object address  : 0x7fffe22082f0
object refcount : 2
object type     : 0x7ffff7f71e20
object type name: str
object repr     : 'PATH'

and getting the python stack from gdb tells me something much more plausible:

/usr/lib/python3.8/os.py (640): get_exec_path
/usr/lib/python3.8/subprocess.py (5474): _execute_child
/usr/lib/python3.8/subprocess.py (2390): __init__
/usr/lib/python3.8/subprocess.py (489): run
/usr/lib/python3.8/subprocess.py (667): check_output
/usr/lib/python3.8/platform.py (896): _syscmd_file
/home/florian/proj/qutebrowser/git/.tox/py38-pyqt515/lib/python3.8/site-packages/pluggy/callers.py (443): _multicall
/home/florian/proj/qutebrowser/git/.tox/py38-pyqt515/lib/python3.8/site-packages/pluggy/manager.py (340): <lambda>
/home/florian/proj/qutebrowser/git/.tox/py38-pyqt515/lib/python3.8/site-packages/pluggy/hooks.py (564): call_historic
/home/florian/proj/qutebrowser/git/.tox/py38-pyqt515/lib/python3.8/site-packages/pluggy/callers.py (443): _multicall
/home/florian/proj/qutebrowser/git/.tox/py38-pyqt515/lib/python3.8/site-packages/pluggy/manager.py (340): <lambda>
/home/florian/proj/qutebrowser/git/.tox/py38-pyqt515/lib/python3.8/site-packages/pluggy/hooks.py (1054): __call__
/home/florian/proj/qutebrowser/git/.tox/py38-pyqt515/lib/python3.8/site-packages/pytest/__main__.py (7): <module>
/usr/lib/python3.8/runpy.py (343): _run_code
/usr/lib/python3.8/runpy.py (450): _run_module_as_main

It remains a mystery why Python thinks this is triggered by the struct.unpack and why Python thinks the stacktrace would point there. It almost smells like I found a CPython bug, perhaps with some kind of memory corruption going on?

Next steps:

  1. Check the real cause of the warning
  2. Try with different Python/pytest/xdist versions (as I think this is a recent thing?)
  3. Maybe run Python under valgrind, at least if 1. allows me to get a reproducer which doesn't involve qutebrowser.

@nicoddemus
Copy link
Member

Hi @The-Compiler,

Ouch, yeah that's a doozy. I don't know execnet internals at all, so I'm afraid I don't have any suggestions to make here. 😕

@tadeu
Copy link
Member

tadeu commented Sep 13, 2020

Hey folks, I was intrigued by this and took some time to debug it, here are my findings.

There's an internal cache in struct implementations (such as struct.calcsize and struct.unpack), and what's happening is that the "string type" (bytes or str) of the first call to that function is being stored, and subsequent calls will try to coerce need to compare the arguments with the stored type:

>>> import struct
>>> struct.calcsize(b'!d')  # cache for '!d' uses bytes
8
>>> struct.calcsize('!d')  # so there's a warning when trying to use str
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
BytesWarning: Comparison between bytes and string
>>> struct.calcsize('>d')  # cache for '>d' uses str
8
>>> struct.calcsize(b'>d')  # so now the warning is inverted, it shows up when trying to use bytes
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
BytesWarning: Comparison between bytes and string
>>> struct.calcsize('>d')  # no problem when using str
8

In this case, hypothesis is to blame, since it's the first one to populate the cache (via struct.unpack).

@graingert
Copy link
Member

graingert commented Sep 13, 2020

Can we fix this by wrapping struct calls with a retry that attempts with both types of string?

@tadeu
Copy link
Member

tadeu commented Sep 13, 2020

Some possible ways to fix it:

  • As a workaround, call struct.calcsize('!d') before hypothesis is imported. This will trigger the warning in hypothesis, though.
  • Change hypothesis to stop using bytes in that module. Hypothesis have frequent releases, so this could be quick.
  • Report and change this in cpython so that the cache uses different entries for bytes and str, for example. Seems like a proper solution, although it could take way too long :)

@tadeu
Copy link
Member

tadeu commented Sep 13, 2020

Can we fix this by wrapping struct calls with a retry that attempts with both types of string?

Probably, or, as a workaround, just disable this warning in the wrapped struct calls.

@The-Compiler
Copy link
Member Author

@tadeu Amazing work! I guess my stack trace above was a red herring then, and probably caused by some other BytesWarning which is already ignored somewhere.

This indeed seems like a CPython bug. Do you want to report it there, or should I do so?

I'm closing this for now as I don't think this is something that should/can be fixed in xdist or execnet. I can ignore the warning via pytest.ini I presume.

@tadeu
Copy link
Member

tadeu commented Sep 13, 2020

I'll report there and post a link here 👍

@tadeu
Copy link
Member

tadeu commented Sep 13, 2020

It seems to be a big problem with python -bb specifically (and python -b):

$ python
Python 3.8.5 (tags/v3.8.5:580fbb0, Jul 20 2020, 15:57:54) [MSC v.1924 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> d={}
>>> d['a']=1
>>> d[b'a']=2
>>> d['a']
1
>>> d[b'a']
2


$ python -bb
Python 3.8.5 (tags/v3.8.5:580fbb0, Jul 20 2020, 15:57:54) [MSC v.1924 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> d={}
>>> d['a']=1
>>> d[b'a']=2
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
BytesWarning: Comparison between bytes and string
>>> d['a']
1
>>> d[b'a']
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
BytesWarning: Comparison between bytes and string

I don't see why dict should try to compare 'a' with b'a', since they are separate keys... (update: explained in a comment below)

@nicoddemus
Copy link
Member

Ra! @tadeu is amazing! 😁

@The-Compiler
Copy link
Member Author

The-Compiler commented Sep 13, 2020

The problem is that their hashes are equal:

>>> hash('a')
-8194812638379419632
>>> hash(b'a')
-8194812638379419632

And since hashes are never a guarantee of equality, a dict internally will use == on any objects where the hashes are equal. But yeah, agreed, this is probably a problem in Python with dicts in general then, not just with struct!

@tadeu
Copy link
Member

tadeu commented Sep 13, 2020

I've posted it here: https://bugs.python.org/issue41777

@tadeu
Copy link
Member

tadeu commented Sep 13, 2020

The problem is that their hashes are equal:

>>> hash('a')
-8194812638379419632
>>> hash(b'a')
-8194812638379419632

And since hashes are never a guarantee of equality, a dict internally will use == on any objects where the hashes are equal. But yeah, agreed, this is probably a problem in Python with dicts in general then, not just with struct!

Yeah, this makes sense, dicts are type-agnostic (reason why d[1] and d[1.0] are the same), it was just a sad coincidence in the bytes/str case, since they compare False, but with a warning/error with -b/-bb :)

@The-Compiler
Copy link
Member Author

Yup - I guess Python should internally somehow suppress that warning when doing equality comparisons for dicts, or perhaps special-case this case to always assume they're not equal. No idea who to ping in the Python bugtracker about that, so for now I just added some people involved in the struct module (you can type a module name in the "nosy list" to see people who might be interested in the bug, also see https://devguide.python.org/triaging/#nosy-list)

@The-Compiler
Copy link
Member Author

With HypothesisWorks/hypothesis#2620 hypothesis now uses str instead of bytes for struct, which means this should disappear - thanks @graingert and @Zac-HD!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants