Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

struct.unpack('?', '\x02') returns (False,) on Mac OSX #66211

Open
wayedt mannequin opened this issue Jul 19, 2014 · 13 comments
Open

struct.unpack('?', '\x02') returns (False,) on Mac OSX #66211

wayedt mannequin opened this issue Jul 19, 2014 · 13 comments
Labels
build The build process and cross-build stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@wayedt
Copy link
Mannequin

wayedt mannequin commented Jul 19, 2014

BPO 22012
Nosy @gpshead, @ronaldoussoren, @mdickinson, @ned-deily
Files
  • issue-22012.txt
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2014-07-19.18:30:27.947>
    labels = ['type-bug', 'library', 'build']
    title = "struct.unpack('?', '\\x02') returns (False,) on Mac OSX"
    updated_at = <Date 2015-12-12.00:36:30.051>
    user = 'https://bugs.python.org/wayedt'

    bugs.python.org fields:

    activity = <Date 2015-12-12.00:36:30.051>
    actor = 'gregory.p.smith'
    assignee = 'none'
    closed = False
    closed_date = None
    closer = None
    components = ['Build', 'Library (Lib)']
    creation = <Date 2014-07-19.18:30:27.947>
    creator = 'wayedt'
    dependencies = []
    files = ['36010']
    hgrepos = []
    issue_num = 22012
    keywords = ['patch', 'needs review']
    message_count = 13.0
    messages = ['223470', '223472', '223480', '223481', '223487', '223501', '223502', '223503', '223511', '223565', '223566', '224030', '224710']
    nosy_count = 5.0
    nosy_names = ['gregory.p.smith', 'ronaldoussoren', 'mark.dickinson', 'ned.deily', 'wayedt']
    pr_nums = []
    priority = 'normal'
    resolution = None
    stage = 'patch review'
    status = 'open'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue22012'
    versions = ['Python 2.7', 'Python 3.4', 'Python 3.5', 'Python 3.6']

    @wayedt
    Copy link
    Mannequin Author

    wayedt mannequin commented Jul 19, 2014

    On Mac OSX, struct.unpack incorrectly handles bools.

    Python 3.4.1 (default, May 19 2014, 13:10:29)
    [GCC 4.2.1 Compatible Apple LLVM 5.1 (clang-503.0.40)] on darwin
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import struct
    >>> struct.unpack('????', b'\x00\x01\x02\x03')
    (False, True, False, True)

    @wayedt wayedt mannequin assigned ronaldoussoren Jul 19, 2014
    @wayedt wayedt mannequin added OS-mac type-bug An unexpected behavior, bug, or error labels Jul 19, 2014
    @ned-deily
    Copy link
    Member

    Doing a quick look, the results vary. Using current python.org 2.7.8 and 3.4.1 installers, the results are correct. These interpreters are built with Apple gcc-4.2 (non-LLVM) from Xcode 3.2.6. Other 2.7 and 3.4.x builds I have available at the moment, including the Apple-supplied Python 2.7, were built with clang LLVM and they all give the incorrect result.

    $ /usr/bin/python2.7 -c "import sys,struct;print(struct.unpack('????', b'\x00\x01\x02\x03'), sys.version)"
    ((False, True, False, True), '2.7.5 (default, Mar 9 2014, 22:15:05) \n[GCC 4.2.1 Compatible Apple LLVM 5.0 (clang-500.0.68)]')

    $ arch -i386 /usr/bin/python2.7 -c "import sys,struct;print(struct.unpack('????', b'\x00\x01\x02\x03'), sys.version)"
    ((False, True, False, True), '2.7.5 (default, Mar  9 2014, 22:15:05) \n[GCC 4.2.1 Compatible Apple LLVM 5.0 (clang-500.0.68)]')
    
    $ /usr/local/bin/python2.7 -c "import sys,struct;print(struct.unpack('????', b'\x00\x01\x02\x03'), sys.version)"
    ((False, True, True, True), '2.7.8 (v2.7.8:ee879c0ffa11, Jun 29 2014, 21:07:35) \n[GCC 4.2.1 (Apple Inc. build 5666) (dot 3)]')
    
    $ /usr/local/bin/python2.7-32 -c "import sys,struct;print(struct.unpack('????', b'\x00\x01\x02\x03'), sys.version)"
    ((False, True, True, True), '2.7.8 (v2.7.8:ee879c0ffa11, Jun 29 2014, 21:07:35) \n[GCC 4.2.1 (Apple Inc. build 5666) (dot 3)]')
    
    $ /usr/local/bin/python3.4 -c "import sys,struct;print(struct.unpack('????', b'\x00\x01\x02\x03'), sys.version)"
    (False, True, True, True) 3.4.1 (v3.4.1:c0e311e010fc, May 18 2014, 00:54:21)
    [GCC 4.2.1 (Apple Inc. build 5666) (dot 3)]
    
    $ /usr/local/bin/python3.4-32 -c "import sys,struct;print(struct.unpack('????', b'\x00\x01\x02\x03'), sys.version)"
    (False, True, True, True) 3.4.1 (v3.4.1:c0e311e010fc, May 18 2014, 00:54:21)
    [GCC 4.2.1 (Apple Inc. build 5666) (dot 3)]
    
    $ ./bin/python3.4 -c "import sys,struct;print(struct.unpack('????', b'\x00\x01\x02\x03'), sys.version)"
    (False, True, False, True) 3.4.1+ (3.4:d6b71971b228, Jul 16 2014, 16:30:56)
    [GCC 4.2.1 Compatible Apple LLVM 5.1 (clang-503.0.40)]

    @mdickinson
    Copy link
    Member

    The relevant piece of code in the struct module looks like this:

    static PyObject *
    nu_bool(const char *p, const formatdef *f)
    {
        BOOL_TYPE x;
        memcpy((char *)&x, p, sizeof x);
        return PyBool_FromLong(x != 0);
    }

    Is it possible that BOOL_TYPE is a bitfield of length 1, and that clang is somehow making use of that fact?

    One thing I don't understand is that this shouldn't affect *standard* packing and unpacking (as opposed to native), but I still see the problem for a format of "<????". However, it's fine for ">????". Some debugging shows that we're calling 'nu_bool' even for "<????", which is a bit odd.

    @mdickinson
    Copy link
    Member

    Some debugging shows that we're calling 'nu_bool' even for "<????", which is a bit odd.

    Ah, I see. There's an optimisation that uses the native table functions instead of the big/little-endian ones if the size and byte order match.

    @ned-deily
    Copy link
    Member

    FTR, the problem also shows up with the current clang-3.4 (3.4.2) and clang3.5 (svn213451-1) packages in Debian testing (on i386), building --with-pydebug and without, so the issue is not confined to OS X.

    @ned-deily ned-deily added build The build process and cross-build and removed OS-mac labels Jul 19, 2014
    @ned-deily ned-deily changed the title struct.unpack('?', '\x02') returns (False,) on Mac OSX struct.unpack('?', '\x02') returns (False,) when Python is built with clang Jul 19, 2014
    @ronaldoussoren
    Copy link
    Contributor

    On 19 jul. 2014, at 23:22, Mark Dickinson <report@bugs.python.org> wrote:

    Mark Dickinson added the comment:

    The relevant piece of code in the struct module looks like this:

    static PyObject *
    nu_bool(const char *p, const formatdef *f)
    {
    BOOL_TYPE x;
    memcpy((char *)&x, p, sizeof x);
    return PyBool_FromLong(x != 0);
    }

    Is it possible that BOOL_TYPE is a bitfield of length 1, and that clang is somehow making use of that fact?

    I haven't found a definitive source yet, but it seems that the only valid values for _Bool, which BOOL_TYPE expands to, are 0 and 1. Clang might make use of that restriction.

    Ronald

    @ronaldoussoren ronaldoussoren changed the title struct.unpack('?', '\x02') returns (False,) when Python is built with clang struct.unpack('?', '\x02') returns (False,) on Mac OSX Jul 20, 2014
    @mdickinson
    Copy link
    Member

    For native struct packing / unpacking, it's not even clear to me that this should be considered a bug (though for standard unpacking it definitely is): the primary use-case for native unpacking is unpacking a collection of bytes that was written (from Python, or C, or ...) on the same machine, and in that case we're only going to be unpacking 0s and 1s.

    FWIW, the docs say: "Either 0 or 1 in the native or standard bool representation will be packed, and any non-zero value will be True when unpacking.", so for native packing either the docs or the code need to be fixed.

    The simplest solution would be to replace nu_bool with something identical to bu_bool. In theory that would break on any platform where "sizeof(_Bool)" is greater than 1. In practice, I doubt that Python's going to meet such platforms in a hurry.

    @mdickinson
    Copy link
    Member

    In practice, I doubt that Python's going to meet such platforms in a hurry.

    Hrm; looks like that's not a particularly safe assumption, especially with older compilers that might do a "typedef int _Bool".

    From http://msdn.microsoft.com/en-us/library/tf4dy80a.aspx:

    "That means that for Visual C++ 4.2, a call of sizeof(bool) yields 4, while in Visual C++ 5.0 and later, the same call yields 1."

    @ned-deily ned-deily changed the title struct.unpack('?', '\x02') returns (False,) on Mac OSX struct.unpack('?', '\x02') returns (False,) when Python is built with clang Jul 20, 2014
    @ronaldoussoren
    Copy link
    Contributor

    On 20 jul. 2014, at 10:32, Mark Dickinson <report@bugs.python.org> wrote:

    Mark Dickinson added the comment:

    > In practice, I doubt that Python's going to meet such platforms in a hurry.

    Hrm; looks like that's not a particularly safe assumption, especially with older compilers that might do a "typedef int _Bool"

    I'm not sure about the actual definition, but at least for OSX on PPC sizeof(bool) is not 1. I ran into this with pyobjc when I tried to treat BOOL and bool the same.

    All of this is with Objective-C, but the same should be true for plain C.

    @ronaldoussoren ronaldoussoren changed the title struct.unpack('?', '\x02') returns (False,) when Python is built with clang struct.unpack('?', '\x02') returns (False,) on Mac OSX Jul 20, 2014
    @ronaldoussoren
    Copy link
    Contributor

    I just confirmed that clang only uses the LSB for _Bool values by looking at the assembly generated for the following code:

    <quote>
    #include <stdbool.h>
    #include <stdio.h>
    
    int main(void)
    {
    	_Bool x;
    	*(unsigned char*)&x = 42;
    	printf("%d\n", (int)x);
    	return 0;
    }
    </quote>

    The attached patch fixes the issue for me. The new testcase fails without the changes to _struct.c and passes with those changes.

    @ronaldoussoren
    Copy link
    Contributor

    The last draf of ISO C '11: <http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf\>.

    This says that _Bool is large enough to store 0 and 1, and that conversion of any integer data to _Bool results in 0 or 1. If I interpret the document correctly there is no conforming way to create a value of _Bool that has a value other than 0 or 1, and that should mean clang's behavior is not a bug.

    BTW. I haven't tested my patch on a PPC system yet (where sizeof(bool) != 1), and won't be able to do so until I'm back home (I'm currently at EuroPython)

    @ronaldoussoren
    Copy link
    Contributor

    Does anyone have feedback for my proposed patch (other the bug in test code when sizeof(bool) != 1, the test values for big and little endian are in the wrong order)?

    @ronaldoussoren
    Copy link
    Contributor

    BTW. There is also an argument to be made against my patch and for a documentation update: it is unclear to me if clang ever creates _Bool false values where the bit pattern isn't exactly the same as the zero value of an integer of the same size (for example due to generated code only updating the least significant bit and starting with a uninitialised value that has some of the other bits set). If it does so depends on what's more efficient to do in machine code, and my knowledge of assembly is too rusty to have anything useful to say about that (although I'd suspect that overwriting the complete _Bool value would be more efficient than loading the current value, updating the LSB and storing it again).

    If it can do so the current behavior of struct.unpack would be more correct than the version you get after applying my patch.

    @gpshead gpshead added the stdlib Python modules in the Lib dir label Dec 12, 2015
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    build The build process and cross-build stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    4 participants