Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

abort when passing certain structs by value using ctypes #66469

Closed
weeble mannequin opened this issue Aug 25, 2014 · 38 comments
Closed

abort when passing certain structs by value using ctypes #66469

weeble mannequin opened this issue Aug 25, 2014 · 38 comments
Labels
3.7 (EOL) end of life 3.8 only security fixes topic-ctypes type-crash A hard crash of the interpreter, possibly with a core dump

Comments

@weeble
Copy link
Mannequin

weeble mannequin commented Aug 25, 2014

BPO 22273
Nosy @vsajip, @amauryfa, @abalkin, @vstinner, @meadori, @eryksun, @Kentzo
PRs
  • bpo-22273: Update ctypes to correctly handle arrays in small structur… #15839
  • [3.7] bpo-22273: Update ctypes to correctly handle arrays in small structur… (GH-15839) #16369
  • [3.8] bpo-22273: Update ctypes to correctly handle arrays in small structur… (GH-15839) #16370
  • bpo-22273: Disabled tests while investigating buildbot failures on ARM7L/PPC64. #16377
  • bpo-22273: Changed conditions for ctypes array-in-struct handling. #16381
  • bpo-22273: Re-enabled ctypes test on ARM machines. #16388
  • bpo-22273: Removed temporary test skipping on PPC platforms. #16399
  • [3.7] bpo-22273: Changed conditions for ctypes array-in-struct handling. (GH-16381) #16400
  • [3.8] bpo-22273: Changed conditions for ctypes array-in-struct handling. (GH-16381) #16401
  • bpo-38321: Fix PyCStructUnionType_update_stgdict() warning #16492
  • Files
  • fix-22273-01.diff: Test for structure with array seems to work on Windows, abort on LInux
  • fix-22273-02.diff: First cut patch to address issue.
  • fix-22273-03.diff: Patch updated to address Eryk's comments.
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2019-09-26.06:55:39.781>
    created_at = <Date 2014-08-25.17:33:22.934>
    labels = ['ctypes', '3.7', '3.8', 'type-crash']
    title = 'abort when passing certain structs by value using ctypes'
    updated_at = <Date 2019-10-01.11:52:03.574>
    user = 'https://bugs.python.org/weeble'

    bugs.python.org fields:

    activity = <Date 2019-10-01.11:52:03.574>
    actor = 'vstinner'
    assignee = 'none'
    closed = True
    closed_date = <Date 2019-09-26.06:55:39.781>
    closer = 'vinay.sajip'
    components = ['ctypes']
    creation = <Date 2014-08-25.17:33:22.934>
    creator = 'weeble'
    dependencies = []
    files = ['46652', '46660', '46663']
    hgrepos = []
    issue_num = 22273
    keywords = ['patch']
    message_count = 38.0
    messages = ['225881', '226045', '235598', '287881', '288142', '288143', '288163', '288221', '288237', '288239', '288241', '288244', '288298', '288341', '288385', '288426', '288437', '288448', '288452', '288461', '288471', '288482', '288491', '288493', '309113', '309126', '353135', '353138', '353139', '353150', '353176', '353200', '353225', '353227', '353228', '353238', '353589', '353679']
    nosy_count = 9.0
    nosy_names = ['vinay.sajip', 'amaury.forgeotdarc', 'belopolsky', 'vstinner', 'weeble', 'meador.inge', 'eryksun', 'Ilya.Kulakov', 'alexei.romanov']
    pr_nums = ['15839', '16369', '16370', '16377', '16381', '16388', '16399', '16400', '16401', '16492']
    priority = 'high'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'crash'
    url = 'https://bugs.python.org/issue22273'
    versions = ['Python 3.7', 'Python 3.8']

    @weeble
    Copy link
    Mannequin Author

    weeble mannequin commented Aug 25, 2014

    I'm not 100% certain this is a bug yet, but I'm beginning to think it's likely.

    On 64-bit Linux, I can't pass a struct like this:

        struct S { uint8_t data[16]; };

    ...to a function declared like this:

        void f(struct S);

    From experimentation with various integer types and array sizes, it seems this causes an abort somewhere in libffi any time the array is between 9 and 16 bytes in size. If the array is smaller or larger than that, the calls work as expected.

    I've asked about this here: http://stackoverflow.com/questions/25487928/is-this-the-correct-way-to-pass-a-struct-by-value-in-ctypes

    Here's some test code:

    ## sum.cpp

        #include <cstdint>
    using std::size_t;
    
        struct ArrayStruct {
            // We'll define ARRAY_TYPE and ARRAY_SIZE on the
            // command-line when we compile.
            std::ARRAY_TYPE data[ARRAY_SIZE];
        };
    
        extern "C" int64_t sum(struct ArrayStruct array)
        {
            int64_t acc=0;
            for (size_t i=0; i!=ARRAY_SIZE; ++i)
            {
                acc+=array.data[i];
            }
            return acc;
        }
    ## sum.py
        import ctypes
        import sys
    
        def main():
            array_size = int(sys.argv[1])
            array_type = sys.argv[2]
    
            libsum = ctypes.cdll.LoadLibrary('./libsum.so')
    
            ArrType = getattr(ctypes, 'c_' + array_type) * array_size
    
            class MyStruct(ctypes.Structure):
                _fields_ = [("data", ArrType)]
    
            m=MyStruct()
            for i in range(array_size):
                m.data[i]=i
    
            print(libsum.sum(m))
    
        if __name__ == '__main__':
            main()

    ## Build/run

        $ g++ -g -shared -Wall -fPIC sum.cpp -o libsum.so -std=c++11 -D ARRAY_SIZE=16 -D ARRAY_TYPE=uint8_t && python3 sum.py 16 uint8
        Aborted (core dumped)

    I poked around a little bit in gdb. It's aborting in libffi's "ffi_call" function: https://github.com/atgreen/libffi/blob/v3.0.13/src/x86/ffi64.c#L516

    (gdb) bt
    #0 0x00007ffff782cf79 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
    #1 0x00007ffff7830388 in __GI_abort () at abort.c:89
    #2 0x00007ffff67134f5 in ffi_call (cif=0x7fffffffd7b0, fn=0x7ffff650c625 <sum(ArrayStruct)>, rvalue=0x7fffffffd6f0, avalue=0x7fffffffd6d0) at ../src/x86/ffi64.c:516
    #3 0x00007ffff691fee3 in _ctypes_callproc () from /usr/lib/python3.4/lib-dynload/_ctypes.cpython-34m-x86_64-linux-gnu.so
    #4 0x00007ffff6920578 in ?? () from /usr/lib/python3.4/lib-dynload/_ctypes.cpython-34m-x86_64-linux-gnu.so
    #5 0x000000000043810a in PyObject_Call ()
    #6 0x0000000000579f45 in PyEval_EvalFrameEx ()
    #7 0x000000000057d3d3 in PyEval_EvalCodeEx ()
    #8 0x000000000057bfaa in PyEval_EvalFrameEx ()
    #9 0x000000000057d3d3 in PyEval_EvalCodeEx ()
    #10 0x000000000060ba83 in PyRun_FileExFlags ()
    #11 0x000000000060bc85 in PyRun_SimpleFileExFlags ()
    #12 0x000000000060d3ac in Py_Main ()
    #13 0x000000000041ec0d in main ()
    (gdb) frame 2
    #2 0x00007ffff67134f5 in ffi_call (cif=0x7fffffffd7b0, fn=0x7ffff650c625 <sum(ArrayStruct)>, rvalue=0x7fffffffd6f0, avalue=0x7fffffffd6d0) at ../src/x86/ffi64.c:516
    516 abort();
    (gdb) info args
    cif = 0x7fffffffd7b0
    fn = 0x7ffff650c625 <sum(ArrayStruct)>
    rvalue = 0x7fffffffd6f0
    avalue = 0x7fffffffd6d0
    (gdb) info locals
    a = <optimized out>
    j = <optimized out>
    size = 8
    n = <optimized out>
    classes = {X86_64_INTEGER_CLASS, X86_64_NO_CLASS, 4294956784, 32767}
    stack = 0x7fffffffd4f0 ""
    argp = 0x7fffffffd5a0 "\001"
    arg_types = 0x7fffffffd6b0
    gprcount = 1
    ssecount = <optimized out>
    ngpr = 1
    nsse = 0
    i = <optimized out>
    avn = <optimized out>
    ret_in_memory = <optimized out>
    reg_args = 0x7fffffffd4f0
    (gdb) print *cif
    $2 = {abi = FFI_UNIX64, nargs = 1, arg_types = 0x7fffffffd6b0, rtype = 0x7ffff6b5e228, bytes = 0, flags = 10}

    It looks like we're trying to pass the struct in two registers, which I think is what's supposed to happen, but something is going wrong with the second register. It aborted because it has class X86_64_NO_CLASS and that's not handled by the switch.

    I don't know if this is a bug in libffi, or if ctypes is feeding it bad information, or if I'm feeding ctypes bad information. I hope this information is useful for anyone investigating.

    I get the same abort in both Python 2.7.6 and 3.4.0.

    I originally stumbled across this issue trying to use PySDL2:

    http://pysdl2.readthedocs.org/en/rel_0_9_3/

    I was trying to call SDL_JoystickGetGUIDString, which uses a similar struct-by-value call:

    http://hg.libsdl.org/SDL/file/92ca74200ea5/include/SDL_joystick.h

    @weeble weeble mannequin added topic-ctypes type-crash A hard crash of the interpreter, possibly with a core dump labels Aug 25, 2014
    @weeble
    Copy link
    Mannequin Author

    weeble mannequin commented Aug 28, 2014

    I had a closer look at the cif object in gdb. The ffi_type of the argument in question has size 16, alignment 1, type FFI_TYPE_STRUCT and elements contains a single nested ffi_type, of size 8, alignment 8, type FFI_TYPE_POINTER.

    I think this pointer type is wrong. The struct should indeed be size 16, but its contents (in this case) should be 16 bytes of uint8s, rather than a single pointer. I'm not certain how to correctly describe an array to libffi. While you might be able to hack it with a nested struct filled with 16 individual integers, I have no idea if that would work consistently across platforms.

    @Kentzo
    Copy link
    Mannequin

    Kentzo mannequin commented Feb 9, 2015

    The structure hack does not work on Windows 8, x64.

    @eryksun eryksun added the 3.7 (EOL) end of life label Feb 15, 2017
    @eryksun
    Copy link
    Contributor

    eryksun commented Feb 15, 2017

    ctypes defines arrays as a pointer FFI type because they degenerate as pointers in C calls. But it's generally wrong to set a pointer FFI type for an array in the elements of a struct's FFI type. An integer array in a struct that's 16 bytes or less should be packed in one or two general-purpose registers (rdi, rsi, rdx, rcx, r8, r9).

    For the example 16-byte struct, classify_argument() in ffi64.c expects to classify two 8-byte words. But the struct's FFI type only has one element, which we've incorrectly defined as a pointer element. Thus the second word is left at the default classification X86_64_NO_CLASS. Back in ffi_call() it expects two classified words, so it aborts when it sees X86_64_NO_CLASS.

    I think we can special-case small arrays in PyCStructUnionType_update_stgdict when assigning the elements of the FFI type of a struct or union. If we have an array that's 32 bytes or less, unpack it as individual FFI elements, e.g. a c_ushort * 8 array would be stored as 8 ffi_type_uint16 elements in the struct's FFI type.

    @vsajip
    Copy link
    Member

    vsajip commented Feb 19, 2017

    I think we can special-case small arrays in PyCStructUnionType_update_stgdict

    Is that definitely the right place? And is doing it only for small arrays going to be enough? Currently, PyCStructUnionType_update_stgdict does

    dict = PyType_stgdict(desc);

    and then

    stgdict->ffi_type_pointer.elements[ffi_ofs + i] = &dict->ffi_type_pointer;

    where dict is the ctypes object for the field type. If the ffi_type_pointer is used all over the place because arrays usually degenerate to pointers, and changing it would cause breakage elsewhere, maybe the answer is to have a new ffi_type_array field which is NULL for non-array types and set correctly for array types; then the above code can check for a non-NULL ffi_type_array and use that instead of the ffi_type_pointer? Or am I talking nonsense?

    Oddly (or perhaps not), this failure doesn't seem to occur on Windows - no crash happens and the correct value is returned from a function which sums the array, as in this example. See attached patch.

    @vsajip
    Copy link
    Member

    vsajip commented Feb 19, 2017

    I've not marked it "patch review" as the patch isn't complete. Just wanted to see if anyone can reproduce/explain the working on Windows/failing on Linux.

    @eryksun
    Copy link
    Contributor

    eryksun commented Feb 19, 2017

    Structs that are larger than 32 bytes get copied to the stack (see classify_argument in ffi64.c), so we don't have to worry about classifying their elements for register passing. Thus if a new field is added for this in StgDictObject, then PyCArrayType_new should only allocate it for array types that are 32 bytes or less. Using it for larger array types would serve no point.

    explain the working on Windows/failing on Linux.

    In the Windows libffi we don't have examine_argument() and classify_argument(). The Win64 ABI is fairly simple 1. A struct that's 8 bytes or less gets passed as an integer, so if it's in the first four arguments it gets passed in rcx, rdx, r8, or r9. Otherwise it gets copied and passed by reference. Unlike the 64-bit Unix ABI, we don't have to worry about packing struct elements across multiple registers or passing floating-point elements in vector registers.

    @vsajip
    Copy link
    Member

    vsajip commented Feb 20, 2017

    Thanks for spelling it out for me, that's helpful. But I'm still confused about a couple of things: I can't find classify_argument in the Python source tree other than in

    Modules/_ctypes/libffi_osx/x86/x86-ffi64.c

    Is that the file you referred to as ffi64.c? I assumed this is only used on OS X. Do we just use the system libffi on Linux?

    I also note that if I use the following:

    typedef struct {
        int foo;
        int bar;
        unsigned char data[8];
    } Test;

    which is certainly the same size of struct, there's no abort and the sum is correctly calculated and returned as 28, which is printed by the Python script. If I swap things around so that the array comes first in the structure, that also works. If I increase the array size back to 16 (giving a total structure size of 24), that also works. If I then comment out the 'int foo' and 'int bar' fields in both C and Python, the abort reappears.

    @vsajip
    Copy link
    Member

    vsajip commented Feb 20, 2017

    Possibly also relevant: bpo-16575 and

    libffi/libffi#33

    @eryksun
    Copy link
    Contributor

    eryksun commented Feb 20, 2017

    classify_argument is in Modules/_ctypes/libffi/src/x86/ffi64.c. This file is for the 64-bit Unix ABI. libffi doesn't use it for 64-bit Windows.

    Some (all?) Linux distros link ctypes to the system libffi. In my experience, building 3.6 on Linux defaults to the system libffi, and I don't personally know how to override this. Configuring "--without-system-ffi" seems to be ignored.

    Regarding your Test struct example, it's ok in this particular case to classify an 8-byte integer array the same as an 8-byte pointer. It doesn't abort() because the 2nd word isn't left unclassified (i.e. X86_64_NO_CLASS).

        import ctypes
    
        class Test(ctypes.Structure):
            _fields_ = (('foo', ctypes.c_int),
                        ('bar', ctypes.c_int),
                        ('data', ctypes.c_uint8 * 8))
    
        @ctypes.CFUNCTYPE(None, Test)
        def func(t):
            print('foo:', t.foo)
            print('bar:', t.bar)
            print('data:', t.data[:])
    
        t = Test(5, 10, tuple(range(8)))
        >>> hex(id(Test))
        '0x9d8ad8'

    The ctypes Structure has 3 elements. The first two are ffi_type_sint32 (FFI_TYPE_SINT32 == 10), and the third is ffi_type_pointer (FFI_TYPE_POINTER == 14):

    (gdb) set $dict = (StgDictObject *)(((PyTypeObject *)0x9d8ad8)->tp_dict)
    (gdb) p *$dict->ffi_type_pointer->elements[0]
    $1 = {size = 4, alignment = 4, type = 10, elements = 0x0}
    (gdb) p *$dict->ffi_type_pointer->elements[1]
    $2 = {size = 4, alignment = 4, type = 10, elements = 0x0}
    (gdb) p *$dict->ffi_type_pointer->elements[2]
    $3 = {size = 8, alignment = 8, type = 14, elements = 0x0}
    (gdb) p $dict->ffi_type_pointer->elements[3]
    $4 = (struct _ffi_type *) 0x0
    

    classify_argument() recursively classifies and merges these elements. The first two get merged as X86_64_INTEGER_CLASS, and the 'pointer' (actually an array) is X86_64_INTEGER_CLASS.

    >>> func(t)
    
    Breakpoint 1, ffi_call (cif=cif@entry=0x7fffffffd570, fn=fn@entry=0x7ffff7fee010,
        rvalue=rvalue@entry=0x7fffffffd630, avalue=avalue@entry=0x7fffffffd610)
        at ../src/x86/ffi64.c:424
    
    [...snip...]
    
    458	  for (i = 0; i < avn; ++i)
    (gdb) 
    462	      n = examine_argument (arg_types[i], classes, 0, &ngpr, &nsse);
    (gdb) 
    463	      if (n == 0
    (gdb) p n
    $6 = 2
    (gdb) p ngpr
    $7 = 2
    (gdb) p classes[0]
    $8 = X86_64_INTEGER_CLASS
    (gdb) p classes[1]
    $9 = X86_64_INTEGER_CLASS
    

    The struct is passed in two general-purpose integer registers, rdi and rsi:

    Breakpoint 2, ffi_call_unix64 () at ../src/x86/unix64.S:49
    49		movq	(%rsp), %r10		/* Load return address.  */
    
    [...snip...]
    76		call	*%r11
    
    (gdb) p/x $rdi
    $10 = 0xa00000005
    (gdb) p/x $rsi
    $11 = 0x706050403020100
    (gdb) c
    Continuing.
    
    foo: 5
    bar: 10
    data: [0, 1, 2, 3, 4, 5, 6, 7]
    

    @eryksun
    Copy link
    Contributor

    eryksun commented Feb 20, 2017

    I see now why you couldn't find ffi64.c. I've been using a 3.6 worktree. The libffi sources have been removed from master.

    For the union and bitfield problem, also see the crash reported in bpo-26628.

    @vsajip
    Copy link
    Member

    vsajip commented Feb 20, 2017

    I'm learning a bit about Linux calling conventions :-)

    But it also works when a 16-byte array is followed by 2 ints; if the two ints are removed, then it fails again.

    ctypes sets elements up in the first case to be a FFI_TYPE_POINTER slot followed by two slots of FFI_TYPE_SINT32, and classify_argument seemingly does the right thing. But remove the two integers, and classify_argument seems to not do the right thing. Isn't this looking like a problem in classify_argument? In the first case:

    p *stgdict->ffi_type_pointer.elements[0]
    $3 = {size = 8, alignment = 8, type = 14, elements = 0x0}
    p *stgdict->ffi_type_pointer.elements[1]
    $4 = {size = 4, alignment = 4, type = 10, elements = 0x0}
    p *stgdict->ffi_type_pointer.elements[2]
    $5 = {size = 4, alignment = 4, type = 10, elements = 0x0}
    p stgdict->ffi_type_pointer.elements[3]
    $6 = (struct _ffi_type *) 0x0

    and the second case:

    p *stgdict->ffi_type_pointer.elements[0]
    $2 = {size = 8, alignment = 8, type = 14, elements = 0x0}
    p stgdict->ffi_type_pointer.elements[1]
    $3 = (struct _ffi_type *) 0x0

    It's like this on the way into ffi_call (can't step into it at the moment).

    @eryksun
    Copy link
    Contributor

    eryksun commented Feb 21, 2017

    The 24-byte struct gets passed on the stack, as it should be. In this case ffi_call doesn't abort() because examine_argument returns 0, which is due to the following code in classify_argument:

    if (words > 2)
    {
        /* When size > 16 bytes, if the first one isn't
           X86_64_SSE_CLASS or any other ones aren't
           X86_64_SSEUP_CLASS, everything should be passed in
           memory.  */
        if (classes[0] != X86_64_SSE_CLASS)
            return 0;
    
            for (i = 1; i < words; i++)
                if (classes[i] != X86_64_SSEUP_CLASS)
                    return 0;
        }

    It looks like X86_64_SSEUP_CLASS is never actually assigned by classify_argument(), in which case libffi never uses registers to pass structs that are larger than 16 bytes.

    Regarding floating-point values, we get a similar abort for passing a struct containing an array of two doubles because ctypes passes one ffi_type_pointer element instead of two ffi_type_double elements.

    Also, a struct with an array of one double (weird but should be supported) doesn't abort, but instead gets passed incorrectly like a pointer, i.e. as an integer in register rdi, instead of in the expected xmm0 register. The call thus uses whatever garbage value is currently in xmm0. You have to use a test lib to reproduce this. It's not apparent with a ctypes callback because ffi_closure_unix64 (unix64.S) and ffi_closure_unix64_inner (ffi64.c) use the same incorrect classification before calling ctypes closure_fcn and _CallPythonObject.

    @vsajip
    Copy link
    Member

    vsajip commented Feb 22, 2017

    Patch attached, including tests. If it looks halfway sensible, I can work up a PR.

    @eryksun
    Copy link
    Contributor

    eryksun commented Feb 22, 2017

    Notes on fix-22273-02.diff:

    In the second pass over _fields_, you can (should) use dict->length and dict->proto for array types instead of the _length_ and _type_ attributes.

    When reassigning stgdict->ffi_type_pointer.elements, if use_broken_old_ctypes_semantics is false, then you also have to allocate space for and copy the elements from the base class if any (i.e. if basedict && basedict->length > 0).

    Regarding structs with bitfields and unions, we could add an stgdict flag to prevent passing them as arguments in the Unix X86_64 ABI -- e.g. add a flag named TYPEFLAG_NONARGTYPE (0x400). ConvParam (callproc.c) and converters_from_argtypes (_ctypes.c) would raise an ArgumentError or TypeError in this case. Subclasses of structs and unions would inherit this flag value in StructUnionType_new.

    The first pass in PyCStructUnionType_update_stgdict can set arrays_seen and bitfields_seen. Also, per the above suggestion, isArgType can be added. Moreover, since we don't have to worry about bitfields if we forbid passing structs with bitfields in this ABI, then MAX_ELEMENTS can be reduced to 8. For example:

        #ifdef X86_64
        #define MAX_ELEMENTS 8
        isArgType = (!(stgdict->flags & TYPEFLAG_NONARGTYPE) &&
                     isStruct && !bitfields_seen);
        if (!isArgType) {
            stgdict->flags |= TYPEFLAG_NONARGTYPE;
        } else if (size <= 16 && arrays_seen) {
            ffi_type *actual_types[MAX_ELEMENTS + 1];
            int actual_type_index = 0;
        /* second pass over _fields_ */
    
    }
    

    This is speculative based on how we address passing unions and structs with bitfields in the 64-bit Unix ABI. Raising a descriptive exception is at least an improvement over abruptly aborting the process.

    @vsajip
    Copy link
    Member

    vsajip commented Feb 23, 2017

    Thanks for the comments. Using your suggestions simplifies things quite a bit. Still finding my way around :-)

    Regarding structs with bitfields and unions, we could add an stgdict flag to prevent passing them as arguments in the Unix X86_64 ABI

    Is this to deal with issues bpo-16575, bpo-16576 and bpo-26628? Issue bpo-26628 seems to be a duplicate of bpo-16575. I can add the flag and set it here, but the checks should probably be in a separate patch.

    if we forbid passing structs with bitfields in this ABI, then MAX_ELEMENTS can be reduced to 8.

    Not sure that's the case. For example, if we have to handle

        typedef struct {
            unsigned char data[12];
        } Test;

    that would use up 12 slots for the unrolled array. Perhaps you mean 16 rather than 8?

    Updated patch attached.

    @eryksun
    Copy link
    Contributor

    eryksun commented Feb 23, 2017

    Perhaps you mean 16 rather than 8?

    Sorry, that was a misfire. It should be 16.

    @vsajip
    Copy link
    Member

    vsajip commented Feb 23, 2017

    Just a thought - the TYPEFLAG_NONARGTYPE needs to be copied from the base class if set there, right?

    @eryksun
    Copy link
    Contributor

    eryksun commented Feb 23, 2017

    I had suggested inheriting the TYPEFLAG_NONARGTYPE flag in StructUnionType_new. It requires a minor change to get basedict unconditionally, and then assign

    if (basedict)
        dict->flags |= basedict->flags & TYPEFLAG_NONARGTYPE;
    

    We need more feedback on this suggested flag, especially to stay consistent with CFFI if possible. Do you know whether CFFI supports passing unions and structs with bitfields in its ABI mode for 64-bit Unix?

    @vsajip
    Copy link
    Member

    vsajip commented Feb 23, 2017

    We need more feedback on this suggested flag, especially to stay consistent with CFFI if possible.

    Undoubtedly, more feedback would be very helpful. I'm not sure using this flag impacts on consistency with CFFI particularly, since it's an internal implementation detail. Its main purpose would be for ctypes to raise exceptions rather than leading to crashes or undefined behaviour, as we have at the moment.

    Do you know whether CFFI supports passing unions and structs with bitfields in its ABI mode for 64-bit Unix?

    I don't believe so, but I'm relatively new to this area. I'm not sure if things have changed recently, but an analogous CFFI issue was closed as WONTFIX in 2015, citing lack of support in libffi:

    https://bitbucket.org/cffi/cffi/issues/150/structs-with-bit-fields-as-arguments

    Also, the latest CFFI documentation, near the bottom of this section:

    https://cffi.readthedocs.io/en/latest/using.html#function-calls

    says:

    "The limitations are that you cannot pass directly as argument or return type:

    • a union (but a pointer to a union is fine);
    • a struct which uses bitfields (but a pointer to such a struct is fine);"

    The documentation applies these limitations regardless of any specific ABI (presumably to provide consistency).

    So, I would guess that, as with ctypes, lack of libffi support is the main obstacle. I suppose one would have to seriously consider contributing there to make much headway here. In this still-open issue from 2013:

    libffi/libffi#33

    Anthony Green of libffi said he'd welcome a patch, in response to a question by Eli Bendersky. Of course, it may be hard for individual contributors to support this for the range of architectures that libffi covers.

    @eryksun
    Copy link
    Contributor

    eryksun commented Feb 23, 2017

    I'm not sure using this flag impacts on consistency with CFFI

    I meant consistency with respect to supported argument types. If someone contributed a workaround to CFFI, then I would rather port it, but it looks like they're also waiting for this to be addressed upstream.

    It occurs to me that in the 1st pass, it also needs to propagate the non-argument flag from any field that has it set. This should be done for all platforms, not just X86_64.

    @vsajip
    Copy link
    Member

    vsajip commented Feb 23, 2017

    It occurs to me that in the 1st pass, it also needs to propagate the non-argument flag from any field that has it set.

    So does that mean disallowing a structure which contains a union? What about if the final structure is large enough to require passing in memory rather than registers, so that libffi doesn't need to do any clever marshalling, even if some part of the structure wouldn't by itself be able to be passed as an argument in a call? Won't that end up being too restrictive?

    @eryksun
    Copy link
    Contributor

    eryksun commented Feb 23, 2017

    Perhaps it should instead use two specific flags, TYPEFLAG_HASBITFIELD and TYPEFLAG_HASUNION, which are propagated unconditionally from the base class and fields. As a base case, a union itself is flagged TYPEFLAG_HASUNION. Arrays are unrolled on X86_64 only if neither flag is present and the size is 16 bytes or less.

    If ConvParam or converters_from_argtypes see either flag on X86_64 and the size is 16 bytes or less, then they raise an exception. As before, this rejects some call signatures that would actually succeed. We're not accounting for the case in which the limited number of registers forces an argument to be passed on the stack even though it's small enough to be passed in registers.

    @vsajip
    Copy link
    Member

    vsajip commented Feb 23, 2017

    Perhaps it should instead use two specific flags, TYPEFLAG_HASBITFIELD and TYPEFLAG_HASUNION

    This seems better at first sight. It's not making any suitability decisions (apart from doing the unrolling), and the meaning of these flags will be less volatile than TYPEFLAG_NONARGTYPE because that assessment depends on current limitations, and those limitations might change over time.

    I'm going into a period of two weeks where I may not have much time to work on this due to other time commitments, so if you want to press on with it, go right ahead :-)

    @Kentzo
    Copy link
    Mannequin

    Kentzo mannequin commented Dec 28, 2017

    Is there anything to be done for this patch to get merged?

    @vsajip
    Copy link
    Member

    vsajip commented Dec 28, 2017

    Yes, the patch needs improving as per the suggestion in msg288493 (not had the time since to do any work on it), followed by a review of the changes.

    @csabella csabella added the 3.8 only security fixes label May 13, 2019
    @vsajip
    Copy link
    Member

    vsajip commented Sep 25, 2019

    New changeset 12f209e by Vinay Sajip in branch 'master':
    bpo-22273: Update ctypes to correctly handle arrays in small structur… (GH-15839)
    12f209e

    @vsajip
    Copy link
    Member

    vsajip commented Sep 25, 2019

    New changeset ce62dcc by Vinay Sajip (Miss Islington (bot)) in branch '3.8':
    bpo-22273: Update ctypes to correctly handle arrays in small structur… (GH-15839) (GH-16370)
    ce62dcc

    @vsajip
    Copy link
    Member

    vsajip commented Sep 25, 2019

    New changeset 16c0f6d by Vinay Sajip (Miss Islington (bot)) in branch '3.7':
    bpo-22273: Update ctypes to correctly handle arrays in small structur… (GH-15839) (GH-16369)
    16c0f6d

    @vsajip
    Copy link
    Member

    vsajip commented Sep 25, 2019

    New changeset 57dc7d5 by Vinay Sajip in branch 'master':
    bpo-22273: Disabled tests while investigating buildbot failures on ARM7L/PPC64. (GH-16377)
    57dc7d5

    @vstinner
    Copy link
    Member

    Please check if bpo-38272 regression is caused by this issue.

    "FAIL: test_array_in_struct (ctypes.test.test_structures.StructureTestCase)" on ARMv7.

    @vsajip
    Copy link
    Member

    vsajip commented Sep 25, 2019

    New changeset 417089e by Vinay Sajip in branch 'master':
    bpo-22273: Re-enabled ctypes test on ARM machines. (GH-16388)
    417089e

    @vsajip
    Copy link
    Member

    vsajip commented Sep 25, 2019

    Please check if bpo-38272 regression is caused by this issue.

    Yes, it is. Being worked on right now. Sorry for the noise.

    @vsajip
    Copy link
    Member

    vsajip commented Sep 25, 2019

    New changeset cc28ed2 by Vinay Sajip in branch 'master':
    bpo-22273: Removed temporary test skipping on PPC platforms. (GH-16399)
    cc28ed2

    @vsajip
    Copy link
    Member

    vsajip commented Sep 25, 2019

    New changeset d015714 by Vinay Sajip in branch '3.7':
    [3.7] bpo-22273: Changed conditions for ctypes array-in-struct handling. (GH-16381) (GH-16400)
    d015714

    @vsajip
    Copy link
    Member

    vsajip commented Sep 25, 2019

    New changeset b92b8c5 by Vinay Sajip in branch '3.8':
    [3.8] bpo-22273: Changed conditions for ctypes array-in-struct handling. (GH-16381) (GH-16401)
    b92b8c5

    @vsajip vsajip closed this as completed Sep 26, 2019
    @vsajip
    Copy link
    Member

    vsajip commented Sep 30, 2019

    New changeset c9a413e by Vinay Sajip (Victor Stinner) in branch 'master':
    bpo-38321: Fix PyCStructUnionType_update_stgdict() warning (GH-16492)
    c9a413e

    @vstinner
    Copy link
    Member

    vstinner commented Oct 1, 2019

    New changeset bfe1f74 by Victor Stinner in branch '3.8':
    [3.8] bpo-3832: Fix compiler warnings (GH-16518)
    bfe1f74

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life 3.8 only security fixes topic-ctypes type-crash A hard crash of the interpreter, possibly with a core dump
    Projects
    None yet
    Development

    No branches or pull requests

    4 participants