-
-
Notifications
You must be signed in to change notification settings - Fork 30.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ctypes change made clang fail to build #81321
Comments
Hello, On Fedora, "clang" fails to build with Python 3.8, probably due this change (which was supposed to be Windows-only): According to serge_sans_paille: if Fedora discussion with link to logs: https://bugzilla.redhat.com/show_bug.cgi?id=1715016 |
The change causing the build failure comes from this commit. The commit comment says: "Fix copy of structures when passed by value." Steve, do you recall what this change fixed? |
If I undo the changes to StructUnionType_paramfunc then test_pass_by_value (ctypes.test.test_structures.StructureTestCase) fails on x64 on Windows. Looking at the code I don't think this is specific to Windows. This is a test for fixing this issue: https://bugs.python.org/issue29565 |
Reading the related bugs more carefully I think the struct/union passing conventions are different on Windows x64 and Linux. I have a fix which works for Windows but preserves the prior code for Linux. |
Would you mind to elaborate? It works for me on Fedora 30 with clang 8.0.0: $ ./configure CC=clang
$ make
$ ./python -m test -v test_ctypes
...
Tests result: SUCCESS I tested the master branch of Python. |
The issue is with building clang using Python 3.8; not building Python 3.8 using clang :) |
@vstinner: to reproduce the issue
|
It would be great to be able to write a reproducer and then convert it into a proper unit test. |
I added a unittest to the PR that illustrates the problem. It doesn't pass yet. |
I did some reading about parameter passing and it's still not clear to me whether https://bugs.python.org/issue37140 is a bug in CPython or whether the clang bindings were relying on incorrect parameter passing behavior to work. The change in #13796 restores the previous behavior where Windows and non-Windows builds pass structs differently. |
We probably need a second parg->obj to keep self alive for as long as copied_self. Or pack it into a tuple. Having a repro test for this would be ideal, especially if we can make it happen (even crash) on all platforms. The double-free issue would seem to be real, and I don't want it to crash on Windows either. |
Ok, I'm able to reproduce the crash using: git clone https://github.com/llvm/llvm-project.git The crash occurs in the test_access_specifiers() method defined in tests/cindex/test_access_specifiers.py. This test uses ctypes to call functions of the libclang.so dynamic library. I reduced the test case from 4350 lines of Python code to 131 lines of Python which only depends on the standard library and libclang.so. $ cd llvm-project/clang/bindings/python
$ wc -l tests/cindex/test_access_specifiers.py tests/cindex/util.py clang/cindex.py
41 tests/cindex/test_access_specifiers.py
90 tests/cindex/util.py
4219 clang/cindex.py
4350 total
$ wc -l bug.py
131 bug.py See attached bug.py script. Python 3.7 is fine, whereas Python 3.8 does crash: $ python3.7 bug.py
spelling
spelling = None
_CXString del: obj id=7f0904a6dcb0
spelling = None ---
$ python3.8 bug.py
spelling
spelling = None
_CXString del: obj id=7f4fa6180c40
_CXString del: obj id=7f4fa6180ac0
free(): double free detected in tcache 2
Aborted (core dumped) Notice that _CXString.__del__() is only called once in Python 3.7, but called twice in Python 3.8. |
It's a Python 3.8 regression, so I mark it as a release blocker. |
I was surprised by the following test in StructUnionType_paramfunc(): if ((size_t)self->b_size > sizeof(void*)) {
... copy the structure ...
} else {
... pass by reference (?) ...
} So I wrote a simple C library with the structure: typedef struct Point { int x; int y; } and functions modifying this structure, structured passed by copy or structure passed by reference. I'm surprised: ctypes works as expected :-) Try attached point.c and point.py: $ gcc -c point.c -fpic && gcc -shared -o libpoint.so point.o && python3 point.py
p = <Point x=1 y=2>
p = <Point x=1 y=2>
p = <Point x=0 y=0>
ok
sizeof(Point) = 16 bytes Modify CoordType in point.c and point.py to test different sizes, on x86-64, I get:
|
Ok, it took me a while to understand the subtle ctypes internals. serge-sans-paille and me read the C code and run a debugger (gdb) on it to undertand how passing a structure by copy work in ctypes. The root function is StructUnionType_paramfunc(): it copies the structure if it's larger than sizeof(void*) bytes. StructUnionType_paramfunc() creates a new internal object which is stored into parg->obj. The only purpose of this object is to release the memory copy allocated by PyMem_Malloc(). This issue is a problem with the internal object: if the struture has a finalizer, the finalizer is called twice. First, it is called on the internal object which is more and less a full copy of the structure. Second, it is called on the structure (once the last reference to the structure is removed). The code behaves as if the the finalizer is called twice. Even if it's two separated Python object, in fact the two objects contain the same structure values. For example, if the structure contains a pointer to memory block and the finalizer calls free(ptr): free(ptr) will be called twice with the same ptr value. This surprising behavior comes from this code: void *new_ptr = PyMem_Malloc(self->b_size);
if (new_ptr == NULL)
return NULL;
memcpy(new_ptr, self->b_ptr, self->b_size);
copied_self = (CDataObject *)PyCData_AtAddress(
(PyObject *)Py_TYPE(self), new_ptr);
copied_self->b_needsfree = 1; copied_self reuses the exact same type of the structure. If the structure has a finalizer defined in Python, it will be called. copied_self finalized is called at the "cleanup:" label of _ctypes_callproc(): for (i = 0; i < argcount; ++i)
Py_XDECREF(args[i].keep); -- Trying to duplicate self isn't needed. All we need is to call PyMem_Free(ptr) at the _ctypes_callproc() exit. My PR 15612 introduces a new internal StructParam_Type type which has exactly one purpose: call PyMem_Free(ptr) in its deallocator. The type has no finalizer. |
Ok, the regression is now fixed. I close the issue. Thanks Petr Viktorin for the bug report, Paul Monson for the fix proposed fix, and serge-sans-paille for your help on investigating this complex bug. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: