gh-76961: Fix the PEP3118 format string for ctypes.Structure #5561

eric-wieser · 2018-02-06T05:33:46Z

The summary of this diff is that it:

adds a _ctypes_alloc_format_padding function to append strings like 37x to a format string to indicate 37 padding bytes
removes the branches that amount to "give up on producing a valid format string if the struct is packed"
combines the resulting adjacent if (isStruct) {s now that neither is if (isStruct && !isPacked) {
invokes _ctypes_alloc_format_padding to add padding between structure fields, and after the last structure field. The computation used for the total size is unchanged from ctypes already used.

This patch does not affect any existing aligment computation; all it does is use subtraction to deduce the amount of paddnig introduced by the existing code.

Without this fix, it would never include padding bytes - an assumption that was only
valid in the case when _pack_ was set - and this case was explicitly not implemented.

This should allow conversion from ctypes structs to numpy structs

Fixes numpy/numpy#10528

https://bugs.python.org/issue32780

Fixes #76961

Issue: ctypes: memoryview gives incorrect PEP3118 format strings for both packed and unpacked structs #76961

the-knights-who-say-ni · 2018-02-06T05:33:48Z

Hello, and thanks for your contribution!

I'm a bot set up to make sure that the project can legally accept your contribution by verifying you have signed the PSF contributor agreement (CLA).

Unfortunately our records indicate you have not signed the CLA. For legal reasons we need you to sign this before we can look at your contribution. Please follow the steps outlined in the CPython devguide to rectify this issue.

Thanks again to your contribution and we look forward to looking at it!

Without this fix, it would never include padding bytes - an assumption that was only valid in the case when `_pack_` was set - and this case was explicitly not implemented. This should allow conversion from ctypes structs to numpy structs.

mattip

LGTM. The PR is sufficient to fix the problems described. Perhaps judicious use of spacing between fields could make the format clearer, especially when anonymous padding is added. For instance, I would find "<b:x:7x<Q:y:" clearer as "<b:x: 7x <Q:y:". Not critical as the format is probably going to be machine-parsed anyway.

eric-wieser · 2018-05-23T15:45:02Z

That's looks like a reasonable suggestion, bit possibly out of scope for this PR.

Less intrusively, I could add whitespace in the test expectations, and remove it before comparing.

Let's see how the core devs feel.

Thanks for the review!

eric-wieser · 2018-10-21T07:25:03Z

@abalkin: Any chance you could take a look at this?

eric-wieser · 2019-04-14T06:35:52Z

@brettcannon: I'd argue this is type-bugfix, not type-enhancement - the implementation in master produces invalid buffers in almost all cases. This isn't just adding support for _pack_ (an enhancement), but also fixing the behavior for when _pack_ is absent.

abalkin · 2022-08-14T09:31:39Z

Modules/_ctypes/stgdict.c

+    Py_ssize_t log_n = 0;
+    while (n > 0) {
+        log_n++;
+        n /= 10;


Multiplication is often faster than division. Can this be rewritten by computing powers of 10 until n is exceeded?

Better yet, just inline linear search.

if (n < 10ULL) return 1; if (n < 100ULL) return 2; ... if (n < 10_000_000_000_000_000_000ULL) return 20;

On the second thought, I would be surprised if this has not been implemented elsewhere in cpython code base. Off the top of my head, I cannot recall where it could be, but I will try to search. If someone beats me to it - please leave a note.

Can this be rewritten by computing powers of 10 until n is exceeded?

This is risky because the power can overflow

I would be surprised if this has not been implemented elsewhere in cpython code base. Off the top of my head, I cannot recall where it could be, but I will try to search.

It took me a little longer than I expected to get back to this, but the code that I was looking for is in Python/dtoa.c.

@mdickinson - Is the approximation used in dtoa applicable to the problem at hand? If so, do you think that code can be factored out and called here?

@abalkin Looks like the relevant code is no longer part of the PR. But the Python/dtoa.c code is for floats, and assumes IEEE 754 format; it's not clear to me how it could be used here. (In general, I'm reluctant to add floating-point dependence to code that doesn't need it.)

abalkin · 2022-08-14T09:59:45Z

Modules/_ctypes/stgdict.c

+    }
+
+    /* decimal characters + x + null */
+    buf = PyMem_Malloc(clog10(padding) + 2);


I left a comment about log10 implementation above, but looking at the actual use, I don't see why it is needed. Can't we just make buf

char buf[20];

and not allocate it on the heap?

I wanted to avoid risking introducing a buffer overflow by accident by choosing too short a buffer; especially since I don't think we care about performance here. The whole framework for building the format strings here consists of repeated heap allocations, so one more allocation doesn't seem like a big deal.

I think 20 isn't actually enough, as an int64 can need up to 19 digits, and then we need the x and the null.

~~I could ask PyOS_snprintf to compute the size for me if you'd prefer? Although I can't see any evidence that PyOS_snprintf is actually called with a null buffer anywhere in CPython.~~ Nevermind, PyOS_snprintf does not support this feature of snprintf (#95993)

I've pushed the version with stack allocation as requested

eric-wieser · 2023-01-10T14:32:00Z

@mdickinson, I really appreciate that you were able to review #5576 a few weeks ago. Do you think you'd be able to take a look at this one too?

mdickinson · 2023-01-14T18:02:19Z

@eric-wieser Yes, I'll take a look. It won't be this weekend, though, I'm afraid.

mdickinson

This LGTM, and behaves as expected in my manual testing; the code looks good. I left a couple of nitpick-level comments / suggestions.

@abalkin This is still assigned to you; is there anything you're aware of that would mean this shouldn't be merged?

The issue is marked as a "bug" (which makes some sense), but I think this is sufficiently new-feature'y that the changes shouldn't be backported to Python <3.12. @abalkin: thoughts?

The way we build up the format string doesn't seem ideal - it looks as though it would take time quadratic in the number of fields of the struct. But AFAICT that's a pre-existing issue, not introduced in this PR. Presumably for the sort of things for which this is used in practice this hasn't yet been a real issue.

Observation: it seems there's no documentation of the "data format description" language outside PEP 3118 itself, unless I'm missing something. I'd expect the ctypes documentation to at least have a pointer to PEP 3118 (though it would be better to have a description in the docs themselves at some point, since PEPs are historical documents that shouldn't generally by relied upon to be up to date with implementations.) Anyway, that's off-topic for this PR.

Misc/NEWS.d/next/Core and Builtins/2018-02-05-21-54-46.bpo-32780.Dtiz8z.rst

Modules/_ctypes/stgdict.c

…80.Dtiz8z.rst Co-authored-by: Mark Dickinson <dickinsm@gmail.com>

mdickinson · 2023-02-05T15:21:56Z

@eric-wieser: Changes LGTM; thank you.

@abalkin Python 3.12a5 is due tomorrow, which means that if we leave this PR open for another couple of days we'll end up having to nudge the news entry again. (That was poor timing on my part - sorry about that.) But I think this is ready, so I'll take the EAFP approach: I'll merge, and we can tweak later if necessary.

abalkin · 2023-02-05T17:03:32Z

@mdickinson - my only concern was with the floor log10 computation that I felt could be done better. If it looks good enough to you, I have no objections to the merge.

bedevere-bot · 2023-02-05T17:56:01Z

⚠️⚠️⚠️ Buildbot failure ⚠️⚠️⚠️

Hi! The buildbot x86 Gentoo Installed with X 3.x has failed when building commit 90d85a9.

What do you need to do:

Don't panic.
Check the buildbot page in the devguide if you don't know what the buildbots are or how they work.
Go to the page of the buildbot that failed (https://buildbot.python.org/all/#builders/464/builds/3771) and take a look at the build logs.
Check if the failure is related to this commit (90d85a9) or if it is a false positive.
If the failure is related to this commit, please, reflect that on the issue and make a new Pull Request with a fix.

You can take a look at the buildbot page here:

https://buildbot.python.org/all/#builders/464/builds/3771

Failed tests:

test_ctypes

Failed subtests:

test_native_types - test.test_ctypes.test_pep3118.Test.test_native_types

Summary of the results of the build (if available):

== Tests result: FAILURE then FAILURE ==

413 tests OK.

1 test failed:
test_ctypes

17 tests skipped:
test_asdl_parser test_check_c_globals test_clinic test_devpoll
test_gdb test_ioctl test_kqueue test_launcher test_msilib
test_peg_generator test_perf_profiler test_startfile
test_winconsoleio test_winreg test_winsound test_wmi
test_zipfile64

1 re-run test:
test_ctypes

Total duration: 22 min

Click to see traceback logs

Traceback (most recent call last):
  File "/buildbot/buildarea/cpython/3.x.ware-gentoo-x86.installed/build/target/lib/python3.12/test/test_ctypes/test_pep3118.py", line 27, in test_native_types
    self.assertEqual(normalize(v.format), normalize(fmt))
AssertionError: 'T{<b:x:3x<Q:y:}' != 'T{<b:x:7x<Q:y:}'
- T{<b:x:3x<Q:y:}
?        ^
+ T{<b:x:7x<Q:y:}
?        ^

mdickinson · 2023-02-05T17:58:51Z

Urgh. That looks like a legitimate failure on 32-bit platforms, introduced by this PR. @eric-wieser I think this is a problem with the test rather than the core logic. Do you agree?

eric-wieser · 2023-02-05T18:07:12Z

Yes, you're right; the cases that don't set _pack_ at all are platform-dependent.

I think a test written for 32 bit platforms would pass on both; so replacing the uint64 with uint32s should make the problem go away

bedevere-bot · 2023-02-05T18:24:58Z

⚠️⚠️⚠️ Buildbot failure ⚠️⚠️⚠️

Hi! The buildbot x86 Gentoo Non-Debug with X 3.x has failed when building commit 90d85a9.

What do you need to do:

Don't panic.
Check the buildbot page in the devguide if you don't know what the buildbots are or how they work.
Go to the page of the buildbot that failed (https://buildbot.python.org/all/#builders/58/builds/3813) and take a look at the build logs.
Check if the failure is related to this commit (90d85a9) or if it is a false positive.
If the failure is related to this commit, please, reflect that on the issue and make a new Pull Request with a fix.

You can take a look at the buildbot page here:

https://buildbot.python.org/all/#builders/58/builds/3813

Failed tests:

test_ctypes

Failed subtests:

test_native_types - test.test_ctypes.test_pep3118.Test.test_native_types

Summary of the results of the build (if available):

== Tests result: FAILURE then FAILURE ==

419 tests OK.

10 slowest tests:

test_math: 4 min 3 sec
test_multiprocessing_spawn: 3 min 45 sec
test_asyncio: 2 min 46 sec
test_concurrent_futures: 2 min 45 sec
test_tokenize: 1 min 46 sec
test_multiprocessing_forkserver: 1 min 32 sec
test_unparse: 1 min 24 sec
test_multiprocessing_fork: 1 min 15 sec
test_signal: 1 min 12 sec
test_compileall: 1 min 2 sec

1 test failed:
test_ctypes

14 tests skipped:
test_check_c_globals test_devpoll test_ioctl test_kqueue
test_launcher test_msilib test_peg_generator test_perf_profiler
test_startfile test_winconsoleio test_winreg test_winsound
test_wmi test_zipfile64

1 re-run test:
test_ctypes

Total duration: 26 min 53 sec

Click to see traceback logs

Traceback (most recent call last):
  File "/buildbot/buildarea/cpython/3.x.ware-gentoo-x86.nondebug/build/Lib/test/test_ctypes/test_pep3118.py", line 27, in test_native_types
    self.assertEqual(normalize(v.format), normalize(fmt))
AssertionError: 'T{<b:x:3x<Q:y:}' != 'T{<b:x:7x<Q:y:}'
- T{<b:x:3x<Q:y:}
?        ^
+ T{<b:x:7x<Q:y:}
?        ^

mdickinson · 2023-02-05T19:16:37Z

I think a test written for 32 bit platforms would pass on both [...]

I'll give that a go; PR shortly. I'm wondering whether I'm going to end up hitting problems with ILP32 platforms not being consistent about whether uint32 is aliased to unsigned int or to unsigned long, though.

mdickinson · 2023-02-05T19:26:40Z

PR shortly

#101587

This PR fixes the buildbot failures introduced by the merge of #5561, by restricting the relevant tests to something that should work on both 32-bit and 64-bit platforms. It also silences some compiler warnings introduced in that PR.

the-knights-who-say-ni added the CLA not signed label Feb 6, 2018

bedevere-bot added the awaiting review label Feb 6, 2018

eric-wieser force-pushed the ctypes-padding branch from fd7e46e to 95157de Compare February 6, 2018 05:37

eric-wieser mentioned this pull request Feb 6, 2018

BUG: Cannot convert ctypes struct into numpy array numpy/numpy#10528

Closed

eric-wieser force-pushed the ctypes-padding branch 5 times, most recently from d851b6e to 9dbc70a Compare February 6, 2018 07:35

eric-wieser mentioned this pull request Feb 6, 2018

Numpy does not recognize ctypes arrays with c_wchar field numpy/numpy#10100

Closed

eric-wieser force-pushed the ctypes-padding branch from 9dbc70a to ac4e5bb Compare February 6, 2018 10:04

This comment has been minimized.

Sign in to view

eric-wieser mentioned this pull request Feb 7, 2018

BUG: np.dtype(ctype) does not respect endianness numpy/numpy#10533

Closed

eric-wieser mentioned this pull request Apr 25, 2018

WIP: Remove fragile use of __array_interface__ in ctypeslib.as_array numpy/numpy#10970

Merged

eric-wieser mentioned this pull request May 21, 2018

bpo-32782: PEP3118 itemsize of an empty ctypes array should not be 0 #5576

Merged

mattip approved these changes May 23, 2018

View reviewed changes

bedevere-bot added awaiting core review and removed awaiting review labels May 23, 2018

Mariatta removed the CLA not signed label Jun 15, 2018

the-knights-who-say-ni added the CLA signed label Jun 15, 2018

abalkin self-assigned this Jul 8, 2018

eric-wieser mentioned this pull request Aug 12, 2018

BUG: np.dtype(ctypes.Structure) does not respect _pack_ field numpy/numpy#10532

Closed

brettcannon added the type-feature A feature request or enhancement label Apr 2, 2019

brettcannon added type-bug An unexpected behavior, bug, or error and removed type-feature A feature request or enhancement labels Apr 17, 2019

abalkin reviewed Aug 14, 2022

View reviewed changes

github-actions bot removed the stale Stale PR or inactive for long period of time. label Aug 15, 2022

eric-wieser added 2 commits August 15, 2022 10:55

fix incorrect name and documentation

27b1601

remove the heap allocation as requested

e300e16

eric-wieser requested review from abalkin and removed request for skrah October 7, 2022 19:02

Merge branch 'main' into ctypes-padding

7e8b4f9

eric-wieser mentioned this pull request Dec 23, 2022

memoryview & ctypes: incorrect itemsize for empty array #76963

Closed

AlexWaygood changed the title ~~bpo-32780: Fix the PEP3118 format string for ctypes.Structure~~ gh-76961: Fix the PEP3118 format string for ctypes.Structure Jan 10, 2023

mdickinson self-requested a review January 14, 2023 18:01

mdickinson approved these changes Feb 5, 2023

View reviewed changes

Misc/NEWS.d/next/Core and Builtins/2018-02-05-21-54-46.bpo-32780.Dtiz8z.rst Outdated Show resolved Hide resolved

Modules/_ctypes/stgdict.c Outdated Show resolved Hide resolved

Modules/_ctypes/stgdict.c Outdated Show resolved Hide resolved

bedevere-bot added awaiting merge and removed awaiting core review labels Feb 5, 2023

eric-wieser and others added 2 commits February 5, 2023 12:37

Update Misc/NEWS.d/next/Core and Builtins/2018-02-05-21-54-46.bpo-327…

4de7a8a

…80.Dtiz8z.rst Co-authored-by: Mark Dickinson <dickinsm@gmail.com>

Apply suggestions from code review

26311ae

mdickinson merged commit 90d85a9 into python:main Feb 5, 2023

bedevere-bot removed the awaiting merge label Feb 5, 2023

mdickinson mentioned this pull request Feb 5, 2023

gh-76961: Possible fix for buildbot failures in test_pep3118 #101587

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gh-76961: Fix the PEP3118 format string for ctypes.Structure #5561

gh-76961: Fix the PEP3118 format string for ctypes.Structure #5561

eric-wieser commented Feb 6, 2018 •

edited by bedevere-bot

Loading

the-knights-who-say-ni commented Feb 6, 2018

This comment has been minimized.

mattip left a comment

eric-wieser commented May 23, 2018

eric-wieser commented Oct 21, 2018

eric-wieser commented Apr 14, 2019

abalkin Aug 14, 2022

abalkin Aug 14, 2022

eric-wieser Aug 15, 2022

abalkin Jan 26, 2023

mdickinson Feb 5, 2023

abalkin Aug 14, 2022

eric-wieser Aug 15, 2022 •

edited

Loading

eric-wieser Aug 15, 2022

eric-wieser commented Jan 10, 2023

mdickinson commented Jan 14, 2023

mdickinson left a comment

mdickinson commented Feb 5, 2023

abalkin commented Feb 5, 2023

bedevere-bot commented Feb 5, 2023

mdickinson commented Feb 5, 2023

eric-wieser commented Feb 5, 2023

bedevere-bot commented Feb 5, 2023

mdickinson commented Feb 5, 2023

mdickinson commented Feb 5, 2023

gh-76961: Fix the PEP3118 format string for ctypes.Structure #5561

gh-76961: Fix the PEP3118 format string for ctypes.Structure #5561

Conversation

eric-wieser commented Feb 6, 2018 • edited by bedevere-bot Loading

the-knights-who-say-ni commented Feb 6, 2018

This comment has been minimized.

mattip left a comment

Choose a reason for hiding this comment

eric-wieser commented May 23, 2018

eric-wieser commented Oct 21, 2018

eric-wieser commented Apr 14, 2019

abalkin Aug 14, 2022

Choose a reason for hiding this comment

abalkin Aug 14, 2022

Choose a reason for hiding this comment

eric-wieser Aug 15, 2022

Choose a reason for hiding this comment

abalkin Jan 26, 2023

Choose a reason for hiding this comment

mdickinson Feb 5, 2023

Choose a reason for hiding this comment

abalkin Aug 14, 2022

Choose a reason for hiding this comment

eric-wieser Aug 15, 2022 • edited Loading

Choose a reason for hiding this comment

eric-wieser Aug 15, 2022

Choose a reason for hiding this comment

eric-wieser commented Jan 10, 2023

mdickinson commented Jan 14, 2023

mdickinson left a comment

Choose a reason for hiding this comment

mdickinson commented Feb 5, 2023

abalkin commented Feb 5, 2023

bedevere-bot commented Feb 5, 2023

⚠️⚠️⚠️ Buildbot failure ⚠️⚠️⚠️

mdickinson commented Feb 5, 2023

eric-wieser commented Feb 5, 2023

bedevere-bot commented Feb 5, 2023

⚠️⚠️⚠️ Buildbot failure ⚠️⚠️⚠️

mdickinson commented Feb 5, 2023

mdickinson commented Feb 5, 2023

eric-wieser commented Feb 6, 2018 •

edited by bedevere-bot

Loading

eric-wieser Aug 15, 2022 •

edited

Loading