BUG: Various fixes to _dtype_from_pep3118 #9054

eric-wieser · 2017-05-05T10:15:07Z

Fixes #9053, and groundwork for #9049.

Again, probably easier to review commit-by-commit

Adding padding fields means that dtypes with the same fields cannot necessarily be cast to one another. Padding shouldn't prevent casting. Partially fixes numpy#9053

Fixes numpygh-9053

mhvk · 2017-05-05T17:58:37Z

I like the solution with the stream class (though I wonder to what extent you're reinventing ply...). But I cannot say I went through this in enough detail.

mhvk · 2017-05-05T17:59:19Z

p.s. would be nice to have #8774 so you don't have to redefine gcd and lcm...

eric-wieser · 2017-05-05T18:03:22Z

#8774 actually falls back on _internals._gcd for object arrays anyway. I also think that that file is minimizing the number of things from np.core it uses, to avoid the possibility of recursion loops either at import-time or runtime.

eric-wieser · 2017-05-05T18:05:40Z

though I wonder to what extent you're reinventing ply

I've seen in the past that dependencies for numpy are a pain from a distribution perspective (#that-pr-about-deprecating-np.float), and it's better to do simple things ourselves.

mhvk · 2017-05-05T18:10:15Z

Yes, makes sense. And as said, I quite like your stream class; and the path that gets taken is at least a bit clearer than with a lex file (or a regex).

ahaldane · 2017-05-08T02:40:21Z

The 'stream' cleanup commit looks nice. Perhaps consider making the class global level? I'm fine as is though.
The three bugfix commits look good:
- The itemsize commit is partly like in ENH: properly account for trailing padding in PEP3118 #7798 as you mentioned. I am fine merging it as you have it for now. Although, as noted there we are probably incorrectly interpreting the spec: According to spec ix should be 5 bytes, not 8. If the user wants 8 they need to do ix0i. We might leave that for ENH: properly account for trailing padding in PEP3118 #7798, since the problem is purely theoretical right now: I don't know of public python modules (including numpy) which can create buffers with 'ix' format.
- The commit using the 'names' list to store the field order looks correct, and even seems to fix bugs we didn't come across (eg the extra test that the 1-item itemsize is unchanged).
- The autogenerated names commit looks good and is much easier to read than the old code

I read over it well enough that it has my +1 for merging. If there are no other comments I would be happy to merge in a day or two.

ahaldane · 2017-05-09T16:48:35Z

All right, pulling the trigger here. Thanks @eric-wieser !

eric-wieser · 2017-05-09T16:51:50Z

. Although, as noted there we are probably incorrectly interpreting the spec: According to spec ix should be 5 bytes, not 8

Can you give a reference for that?

pv · 2017-05-09T16:59:01Z

Is it that clear? As I understand, the padding is supposed to follow typical C compiler behavior.

eric-wieser · 2017-05-09T16:59:42Z

@pv: Here's the problem. The c code you refer to gives 8, but struct.Struct('ix').size gives 5.

pv · 2017-05-09T17:03:00Z

Yes, but there is also a question on whether struct.Struct implementation is correct.

pv · 2017-05-09T17:12:33Z

Since also `struct.Struct('ib').size == 5`, I guess the question then is whether trailing padding must be explicit or not. The PEP does not spell this out. Python's struct module syntax is older than the PEP, so maybe that is the behavior that has priority.

ahaldane · 2017-05-09T17:17:07Z

Here's the reference, (copied from my comment in #7798):

The struct module says this about how format strings treat trailing padding:

To align the end of a structure to the alignment requirement of a particular type, end the format with the code for that type with a repeat count of zero.

My interpretation is that for a format like 'ix', even though the type is implicitly @ (aligned) we should only add 1 trailing padding byte. If we wanted numpy-style padding bytes the format string should instead be something like 'ix0i'.

MAINT: refactor _dtype_from_pep3118 in terms of a stream

7f6c95f

eric-wieser added 03 - Maintenance 55 - Needs work labels May 5, 2017

eric-wieser added 2 commits May 5, 2017 12:00

ENH: Pad with itemsize, not padding fields

fed2e1a

Adding padding fields means that dtypes with the same fields cannot necessarily be cast to one another. Padding shouldn't prevent casting. Partially fixes numpy#9053

BUG: Fix non-determinism in order of fields created from pep3118 formats

3c4545f

Fixes numpygh-9053

eric-wieser changed the title ~~[WIP] MAINT: refactor _dtype_from_pep3118 in terms of a stream~~ BUG: Various fixes to _dtype_from_pep3118 May 5, 2017

BUG: Prevent autogenerated names clashing with given names

a4f435c

eric-wieser force-pushed the fix-pep3118 branch from 552c4a3 to a4f435c Compare May 5, 2017 11:39

eric-wieser added 00 - Bug component: numpy._core and removed 55 - Needs work labels May 5, 2017

eric-wieser mentioned this pull request May 5, 2017

DOC: update structured array docs to reflect #6053 #9056

Merged

eric-wieser mentioned this pull request May 8, 2017

ENH: properly account for trailing padding in PEP3118 #7798

Closed

aldanor mentioned this pull request May 8, 2017

Allow std::complex field with PYBIND11_NUMPY_DTYPE pybind/pybind11#831

Merged

ahaldane merged commit 8618066 into numpy:master May 9, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: Various fixes to _dtype_from_pep3118 #9054

BUG: Various fixes to _dtype_from_pep3118 #9054

eric-wieser commented May 5, 2017 •

edited

mhvk commented May 5, 2017

mhvk commented May 5, 2017

eric-wieser commented May 5, 2017 •

edited

eric-wieser commented May 5, 2017

mhvk commented May 5, 2017

ahaldane commented May 8, 2017

ahaldane commented May 9, 2017

eric-wieser commented May 9, 2017

pv commented May 9, 2017 via email

eric-wieser commented May 9, 2017 •

edited

pv commented May 9, 2017 via email

pv commented May 9, 2017 via email

ahaldane commented May 9, 2017

BUG: Various fixes to _dtype_from_pep3118 #9054

BUG: Various fixes to _dtype_from_pep3118 #9054

Conversation

eric-wieser commented May 5, 2017 • edited

mhvk commented May 5, 2017

mhvk commented May 5, 2017

eric-wieser commented May 5, 2017 • edited

eric-wieser commented May 5, 2017

mhvk commented May 5, 2017

ahaldane commented May 8, 2017

ahaldane commented May 9, 2017

eric-wieser commented May 9, 2017

pv commented May 9, 2017 via email

eric-wieser commented May 9, 2017 • edited

pv commented May 9, 2017 via email

pv commented May 9, 2017 via email

ahaldane commented May 9, 2017

eric-wieser commented May 5, 2017 •

edited

eric-wieser commented May 5, 2017 •

edited

eric-wieser commented May 9, 2017 •

edited