Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: concatenate of structured dtypes does not check field orders match #15494

Closed
relleums opened this issue Feb 3, 2020 · 4 comments · Fixed by #19226
Closed

BUG: concatenate of structured dtypes does not check field orders match #15494

relleums opened this issue Feb 3, 2020 · 4 comments · Fixed by #19226

Comments

@relleums
Copy link

relleums commented Feb 3, 2020

When concatenating, or stacking a list of recarrays, I encountered unexpected outcome when there are empty recarrays with a different column ordering. In such a case, np.concatenate mixes up the column-key to value assignments. The issue is not present with empty recarrays in general. Only when the ordering of the columns is different from the ordering in the non empty recarray. I did not expect the ordering of the columns to matter.

Reproducing code example:

import numpy as np


def assert_as_expected(final_recarray):
    assert len(final_recarray) == 1
    assert final_recarray["a"][0] == 1
    assert final_recarray["b"][0] == 2


A = ("a", "<i8")
B = ("b", "<i8")
a1b2 = np.rec.array(obj=[(1, 2)],     dtype=[A, B])
a_b_ = np.rec.array(obj=np.array([]), dtype=[A, B])
b_a_ = np.rec.array(obj=np.array([]), dtype=[B, A])

fine = [
    a1b2,
    a_b_,
]

evil = [
    a1b2,
    b_a_,
]

assert_as_expected(np.concatenate(fine))
assert_as_expected(np.concatenate(evil))  # <-- fails

Error message:

No warning. No exception. Just unexpected outcome.

Numpy/Python version information:

1.17.2 3.7.4 (default, Aug 13 2019, 20:35:49) 
[GCC 7.3.0]
@eric-wieser
Copy link
Member

eric-wieser commented Feb 3, 2020

rec.array is a distraction here, the bug is present in regular arrays:

A = ("a", "<i8")
B = ("b", "<i8")
a1b2 = np.array([(1, 2)], dtype=[A, B])
b_a_ = np.array([], dtype=[B, A])
np.concatenate([a1b2, b_a_])
# array([(1, 2)], dtype=[('b', '<i8'), ('a', '<i8')])
# fields values were swapped

@eric-wieser eric-wieser changed the title concatenate, and stack do mix up columns in recarrays BUG: concatenate of structured dtypes does not check field orders match Feb 3, 2020
@eric-wieser
Copy link
Member

eric-wieser commented Feb 3, 2020

Tracing this back further, I think the bug is actually:

>>> np.result_type([A, B], [B, A])
dtype([('b', '<i8'), ('a', '<i8')])

which should probably error.

@eric-wieser
Copy link
Member

@ahaldane, I think this is something we missed with the "assign by position" changes to fields. can_cast_fields still uses assign-by-name comparisons.

@ahaldane
Copy link
Member

ahaldane commented Feb 4, 2020

Yeah, seems like it. Thanks for finding it!

Those two arrays should be assignable to each other using the assign-by-position convention (eg, b_a_[:] = a1b2, so we should be careful not to break such assignment in the fix.

seberg added a commit to seberg/numpy that referenced this issue Jun 11, 2021
This PR replaces the old numpygh-15509 implementing proper type promotion
for structured voids.  It further fixes the casting safety to consider
casts with equivalent field number and matching order as "safe"
and if the names, titles, and offsets match as "equiv".

The change perculates into the void comparison, and since it fixes
the order, it removes the current FutureWarning there as well.

This addresses liberfa/pyerfa#77
and replaces numpygh-15509 (the implementation has changed too much).

Fixes numpygh-15494  (and probably a few more)

Co-authored-by: Allan Haldane <allan.haldane@gmail.com>
seberg added a commit to seberg/numpy that referenced this issue Jun 26, 2021
This PR replaces the old numpygh-15509 implementing proper type promotion
for structured voids.  It further fixes the casting safety to consider
casts with equivalent field number and matching order as "safe"
and if the names, titles, and offsets match as "equiv".

The change perculates into the void comparison, and since it fixes
the order, it removes the current FutureWarning there as well.

This addresses liberfa/pyerfa#77
and replaces numpygh-15509 (the implementation has changed too much).

Fixes numpygh-15494  (and probably a few more)

Co-authored-by: Allan Haldane <allan.haldane@gmail.com>
seberg added a commit to seberg/numpy that referenced this issue Jun 26, 2021
This PR replaces the old numpygh-15509 implementing proper type promotion
for structured voids.  It further fixes the casting safety to consider
casts with equivalent field number and matching order as "safe"
and if the names, titles, and offsets match as "equiv".

The change perculates into the void comparison, and since it fixes
the order, it removes the current FutureWarning there as well.

This addresses liberfa/pyerfa#77
and replaces numpygh-15509 (the implementation has changed too much).

Fixes numpygh-15494  (and probably a few more)

Co-authored-by: Allan Haldane <allan.haldane@gmail.com>
seberg added a commit to seberg/numpy that referenced this issue Feb 24, 2022
This PR replaces the old numpygh-15509 implementing proper type promotion
for structured voids.  It further fixes the casting safety to consider
casts with equivalent field number and matching order as "safe"
and if the names, titles, and offsets match as "equiv".

The change perculates into the void comparison, and since it fixes
the order, it removes the current FutureWarning there as well.

This addresses liberfa/pyerfa#77
and replaces numpygh-15509 (the implementation has changed too much).

Fixes numpygh-15494  (and probably a few more)

Co-authored-by: Allan Haldane <allan.haldane@gmail.com>
seberg added a commit to seberg/numpy that referenced this issue Feb 24, 2022
This PR replaces the old numpygh-15509 implementing proper type promotion
for structured voids.  It further fixes the casting safety to consider
casts with equivalent field number and matching order as "safe"
and if the names, titles, and offsets match as "equiv".

The change perculates into the void comparison, and since it fixes
the order, it removes the current FutureWarning there as well.

This addresses liberfa/pyerfa#77
and replaces numpygh-15509 (the implementation has changed too much).

Fixes numpygh-15494  (and probably a few more)

Co-authored-by: Allan Haldane <allan.haldane@gmail.com>
seberg added a commit to seberg/numpy that referenced this issue May 5, 2022
This PR replaces the old numpygh-15509 implementing proper type promotion
for structured voids.  It further fixes the casting safety to consider
casts with equivalent field number and matching order as "safe"
and if the names, titles, and offsets match as "equiv".

The change perculates into the void comparison, and since it fixes
the order, it removes the current FutureWarning there as well.

This addresses liberfa/pyerfa#77
and replaces numpygh-15509 (the implementation has changed too much).

Fixes numpygh-15494  (and probably a few more)

Co-authored-by: Allan Haldane <allan.haldane@gmail.com>
seberg added a commit to seberg/numpy that referenced this issue May 9, 2022
This PR replaces the old numpygh-15509 implementing proper type promotion
for structured voids.  It further fixes the casting safety to consider
casts with equivalent field number and matching order as "safe"
and if the names, titles, and offsets match as "equiv".

The change perculates into the void comparison, and since it fixes
the order, it removes the current FutureWarning there as well.

This addresses liberfa/pyerfa#77
and replaces numpygh-15509 (the implementation has changed too much).

Fixes numpygh-15494  (and probably a few more)

Co-authored-by: Allan Haldane <allan.haldane@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants