Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARROW-4181: [Python] Fixes for Numpy struct array conversion #3614

Closed

Conversation

pitrou
Copy link
Member

@pitrou pitrou commented Feb 11, 2019

This fixes two issues:

  • object fields inside numpy structs should be allowed
  • buggy ndarray indexing if stride % itemsize != 0

This fixes two issues:
- object fields inside numpy structs should be allowed
- buggy ndarray indexing if stride % itemsize != 0
@pitrou pitrou force-pushed the ARROW-4181-py-struct-conversion-fixes branch from 06b7fc4 to 9886e29 Compare February 11, 2019 20:08
@kszucs
Copy link
Member

kszucs commented Feb 12, 2019

@pitrou You can test edge cases with hypothesis, like the following:

import hypothesis as h
import hypothesis.extra.numpy as npst

supported_scalar_dtypes = (npst.boolean_dtypes() | 
                           npst.integer_dtypes() | 
                           npst.unsigned_integer_dtypes() | 
                           npst.floating_dtypes() | 
                           npst.datetime64_dtypes())

@h.given(
    npst.arrays(npst.nested_dtypes(supported_scalar_dtypes), shape=(5,))
)
def test_nested_struct_array_from_numpy(numpy_array):
    pa.array(numpy_array)

Or to test converting to pylist:

import hypothesis as h
import pyarrow.tests.strategies as past

@h.given(
    past.arrays(past.nested_struct_types())
)
def test_nested_pyarrow_array_to_pylist(pyarrow_array):
    pyarrow_array.to_pylist()

And run with:

pytest -sv --enable-hypothesis --hypothesis-show-statistics --hypothesis-profile=dev

or to print the generated examples:

pytest -sv --enable-hypothesis --hypothesis-show-statistics --hypothesis-profile=debug

@pitrou
Copy link
Member Author

pitrou commented Feb 12, 2019

When fixing a specific bug, I think I'd rather rely on manual testing and know exactly what's being tested.

@kszucs
Copy link
Member

kszucs commented Feb 12, 2019

Agree, just wanted to inform You about a handy way to discover bugs :)

Copy link
Member

@xhochy xhochy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM

@xhochy xhochy closed this in a5d8ccc Feb 12, 2019
@pitrou pitrou deleted the ARROW-4181-py-struct-conversion-fixes branch February 12, 2019 15:19
tanyaschlusser pushed a commit to tanyaschlusser/arrow that referenced this pull request Feb 21, 2019
This fixes two issues:
- object fields inside numpy structs should be allowed
- buggy ndarray indexing if stride % itemsize != 0

Author: Antoine Pitrou <antoine@python.org>

Closes apache#3614 from pitrou/ARROW-4181-py-struct-conversion-fixes and squashes the following commits:

9886e29 <Antoine Pitrou> ARROW-4181:  Fixes for Numpy struct array conversion
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants