Skip to content
This repository has been archived by the owner on May 4, 2021. It is now read-only.

PyArray_GetPtr does not work with named dtypes #30

Closed
aherlihy opened this issue Jun 12, 2017 · 4 comments
Closed

PyArray_GetPtr does not work with named dtypes #30

aherlihy opened this issue Jun 12, 2017 · 4 comments

Comments

@aherlihy
Copy link
Contributor

aherlihy commented Jun 12, 2017

PyArray_GetPtr does not descend into named dtypes.

Example:

array([(0, 10), (1,  9), (2,  8), (3,  7), (4,  6), (5,  5), (6,  4), (7,  3), (8,  2), (9,  1)],
       dtype=[('x', '<i4'), ('y', '<i4')])

In C:

PyArray_GetPtr(ndarray, [1, 0]) == PyArray_GetPtr(ndarray, [1, 10])
PyArray_GETPTR2(ndarray, 1, 0) == PyArray_GETPTR2(ndarray, 1, 10)

(For the record, it works up until the first dtype, or with regular dtypes that don't have any named fields: PyArray_GetPtr([1, 0]) != PyArray_GetPtr([2, 0])

Potentially because named dtypes don't really have an order? Either way, that's why we have "sub_coordinates" and "offset" counters for keeping track of where in the array we are at. Would love to get rid of that extra code and computation and rely on a Numpy API call.

@behackett
Copy link
Member

@aherlihy if I understand this ticket right this isn't a bug but a cleanup?

@aherlihy
Copy link
Contributor Author

aherlihy commented Sep 16, 2019

Exactly, the PyArray_GetPtr method will only give you the pointer to the level above the named dtypes. So while you can pass extra arguments to the function, it will ignore them, so we have a bunch of extra code in there to handle that. It seems like there should be something that works out of the box since it's not an usual use case, but it requires some research and testing.

@aherlihy
Copy link
Contributor Author

IIRC, this might even be a Numpy bug since I could not find documentation anywhere and it seems like unexpected behavior.

@prashantmital
Copy link
Contributor

Closing this as we have discontinued development of BSON-NumPy. PyMongoArrow is now the recommended way to materialize MongoDB query results as NumPy ndarrays as well as tabular formats like Pandas' DataFrames and PyArrow Tables.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants