Skip to content

Conversation

@WarrenWeckesser
Copy link
Member

  • Include the return values of PyArray_RegisterCanCast
    in the description.
  • Correct the documentation of PyArray_Scalar.

If the data is not in native byte order (as indicated by
``dtype->byteorder``) then this function will byteswap the data,
because array scalars are always in correct machine-byte order.
Copy link
Member

@seberg seberg Oct 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this is better, although the docs are still nonsense :). itemsize is actually the array object owning the data (which makes sense itemsize would not be an object). Curiously, descr must also be the correct descriptor of that same array object, but I guess sometimes you only need one or the other...

Does anyone use this function, because if not... oh boy, I would like to deprecate it :).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I think we just need to tweak the signature a bit: base should be the owner of the data (but de-facto unused except for void scalars). In all other cases, it can be NULL anyway and all is fine probably.

I think it would make sense to rename itemsize to base here, although I do think there is a current pretty strict requirement that base->descr == dtype.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... although the docs are still nonsense :). itemsize is actually the array object owning the data

Argh. I had noticed that mistake, too, but I only made a note to myself to "fix PyArray_Scalar doc", and forgot about the base/itemsize argument when I went back to make the fix. I'll update the PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does anyone use this function, because if not... oh boy, I would like to deprecate it :).

I don't know if this is a good use of it, but I use it as a convenience in the str method of some custom dtypes. The 32 bit version of the custom dtype holds a C float, and I want the value printed by str to match what np.float32 does, so I use PyArray_Scalar to create an instance of a np.float32 with my value (passing NULL for base), and then create the string using PyUnicode_FromFormat(...), with the %S format code. Suggestions for a better method would be great!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestions for a better method would be great!

FYI: I replaced my use of PyArray_Scalar with, for example, PyArrayScalar_New(Float32) followed by PyArrayScalar_ASSIGN(...). That reduced the need for the intermediate references to the PyArray_Descr, so the code is now simpler.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm... is the code in PyArray_Scalar correct? I'm looking at

type_num = descr->type_num;
if (type_num == NPY_BOOL) {
PyArrayScalar_RETURN_BOOL_FROM_LONG(*(npy_bool*)data);
}
else if (PyDataType_FLAGCHK(descr, NPY_USE_GETITEM)) {
return descr->f->getitem(data, base);
}
itemsize = descr->elsize;
copyswap = descr->f->copyswap;
type = descr->typeobj;
swap = !PyArray_ISNBO(descr->byteorder);
if (PyTypeNum_ISSTRING(type_num)) {
/* Eliminate NULL bytes */
char *dptr = data;
dptr += itemsize - 1;
while(itemsize && *dptr-- == 0) {
itemsize--;
}
if (type_num == NPY_UNICODE && itemsize) {
/*
* make sure itemsize is a multiple of 4
* so round up to nearest multiple
*/
itemsize = (((itemsize - 1) >> 2) + 1) << 2;
}
}

According to the docs for elsize, for a flexible type, descr->elsize is 0. But the code shown above in PyArray_Scalar that runs when PyTypeNum_ISSTRING(type_num) is true uses the value of descr->elsize as the number of bytes in data.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The elsize docs refer to the registration of a new dtype, i..e what I would call the "prototype instance". Not that actual instance attached to the array.

That doesn't really make sense, because flexible user-dtypes are pretty much not going to work anyway in that old API...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@seberg, thanks. (The distinction between the use of elsize at the creation of the dtype vs in the object attached to the array isn't made in those docs.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I force-pushed an update with a new version of the description of PyArray_Scalar.

* Include the return values of `PyArray_RegisterCanCast`
  in the description.
* Correct the documentation of `PyArray_Scalar`.
@seberg
Copy link
Member

seberg commented Oct 27, 2021

Thanks for the followup and sieving through the logic!

I expect, I will want to narrow the constraints to base. E.g. that it can always be NULL, but for void we copy the content if it is, and for dtypes defined with the old API and using NPY_USE_GETITEM there is a tiny chance it will blow up.

@seberg seberg merged commit 5914281 into numpy:main Oct 27, 2021
@WarrenWeckesser WarrenWeckesser deleted the doc-misc branch October 27, 2021 22:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants