-
-
Notifications
You must be signed in to change notification settings - Fork 9.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DOC: mention string, bytes, and void dtypes in dtype intro #25507
Conversation
92be44a
to
c485a69
Compare
[skip actions] [skip azp] [skip cirrus]
c485a69
to
dd484ff
Compare
[skip actions] [skip azp] [skip cirrus]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, when I started reviewing this, I thought what a strange additions you made (see in-line comments), but then I realized that was all in the piece of text you moved... Your actual additions I think are good!
Of course, that moved piece also led me to the same problems as in #25508 (comment) -- recommending against character codes when most users will quite quickly encounter them anyway (and have to understand them) seems rather backward to me.
So, bottom line is that I'm not sure what is best - ideally, we have an actual complete (and ideally concise) system of defining dtypes in place before we change the documentation to something that is incomplete and less helpful, as was done to this file already before your PR...
@@ -33,4 +33,10 @@ Utility | |||
show_runtime | |||
broadcast_shapes | |||
|
|||
.. automodule:: numpy.dtypes | |||
DType classes and utility (:mod:`numpy.dtypes`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
utility -> utilities?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a utility
yet? If not, maybe just DType classes
?
doc/source/user/basics.types.rst
Outdated
@@ -12,6 +12,146 @@ Array types and conversions between types | |||
NumPy supports a much greater variety of numerical types than Python does. | |||
This section shows which are available, and how to modify an array's data-type. | |||
|
|||
NumPy numerical types are instances of `numpy.dtype` (data-type) objects, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instance
has specific meaning in python: I would expect type(np.int8)
now to be np.dtype
, but this is not true - they all derive from np.generic
I don't think it is helpful to suggest it is.
EDIT: darn, I see that this is simply all from the previous piece of text...
@mhvk would you mind if we leave the text that was already there as-is and just review the new text I added? And maybe file an issue to come back to the introductory text? |
ed3c893
to
40ddf57
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, a review just of what you added: I like it, but would not mention the long names if you are not going to actually use them.
But I think when you move a piece so much up front, it should actually be good... I suggest improving it by shortening: just show a few basic array instantiation calls, ignoring the scalars part altogether - it is irrelevant here. You could even mostly just let the array constructor itself decide on float or int, to avoid passing in anything, and then perhaps give a single example with an overflow (for which I would use a character code, but that's me...)
40ddf57
to
c112edd
Compare
They are directly useful right now, FWIW:
But point taken, let's not move to using those names in this one place. I also edited down the introductory text a bit to remove the most objectionable stuff. |
3c1c498
to
d382885
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks so much, this now reads very well. Only a few nits left.
@@ -33,4 +33,10 @@ Utility | |||
show_runtime | |||
broadcast_shapes | |||
|
|||
.. automodule:: numpy.dtypes | |||
DType classes and utility (:mod:`numpy.dtypes`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a utility
yet? If not, maybe just DType classes
?
* - Strings | ||
- ``BytesDType``, ``BytesDType`` | ||
* - Strings and Bytestrings | ||
- ``StrDType``, ``BytesDType`` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch!
[skip actions] [skip azp] [skip cirrus]
d382885
to
a57325a
Compare
Responded to latest comments, should be good to go now. Maybe @mdhaber or @rossbar want to look this over? One wrinkle I noticed looking at the docs build, this introduces some new reference warnings that I can't easily fix:
And similar warnings for all of the other DType classes. I think that's happening is that both the dtype metaclass and dtype class share a python |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks all good as far as I'm concerned!
Pulling this in, if anyone has any edits they didn't catch before merging I'm happy to followup. |
@@ -24,6 +24,7 @@ indentation. | |||
routines.ctypeslib | |||
routines.datetime | |||
routines.dtype | |||
routines.dtypes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ngoldbaum this file doesn't actually exist. Did you forget to commit it, or is this some leftover?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, that's odd, must have forgotten to git add
it back when I made this PR. Will push a fix tomorrow.
This is part of my work on docs for stringdtype. I'm issuing this separately from the stringdtype PR to ease review.
I moved the section comparing python and C data type names to later in the document, after material that made more sense to me at the beginning of the document.
I also added some discussion about working with string or bytes data. I mentioned some common footguns working with the bytes dtype and also explicitly mentioned the unstructured void dtype as an alternative option. As far as I can tell, unstructured void dtypes are not mentioned anywhere in the docs yet but they seem like a natural thing to mention here.
EDIT: also made sure the dtype classes all show up in the API docs so they can be cross-referenced