Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use "array" instead of "numpy array" except when emphasis is needed. #25125

Merged
merged 1 commit into from Feb 8, 2023

Conversation

anntzer
Copy link
Contributor

@anntzer anntzer commented Feb 1, 2023

Throughout the library, array usually means "numpy array" (we also have "array-like" for the more permissive case), so no need to repeat "numpy" every time (I left it in when I judged that the emphasis was reasonable). Also some other associated docstring cleanups.

PR Summary

PR Checklist

Documentation and Tests

  • Has pytest style unit tests (and pytest passes)
  • Documentation is sphinx and numpydoc compliant (the docs should build without error).
  • New plotting related features are documented with examples.

Release Notes

  • New features are marked with a .. versionadded:: directive in the docstring and documented in doc/users/next_whats_new/
  • API changes are marked with a .. versionchanged:: directive in the docstring and documented in doc/api/next_api_changes/
  • Release notes conform with instructions in next_whats_new/README.rst or next_api_changes/README.rst

@timhoffm
Copy link
Member

timhoffm commented Feb 2, 2023

I have been in favor of this simplification in the past. However, given that more and more array libraries appear, I'm unsure whether we should be more explicit and basically always say "numpy array".

@anntzer
Copy link
Contributor Author

anntzer commented Feb 2, 2023

Certainly going the other way is also an option, but I'm not sure I like that verbosity :/

@timhoffm
Copy link
Member

timhoffm commented Feb 2, 2023

Either way is a valid choice. Maybe more people want tho add their opinion. We‘re still not representative, but better than nothing.

@jklymak
Copy link
Member

jklymak commented Feb 2, 2023

I don't think it matters for the Returns (they can check the type if they are uncertain). For parameters, obviously we should only specify numpy array if we are being restrictive, whereas for almost everything we allow np.asarray(x) to work.

@tacaswell tacaswell added this to the v3.8.0 milestone Feb 2, 2023
@tacaswell
Copy link
Member

I see arguments for both ways here, but on balance I am in favor of going to just "array". The current croup of array-like libraries try to match numpy's API as best they can but it is always going to be the case that some work and some do not. It is better to say that we are permissive and then deal with the breakage as it comes up than claim to be very precise and have a bunch of extra stuff work (particularly because list-of-lists will work most places).

@timhoffm
Copy link
Member

timhoffm commented Feb 2, 2023

For parameters, obviously we should only specify numpy array if we are being restrictive, whereas for almost everything we allow np.asarray(x) to work.

We already use "array-like" for parameters when np.asarray(x) works on that parameter, I think we're covered there. The tricky cases are the ones where we expect actual arrays and the input is a non-numpy array.


It is better to say that we are permissive and then deal with the breakage as it comes up.

Only if we are willing to deal with that (i.e. fix). Having users complain that their array does not work and often respond that "oh we only said array, but this type of array does not work here" would be a bad communication strategy.

Effectively, this would likely become a commitment to support arrays that follow the array API standard. That can be reasonable, but should be a conscious decision on our side. I'm not clear how much effort this would be. Things to consider:

  • We have essentially two strategies for inputs in our functions
    i) Convert the input to a numpy array
    ii) Assume that the data is a (numpy(?)) array and directly use it (including operations, storing and returning).
    Both need to work with non-numpy array inputs
  • The above also means that we will have some type incoherence, internally and as return types. This may or may not be an issue.

Copy link
Member

@jklymak jklymak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think most of these are fine because they go through np.asarray or np.asanyarray. One of them only goes through np.array, so perhaps problematic?

@@ -2050,8 +2050,7 @@ def inverse(self, value):

def rgb_to_hsv(arr):
"""
Convert float RGB values (in the range [0, 1]), in a numpy array to HSV
values.
Convert an array of float RGB values (in the range [0, 1]) to HSV values.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this one is fine because of the np.array will convert other array-like? Probably should check...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's what the docs say? (The parameters block immediately below explicitly states array-like, I don't think the first line needs to restate the same info. Perhaps the top line could even just be "Convert float RGB values to HSV values.")

@@ -1990,7 +1990,7 @@ def _do_layout(gs, mosaic, unique_ids, nested):
----------
gs : GridSpec
mosaic : 2D object array
The input converted to a 2D numpy array for this level.
The input converted to a 2D array for this level.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is clearly array-like as it takes lists. Not sure if it accepts any array-like?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually this is only ever called internally and at that point mosaic has indeed been converted to a numpy array (by _make_array).

@@ -166,8 +166,8 @@ def _fast_from_codes_and_verts(cls, verts, codes, internals_from=None):

Parameters
----------
verts : numpy array
codes : numpy array
verts : array
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These go through np.asarray, so I assume array-like are fine.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

codes does not, but verts does, adjusted accordingly.

xy : `~numpy.ndarray`
A numpy array of 2D points.

xy : (N, 2) array
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are converted using np.asarray so I think this is OK

@@ -1484,13 +1477,13 @@ def transform(self, values):
Parameters
----------
values : array
The input values as NumPy array of length :attr:`input_dims` or
The input values as an array of length :attr:`input_dims` or
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This goes through np.asanyarray

" If 2-d, the image is grayscale. If 3-d, the image must be of size\n"
" 4 in the last dimension and represents RGBA data.\n\n"

"output_array : 2-d or 3-d Numpy array of float, double or uint8\n"
" The dtype and number of dimensions must match `input_array`.\n\n"
"output_array : 2-d or 3-d array of float, double or uint8\n"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these should probably stay to be explicit?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure; input_array can be any array-like, though (it goes through PyArray_FromAny); edited accordingly.

@jklymak
Copy link
Member

jklymak commented Feb 3, 2023

Strongly in favour of standardizing general array input as much as possible and documenting what we expect when we say array-like.

@anntzer
Copy link
Contributor Author

anntzer commented Feb 4, 2023

Thanks @jklymak for the careful checking.

@jklymak jklymak requested a review from timhoffm February 7, 2023 18:55
Copy link
Member

@timhoffm timhoffm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Take or leave the one comment.

Can self-merge after this.

A numpy array of 2D points.

xy : (N, 2) array-like
Array of (x, y) coordinates.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kind of inconsistent/imprecise: array vs. array-like

Suggested change
Array of (x, y) coordinates.
The (x, y) coordinates.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, fixed.

Throughout the library, array usually means "numpy array" (we also have
"array-like" for the more permissive case), so no need to repeat "numpy"
every time (I left it in when I judged that the emphasis was
reasonable).  Also some other associated docstring cleanups.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants