New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow for more expressive Array signatures #12
Comments
Thank you! I think the proposed signature design sounds great, and incorporates clearly how to embed the dimension/rank of arrays - my inquiry in #10. Renaming to It would extend many of the efficiencies of type hinting into the science stack realm, which is an area that could greatly benefit from this! I'd love to help, where I can. |
This seems like a solid proposal to me. It would cover the majority of our potential use cases, I wouldn't envision many down sides. The only thing that I see that could be missing is possibly declaring rank without specifying the exact dimension sizes. Not sure how hard it would be to implement or even how useful it is from a typing perspective. But I know that we occasionally have arrays of fixed rank and length that might have their dimensions resized. E.g. a basic transpose operation would fit under this use case. Edit: Nevermind paragraph two. A colleague pointed out to me that something like |
This sounds great. Would there be a way to name your own dimension variables? Could be very helpful as part of documenting the intent of each dimension. E.g., |
@alimanfoo , I think you would rather explicitly name the columns of a dimension, rather than the dimension itself. Correct me if I'm wrong. This is what the signature of an array of coordinates would look like: So in your case, you want to further elaborate on that We could take this one step further though, by introducing something that allows you to be more precise on what a column value should be: from nptyping import NamedColumn
# NamedColumn takes a name and an optional predicate to validate a value.
lat = NamedColumn('lattitude', lambda x: x >= 0)
lon = NamedColumn('longitude', lambda x: x >= 0)
NDArray[(..., (lat, lon)), float] The optional predicate of a With this, you could also write: from nptyping import NamedColumn
lat = NamedColumn('lattitude', lambda x: isinstance(x, float) and x >= 0)
lon = NamedColumn('longitude', lambda x: isinstance(x, float) and x >= 0)
NDArray[(..., (lat, lon))] # indefinite number of coordinates
NDArray[(5, (lat, lon))] # 5 coordinates Or even something like this: somewhere_in_europe = NamedColumn('coordinate somewhere in Europe', lambda x: is_in_polygon(x, EU))
somewhere_in_usa = NamedColumn('coordinate somewhere in USA', lambda x: is_in_polygon(x, USA))
NDArray[((somewhere_in_europe, somewhere_in_usa), (lat, lon))] # 2 coordinates One needs to keep in mind that instance checks will get more expensive with the typings being more precise. I would recommend type checking only during development anyway, not in a production environment. Does this extension with |
The major part of this issue have been addressed and is released in v.1.0.0. Next in line are the dimension variables and the named columns. |
Great news! |
Awesome news, and work! |
This is great and very useful. Since we use arrays everywhere, it would be really nice to have a less "brackety" syntax that allows you to name your dimensions to signify that you expect consistency, like:
|
Second thing:
I think this conflicts with the way Ellipsis (
Whereas
Since I hope it's not too late to change, but I would propose that
|
Thanks for this really cool repo. I'm really looking forward to the |
See also issues #9, #10 and #11.
There have been several requests to extend the expressiveness of
Array
. I don't feel much for a sudden signature change ofArray
. Rather, I'd like to introduce a new typeNDArray
(which name I like more thanArray
anyway) that will "slowly" replaceArray
.I have the following signature in mind:
Signature design
NDArray
any dimension of any size of any typeNDArray[...]
1 dimension of any size of any typeNDArray[3]
1 dimension of size 3 of any typeNDArray[(3, 3, 5)]
3 dimensions (3 x 3 x 5) of any typeNDArray[(3, ..., 5)]
3 dimensions (3 x ? x 5) of any typeNDArray[(D1, 3, D1)]
3 dimensions (D1 x 3 x D1 where D1 is an nptyping constant that can beimported to express a dimension variable, see #9 and #11) of any type
NDArray[int]
any dimension of any size of type intNDArray[..., int]
1 dimension of any size of type intNDArray[(3, 3, 5), int]
3 dimensions (3 x 3 x 5) of type intNDArray[(3, 3, 5), np.dtype('int16')]
3 dimensions (3 x 3 x 5) of type int16NDArray[(3, 3), np.dtype([('f1', np.int16), ('f2', np.int16)])]
2 dimensions (3 x 3) with structured typesProcess
The new
NDArray
is to replace the currentArray
. Once introduced, the originalArray
will become deprecated to be removed upon the minor release that follows next.Before I start investing time into this, I'd love to hear your opinion on this. Please leave any feedback, any comments, any suggestions.
The text was updated successfully, but these errors were encountered: