New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: Add shape to *_like() array creation #13046
Changes from 17 commits
f885c08
5ad61b6
fd3e270
b7202a7
013cbce
807f512
95bbfd0
43b9828
fa7fd75
d57e6d3
f1b3e91
d24ac10
ec26417
40e7e9e
928952d
e394356
d938fb9
99a55e0
c087801
db7614b
669ea71
f53afbe
c3ac08e
695b836
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -106,12 +106,12 @@ class ComplexWarning(RuntimeWarning): | |
pass | ||
|
||
|
||
def _zeros_like_dispatcher(a, dtype=None, order=None, subok=None): | ||
def _zeros_like_dispatcher(a, dtype=None, order=None, subok=None, shape=None): | ||
return (a,) | ||
|
||
|
||
@array_function_dispatch(_zeros_like_dispatcher) | ||
def zeros_like(a, dtype=None, order='K', subok=True): | ||
def zeros_like(a, dtype=None, order='K', subok=True, shape=None): | ||
""" | ||
Return an array of zeros with the same shape and type as a given array. | ||
|
||
|
@@ -135,12 +135,22 @@ def zeros_like(a, dtype=None, order='K', subok=True): | |
If True, then the newly created array will use the sub-class | ||
type of 'a', otherwise it will be a base-class array. Defaults | ||
to True. | ||
shape : int or sequence of ints, optional. | ||
Overrides the shape of the result. | ||
|
||
pentschev marked this conversation as resolved.
Show resolved
Hide resolved
|
||
.. versionadded:: 1.17.0 | ||
|
||
Returns | ||
------- | ||
out : ndarray | ||
Array of zeros with the same shape and type as `a`. | ||
|
||
Raises | ||
------ | ||
ValueError | ||
If len(shape) different from a.ndim and order is 'K', or | ||
a is not a C/F-layout array | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What does There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I assume you mean the
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think what you mean with your suggestion is that There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. And by no particular memory layout, I mean something like the result of advanced indexing, for example, stealing that terminology from indexing documentation: The memory layout of an advanced indexing result is optimized for each indexing operation and no particular memory order can be assumed. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. OK, I see what you mean. That is difficult to express exactly, maybe just brute force it:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Note that strictly speaking, it should be possible to add 1's to the shape without affecting the contiguity.
So possibly the correct check would be that the number of number of dimensions > 1 must match. @seberg Thoughts? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You're right, I can do that modification if necessary, but I'm not sure this is of utmost importance. Particularly, for our use cases I don't think that affects us. |
||
|
||
See Also | ||
-------- | ||
empty_like : Return an empty array with shape and type of input. | ||
|
@@ -166,7 +176,7 @@ def zeros_like(a, dtype=None, order='K', subok=True): | |
array([0., 0., 0.]) | ||
|
||
""" | ||
res = empty_like(a, dtype=dtype, order=order, subok=subok) | ||
res = empty_like(a, dtype=dtype, order=order, subok=subok, shape=shape) | ||
# needed instead of a 0 to get same result as zeros for for string dtypes | ||
z = zeros(1, dtype=res.dtype) | ||
multiarray.copyto(res, z, casting='unsafe') | ||
rgommers marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
@@ -226,12 +236,12 @@ def ones(shape, dtype=None, order='C'): | |
return a | ||
|
||
|
||
def _ones_like_dispatcher(a, dtype=None, order=None, subok=None): | ||
def _ones_like_dispatcher(a, dtype=None, order=None, subok=None, shape=None): | ||
return (a,) | ||
|
||
|
||
@array_function_dispatch(_ones_like_dispatcher) | ||
def ones_like(a, dtype=None, order='K', subok=True): | ||
def ones_like(a, dtype=None, order='K', subok=True, shape=None): | ||
""" | ||
Return an array of ones with the same shape and type as a given array. | ||
|
||
|
@@ -255,12 +265,22 @@ def ones_like(a, dtype=None, order='K', subok=True): | |
If True, then the newly created array will use the sub-class | ||
type of 'a', otherwise it will be a base-class array. Defaults | ||
to True. | ||
shape : int or sequence of ints, optional. | ||
Overrides the shape of the result. | ||
|
||
.. versionadded:: 1.17.0 | ||
|
||
Returns | ||
------- | ||
out : ndarray | ||
Array of ones with the same shape and type as `a`. | ||
|
||
Raises | ||
------ | ||
ValueError | ||
If len(shape) different from a.ndim and order is 'K', or | ||
a is not a C/F-layout array | ||
|
||
See Also | ||
-------- | ||
empty_like : Return an empty array with shape and type of input. | ||
|
@@ -286,7 +306,7 @@ def ones_like(a, dtype=None, order='K', subok=True): | |
array([1., 1., 1.]) | ||
|
||
""" | ||
res = empty_like(a, dtype=dtype, order=order, subok=subok) | ||
res = empty_like(a, dtype=dtype, order=order, subok=subok, shape=shape) | ||
multiarray.copyto(res, 1, casting='unsafe') | ||
rgommers marked this conversation as resolved.
Show resolved
Hide resolved
|
||
return res | ||
|
||
|
@@ -338,12 +358,12 @@ def full(shape, fill_value, dtype=None, order='C'): | |
return a | ||
|
||
|
||
def _full_like_dispatcher(a, fill_value, dtype=None, order=None, subok=None): | ||
def _full_like_dispatcher(a, fill_value, dtype=None, order=None, subok=None, shape=None): | ||
return (a,) | ||
|
||
|
||
@array_function_dispatch(_full_like_dispatcher) | ||
def full_like(a, fill_value, dtype=None, order='K', subok=True): | ||
def full_like(a, fill_value, dtype=None, order='K', subok=True, shape=None): | ||
""" | ||
Return a full array with the same shape and type as a given array. | ||
|
||
|
@@ -365,12 +385,22 @@ def full_like(a, fill_value, dtype=None, order='K', subok=True): | |
If True, then the newly created array will use the sub-class | ||
type of 'a', otherwise it will be a base-class array. Defaults | ||
to True. | ||
shape : int or sequence of ints, optional. | ||
Overrides the shape of the result. | ||
|
||
.. versionadded:: 1.17.0 | ||
|
||
Returns | ||
------- | ||
out : ndarray | ||
Array of `fill_value` with the same shape and type as `a`. | ||
|
||
Raises | ||
------ | ||
ValueError | ||
If len(shape) different from a.ndim and order is 'K', or | ||
a is not a C/F-layout array | ||
|
||
See Also | ||
-------- | ||
empty_like : Return an empty array with shape and type of input. | ||
|
@@ -395,7 +425,7 @@ def full_like(a, fill_value, dtype=None, order='K', subok=True): | |
array([0.1, 0.1, 0.1, 0.1, 0.1, 0.1]) | ||
|
||
""" | ||
res = empty_like(a, dtype=dtype, order=order, subok=subok) | ||
res = empty_like(a, dtype=dtype, order=order, subok=subok, shape=shape) | ||
multiarray.copyto(res, fill_value, casting='unsafe') | ||
rgommers marked this conversation as resolved.
Show resolved
Hide resolved
|
||
return res | ||
|
||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1183,28 +1183,43 @@ PyArray_NewFromDescrAndBase( | |
flags, obj, base, 0, 0); | ||
} | ||
|
||
/*NUMPY_API | ||
/* | ||
* Creates a new array with the same shape as the provided one, | ||
* with possible memory layout order and data type changes. | ||
* with possible memory layout order, data type and shape changes. | ||
* | ||
* prototype - The array the new one should be like. | ||
* order - NPY_CORDER - C-contiguous result. | ||
* NPY_FORTRANORDER - Fortran-contiguous result. | ||
* NPY_ANYORDER - Fortran if prototype is Fortran, C otherwise. | ||
* NPY_KEEPORDER - Keeps the axis ordering of prototype. | ||
* dtype - If not NULL, overrides the data type of the result. | ||
* ndim - If not 0 and dims not NULL, overrides the shape of the result. | ||
* dims - If not NULL and ndim not 0, overrides the shape of the result. | ||
* subok - If 1, use the prototype's array subtype, otherwise | ||
* always create a base-class array. | ||
* | ||
* NOTE: If dtype is not NULL, steals the dtype reference. On failure or when | ||
* dtype->subarray is true, dtype will be decrefed. | ||
*/ | ||
NPY_NO_EXPORT PyObject * | ||
PyArray_NewLikeArray(PyArrayObject *prototype, NPY_ORDER order, | ||
PyArray_Descr *dtype, int subok) | ||
PyArray_NewLikeArrayWithShape(PyArrayObject *prototype, NPY_ORDER order, | ||
PyArray_Descr *dtype, int ndim, npy_intp *dims, int subok) | ||
{ | ||
PyObject *ret = NULL; | ||
int ndim = PyArray_NDIM(prototype); | ||
|
||
int new_shape = 1; | ||
|
||
if (dims == NULL) { | ||
ndim = PyArray_NDIM(prototype); | ||
dims = PyArray_DIMS(prototype); | ||
new_shape = 0; | ||
} | ||
else if (order == NPY_KEEPORDER && (ndim != PyArray_NDIM(prototype) || | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This breaks previous behavior which did not require the prototype to be contiguous.
I'm not sure why an error should be raised, why not just override the shape and go with C order when a new shape is specified and order == 'K'? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @charris what you're saying is like my original implementation. I modified this based on @eric-wieser's request here #13046 (comment). For me it's fine either way. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @charris: This shouldn't break any existing code once my comment below is addressed. The second check here makes no sense to me either. The thinking is that requesting the order (of the strides) to be kept is nonsensical if the the number of strides is not the same. |
||
!(PyArray_IS_C_CONTIGUOUS(prototype) || PyArray_IS_F_CONTIGUOUS(prototype)))) { | ||
PyErr_SetString(PyExc_ValueError, | ||
"mismatching ndim or non-C/F-layout arrays can not keep order"); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why are you checking There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Because we can't keep the order when there's no particular memory layout, just like in the discussion here: There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. My current understanding of 'K' is that we try to keep the memory order if possible, not in all circumstances. Indeed, the former code fell back on recomputing the strides when the prototype was neither c/f continguous. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. First of all, we should make sure that this does not switch to an error suddenly. I agree that in most cases I think the default may actually be There are some strides where it would be unambiguous, but I am not sure it is worth to bother about those. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Unless I'm miserably misunderstanding some of the comments, I think that there are two clashing opinions:
Feel free to correct me if I'm really misunderstanding it, otherwise, it would be nice to have here @charris, @eric-wieser and @seberg commenting here to reach an agreement before I do any further changes regarding this matter. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @eric-wieser could you let me know about your findings then? I'm sorry, but I don't understand what you mean with your previous statement, it may be getting a little too complex for my understanding of NumPy handling of strides and orders. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sure, here's the case I'm thinking of: >>> a = np.zeros((20, 30, 40)).transpose((1, 0, 2))[::10,::10,::10]
>>> a.strides
(3200, 96000, 80)
>>> np.copy(a, order='K').strides
(32, 96, 8) note how the order of the strides is kept. This is the behavior I would expect for There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @eric-wieser I believe the new commit I pushed covers now the case you're looking for. Extending the case from your previous comment, this is what we get with this new commit: >>> np.empty_like(a, shape=(3, 2, 4), order='K').strides
(32, 96, 8) Can you confirm this is what you had in mind? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That behavior looks correct to me now. Well take one more look at this tonight. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @eric-wieser just checking, have you had the chance to have a look at this? Do you think we're good to merge this now? |
||
return ret; | ||
} | ||
|
||
/* If no override data type, use the one from the prototype */ | ||
if (dtype == NULL) { | ||
|
@@ -1232,12 +1247,12 @@ PyArray_NewLikeArray(PyArrayObject *prototype, NPY_ORDER order, | |
break; | ||
} | ||
|
||
/* If it's not KEEPORDER, this is simple */ | ||
if (order != NPY_KEEPORDER) { | ||
/* If it's not KEEPORDER, or there is a shape change, this is simple */ | ||
if (order != NPY_KEEPORDER || new_shape) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think if the number of dims match, matching the stride order is still a good idea There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. But passing shape alone doesn't provide any information of strides, it will not be more than an assumption that keeping strides unchanged is important. Plus, we could only keep the strides for the dimensions that don't change. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The stride information comes from the source. The purpose of I think that the behavior should be:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. But even if There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. keeporder means keep the relative order of the strides ( There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Oh, I'm sorry, I got what you mean now. Yes, I agree with that, let me change that. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think the new commits should cover this case now. |
||
ret = PyArray_NewFromDescr(subok ? Py_TYPE(prototype) : &PyArray_Type, | ||
dtype, | ||
ndim, | ||
PyArray_DIMS(prototype), | ||
dims, | ||
NULL, | ||
NULL, | ||
order, | ||
|
@@ -1246,11 +1261,10 @@ PyArray_NewLikeArray(PyArrayObject *prototype, NPY_ORDER order, | |
/* KEEPORDER needs some analysis of the strides */ | ||
else { | ||
npy_intp strides[NPY_MAXDIMS], stride; | ||
npy_intp *shape = PyArray_DIMS(prototype); | ||
npy_stride_sort_item strideperm[NPY_MAXDIMS]; | ||
int idim; | ||
|
||
PyArray_CreateSortedStridePerm(PyArray_NDIM(prototype), | ||
PyArray_CreateSortedStridePerm(ndim, | ||
PyArray_STRIDES(prototype), | ||
pentschev marked this conversation as resolved.
Show resolved
Hide resolved
|
||
strideperm); | ||
|
||
|
@@ -1259,14 +1273,14 @@ PyArray_NewLikeArray(PyArrayObject *prototype, NPY_ORDER order, | |
for (idim = ndim-1; idim >= 0; --idim) { | ||
npy_intp i_perm = strideperm[idim].perm; | ||
strides[i_perm] = stride; | ||
stride *= shape[i_perm]; | ||
stride *= dims[i_perm]; | ||
} | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. My 1.ii proposal would look like changing everything between the new line 1261 and here to: if (PyArray_NDIM(prototype) >= ndim) {
int leading_dims = PyArray_NDIM(prototype) - ndim;
/* Use only the trailing strides */
PyArray_CreateSortedStridePerm(ndim,
PyArray_STRIDES(prototype) + leading_dims,
strideperm);
/* Build the new strides */
stride = dtype->elsize;
for (idim = ndim-1; idim >= 0; --idim) {
npy_intp i_perm = strideperm[idim].perm;
strides[i_perm] = stride;
stride *= shape[i_perm];
}
}
else {
int leading_dims = ndim - PyArray_NDIM(prototype);
/* Use all the strides */
PyArray_CreateSortedStridePerm(PyArray_NDIM(prototype),
PyArray_STRIDES(prototype),
strideperm);
/* Build the new trailing strides */
stride = dtype->elsize;
for (idim = PyArray_NDIM(prototype)-1; idim >= 0; --idim) {
npy_intp i_perm = strideperm[idim].perm + leading_dims;
strides[i_perm] = stride;
stride *= shape[i_perm];
}
/* Create remaining leading strides as C order */
for (idim = leading_dims; idim >= 0; --idim) {
strides[idim] = stride;
stride *= shape[idim];
}
} There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Or trading branching for state: /* not sure if this is clearer via min/max */
int leading_src_dims = 0; // max(src.ndim - dst.ndim, 0)
int leading_dst_dims = 0; // max(dst.ndim - src.ndim, 0)
int shared_dims; // min(src.ndim, dst.ndim
if (PyArray_NDIM(prototype) >= ndim) {
shared_dims = ndim;
leading_src_dims = PyArray_NDIM(prototype) - ndim;
}
else {
shared_dims = PyArray_NDIM(prototype);
leading_dst_dims = ndim - PyArray_NDIM(prototype);
}
/* Use only the trailing strides from the source */
PyArray_CreateSortedStridePerm(shared_dims,
PyArray_STRIDES(prototype) + leading_src_dims,
strideperm);
/* Build the destrination trailing strides */
stride = dtype->elsize;
for (idim = ndim-1; idim >= 0; --idim) {
npy_intp i_perm = strideperm[idim].perm + leading_dst_dims;
strides[i_perm] = stride;
stride *= shape[i_perm];
}
/* Create remaining leading strides as C order */
for (idim = leading_dst_dims; idim >= 0; --idim) {
strides[idim] = stride;
stride *= shape[idim];
} |
||
|
||
/* Finally, allocate the array */ | ||
ret = PyArray_NewFromDescr(subok ? Py_TYPE(prototype) : &PyArray_Type, | ||
dtype, | ||
ndim, | ||
shape, | ||
dims, | ||
strides, | ||
NULL, | ||
0, | ||
|
@@ -1276,6 +1290,29 @@ PyArray_NewLikeArray(PyArrayObject *prototype, NPY_ORDER order, | |
return ret; | ||
} | ||
|
||
/*NUMPY_API | ||
* Creates a new array with the same shape as the provided one, | ||
* with possible memory layout order and data type changes. | ||
* | ||
* prototype - The array the new one should be like. | ||
* order - NPY_CORDER - C-contiguous result. | ||
* NPY_FORTRANORDER - Fortran-contiguous result. | ||
* NPY_ANYORDER - Fortran if prototype is Fortran, C otherwise. | ||
* NPY_KEEPORDER - Keeps the axis ordering of prototype. | ||
* dtype - If not NULL, overrides the data type of the result. | ||
* subok - If 1, use the prototype's array subtype, otherwise | ||
* always create a base-class array. | ||
* | ||
* NOTE: If dtype is not NULL, steals the dtype reference. On failure or when | ||
* dtype->subarray is true, dtype will be decrefed. | ||
*/ | ||
NPY_NO_EXPORT PyObject * | ||
PyArray_NewLikeArray(PyArrayObject *prototype, NPY_ORDER order, | ||
PyArray_Descr *dtype, int subok) | ||
{ | ||
return PyArray_NewLikeArrayWithShape(prototype, order, dtype, 0, NULL, subok); | ||
} | ||
|
||
/*NUMPY_API | ||
* Generic new array creation routine. | ||
*/ | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just noticed order and subok both default to None rather than the ('K' and True) values documentation proposes and other _like() functions have. Which one is the correct? It seems to me that the function definition should be order='K' and subok=True, am I correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a little confusing, but the arguments for this function (the dispatcher) need to default to None. We actually have a test that verifies this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we then update the documentation to reflect the real defaults?