-
-
Notifications
You must be signed in to change notification settings - Fork 9.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: ENH: Add nd-support to trim_zeros #15181
base: main
Are you sure you want to change the base?
Conversation
Add support for trimming nd-arrays with trim_zeros while preserving the old behavior for 1D input. The new parameter `axis` can specify a single dimension to be trimmed (reducing all other dimensions to the envelope of absolute values). If None or multiple values are specified, all or the selected dimensions are trimmed iteratively. This should make the function applicable to more use cases. Additionally provide the `atol`, `rtol` and `return_lengths` parameters. The first two control what is considered a "zero" to be trimmed, the latter provides the user with the on how much was trimmed.
numpy/lib/function_base.py
Outdated
def trim_zeros( | ||
filt, | ||
trim='fb', | ||
axis=-1, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure about the proper default value here. None
(trim all dimensions), 0 or 1 (trim first/last dimension) are candidates that would preserve the old behavior.
in trim_zeros.
numpy/lib/function_base.py
Outdated
start = stop = absolutes.shape[current_axis] | ||
if "f" not in trim: | ||
# except when only the backside is to be sliced | ||
stop = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really understand what you mean here - are you saying to slice leaving only the front, or slice removing the front?
When asked to slice both ends of an all zero array, I think I'd expect either:
- Only the back to be sliced
- The front to be sliced if
'fb'
is passed, and the back to be sliced if'bf'
is passed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call. I prefer the latter behavior and have modified the code accordingly.
The old function was really liberal with what it accepted as input for the trim
parameter (e.g. only checking "f" in trim
). I think restricting what is allowed to be passed in would make that parameter easier to understand and document.
Edit: But I've left it unrestricted for now.
In case the user passes in an all-zero array and string to `trim` that starts with "b", the last dimension should be sliced first. In all other cases the first dimension takes precedence.
It's easy to emulate this behavior by assigning zeros appropriately beforehand.
An alternate spelling with vectorization: nonzero = np.argwhere(filt)
if len(nonzero) == 0:
if trim.startswith("b"):
start = stop = np.zeros(filt.ndim)
else:
start = stop = np.array(filt.shape)
else:
start = nonzero.min(axis=0)
stop = nonzero.max(axis=0)
# TODO: make a function that just returns `start` and `stop`?
sl = tuple(
slice(a, b) if ax in axis else slice(None)
for ax, (a, b) in enumerate(zip(start, stop))
)
return filt[sl] |
as it's own function. trim_zeros uses its output to newly support the nd-case.
@eric-wieser Concerning the discussion in #15181 (comment) and your other suggestions: You seem to be in favor of adding a new function (e.g. |
Ensure that the returned start and stop arrays of indices have exactly one dimension.
@eric-wieser what do you think of the currently proposed API? I'll add missing tests if you're happy with it and give a notice (required I think?) on the mailing list. An alternative to |
Add support for trimming nd-arrays with trim_zeros while preserving the old behavior for 1D input. The new parameter
axis
can specify a single dimension to be trimmed (reducing all other dimensions to the envelope of absolute values). If None or multiple values are specified, all or the selected dimensions are trimmed iteratively.Additionally provide the
atol
,rtol
andreturn_lengths
parameters. The first two control what is considered a "zero" to be trimmed, the latter provides the user with the on how much was trimmed.I recently needed this behavior myself and think this makes the function applicable to more use cases.
This is currently marked as work in progress, as I wasn't sure about the new API. I tried to preserve the old API but I think it's worth having a discussion about refactoring and additional API changes. See also the review-comments below.