WIP: ENH: Add nd-support to trim_zeros #15181

lagru · 2019-12-26T12:21:21Z

Add support for trimming nd-arrays with trim_zeros while preserving the old behavior for 1D input. The new parameter axis can specify a single dimension to be trimmed (reducing all other dimensions to the envelope of absolute values). If None or multiple values are specified, all or the selected dimensions are trimmed iteratively.

Additionally provide the atol, rtol and return_lengths parameters. The first two control what is considered a "zero" to be trimmed, the latter provides the user with the on how much was trimmed.

I recently needed this behavior myself and think this makes the function applicable to more use cases.

This is currently marked as work in progress, as I wasn't sure about the new API. I tried to preserve the old API but I think it's worth having a discussion about refactoring and additional API changes. See also the review-comments below.

Add support for trimming nd-arrays with trim_zeros while preserving the old behavior for 1D input. The new parameter `axis` can specify a single dimension to be trimmed (reducing all other dimensions to the envelope of absolute values). If None or multiple values are specified, all or the selected dimensions are trimmed iteratively. This should make the function applicable to more use cases. Additionally provide the `atol`, `rtol` and `return_lengths` parameters. The first two control what is considered a "zero" to be trimmed, the latter provides the user with the on how much was trimmed.

lagru · 2019-12-26T12:23:20Z

numpy/lib/function_base.py

+def trim_zeros(
+    filt,
+    trim='fb',
+    axis=-1,


Not sure about the proper default value here. None (trim all dimensions), 0 or 1 (trim first/last dimension) are candidates that would preserve the old behavior.

numpy/lib/function_base.py

in trim_zeros.

eric-wieser · 2019-12-26T14:52:16Z

numpy/lib/function_base.py

+            start = stop = absolutes.shape[current_axis]
+            if "f" not in trim:
+                # except when only the backside is to be sliced
+                stop = 0


I don't really understand what you mean here - are you saying to slice leaving only the front, or slice removing the front?

When asked to slice both ends of an all zero array, I think I'd expect either:

Only the back to be sliced

The front to be sliced if 'fb' is passed, and the back to be sliced if 'bf' is passed

Good call. I prefer the latter behavior and have modified the code accordingly.

The old function was really liberal with what it accepted as input for the trim parameter (e.g. only checking "f" in trim). I think restricting what is allowed to be passed in would make that parameter easier to understand and document.

Edit: But I've left it unrestricted for now.

numpy/lib/function_base.py

In case the user passes in an all-zero array and string to `trim` that starts with "b", the last dimension should be sliced first. In all other cases the first dimension takes precedence.

numpy/lib/function_base.py

It's easy to emulate this behavior by assigning zeros appropriately beforehand.

numpy/lib/function_base.py

eric-wieser · 2019-12-27T11:01:33Z

An alternate spelling with vectorization:

nonzero = np.argwhere(filt)
if len(nonzero) == 0:
    if trim.startswith("b"):
        start = stop = np.zeros(filt.ndim)
    else:
        start = stop = np.array(filt.shape)
else:
    start = nonzero.min(axis=0)
    stop = nonzero.max(axis=0)

# TODO: make a function that just returns `start` and `stop`?

sl = tuple(
    slice(a, b) if ax in axis else slice(None)
    for ax, (a, b) in enumerate(zip(start, stop))
)
return filt[sl]

as it's own function. trim_zeros uses its output to newly support the nd-case.

lagru · 2020-01-13T15:57:01Z

@eric-wieser Concerning the discussion in #15181 (comment) and your other suggestions: You seem to be in favor of adding a new function (e.g. arg_trim_zeros) that returns indices to the zero area. I've pushed a draft so that we can discuss that API...

Ensure that the returned start and stop arrays of indices have exactly one dimension.

lagru · 2020-02-08T12:17:21Z

@eric-wieser what do you think of the currently proposed API? I'll add missing tests if you're happy with it and give a notice (required I think?) on the mailing list.

An alternative to arg_trim_zeros would be to remove it but document its implementation as a lightweight example inside trim_zeros's Note section. It basically amounts to "use nonzero and call max, min on the appropriate dimension of the output".

lagru commented Dec 26, 2019

View reviewed changes