In [1]:
import numpy as np

In [2]:
from numpy.random import default_rng
rng = default_rng()

In [12]:
arr = rng.integers(-1, 20, size=(500000,))
(arr == -1).sum()

23674

We want to test the speed of different masking alternatives. The use case is to compress the arr `arr` to remove the -1 values, **and** to keep the mask used (or equivalent functionality), as it is needed in the future.

Start with mask creation:

In [17]:
def naive_boolean_mask(arr):
    mask = arr != -1
    return arr[mask]

In [18]:
%timeit naive_boolean_mask(arr)

1.25 ms ± 28.7 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)


This... this is pretty quick already. Try the [take trick](https://wesmckinney.com/blog/numpy-indexing-peculiarities/).

In [19]:
def with_take(arr):
    mask = arr != -1
    return arr.take(mask)

In [20]:
%timeit with_take(arr)

1.48 ms ± 75.7 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)


Let's try some other options.

In [21]:
def with_extract(arr):
    mask = arr != -1
    return np.extract(mask, arr)

In [22]:
%timeit with_extract(arr)

2.05 ms ± 52.4 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


In [23]:
def with_nonzero_fancy(arr):
    indices = np.asarray(arr != -1).nonzero()
    return arr[indices]

In [24]:
def with_nonzero_take(arr):
    indices = np.asarray(arr != -1).nonzero()
    return arr.take(indices)

In [25]:
%timeit with_nonzero_fancy(arr)

2.77 ms ± 826 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


In [26]:
%timeit with_nonzero_take(arr)

2.9 ms ± 350 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


Overall, a wash; here is [some good discussion](https://stackoverflow.com/questions/46041811/performance-of-various-numpy-fancy-indexing-methods-also-with-numba) of the factors which influence the speed of the different options. As we have a boolean array where most values are true, which will stick with the easiest solution.