Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Faster input validation #46

Merged
merged 2 commits into from Nov 28, 2021
Merged

Conversation

dcherian
Copy link
Collaborator

@dcherian dcherian commented Nov 19, 2021

I found that ravel_multi_index was taking a lot of time with reductions I tend to run; nD array, 1D group_idx, axis=-1 .

This is an alternate algorithm from https://stackoverflow.com/questions/46256279/bin-elements-per-row-vectorized-2d-bincount-for-numpy

I timed it with this script:

import timeit

import numpy as np
import numpy_groupies as npg


def time_call(method):
    import numpy_groupies as npg

    group_idx = np.repeat([1, 2, 3, 4], repeats=3)
    times = []
    for exp in np.arange(6):
        a = np.ones(
            (
                10 ** exp,
                100,
                12,
            ),
            dtype=np.int32,
        )
        time = timeit.timeit(
            f"npg.utils_numpy.input_validation(group_idx, a, axis=-1, func='sum', method={method!r})",
            number=10,
            globals=locals(),
        )

        times.append(time)

    np.testing.assert_array_equal(
        npg.utils_numpy.input_validation(group_idx, a, axis=-1, func="sum", method="ravel")[0],
        npg.utils_numpy.input_validation(group_idx, a, axis=-1, func="sum", method="offset")[0],
    )
    return times


ravel = time_call("ravel")
offset = time_call("offset")

import matplotlib.pyplot as plt

numel = 12 * 100 * 10**np.arange(len(ravel))
plt.plot(numel, ravel)
plt.plot(numel, offset)
plt.legend(["current npg", "proposed"])
plt.yscale("log")
plt.xscale("log")
plt.grid(True)

It's an ≈ 2x speedup for decent sized arrays
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants