Weighted quantiles shared prework (~7.5 times speedup) #22

sbrugman · 2020-06-02T11:36:39Z

Weighted quantiles speedup

This PR groups the prework that is needed for multiple quantiles, for instance the sorting of the weights. This significantly improves performance for the quantiles calculated by the HistProfiler .

Isolated performance test

For 10^3 samples, the speedup is a factor of around 7.5.

import timeit
import numpy as np


def weighted_quantiles(a, q, weights):
    raveled_data = np.ravel(a)
    idx = np.argsort(raveled_data)
    sorted_data = raveled_data[idx]
    sorted_weights = np.ravel(weights)[idx]
    Sn = np.cumsum(sorted_weights)
    Pn = (Sn - 0.5 * sorted_weights) / Sn[-1]
    y = np.interp(q, Pn, sorted_data)
    return y


# Setup
a = np.random.random((1000,))
w = np.random.random((1000,))
q = [0.0, 0.01, 0.05, 0.16, 0.50, 0.84, 0.95, 0.99, 1.0]


def f1():
    for qp in q:
        _ = weighted_quantiles(a, qp, w)


def f2():
    _ = weighted_quantiles(a, q, w)


ts = timeit.Timer(f1, globals=globals()).repeat(repeat=500, number=100)
m1 = np.mean(ts)
print(m1, np.std(ts))

ts = timeit.Timer(f2, globals=globals()).repeat(repeat=500, number=100)
m2 = np.mean(ts)
print(m2, np.std(ts))

print(f"Performance ratio {m1 / m2}")

Weighted quantiles shared prework (~7.5 times speedup)

9815093

tomcis force-pushed the feature/quantiles branch from 9b64704 to 9815093 Compare June 7, 2020 20:53

tomcis merged commit 2dc6b0d into develop Jun 7, 2020

sbrugman deleted the feature/quantiles branch June 9, 2020 12:28

tomcis pushed a commit that referenced this pull request Jun 10, 2020

Weighted quantiles shared prework (~7.5 times speedup) (#22)

51dea92

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Weighted quantiles shared prework (~7.5 times speedup) #22

Weighted quantiles shared prework (~7.5 times speedup) #22

sbrugman commented Jun 2, 2020

Weighted quantiles shared prework (~7.5 times speedup) #22

Weighted quantiles shared prework (~7.5 times speedup) #22

Conversation

sbrugman commented Jun 2, 2020

Weighted quantiles speedup

Isolated performance test