PERF: interpolate_1d returns function to apply columnwise #34728

simonjayhawkins · 2020-06-12T08:32:14Z

This doesn't provide a significant improvement for the existing asv due to the bulk of the time creating python sets which is in the function applied columnwise. see #34727

even without #34727 this could provide perf improvements for other index types ( and for unsorted indexes with creating a column-wise function for the numpy call.

will look at adding asvs for these cases.

…olate

jreback · 2020-06-14T14:47:34Z

pandas/core/missing.py

-            order=order,
-            **kwargs,
-        )
+    def func(yvalues: np.ndarray) -> np.ndarray:


can you make this a module level function and give it a nice name

doc-string as well

it needs to be a closure

really? can you pass any needed arguments, this makes it really hard to grok

the code in this function is applied columnwise, the perf gains will come from doing some things once instead of 100x for the asv.

I prefer this functional approach to pre-validation as it keeps related code together. There is some more cleaning which could move more into the outer function once it's called less often

I understand, but having this an outer function with passing args should not have any impact on performance. It simply a more understandable approach. ok with merging then refactoring to be simpler later though.

its basically the processing of the index which only needs to be done once. However, with the bulk of the time on the preserve_nans set logic, there is no sig perf gains yet, hence draft for now.

originally grouped the validation see #34628. So can do it that way if preferred.

i.e. method, xvalues = missing.clean_interp_method(method, index, **kwargs)

kk sure, i think you have 1 PR pending that we should merge before this i think. but lmk what you prefer.

yeah, trying not to affect the work by @cchwala, I have an alternative implementation of limit_direction that does not re-use the preserve_nans set logic, see #34628

so i'm now happy #34727 need not affect that, but still need to look at the max_gap algo

sorry #34749 not #34628

…olate

simonjayhawkins · 2020-07-07T14:41:52Z

I'll close this to clear the queue for now. Some deeper changes now planned so will reopen once complete.

interpolate_1d returns function

659895f

simonjayhawkins added Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate Algos Non-arithmetic algos: value_counts, factorize, sorting, isin, clip, shift, diff labels Jun 12, 2020

simonjayhawkins marked this pull request as draft June 12, 2020 08:32

Merge remote-tracking branch 'upstream/master' into functional-interp…

31596cd

…olate

simonjayhawkins mentioned this pull request Jun 13, 2020

CLN: clean and deduplicate in core.missing.interpolate_1d #34744

Merged

simonjayhawkins added 2 commits June 14, 2020 10:08

Merge remote-tracking branch 'upstream/master' into functional-interp…

3f293f6

…olate

fixup whitespace from merge

810767a

jreback requested changes Jun 14, 2020

View reviewed changes

simonjayhawkins added 10 commits July 4, 2020 19:48

Merge remote-tracking branch 'upstream/master' into functional-interp…

37ffd1e

…olate

use class instead

e676a10

preserve_nans logic to seperate method for profiling

f7c70a0

add validators and convertors

cb1228e

move dispatch logic outside interpolate

fb43c8e

remove unneeded class attributes

3482e23

remove xvalues from class atrributes

871902c

create NumPyInterpolator class

8a46508

move argsort from interpolate to init

54b762a

Merge remote-tracking branch 'upstream/master' into functional-interp…

3e19666

…olate

simonjayhawkins closed this Jul 7, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PERF: interpolate_1d returns function to apply columnwise #34728

PERF: interpolate_1d returns function to apply columnwise #34728

simonjayhawkins commented Jun 12, 2020

jreback Jun 14, 2020

jreback Jun 14, 2020

simonjayhawkins Jun 14, 2020

jreback Jun 14, 2020

simonjayhawkins Jun 14, 2020

jreback Jun 14, 2020

simonjayhawkins Jun 14, 2020

jreback Jun 14, 2020

simonjayhawkins Jun 14, 2020

simonjayhawkins Jun 14, 2020

simonjayhawkins commented Jul 7, 2020

PERF: interpolate_1d returns function to apply columnwise #34728

PERF: interpolate_1d returns function to apply columnwise #34728

Conversation

simonjayhawkins commented Jun 12, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

simonjayhawkins commented Jul 7, 2020