-
Notifications
You must be signed in to change notification settings - Fork 590
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FEAT: Improve pandas udf performance for many arguments #2071
Conversation
|
@icexelloss will this PR also fix the udf issues related the 2 tests that were failing? ref #2065 |
|
I don't believe so. I didn't address any window related issues. I will take a look at that separately . |
|
sounds good! thanks @icexelloss ! |
1db0cf2
to
ee8ca9f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you show some before / after timings. there are asv benchmarks (in ~/ibis/benchmarks), can you add some for Elementedwise UDFS and show the changes.
also add this issue / PR number to the UDF one already in the release notes.
| def pre_execute_elementwise_udf( | ||
| op, *clients, scope=None, aggcontet=None, **kwargs | ||
| ): | ||
| @pre_execute.register(ops.ElementWiseVectorizedUDF, ibis.client.Client) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doesn't this catch more generically here? I guess we don't have ElementWiseVector UDFs on other clients atm, but why this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't really necessary but follows the same convention as https://github.com/ibis-project/ibis/blob/master/ibis/pandas/dispatch.py#L49
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
thanks @icexelloss did we have an associated issue to close? |
Problem
The existing pandas udf implementation suffers from huge performance problem when the udf takes many arguments.
The problem is that, the existing implementation tries to register execution rules for every possible combination of argument types, resulting in an exponential growth execution rules (exponential to the number of udf arguments)
For an example, if a udf takes n arguments, it will ended register O(4^n) rules because each argument can take >4 different types, e.g., for floating numbers, (float, np.floating, pd.Series, pd.SeriesGroupBy) and the existing implementation tries to register the Cartesian product of those.
Proposed Solution
In this PR, the number of rules registered for a given number of arguments is reduced to 3. I.e., for a 10 argument UDF, only 3 rules will be registered.
The last rule (object, object, ..., object) is a "catch all" rule to handle scalar inputs.
Additional Changes
In this PR, I also take the chance to simplify pandas/core.py. This is because I found a bug in the existing implementation where pre_execute is not called for all nodes.
Microbenchmark
Before the PR:
After the PR: