-
Notifications
You must be signed in to change notification settings - Fork 590
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FEAT: Implement two level dispatcher for execute_node #2246
Conversation
b73d6fa
to
ca4aa34
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
conceptually looks fine. the perf of ops seems slighy slower when they are not the worst case ops (cast float64, series). is this noticeable? can we specifically register UDF ops with the 2-level and leave everything else with the 1-level (does this make it way more complicated)?
- i think this makes sense to push upstream to multidispatch as well (ok in here until / unless this patch is accepted & released)
|
I am leaning towards not optimize this further:
|
2daabe2
to
09ab18b
Compare
ibis/pandas/tests/test_core.py
Outdated
| del execute_node.funcs[ops.Add, int, MyObject] | ||
| del execute_node._meta_dispatcher.funcs[(ops.Add,)].funcs[ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
may want to consider a method for this on the dispatcher itself? (other wise this leaks into the impl).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added delitem for this
|
Thanks! @jreback |
What is the change
This PR implements a
TwoLevelDispatcherand replace themultipledispatch.DispatcherwithTwoLevelDispatcherto defineibis.pandas.execute_nodeThe original dispatcher is found to be slow when
(1) dispatch the first execute_node
(2) dispatch after new signature for execute_node is registered (this happens when user uses UDF)
In this alternative dispatcher, neither of these two issue appears. For details, see docstring of
TwoLevelDispatcherHow is the change tested
ibis/pandas/tests/test_dispatcher.pyis addedBenchmark
A benchmark is performed by running all the pandas backend tests and record "time to dispatch" for all signatures.
Here is the top 10 slowest dispatch in the original implementation (with the corresponding time in the new implementation):

Here is the top 10 slowest dispatch in the new implementation (with the corresponding time in the original implementation):
