-
Notifications
You must be signed in to change notification settings - Fork 590
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: Rewrite pandas execution to use topological sort instead of recursion #1758
Conversation
a72c22c
to
a341862
Compare
5e39a72
to
6a050a4
Compare
| # compute the nodes that are not currently in scope | ||
| for node in nodes: | ||
| # allow clients to pre compute nodes as they like | ||
| pre_executed_scope = pre_execute( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should talk about this a bit. based on the naming, i would expect pre_execute to happen before execute_first. and i think executing pre_execute in a loop over each node is a little different than how it happens now. we may not have good test cases that exercise the nuances of pre_execute here.
| type(query).__name__ | ||
| ) | ||
| ) | ||
| return execute_last( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why wouldn't this be part of the main execute? each client needs to call execute_last, and if i call the execution function rather than this method, i won't get the execute_last behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because execute_last will always reset the index because of the default implementation. Window function execution requires the index to be preserved, and it makes recursive calls to execute which means we cannot reset the index.
5ba02f0
to
c7df2d1
Compare
c7df2d1
to
43d1725
Compare
97b3de5
to
9f552a5
Compare
This caused a few failures with downstream code, so I've reverted the changes to ibis/pandas/core.py. I've left other fixes from #1758 unchanged. I'll revisit the reimplementation in the near future. Author: Phillip Cloud <cpcloud@gmail.com> Closes #1837 from cpcloud/revert-topo and squashes the following commits: c9662ac [Phillip Cloud] Just pass kwargs through 5d8ec82 [Phillip Cloud] Make zero argument UDFs work again 9fa4481 [Phillip Cloud] Revert toposort change d9c00ef [Phillip Cloud] Remove unnecessary code deed6aa [Phillip Cloud] Revert select parts
Example run: ``` $ dev/genrelease.py 1.2.0 * :release:`1.2.0 2019-06-19` * 🐛`1842 major` Add max_lookback to window replace and combine functions * 🐛`1837 major` Partially revert #1758 * :support:`1839` Tighter version spec for pytest * :support:`1838` allow pandas timedelta in rows_with_max_lookback * :support:`1825` Accept rows-with-max-lookback as preceding parameter * :support:`1786` PostGIS support * :support:`1826` Allow passing a branch to ci/feedstock.py ``` Author: Phillip Cloud <cpcloud@gmail.com> Closes #1845 from cpcloud/release-note-generator and squashes the following commits: 4e7eae1 [Phillip Cloud] Add rls/release to roles f260477 [Phillip Cloud] SUPP: Generate release notes from commits
cc @toryhaavik for review