Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Rewrite pandas execution to use topological sort instead of recursion #1758

Merged
merged 13 commits into from May 16, 2019

Conversation

Projects
2 participants
@cpcloud
Copy link
Member

commented Apr 13, 2019

cc @toryhaavik for review

@cpcloud cpcloud added this to the Next Feature Release milestone Apr 13, 2019

Show resolved Hide resolved ibis/expr/window.py Outdated
Show resolved Hide resolved ibis/pandas/client.py Outdated
Show resolved Hide resolved ibis/file/parquet.py Outdated
Show resolved Hide resolved ibis/file/hdf5.py Outdated
Show resolved Hide resolved ibis/file/csv.py Outdated
Show resolved Hide resolved ibis/pandas/client.py Outdated
Show resolved Hide resolved ibis/pandas/execution/generic.py

@cpcloud cpcloud force-pushed the cpcloud:pandas-toposort branch 3 times, most recently from a72c22c to a341862 Apr 13, 2019

@cpcloud cpcloud added this to To do in Pandas via automation Apr 14, 2019

@cpcloud cpcloud force-pushed the cpcloud:pandas-toposort branch from 5e39a72 to 6a050a4 Apr 14, 2019

Show resolved Hide resolved ibis/pandas/core.py Outdated
Show resolved Hide resolved ibis/pandas/core.py Outdated
# compute the nodes that are not currently in scope
for node in nodes:
# allow clients to pre compute nodes as they like
pre_executed_scope = pre_execute(

This comment has been minimized.

Copy link
@toryhaavik

toryhaavik Apr 16, 2019

Contributor

we should talk about this a bit. based on the naming, i would expect pre_execute to happen before execute_first. and i think executing pre_execute in a loop over each node is a little different than how it happens now. we may not have good test cases that exercise the nuances of pre_execute here.

type(query).__name__
)
)
return execute_last(

This comment has been minimized.

Copy link
@toryhaavik

toryhaavik Apr 16, 2019

Contributor

why wouldn't this be part of the main execute? each client needs to call execute_last, and if i call the execution function rather than this method, i won't get the execute_last behavior.

This comment has been minimized.

Copy link
@cpcloud

cpcloud Apr 16, 2019

Author Member

Because execute_last will always reset the index because of the default implementation. Window function execution requires the index to be preserved, and it makes recursive calls to execute which means we cannot reset the index.

@cpcloud cpcloud force-pushed the cpcloud:pandas-toposort branch 4 times, most recently from 5ba02f0 to c7df2d1 Apr 16, 2019

@cpcloud cpcloud force-pushed the cpcloud:pandas-toposort branch from c7df2d1 to 43d1725 Apr 24, 2019

@cpcloud cpcloud force-pushed the cpcloud:pandas-toposort branch 2 times, most recently from 97b3de5 to 9f552a5 May 12, 2019

@cpcloud cpcloud self-assigned this May 14, 2019

cpcloud added some commits Apr 14, 2019

@cpcloud cpcloud force-pushed the cpcloud:pandas-toposort branch from 69b01dd to b336901 May 15, 2019

cpcloud added some commits May 15, 2019

@cpcloud cpcloud merged commit 063035c into ibis-project:master May 16, 2019

13 checks passed

ci/circleci: python35_test Your tests passed on CircleCI!
Details
ci/circleci: python36_benchmark Your tests passed on CircleCI!
Details
ci/circleci: python36_conda_build Your tests passed on CircleCI!
Details
ci/circleci: python36_docs Your tests passed on CircleCI!
Details
ci/circleci: python36_test Your tests passed on CircleCI!
Details
ci/circleci: python37_conda_build Your tests passed on CircleCI!
Details
ci/circleci: python37_test Your tests passed on CircleCI!
Details
ibis-project.ibis Build #20190515.10 succeeded
Details
ibis-project.ibis (WindowsCondaBuild py36) WindowsCondaBuild py36 succeeded
Details
ibis-project.ibis (WindowsCondaBuild py37) WindowsCondaBuild py37 succeeded
Details
ibis-project.ibis (WindowsTest py35) WindowsTest py35 succeeded
Details
ibis-project.ibis (WindowsTest py36) WindowsTest py36 succeeded
Details
ibis-project.ibis (WindowsTest py37) WindowsTest py37 succeeded
Details

Pandas automation moved this from To do to Done May 16, 2019

@cpcloud cpcloud deleted the cpcloud:pandas-toposort branch May 16, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.