New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG/CLN: Fix predicates on Selections on Joins #1149

Closed
wants to merge 1 commit into
base: master
from

Conversation

Projects
None yet
3 participants
@cpcloud
Member

cpcloud commented Aug 29, 2017

The motivation for this PR is twofold

  1. fix a bug where filters on joined data would either fail or give the wrong result, depending on which version of pandas you're on
  2. refactor the execute_selection_dataframe function because it was getting rather large and unreadable.

The bug in #1136 was occurring due to the fact that the predicates in the Selection were not being evaluated against the data argument which is the result of the joined data. This doesn't fail on non-Join selections because there's no possible ambiguity regarding which column to select. The solution is to map each root table in each predicate to data, therefore evaluating the predicate against the joined data.

Closes #1136.

@cpcloud cpcloud self-assigned this Aug 29, 2017

@cpcloud cpcloud added this to the 0.11.3 milestone Aug 29, 2017

@cpcloud cpcloud requested a review from wesm Aug 29, 2017

@cpcloud cpcloud force-pushed the cpcloud:fix-selection-join-predicates branch from 42a5690 to 43961a9 Aug 29, 2017

@wesm

wesm approved these changes Aug 29, 2017

+1, much cleaner this way

@cpcloud cpcloud force-pushed the cpcloud:fix-selection-join-predicates branch 2 times, most recently from 68d40f1 to 0c3924c Aug 31, 2017

@cpcloud cpcloud changed the title from BUG/CLN: Fix predicates on Selections on Joins to WIP: BUG/CLN: Fix predicates on Selections on Joins Sep 13, 2017

@cpcloud cpcloud force-pushed the cpcloud:fix-selection-join-predicates branch from 338ead8 to 36b2355 Sep 13, 2017

@cpcloud cpcloud force-pushed the cpcloud:fix-selection-join-predicates branch from 36b2355 to da6d646 Sep 27, 2017

@cpcloud cpcloud force-pushed the cpcloud:fix-selection-join-predicates branch 4 times, most recently from 3bba9a6 to 3daef3f Oct 10, 2017

@cpcloud cpcloud changed the title from WIP: BUG/CLN: Fix predicates on Selections on Joins to BUG/CLN: Fix predicates on Selections on Joins Oct 12, 2017

}
LEFT_JOIN_SUFFIX = '_ibis_left_{}'.format(ibis.util.guid())

This comment has been minimized.

@jreback

jreback Oct 14, 2017

Contributor

should these be functions? so the guid is on demand? (IOW what if you do a join of a join)?

This comment has been minimized.

@cpcloud

cpcloud Oct 15, 2017

Member

This isn't an issue because by the time the next join is executed we've removed this suffix. I have some tests for this: test_multi_join_with_post_expression_filter.

@cpcloud cpcloud force-pushed the cpcloud:fix-selection-join-predicates branch 2 times, most recently from b9bb874 to 354582a Oct 16, 2017

BUG/CLN: Fix predicates on Selections on Joins
Also refactors the code in the projection execution out into a couple of
multipledispatch cases for each kind of projection.

@cpcloud cpcloud force-pushed the cpcloud:fix-selection-join-predicates branch from 354582a to d76b2a8 Oct 16, 2017

@cpcloud cpcloud closed this in 34cc0c2 Oct 16, 2017

@cpcloud cpcloud deleted the cpcloud:fix-selection-join-predicates branch Oct 16, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment