Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: `outer_join` generates LEFT join instead of full outer #1772

Closed
scottcode opened this issue May 10, 2019 · 2 comments

Comments

Projects
None yet
1 participant
@scottcode
Copy link
Contributor

commented May 10, 2019

When trying to create a full outer join using the outer_join method on a TableExpr, the resulting SQL is actually a LEFT OUTER JOIN.

iconn = ibis.sql.sqlite.client.SQLiteClient('test_outer_join.sqlite', create=True)

iconn.con.execute("DROP TABLE IF EXISTS test_left;")
iconn.con.execute("CREATE TABLE test_left(id_left integer, value_left integer);")
iconn.con.execute("DROP TABLE IF EXISTS test_right;")
iconn.con.execute("CREATE TABLE test_right(id_right integer, value_right integer);")

left = iconn.table('test_left')
right = iconn.table('test_right')

joined = left.outer_join(right, left.id_left == right.id_right)

print(joined.compile())
# SELECT t0.id_left, t0.value_left, t1.id_right, t1.value_right 
# FROM test_left AS t0 LEFT OUTER JOIN test_right AS t1 ON t0.id_left = t1.id_right

'full outer' in str(joined.compile()).lower()
# False

The root cause of it is that the method in ibis.sql.alchemy._AlchemyTableSet.get_result() uses sqlalchemy.sql.selectable.FromClause.outerjoin() with just result.outerjoin(table, onclause). However, starting in sqlalchemy version 1.1.0 (circa 2016) they added a keyword argument full with default False. To fix this we need to change the following:

Current:

elif jtype is ops.OuterJoin:
    result = result.outerjoin(table, onclause)

Proposed fix:

elif jtype is ops.OuterJoin:
    result = result.outerjoin(table, onclause, full=True)

@scottcode

This comment has been minimized.

Copy link
Contributor Author

commented May 10, 2019

I submitted my fix. I'm new to contributing to projects like this, so if there are any formalities or technicalities that I need to tweak to make this contribution possible, please let me know. I am open to guidance. Thanks to the community for how awesome ibis already is, and I hope to become part of your community.

scottcode added a commit to scottcode/ibis that referenced this issue May 17, 2019

Test for Alchemy-based full outer join
* Test using pytest framework that would catch bug in issue ibis-project#1772
(outer_join actually did a left join).
* Fixed outer join portion of test_joins test method.
* Removed unittest version of full outer test.
* Refactored MockConnection and MockAlchemyConnection.

In order for the test to catch this problem, it required changes to the
MockAlchemyConnection class. Previously that class's dialect and
_build_ast methods actually used implementations specific to Impala
(objects from `ibis.impala.compiler`). This meant that the `outer_join`
method compiled correctly in the sqlalchemy test because it was using
the Impala compiler, whose implementation did not have the bug.

To make clearer the similarities between Impala- and Alchemy-flavored
mock connections, I refactored so they both share a BaseMockConnection
class. However, I kept the name of the Impala-flavored one as
MockConnection because it is used in several other tests. I wasn't sure
if those other tests really should use Impala or Alchemy. As a to-do,
those other tests should be checked for which flavor they ought to use,
and the MockConnection should be renamed to MockImpalaConnection.

scottcode added a commit to scottcode/ibis that referenced this issue May 17, 2019

Test for Alchemy-based full outer join
* Test using pytest framework that would catch bug in issue ibis-project#1772
(outer_join actually did a left join).
* Fixed outer join portion of test_joins test method.
* Removed unittest version of full outer test.
* Refactored MockConnection and MockAlchemyConnection.

In order for the test to catch this problem, it required changes to the
MockAlchemyConnection class. Previously that class's dialect and
_build_ast methods actually used implementations specific to Impala
(objects from `ibis.impala.compiler`). This meant that the `outer_join`
method compiled correctly in the sqlalchemy test because it was using
the Impala compiler, whose implementation did not have the bug.

To make clearer the similarities between Impala- and Alchemy-flavored
mock connections, I refactored so they both share a BaseMockConnection
class. However, I kept the name of the Impala-flavored one as
MockConnection because it is used in several other tests. I wasn't sure
if those other tests really should use Impala or Alchemy. As a to-do,
those other tests should be checked for which flavor they ought to use,
and the MockConnection should be renamed to MockImpalaConnection.

cpcloud added a commit that referenced this issue May 17, 2019

Fixes `outer_join` generating LEFT join instead of full outer
Fixes issue #1772    Making new pull request from new branch freshly
rebased on upstream master. Supercedes pull request #1773.
Author: Scott Hajek <shajek@pivotal.io>

Closes #1788 from scottcode/fix_full_outer_join2 and squashes the following commits:

4e2fdc4 [Scott Hajek] Shorten long source lines that were failing linter
3a6830d [Scott Hajek] Test for Alchemy-based full outer join
6c91ee9 [Scott Hajek] Fixes bug: `outer_join` generates LEFT join instead of full outer
@scottcode

This comment has been minimized.

Copy link
Contributor Author

commented May 19, 2019

Fixed by PR #1788

@scottcode scottcode closed this May 19, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.