Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add info support for late binding views #159

Merged
merged 5 commits into from
Feb 19, 2019

Conversation

e01n0
Copy link
Contributor

@e01n0 e01n0 commented Jan 22, 2019

Todos

  • MIT compatible
  • Tests
  • Documentation
  • Updated CHANGES.rst

Small change to the _get_all_column_info sql to enable column information for late binding views. Tested against our redshift with late binding views over normal and external tables. Resolves this issue apache/superset#4435

@jklukas
Copy link
Member

jklukas commented Jan 25, 2019

This looks good, but before merging this in, we'll need to have a test to demonstrate the behavior and to alert of us regressions as changes are made in the future.

Would you be able to add a test for this? If you look through existing tests, there are examples of tests that run against a real Redshift instance. They won't run automatically for a forked PR like this, but I can manually kick them off to verify behavior once the test is added.

@sburns
Copy link

sburns commented Feb 19, 2019

I don't have write access to this fork but the test should look something like...

def test_late_binding_view_reflection(redshift_engine):
    table_ddl = "CREATE TABLE my_table (col1 INTEGER, col2 INTEGER)"
    view_query = "SELECT my_table.col1, my_table.col2 FROM public.my_table"
    view_ddl = "CREATE VIEW my_view AS %s WITH NO SCHEMA BINDING" % view_query
    conn = redshift_engine.connect()
    conn.execute(table_ddl)
    conn.execute(view_ddl)
    insp = inspect(redshift_engine)
    view_definition = insp.get_view_definition('my_view')
    assert(clean(compile_query(view_definition)) == clean(view_query))
    view = Table('my_view', MetaData(),
                 autoload=True, autoload_with=redshift_engine)
    assert(len(view.columns) == 2)

Key differences between the normal view reflection test and this test is:

  1. Tables referenced in the query of the view must be referenced by schema
  2. And obviously, the view must be created with WITH NO SCHEMA BINDING to be late-binding.

Can an admin push to this branch?

@jklukas
Copy link
Member

jklukas commented Feb 19, 2019

See a test run in https://travis-ci.org/sqlalchemy-redshift/sqlalchemy-redshift/jobs/495640544

    def test_late_binding_view_reflection(redshift_engine):
        table_ddl = "CREATE TABLE my_table (col1 INTEGER, col2 INTEGER)"
        view_query = "SELECT my_table.col1, my_table.col2 FROM public.my_table"
        view_ddl = "CREATE VIEW my_late_view AS %s WITH NO SCHEMA BINDING" % view_query
        conn = redshift_engine.connect()
        conn.execute(table_ddl)
        conn.execute(view_ddl)
        insp = inspect(redshift_engine)
        view_definition = insp.get_view_definition('my_late_view')
>       assert(clean(compile_query(view_definition)) == clean(view_query))
E       AssertionError: assert 'CREATE VIEW ...CHEMA BINDING' == 'SELECT my_tab...blic.my_table'
E         - CREATE VIEW my_late_view AS SELECT my_table.col1, my_table.col2 FROM public.my_table WITH NO SCHEMA BINDING
E         + SELECT my_table.col1, my_table.col2 FROM public.my_table

Does that behavior of insp.get_view_definition make sense to you? I'm not sure what the correct result should be here, or why this result is being returned.

@sburns
Copy link

sburns commented Feb 19, 2019

I can confirm that the view definitions for late-binding views look really strange compared to those from normal views.

xxx:warehouse# CREATE TABLE test_table(col1 INTEGER, col2 INTEGER);
Time: 415.248 ms
xxx:warehouse# CREATE VIEW my_view AS SELECT test_table.col1, test_table.col2 FROM test_table;
Time: 438.940 ms
xxx:warehouse# select pg_catalog.pg_get_viewdef('my_view');
                      pg_get_viewdef
----------------------------------------------------------
 SELECT test_table.col1, test_table.col2 FROM test_table;
(1 row)

Time: 46.925 ms
xxx:warehouse# CREATE VIEW my_late_view AS SELECT test_table.col1, test_table.col2 FROM public.test_table WITH NO SCHEMA BINDING;
Time: 437.443 ms
xxx:warehouse# select pg_catalog.pg_get_viewdef('my_late_view');
                                                   pg_get_viewdef
--------------------------------------------------------------------------------------------------------------------
 CREATE VIEW my_late_view AS SELECT test_table.col1, test_table.col2 FROM public.test_table WITH NO SCHEMA BINDING;
(1 row)

Time: 47.987 ms

I have no idea why but it feels like an issue on redshift's side? Clearly, for late-binding views it appears like redshift is returning the entire DDL instead of the underlying select query.

¯\(ツ)/¯ all the way down

@jklukas
Copy link
Member

jklukas commented Feb 19, 2019

Yes, this looks to clearly be weirdness on the Redshift side. I'm going to get a working test put together, and I'll get this merged.

@jklukas jklukas merged commit 019df9d into sqlalchemy-redshift:master Feb 19, 2019
@sburns
Copy link

sburns commented Feb 19, 2019

I'm trying to post a question whether this is expected behavior to the Redshift AWS forum, but dear $deity that whole thing is terrible.

edit: I can't post for a few hours while my account is created. This is fine.

@jklukas
Copy link
Member

jklukas commented Feb 19, 2019

Passing test added in 7319ac2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants