-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve SQL row limiting and metadata operations #35
Conversation
Per discussion with @clach04 Tested the current sql_parse.py from the Superset headrev with adhoc SQL and a row limiting clause.
Tested this hack from June 8, 2020 applied to sql_parse.py of the current Superset headrev with adhoc SQL and a row limiting clause.
This lends to the theory that the Superset SQL parser supports using the row limiting clause |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Main change looks good, minor refactor suggested.
Additional change should be reverted.
lib/sqlalchemy_ingres/base.py
Outdated
@@ -641,7 +643,7 @@ def denormalize_name(self, name): | |||
if name is None: | |||
return None | |||
else: | |||
return name.lower().encode('latin1') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm guessing but I suspect this is accidentally working for the test case (compare with normalize name which does NOT change case). I recommend reverting this line and then making this separate and handling DB_NAME_CASE lookup.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Additionally, the existing code with encoding change is DEFINTELY incorrect. This will only work for some values/II_CHARSETxx settings/combinations. This needs a separate issue, I just opened #36 (worth a jira ticket too for cross reference?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reverted change and will deal with it in #36
lib/sqlalchemy_ingres/base.py
Outdated
if not self.is_subquery(): | ||
if select._offset: | ||
if select._offset is not None and select._offset > 0: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seeing a pattern of checks here.
What do you think about a small utility function similar to:
def is_integer_greater_than_zero(check_value):
return check_value is not None and check_value > 0
and then using that instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any thoughts on this suggestion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the suggestion! Utility function added to base.py for checking if value is greater than zero.
Superset issue 27427 opened for the problem involving row limiting clauses. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
@hab6 this is marked as Draft so can not be submitted yet. Any concerns with un-drafting and merging? lgtm :-) |
@clach04 thanks for your suggestion per our brief conversation. I agree that it will be good to merge these changes and provide some relief for the problem addressed by these code changes. |
Overview
These code changes are an attempt to improve some operations in Apache Superset SQL Lab when using the Ingres dialect for SQLAlchemy.
Related internal ticket II-13536
Details
When using the SQLAlchemy-Ingres dialect in Apache Superset SQL Lab, there are a couple of unrelated problems that occur, both when choosing a table from the SEE TABLE SCHEMA drop list which attempts to execute a SQL SELECT statement to retrieve rows from the target table.
Problems
IngresDialect::get_columns
callsIngresDialect::denormalize_name
, which converts the given string to all lowercase. If the table name is mixed (or upper) case, the SQL statement to retrieve column names returns no rows, which causes Superset to build a syntactically invalid SELECT statement that specifies no columns. It is important to note that this problem could also manifest outside of Apache Superset given the right conditions.IngresDialect::limit_clause
is called which returns a row limiting clause usingFIRST FETCH n ROWS ONLY
after which Superset also appendsLIMIT n
, causing a syntactically incorrect SQL statement having two row limiting clauses. Superset appears to not have the ability to recognize theFETCH FIRST ...
clause so that it could handle it properly and avoid adding theLIMIT
clause.Fixes / Workarounds
IngresDialect::limit_clause
gives precedence to theLIMIT n
clause and only usesFETCH FIRST n ROWS ONLY
if there is also anOFFSET m
clause. When the row limiting clause isLIMIT n
, Superset will not append another "LIMIT" clause and the SQL statement remains syntactically valid.SQLAlchemy Test Suite Results
18342 tests
13909 tests
Concerns
SQL_MAX_ROW
affects behavior in Superset SQL Lab. Without the proposed fix to the SQLAlchemy Ingres dialectlimit_clause
method, adhoc queries having aFETCH FIRST ...
row limiting clause always fail with a syntax error since Superset adds the redundantLIMIT n
clause. However, with the fix to methodlimit_clause
, adhoc queries using aFETCH FIRST ...
row limiting clause will work ifSQL_MAX_ROW
is set to0
(zero). IfSQL_MAX_ROW
is set to a value > 0, Superset appends aLIMIT
clause causing a syntax error. Adhoc queries usingLIMIT
for row limiting work regardless of the value ofSQL_MAX_ROW
.