Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(mssql): support top syntax for limiting queries #18746

Merged
merged 16 commits into from
Feb 21, 2022
Merged

fix(mssql): support top syntax for limiting queries #18746

merged 16 commits into from
Feb 21, 2022

Conversation

sujiplr
Copy link
Contributor

@sujiplr sujiplr commented Feb 15, 2022

MSSQL is not supporting LIMIT syntax in SQLs. For limiting the rows, MSSQL having a different keyword TOP. Added fixes for handling the TOP and LIMIT clauses based on the database engines. Made this fix as generalised one for handling different database engines.

SUMMARY

This fix will help to start supporting TOP clause in SQLs. And also along with CTE queries.

BEFORE SCREENSHOTS

Pre-error-with-cte

AFTER SCREENSHOTS

Post-fix-working with top and cte

TESTING INSTRUCTIONS

You can test the SQLs as per the above screenshots.

ADDITIONAL INFORMATION

MSSQL is not supporting LIMIT syntax in SQLs. For limiting the rows, MSSQL having a different keyword TOP. Added fixes for handling the TOP and LIMIT clauses based on the database engines.
@codecov
Copy link

codecov bot commented Feb 15, 2022

Codecov Report

Merging #18746 (ee37d94) into master (ea9a904) will increase coverage by 0.11%.
The diff coverage is 95.18%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #18746      +/-   ##
==========================================
+ Coverage   66.31%   66.43%   +0.11%     
==========================================
  Files        1620     1617       -3     
  Lines       63088    62847     -241     
  Branches     6372     6320      -52     
==========================================
- Hits        41839    41750      -89     
+ Misses      19592    19445     -147     
+ Partials     1657     1652       -5     
Flag Coverage Δ
hive 52.28% <20.48%> (+0.05%) ⬆️
mysql 81.55% <95.18%> (+0.12%) ⬆️
postgres 81.60% <95.18%> (+0.12%) ⬆️
presto 52.12% <20.48%> (+0.05%) ⬆️
python 82.03% <95.18%> (+0.12%) ⬆️
sqlite 81.29% <95.18%> (+0.12%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
superset/sql_lab.py 81.64% <ø> (ø)
superset/models/core.py 88.91% <66.66%> (-0.19%) ⬇️
superset/sql_parse.py 98.61% <92.85%> (-0.87%) ⬇️
superset/db_engine_specs/base.py 89.48% <97.91%> (+0.73%) ⬆️
superset/db_engine_specs/mssql.py 96.00% <100.00%> (+0.08%) ⬆️
superset/db_engine_specs/teradata.py 92.30% <100.00%> (+29.54%) ⬆️
superset-frontend/src/views/CRUD/utils.tsx 58.58% <0.00%> (-5.14%) ⬇️
...d/src/explore/components/PropertiesModal/index.tsx 68.33% <0.00%> (-1.67%) ⬇️
...rset-frontend/src/explore/exploreUtils/formData.ts 81.81% <0.00%> (-1.52%) ⬇️
...dashboard/components/menu/ShareMenuItems/index.tsx 66.66% <0.00%> (-1.20%) ⬇️
... and 52 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ea9a904...ee37d94. Read the comment docs.

@villebro villebro self-requested a review February 16, 2022 07:59
Teradata code for top clause handling removed from teradata.py file, since we added generic section in base engine for the same.
Copy link
Member

@villebro villebro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First pass comments. Let's also ping the original authors of the Teradata TOP functionality when this is ready 👍

superset/db_engine_specs/base.py Outdated Show resolved Hide resolved
superset/sql_parse.py Outdated Show resolved Hide resolved
Added changes to handle TOP command in CTEs, for DB Engines which are not supporting inline CTEs.
Added multiple unit test cases for MSSQL top command handling and also along with CTEs
Corrected the select_keywords name key in basengine
Copy link
Member

@villebro villebro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Second pass comments

superset/db_engine_specs/base.py Outdated Show resolved Hide resolved
superset/db_engine_specs/base.py Outdated Show resolved Hide resolved
superset/db_engine_specs/base.py Outdated Show resolved Hide resolved
superset/db_engine_specs/base.py Outdated Show resolved Hide resolved
superset/db_engine_specs/base.py Outdated Show resolved Hide resolved
superset/db_engine_specs/base.py Outdated Show resolved Hide resolved
superset/db_engine_specs/base.py Outdated Show resolved Hide resolved
superset/db_engine_specs/mssql.py Outdated Show resolved Hide resolved
made the required corrections based on code review to keep good code readability and code cleanliness.
Copy link
Member

@villebro villebro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some last nit comments, but I think this is pretty much ready for a final test pass + a few added unit tests to the MSSQL test suite. Also, I suggest checking failing workflows on CI, as there appears to be simple linting issues that need to be addressed.

superset/db_engine_specs/base.py Outdated Show resolved Hide resolved
superset/db_engine_specs/base.py Outdated Show resolved Hide resolved
superset/db_engine_specs/base.py Outdated Show resolved Hide resolved
superset/sql_parse.py Outdated Show resolved Hide resolved
superset/sql_lab.py Outdated Show resolved Hide resolved
superset/sql_parse.py Outdated Show resolved Hide resolved
@villebro villebro changed the title (WIP) Fix for SQL-TOP/LIMIT Clause Apply based on database engines fix(mssql): support top syntax for limiting queries Feb 18, 2022
@villebro
Copy link
Member

@dmcnulla this PR generalizes the work you did in #18240 to make it work for MSSQL, too. Would you be able to review this PR, and better yet test it on Teradata to ensure it still works as intended? Also notice the updated Teradata unit tests (I pushed a few new ones to this PR as I noticed the SAMPLE keyword wasn't covered by the existing tests).

Copy link
Member

@villebro villebro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two minor nits

superset/sql_lab.py Outdated Show resolved Hide resolved
superset/sql_parse.py Outdated Show resolved Hide resolved
Moved the top/limit flag check from sql_lab to core.
superset/models/core.py Outdated Show resolved Hide resolved
superset/sql_lab.py Outdated Show resolved Hide resolved
superset/sql_parse.py Outdated Show resolved Hide resolved
superset/db_engine_specs/base.py Outdated Show resolved Hide resolved
Changes for keeping code cleanliness
Corrected lint issue.
Copy link
Member

@villebro villebro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Last nit, after this I believe this is good to go

superset/db_engine_specs/mssql.py Outdated Show resolved Hide resolved
Code cleanliness
Copy link
Member

@villebro villebro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks for all the hard work here; this is a major step forward in ensuring Superset works properly on MSSQL, Synapse and other Azure db products! ❤️

@villebro villebro merged commit 7e51b20 into apache:master Feb 21, 2022
@sujiplr sujiplr deleted the SQL-TOP-FIX branch February 23, 2022 09:27
@sfirke
Copy link
Member

sfirke commented Apr 20, 2023

I am noodling on #21027 (comment) and found this PR via blame view on the relevant code. @sujiplr do you have thoughts on that issue - modifying the codebase to support use of DISTINCT in SQL Lab for SQL Server? The current code injects TOP 1000 in the wrong place for MSSQL so it's SELECT TOP 1000 DISTINCT when it needs to be SELECT DISTINCT TOP 1000.

UPDATE @villebro has a PR to address this, #23751.

@mistercrunch mistercrunch added 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 1.5.0 labels Mar 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels size/XL 🚢 1.5.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Common Table Expressions (CTE's) do not work for Microsoft SQL Server database
4 participants