Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DISTINCT command error Incorrect syntax near the keyword on SQLLab when connect MSSQL database #21027

Closed
3 tasks done
unnyns-307 opened this issue Aug 9, 2022 · 5 comments · Fixed by #23751
Closed
3 tasks done
Labels
#bug Bug report

Comments

@unnyns-307
Copy link

unnyns-307 commented Aug 9, 2022

When connect to MSSQL database cannot use DISTINCT command. SQLLab shows mssql error: 156, b"Incorrect syntax near the keyword 'DISTINCT'

How to reproduce the bug

  1. Go to SQLLab editor
  2. Connect to MSSQL database connection
  3. Query data with 'DISTINCT' command
  4. See error

Expected results

Get data result of unique value normally

Actual results

Get this following error:

mssql error: (156, b"Incorrect syntax near the keyword 'DISTINCT'.DB-Lib error message 20018, severity 15:\nGeneral SQL Server error: Check messages from the SQL Server\n")

Screenshots

image (25)

Environment

(please complete the following information):

  • browser type and version: Chrome Version 104.0.5112.81 (Official Build) (64-bit)
  • superset version: 2.0.0
  • python version: 3.8.12
  • node.js version:16
  • any feature flags active:

Checklist

Make sure to follow these steps before submitting your issue - thank you!

  • I have checked the superset logs for python stacktraces and included it here as text if there are any.
  • I have reproduced the issue with at least the latest released version of superset.
  • I have checked the issue tracker for the same issue and I haven't found one similar.

Additional context

@unnyns-307 unnyns-307 added the #bug Bug report label Aug 9, 2022
@sfirke
Copy link
Member

sfirke commented Aug 9, 2022

I get this behavior too, Superset 2.0.0 connecting to MSSQL.

@EcoleKeine
Copy link

Same, any news?

@TerryLines
Copy link

TerryLines commented Jan 31, 2023

same here. Problem is with both pymssql and with pyodbc.

It looks like this is due to the insertion of top 1000 by SQL Lab, as this syntax (top 1000 distinct) isn't valid for sql server?

I ended up using a cte for my actual query with a select * from cte wrapper as a workaround

@sfirke
Copy link
Member

sfirke commented Feb 2, 2023

Nice sleuthing. I wonder if this can be accommodated on the Superset side. It looks like SELECT DISTINCT TOP 1000 * FROM myTable is supported and does what SELECT TOP 1000 DISTINCT * FROM myTable would do in other dialects? If so, could Superset recognize that the dialect is SQL Server and thus insert the TOP 1000 after DISTINCT?

@sfirke
Copy link
Member

sfirke commented Apr 20, 2023

An idea: modify

i
for i, word in enumerate(tokens)
if word.upper() in cls.select_keywords
]
first_select = selects[0]
tokens.insert(first_select + 1, "TOP")
tokens.insert(first_select + 2, str(final_limit))
. It currently finds the first select keyword (dialect-specific) and injects the TOP there. It could instead find the first select keyword followed by a non-select keyword.

Then in superset/db_engine_specs/mssql.py, specify select_keywords = {"SELECT", "DISTINCT"}. Then the TOP 1000 would get injected after SELECT DISTINCT.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
#bug Bug report
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants