Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FEATURE-REQUEST: [PostgreSQL] support for DISTINCT ON #11

Closed
agy-why opened this issue Nov 3, 2021 · 2 comments
Closed

FEATURE-REQUEST: [PostgreSQL] support for DISTINCT ON #11

agy-why opened this issue Nov 3, 2021 · 2 comments

Comments

@agy-why
Copy link

agy-why commented Nov 3, 2021

Issue

The PostgreSQL specific DISTINCT ON clause is not supported by queryparser.

minimal example

example:

from queryparser.postgresql import PostgreSQLQueryProcessor

qp = PostgreSQLQueryProcessor()

sql = """
SELECT 
DISTINCT ON ("source"."tycho2_id") "tycho2_id", 
  "source"."tycho2_dist",
  "source"."source_id", 
  "source"."raj2000"
  
  FROM "applause_dr3"."source_calib" AS "source"

 WHERE "source"."raj2000" BETWEEN 10.0 AND 10.0005
   AND "source"."tycho2_dist" IS NOT NULL
 ORDER BY "source"."tycho2_dist"
"""
qp.set_query(sql)
qp.process_query()

fails with following error message

---------------------------------------------------------------------------
QuerySyntaxError                          Traceback (most recent call last)
/tmp/ipykernel_136447/2882367489.py in <module>
     17 """
     18 qp.set_query(sql)
---> 19 qp.process_query()

~/.pyenv/versions/3.8.7/envs/tap_env/lib/python3.8/site-packages/queryparser/common/common.py in process_query(self, replace_schema_name, indexed_objects)
    757         tree = parser.query()
    758         if len(self.syntax_error_listener.syntax_errors):
--> 759             raise QuerySyntaxError(self.syntax_error_listener.syntax_errors)
    760 
    761         if replace_schema_name is not None:

QuerySyntaxError: [(2, 9, 'ON')]

Expectations

I expect that DISTINCT ON clause passes without raising an exception.

Patch suggestion

This patch seems to solve the issue. But may not be appropriate for all cases.

diff --git a/src/queryparser/postgresql/PostgreSQLParser.g4 b/src/queryparser/postgresql/PostgreSQLParser.g4
index 1e8a6a9..180a1b8 100644
--- a/src/queryparser/postgresql/PostgreSQLParser.g4
+++ b/src/queryparser/postgresql/PostgreSQLParser.g4
@@ -189,7 +189,9 @@ query:                  select_statement SEMI;
 
 schema_name:            ID ;
 select_list:            ( displayed_column ( COMMA displayed_column )* ) |
-                        ( ASTERISK ( COMMA displayed_column ( COMMA displayed_column )* )? ) ;
+                        ( ASTERISK ( COMMA displayed_column ( COMMA displayed_column )* )? ) |
+                       ( ON (subselect_list) ( COMMA displayed_column )* );
+subselect_list:         ( displayed_column ( COMMA displayed_column )* );                      
 select_statement:       select_expression ( (UNION_SYM ( ALL )? ) select_expression )* ;
 
 simple_expr:
@kimakan
Copy link
Contributor

kimakan commented Oct 8, 2022

I've implemented your suggestion into the new branch
https://github.com/kimakan/queryparser/tree/feature-add-distinct
and added your example as a test to /src/queryparser/testing/tests.yaml.
All previous tests and the new tests run without any issues.
Probably, we should think about the cases we want to cover and add tests for it.

@kimakan
Copy link
Contributor

kimakan commented Nov 15, 2022

fixed in the release 0.6.0

@kimakan kimakan closed this as completed Nov 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants