Skip to content
This repository has been archived by the owner on Aug 2, 2022. It is now read-only.

Feature request: support SELECT DISTINCT #294

Closed
seraphjiang opened this issue Nov 19, 2019 · 2 comments
Closed

Feature request: support SELECT DISTINCT #294

seraphjiang opened this issue Nov 19, 2019 · 2 comments
Assignees
Labels
enhancement New feature or request

Comments

@seraphjiang
Copy link
Member

Use case

We want to be able to use SELECT DISTINCT to find out list of unique item of specific field without duplicate data

Example SQL

SELECT DISTINCT(DestAirportID) FROM kibana_sample_data_flights order by DestAirportID

OD-SQL

GET _opendistro/_sql?format=csv
{
  "query": "select distinct(DestAirportID) from  kibana_sample_data_flights order by DestAirportID"
}

actual result(Partial)
you could see, there is duplicate data in result

DestAirportID
ABQ
ABQ
ADL
AICM
AICM
AICM
AICM
AICM
AICM
AICM
...

Expected result
no duplicate data

DestAirportID
ABQ
ADL
AICM
AMS
ARN
@abbashus abbashus added the enhancement New feature or request label Nov 19, 2019
@chloe-zh
Copy link
Member

@seraphjiang Thanks for reporting. We've reproduced this issue:

POST _opendistro/_sql?format=csv
{
  "query": "select distinct(DestAirportID) from  kibana_sample_data_flights order by DestAirportID"
}

Output:

DestAirportID
ABQ
ABQ
ADL
AICM
AICM
AICM
AICM
...

We will investigate this problem and explore the solution to support the DISTINCT feature soon. Thanks!

@dai-chen
Copy link
Member

For now we only support COUNT(DISTINCT) because we rely on Elasticsearch aggregation to de-duplicate. Basically SELECT DISTINCT is equivalent to GROUP BY without aggregate function, so one option is to translate distinct to aggregation. In other database, DISTINCT and GROUP BY generates different physical plan, so it makes more sense to do this translation when generating DSL rather than by rewriter.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants