Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SELECT DISTINCT not removing duplicates #6728

Closed
monetdb-team opened this issue Nov 30, 2020 · 0 comments
Closed

SELECT DISTINCT not removing duplicates #6728

monetdb-team opened this issue Nov 30, 2020 · 0 comments
Labels

Comments

@monetdb-team
Copy link

@monetdb-team monetdb-team commented Nov 30, 2020

Date: 2019-07-11 15:07:53 +0200
From: @swingbit
To: SQL devs <>
Version: 11.31.13 (Aug2018-SP2)
CC: @njnes, @PedroTadim

Last updated: 2019-09-02 16:05:26 +0200

Comment 27127

Date: 2019-07-11 15:07:53 +0200
From: @swingbit

User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.100 Safari/537.36
Build Identifier:

start transaction;

create table r(a1 string, a2 string);
insert into r values
('a','b'),
('a','b'),
('b','a');

The following query removes duplicate correctly:

SELECT DISTINCT a1,a2
FROM r;

+------+------+
| a1 | a2 |
+======+======+
| a | b |
| b | a |
+------+------+
2 tuples

The following query doesn't:

SELECT DISTINCT
CASE WHEN a1 > a2 THEN a2 ELSE a1 END as c1,
CASE WHEN a1 < a2 THEN a2 ELSE a1 END as c2
FROM r;

+------+------+
| c1 | c2 |
+======+======+
| a | b |
| a | b |
+------+------+
2 tuples

It's not really important what the CASE statements do. The DISTINCT should remove duplicate rows from the final relation, but they are still there.

Reproducible: Always

Comment 27128

Date: 2019-07-11 15:11:31 +0200
From: @swingbit

Also notice that replacing DISTINCT with the corresponding GROUP BY clause gives the correct result:

SELECT
CASE WHEN a1 > a2 THEN a2 ELSE a1 END as c1,
CASE WHEN a1 < a2 THEN a2 ELSE a1 END as c2
FROM r
GROUP BY c1,c2;

+------+------+
| c1 | c2 |
+======+======+
| a | b |
+------+------+
1 tuple

Comment 27129

Date: 2019-07-11 15:59:03 +0200
From: @PedroTadim

This bug is reproducible on Apr2019 branch, but not on default.

Comment 27132

Date: 2019-07-11 18:22:57 +0200
From: MonetDB Mercurial Repository <>

Changeset 4448a5aee0a4 made by Pedro Ferreira pedro.ferreira@monetdbsolutions.com in the MonetDB repo, refers to this bug.

For complete details, see https//devmonetdborg/hg/MonetDB?cmd=changeset;node=4448a5aee0a4

Changeset description:

Added test bug #6728.

Comment 27178

Date: 2019-07-27 22:47:07 +0200
From: MonetDB Mercurial Repository <>

Changeset 397669aa4038 made by Niels Nes niels@cwi.nl in the MonetDB repo, refers to this bug.

For complete details, see https//devmonetdborg/hg/MonetDB?cmd=changeset;node=397669aa4038

Changeset description:

make sure we do not loose the distinct when rewriting (splitting) the
project expressions. This fixes bug #6728

Comment 27179

Date: 2019-07-27 22:47:34 +0200
From: @njnes

fixed optimizer

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
1 participant