Date: 2018-10-11 20:20:16 +0200
From: Joseph Dsilva <>
To: SQL devs <>
Version: 11.31.7 (Aug2018)
CC: joedsilva, martin.van.dinther, @PedroTadim
Last updated: 2019-01-14 17:29:12 +0100
Comment 26638
Date: 2018-10-11 20:20:16 +0200
From: Joseph Dsilva <>
User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/66.0.3359.170 Safari/537.36
Build Identifier:
A SQL Query with multiple columns in the IN clause, such as
... WHERE (c1, c2) IN ( SELECT c1, c2 FROM ... )
produces incorrect results (see reproducibility for sample test case)
Reproducible: Always
Steps to Reproduce:
Create a test table
CREATE TABLE T1
(
C1 INTEGER,
C2 INTEGER,
C3 INTEGER
);
Insert some records
INSERT INTO T1
VALUES (1, 2, 3)
,(1, 2, 4)
,(2, 2, 5)
,(1, 3, 6)
;
Let us see which c1, c2 combination has more than one entry (results are correct)
SELECT C1, C2, COUNT()
FROM T1
GROUP BY C1, C2
HAVING COUNT() > 1
;
+------+------+------+
| c1 | c2 | L4 |
+======+======+======+
| 1 | 2 | 2 |
+------+------+------+
1 tuple
Let us find all records from T1 such that C1, C2 has multiple entries for a given value combination. (correct result)
SELECT T1.C1, T1.C2, T1.C3
FROM T1,
(
SELECT C1, C2
FROM T1
GROUP BY C1, C2
HAVING COUNT(*) > 1
) X
WHERE T1.C1 = X.C1 AND T1.C2 = X.C2
;
+------+------+------+
| c1 | c2 | c3 |
+======+======+======+
| 1 | 2 | 3 |
| 1 | 2 | 4 |
+------+------+------+
Let us write the same logic in (4) as a subquery. (results incorrect, the last row should not be there).
SELECT C1, C2, C3
FROM T1
WHERE (C1, C2) IN
(
SELECT C1, C2
FROM T1
GROUP BY C1, C2
HAVING COUNT(*) > 1
)
;
+------+------+------+
| c1 | c2 | c3 |
+======+======+======+
| 1 | 2 | 3 |
| 1 | 2 | 4 |
| 1 | 3 | 6 |
+------+------+------+
3 tuples
Actual Results:
Please see Reproducibility.
Expected Results:
Please see Reproducibility.
Please see Reproducibility.
$ ./mserver5 --version
MonetDB 5 server v11.31.7 "Aug2018" (64-bit, 128-bit integers)
Copyright (c) 1993 - July 2008 CWI
Copyright (c) August 2008 - 2018 MonetDB B.V., all rights reserved
Visit https://www.monetdb.org/ for further information
Found 11.7GiB available memory, 8 available cpu cores
Libraries:
libpcre: 8.38 2015-11-23 (compiled with 8.38)
openssl: OpenSSL 1.0.2g 1 Mar 2016 (compiled with OpenSSL 1.0.2g 1 Mar 2016)
libxml2: 2.9.3 (compiled with 2.9.3)
Compiled by: jdsilv2@cerberus (x86_64-pc-linux-gnu)
Compilation: gcc -O3 -fomit-frame-pointer -pipe -D_FORTIFY_SOURCE=2
Linking : /usr/bin/ld -m elf_x86_64 -Wl,-Bsymbolic-functions
Found another multi-column IN query variant which produces wrong output:
SELECT C1, C2, C3
FROM T1
WHERE (C2, C1) IN
(
SELECT C2, C1
FROM T1
GROUP BY C1, C2
HAVING COUNT(*) > 1
);
Added it to the test script for bug #6651.
Comment 26679
Date: 2018-11-14 16:22:27 +0100
From: Martin van Dinther <<martin.van.dinther>>
Update importance to High-critical
Comment 26681
Date: 2018-11-15 19:11:36 +0100
From: Martin van Dinther <<martin.van.dinther>>
Some more queries to analyse the problem and find some workaround (see query 8):
-- 7. Same query as 5 but using qualified column names in subquery.
-- (produces incorrect results, same as 5)
SELECT C1, C2, C3
FROM T1
WHERE (C1, C2) IN
(
SELECT T1.C1, T1.C2
FROM T1
GROUP BY T1.C1, T1.C2
HAVING COUNT(*) > 1
);
-- 8. Same query as 5 but using alias for table and qualified column names
-- in subquery. (produces correct result, so can be used as a workaround)
SELECT C1, C2, C3
FROM T1
WHERE (C1, C2) IN
(
SELECT T.C1, T.C2
FROM T1 AS T
GROUP BY T.C1, T.C2
HAVING COUNT(*) > 1
);
-- 9. Query using NOT IN instead of IN (and change COUNT() = 1)
-- (produces correct result in this use case data)
SELECT C1, C2, C3
FROM T1
WHERE (C1, C2) NOT IN
(
SELECT C1, C2
FROM T1
GROUP BY C1, C2
HAVING COUNT() = 1
);
From queries 7 and 8 we can see it has to do with a name scoping problem.
Query 8 shows how you can workaround it in the meantime (till there is a software patch released).
Fix for bug #6651. Label expressions for in's right relation, so the rel_optimizer doesn't get confused.
Also added defensive line for uninitialized variable.
The text was updated successfully, but these errors were encountered:
Date: 2018-10-11 20:20:16 +0200
From: Joseph Dsilva <>
To: SQL devs <>
Version: 11.31.7 (Aug2018)
CC: joedsilva, martin.van.dinther, @PedroTadim
Last updated: 2019-01-14 17:29:12 +0100
Comment 26638
Date: 2018-10-11 20:20:16 +0200
From: Joseph Dsilva <>
User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/66.0.3359.170 Safari/537.36
Build Identifier:
A SQL Query with multiple columns in the IN clause, such as
... WHERE (c1, c2) IN ( SELECT c1, c2 FROM ... )
produces incorrect results (see reproducibility for sample test case)
Reproducible: Always
Steps to Reproduce:
Create a test table
CREATE TABLE T1
(
C1 INTEGER,
C2 INTEGER,
C3 INTEGER
);
Insert some records
INSERT INTO T1
VALUES (1, 2, 3)
,(1, 2, 4)
,(2, 2, 5)
,(1, 3, 6)
;
Let us see which c1, c2 combination has more than one entry (results are correct)
SELECT C1, C2, COUNT()
FROM T1
GROUP BY C1, C2
HAVING COUNT() > 1
;
+------+------+------+
| c1 | c2 | L4 |
+======+======+======+
| 1 | 2 | 2 |
+------+------+------+
1 tuple
Let us find all records from T1 such that C1, C2 has multiple entries for a given value combination. (correct result)
SELECT T1.C1, T1.C2, T1.C3
FROM T1,
(
SELECT C1, C2
FROM T1
GROUP BY C1, C2
HAVING COUNT(*) > 1
) X
WHERE T1.C1 = X.C1 AND T1.C2 = X.C2
;
+------+------+------+
| c1 | c2 | c3 |
+======+======+======+
| 1 | 2 | 3 |
| 1 | 2 | 4 |
+------+------+------+
Let us write the same logic in (4) as a subquery. (results incorrect, the last row should not be there).
SELECT C1, C2, C3
FROM T1
WHERE (C1, C2) IN
(
SELECT C1, C2
FROM T1
GROUP BY C1, C2
HAVING COUNT(*) > 1
)
;
+------+------+------+
| c1 | c2 | c3 |
+======+======+======+
| 1 | 2 | 3 |
| 1 | 2 | 4 |
| 1 | 3 | 6 |
+------+------+------+
3 tuples
Actual Results:
Please see Reproducibility.
Expected Results:
Please see Reproducibility.
Please see Reproducibility.
$ ./mserver5 --version
MonetDB 5 server v11.31.7 "Aug2018" (64-bit, 128-bit integers)
Copyright (c) 1993 - July 2008 CWI
Copyright (c) August 2008 - 2018 MonetDB B.V., all rights reserved
Visit https://www.monetdb.org/ for further information
Found 11.7GiB available memory, 8 available cpu cores
Libraries:
libpcre: 8.38 2015-11-23 (compiled with 8.38)
openssl: OpenSSL 1.0.2g 1 Mar 2016 (compiled with OpenSSL 1.0.2g 1 Mar 2016)
libxml2: 2.9.3 (compiled with 2.9.3)
Compiled by: jdsilv2@cerberus (x86_64-pc-linux-gnu)
Compilation: gcc -O3 -fomit-frame-pointer -pipe -D_FORTIFY_SOURCE=2
Linking : /usr/bin/ld -m elf_x86_64 -Wl,-Bsymbolic-functions
Comment 26640
Date: 2018-10-12 11:24:55 +0200
From: MonetDB Mercurial Repository <>
Changeset 7352568e47e0 made by Sjoerd Mullender sjoerd@acm.org in the MonetDB repo, refers to this bug.
For complete details, see https//devmonetdborg/hg/MonetDB?cmd=changeset;node=7352568e47e0
Changeset description:
Comment 26678
Date: 2018-11-14 16:19:42 +0100
From: MonetDB Mercurial Repository <>
Changeset e8917e90a56c made by Martin van Dinther martin.van.dinther@monetdbsolutions.com in the MonetDB repo, refers to this bug.
For complete details, see https//devmonetdborg/hg/MonetDB?cmd=changeset;node=e8917e90a56c
Changeset description:
Comment 26679
Date: 2018-11-14 16:22:27 +0100
From: Martin van Dinther <<martin.van.dinther>>
Update importance to High-critical
Comment 26681
Date: 2018-11-15 19:11:36 +0100
From: Martin van Dinther <<martin.van.dinther>>
Some more queries to analyse the problem and find some workaround (see query 8):
-- 7. Same query as 5 but using qualified column names in subquery.
-- (produces incorrect results, same as 5)
SELECT C1, C2, C3
FROM T1
WHERE (C1, C2) IN
(
SELECT T1.C1, T1.C2
FROM T1
GROUP BY T1.C1, T1.C2
HAVING COUNT(*) > 1
);
-- 8. Same query as 5 but using alias for table and qualified column names
-- in subquery. (produces correct result, so can be used as a workaround)
SELECT C1, C2, C3
FROM T1
WHERE (C1, C2) IN
(
SELECT T.C1, T.C2
FROM T1 AS T
GROUP BY T.C1, T.C2
HAVING COUNT(*) > 1
);
-- 9. Query using NOT IN instead of IN (and change COUNT() = 1)
-- (produces correct result in this use case data)
SELECT C1, C2, C3
FROM T1
WHERE (C1, C2) NOT IN
(
SELECT C1, C2
FROM T1
GROUP BY C1, C2
HAVING COUNT() = 1
);
From queries 7 and 8 we can see it has to do with a name scoping problem.
Query 8 shows how you can workaround it in the meantime (till there is a software patch released).
Comment 26690
Date: 2018-11-20 10:44:46 +0100
From: MonetDB Mercurial Repository <>
Changeset e9be2776cf1d made by Pedro Ferreira pedro.ferreira@monetdbsolutions.com in the MonetDB repo, refers to this bug.
For complete details, see https//devmonetdborg/hg/MonetDB?cmd=changeset;node=e9be2776cf1d
Changeset description:
The text was updated successfully, but these errors were encountered: