Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong query result set WHERE "IS NULL" or "NOT IN" clauses uses in combination with ORDER, LIMIT and OFFSET #3684

Closed
monetdb-team opened this issue Nov 30, 2020 · 0 comments

Comments

@monetdb-team
Copy link

@monetdb-team monetdb-team commented Nov 30, 2020

Date: 2015-03-12 15:27:06 +0100
From: Antonio <>
To: SQL devs <>
Version: 11.19.7 (Oct2014-SP1)
CC: @njnes

Last updated: 2015-05-07 12:37:52 +0200

Comment 20710

Date: 2015-03-12 15:27:06 +0100
From: Antonio <>

User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/40.0.2214.115 Safari/537.36
Build Identifier:

Executing a query that has WHERE condition with NOT IN clause and specifying LIMIT and OFFSET results in different (wrong) result sets compared to:

  1. removing the LIMIT condition
  2. changing the NOT IN clause into a <> operator

This is an example of the query that fails:
select
"store_id" as "col0" , *
from "lineitem_denormalized_first1k"
where
(
"yyyymmdd" <= 20101213
or "customer_date_of_birth" < date '1977-03-19'
or
(
("customer_state" <> 'mardin')
and
(
"customer_country" = 'usa'
or "customer_country" = 'sweden'
)
and not "year_begin_date" = '2010-01-01'
)
)
order by "col0" asc limit 10000;

that returns 1120 rows (instead of 824 expected records) with part of the result set duplicated.

BUT if I remove the LIMIT 10000 clause, than it correctly returns 824 records.
OR I can achieve the same fix by changing the "not "year_begin_date" = '2010-01-01'" into a "year_begin_date" <> '2010-01-01' and keeping the LIMIT clause, and the result set correctly has 824 records.

I tested the same table on MySql executing the same query and it works without problem.

Is this a know issue?

Reproducible: Always

Steps to Reproduce:

Use WHERE NOT Field IN ('value') in combination with ORDER BY and LIMIT clause

Actual Results:

Duplicated part of result set

Expected Results:

non duplicated rows in result set

Comment 20739

Date: 2015-03-23 16:25:18 +0100
From: Antonio <>

Another similar case with this query highlights the problem:

select
Line_Margin
from lineitem_denormalized_first1k
where
(
(
Line_Cost is not null
and
Line_Cost = 4.56
)
or Store_Id <> 35
or
(
Product_Name like ('%' || 'benches' || '%') escape '!'
and
(
Month_Begin_Date is null or
Month_Begin_Date <> date '2010-12-01'
)
)
)
order by Line_Margin limit 68 offset 319

It wrongly returns 0 records, when it should return 68 records.

To solve the issue I have 4 different options:

  1. remove the ORDER BY clause
    or
  2. remove the LIMIT clause
    or
  3. remove the OFFSET clause
    or
  4. remove the "is null" from the WHERE conditions.

This seems to highlight the bug where I cannot use ORDER BY + LIMIT + OFFSET clauses in combination when the WHERE conditions have "IS NULL" or "NOT" expressions.

Comment 20740

Date: 2015-03-25 19:35:24 +0100
From: @njnes

could you also supply the ddl statements for the tables used in your example query?

Comment 20743

Date: 2015-03-26 16:00:42 +0100
From: Antonio <>

Created attachment 327
DDL create table

Attached file: create first1k table on MonetDb.txt (text/plain, 1238 bytes)
Description: DDL create table

Comment 20744

Date: 2015-03-26 16:01:58 +0100
From: Antonio <>

Created attachment 328
ddl create table

Attached file: create first1k table on MonetDb.txt (text/plain, 1238 bytes)
Description: ddl create table

Comment 20745

Date: 2015-03-26 16:19:51 +0100
From: Antonio <>

Created attachment 329
csv file with data

Used COPY INTO command to import data into table.

Attached file: lineitem_denormalized_first1k_headerless.csv (application/vnd.ms-excel, 319404 bytes)
Description: csv file with data

Comment 20746

Date: 2015-03-26 16:21:46 +0100
From: Antonio <>

(In reply to comment 5)

Created attachment 329 [details]
csv file with data

Used COPY INTO command to import data into table.

For convenience, this is the exact command to import data into the table
mclient -d visokio -s "COPY INTO lineitem_denormalized_first1k FROM '/home/antonio/lineitem_denormalized_headerless.csv' USING DELIMITERS ',','\n','"'"

Comment 20774

Date: 2015-04-10 13:27:07 +0200
From: MonetDB Mercurial Repository <>

Changeset ca7a2dd7ec6c made by Niels Nes niels@cwi.nl in the MonetDB repo, refers to this bug.

For complete details, see http//devmonetdborg/hg/MonetDB?cmd=changeset;node=ca7a2dd7ec6c

Changeset description:

fixed bug #3684, properly pushdown topn's including offsets

Comment 20775

Date: 2015-04-10 13:27:43 +0200
From: @njnes

fixed by properly pushing down topn's.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
1 participant