Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Where clause with cast ignores sub-select #6712

Closed
monetdb-team opened this issue Nov 30, 2020 · 0 comments
Closed

Where clause with cast ignores sub-select #6712

monetdb-team opened this issue Nov 30, 2020 · 0 comments

Comments

@monetdb-team
Copy link

@monetdb-team monetdb-team commented Nov 30, 2020

Date: 2019-06-11 12:56:38 +0200
From: @swingbit
To: SQL devs <>
Version: 11.33.3 (Apr2019)
CC: @njnes

Last updated: 2019-09-02 16:05:28 +0200

Comment 27041

Date: 2019-06-11 12:56:38 +0200
From: @swingbit

User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.103 Safari/537.36
Build Identifier:

CREATE TABLE x (
"id" INTEGER,
"attribute" CHARACTER LARGE OBJECT,
"value" CHARACTER LARGE OBJECT
);
INSERT INTO x VALUES (1, 'version', '3.15.0');
INSERT INTO x VALUES (1, 'executiontime', '100848');
INSERT INTO x VALUES (2, 'version', '3.15.0');
INSERT INTO x VALUES (2, 'executiontime', '54340');
INSERT INTO x VALUES (3, 'version', '3.15.0');
INSERT INTO x VALUES (3, 'executiontime', '96715');

CREATE VIEW executiontimes as select * from x where attribute = 'executiontime';

select id
from executiontimes
where cast(value as bigint) > 80000;

This returns:
conversion of string '3.15.0' to type lng failed.

Which means the where clause has been applied to table "x", not to view "executiontimes"

Note that:

  • Both the following clauses give the expected (empty) result, meaning the issue is only with cast()
    where value like '%.%';
    where substring(value,1,1) = '3';
  • This happens with any form of sub-select (in-line/view/with)

PS. If possible, I would really appreciate the fix to be back-ported to Aug2018 (also affected)

Reproducible: Always

Actual Results:

conversion of string '3.15.0' to type lng failed.

Expected Results:

+------+
| id |
+======+
| 1 |
| 3 |
+------+
2 tuples

MonetDB 5 server 11.33.3 (Apr2019) (64-bit, 128-bit integers)
Copyright (c) 1993 - July 2008 CWI
Copyright (c) August 2008 - 2019 MonetDB B.V., all rights reserved
Visit https://www.monetdb.org/ for further information
Found 3.8GiB available memory, 2 available cpu cores
Libraries:
libpcre: 8.32 2012-11-30
openssl: OpenSSL 1.0.2k-fips 26 Jan 2017
libxml2: 2.9.1
Compiled by: mockbuild@ (x86_64-redhat-linux-gnu)
Compilation: gcc -std=gnu99 -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -m64 -mtune=generic -Wno-format-truncation
Linking : /usr/bin/ld -m elf_x86_64 -Wl,-z,relro -specs=/usr/lib/rpm/redhat/redhat-hardened-ld -Wl,-Bsymbolic-functions

Comment 27042

Date: 2019-06-11 14:10:59 +0200
From: MonetDB Mercurial Repository <>

Changeset 5ad88b364ee5 made by Pedro Ferreira pedro.ferreira@monetdbsolutions.com in the MonetDB repo, refers to this bug.

For complete details, see https//devmonetdborg/hg/MonetDB?cmd=changeset;node=5ad88b364ee5

Changeset description:

Added test for bug #6712.

Comment 27044

Date: 2019-06-11 14:37:21 +0200
From: @njnes

well the query gets merged (as we do not support physical views). Somehow we optimized (pushed the select with numbers) before the match on strings (that could be simply fixed, but is a performance issue not really correctness..)

Comment 27045

Date: 2019-06-11 14:38:13 +0200
From: @sjoerdmullender

I think your analysis is flawed. The problem you're facing is not that the subquery is being ignored, it is instead that the conversion and selection on the value column is done before the selection on the attribute column. Just having a view doesn't mean that the view is actually executed first.

Your query is equivalent to
select id
from x
where attribute = 'executiontime' and cast(value as bigint) > 80000;

The two conditions are reversed because the SQL engine probably thinks that might be more efficient.

Comment 27046

Date: 2019-06-11 14:46:43 +0200
From: @swingbit

Sjoerd, I don't think the query you wrote is equivalent to mine. You perform both selections on the same relations. I don't.

I think it is wrong to think about it in terms of execution order.

The query I wrote logically defines the content of relation "executiontimes". When I select anything from it, I expect to select from that content, no matter the execution order that the engine decides.

Comment 27047

Date: 2019-06-11 15:00:15 +0200
From: @swingbit

For reference, PostgreSQL gives for example the expected result. Not because it chooses a different execution order, but because this is the only correct one.

Comment 27048

Date: 2019-06-11 15:22:59 +0200
From: @swingbit

If we look at it without a view, the following is equivalent to my first query:

select executiontimes.id
from (select * from x where attribute = 'executiontime') as executiontimes
where cast(executiontimes.value as bigint) > 80000;

Expression "cast(executiontimes.value as bigint)" depends on relation executiontimes. It cannot be evaluated before executiontimes is evaluated.

Flattening everything to a bunch of selections on the same relation is wrong, in my opinion.

Comment 27053

Date: 2019-06-11 19:40:06 +0200
From: MonetDB Mercurial Repository <>

Changeset 47300227a174 made by Niels Nes niels@cwi.nl in the MonetDB repo, refers to this bug.

For complete details, see https//devmonetdborg/hg/MonetDB?cmd=changeset;node=47300227a174

Changeset description:

fixes for bug #6712

Comment 27054

Date: 2019-06-11 19:56:27 +0200
From: @njnes

when possible we now push the candidate list into the batcalc.convert
routine.

Comment 27059

Date: 2019-06-12 16:02:26 +0200
From: @swingbit

I'm sorry to insist, but I think the fix solves my initial example, not the real issue.

Here another example that fails.

CREATE TABLE x (n INTEGER);

INSERT INTO x VALUES (-2);
INSERT INTO x VALUES (-1);
INSERT INTO x VALUES (0);
INSERT INTO x VALUES (1);
INSERT INTO x VALUES (2);

select nonzero.n
from (select n from x where n <> 0) as nonzero
where 1/nonzero.n = 1;

Gives:
division by zero.

This is not correct. The semantics of the query is not respected. I'm explicitly using relation nonzero which is by definition not containing zeros. This query should not attempt that division.

I completely understand that the implementation flattens the two selections on the same base table. I also understand that this approach produces the correct result in most cases. But not in all cases. This is one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
1 participant