Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpected result when using trigonomeric functions #3981

Closed
suyZhong opened this issue Jan 25, 2024 · 13 comments · Fixed by #3985
Closed

Unexpected result when using trigonomeric functions #3981

suyZhong opened this issue Jan 25, 2024 · 13 comments · Fixed by #3985

Comments

@suyZhong
Copy link

Consider the following test cases. It is unexpected that the second query return TRUE and the third query return an empty result. If it is expected that the second query return TRUE, then the third query should return the row in the table.

DROP TABLE t1;

CREATE TABLE t1(c0 INT, PRIMARY KEY(c0));
INSERT INTO t1 (c0) VALUES (1);

SELECT * FROM t1; -- 1
SELECT ASIN(2) > t1.c0 FROM t1; -- TRUE (somewhat unexpected)
SELECT * FROM t1 WHERE (ASIN(2))>(t1.c0);
-- Actual:  Empty set (0 rows, 0 ms)

I originally find this by building the latest source version cd09d93. I could also reproduce this in 2.2.224.

Here's how I start H2 Shell:

java -cp target/h2-2.2.229-SNAPSHOT.jar org.h2.tools.Shell -url jdbc:h2:./test -user sa -password 1
@manticore-projects
Copy link
Contributor

Greetings.

Unfortunately, I do not understand your test case. Simplified you execute:

select asin(2), 1, asin(2) > 1, True = (asin(2) > 1);
select 1 WHERE (asin(2)) > (1);

NaN 1 TRUE TRUE

1

@manticore-projects
Copy link
Contributor

manticore-projects commented Jan 25, 2024

There is some weird type conversion somewhere:

-- returns 1
SELECT * FROM t1 WHERE (ASIN(2))>(cast(t1.c0 as int));

-- returns 1
SELECT * FROM t1 WHERE (ASIN(2))>(cast(1 as int));

-- returns 1
SELECT * FROM t1 WHERE (ASIN(2))>(1);

-- returns nothing
SELECT * FROM t1 WHERE (ASIN(2))>(t1.c0);

I think this issue is legit because it may represent more useful use cases where this goes wrong.

@manticore-projects
Copy link
Contributor

Interesting! ONLY the PRIMARY KEY constraint causes it. Without it, the test runs through as expected. You can use the following template for a quick simulation:

private void testRoundingNegative() throws SQLException {
        try (
            Connection conn = getConnection("functions"); 
            Statement stat = conn.createStatement();
        ) {
            stat.execute("CREATE TABLE t1(c0 INT, PRIMARY KEY(c0))");
            stat.execute("INSERT INTO t1 VALUES (1);");

            try (
                ResultSet rs = stat.executeQuery("SELECT * FROM t1 WHERE (ASIN(2))>(t1.c0);");
            ) {
                assertTrue(rs.next());
                assertEquals(1, rs.getInt(1));
            }
        }
    }

I assume, that Primary Key affects/changes the type of the column somehow. (I may be very wrong on this).

@manticore-projects
Copy link
Contributor

It gets more interesting since it depends how/when the PRIMARY KEY is created:

-- Works!
DROP TABLE IF EXISTS PUBLIC.T1;
CREATE TABLE PUBLIC.T1 ( 
    C0        INTEGER NULL
);
INSERT INTO PUBLIC.T1 VALUES (1);
SELECT * FROM t1 WHERE (ASIN(2))>(t1.c0);

-- Works!
DROP TABLE IF EXISTS PUBLIC.T1;
CREATE TABLE PUBLIC.T1 ( 
    C0        INTEGER NOT NULL
);
INSERT INTO PUBLIC.T1 VALUES (1);
ALTER TABLE PUBLIC.T1 ADD PRIMARY KEY (C0);
SELECT * FROM t1 WHERE (ASIN(2))>(t1.c0);

-- Fails!
DROP TABLE IF EXISTS PUBLIC.T1;
CREATE TABLE PUBLIC.T1 ( 
    C0        INTEGER PRIMARY KEY
);
INSERT INTO PUBLIC.T1 VALUES (1);
SELECT * FROM t1 WHERE (ASIN(2))>(t1.c0);

@katzyn
Copy link
Contributor

katzyn commented Jan 25, 2024

There are two different problems.

  1. These queries with ASIN(2) must throw an exception as required by the SQL Standard:

If V is less than –1 (negative one) or V is greater than 1 (one), then an exception condition
is raised: data exception — numeric value out of range.

  1. We need to check index conditions and comparisons between different data types when one of these types has non-finite values.

@manticore-projects
Copy link
Contributor

ASIN(2) returns NaN, so far so good I think.

I also understand, that comparison of value vs. comparison of index value may run through different code paths, fine.
But why does it matter, when/how the Index is created?

I have checked the column definition according to INFORMATION SCHEMA and it all looks similar (except NULLABLE).

@katzyn
Copy link
Contributor

katzyn commented Jan 25, 2024

It does matter, because different types of indexes will be used (if column has TINYINT, SMALLINT, INTEGER, or BIGINT data type).

@manticore-projects
Copy link
Contributor

manticore-projects commented Jan 25, 2024

Aha!!!!

org.h2.mvstore.db.MVDelegateIndex (failing) vs. org.h2.mvstore.db.MVSecondaryIndex (working)! I did not know about that but I will try to investigate, how those are used in Comparisons.

@manticore-projects
Copy link
Contributor

Maybe I understand what happens: in MVPrimaryIndex won't find anything for NaN

@Override
    public Cursor find(SessionLocal session, SearchRow first, SearchRow last) {
        long min = extractPKFromRow(first, Long.MIN_VALUE);
        long max = extractPKFromRow(last, Long.MAX_VALUE);
        return find(session, min, max);
    }

and so does not return a record. Further I assume the solution was to NOT use indexes when Error values are involved.
Alternatively, we could also throw an exception when comparing Error Values, e.g. NaN > 1 gives NaN (instead of true).

First of all someone needs to decide on he direction here please. I myself would rather fix the comparisons against Error Values.

@andreitokar
Copy link
Contributor

andreitokar commented Jan 26, 2024

@manticore-projects
The difference in you cases most likely comes from the fact that H2 table always has a primary index - MVPrimaryIndex. That is how table itself is stored. Keys are of type long and they are mapped to data rows. In most cases those keys are sequential row numbers. That would be the case if PRIMARY KEY is not specified at table creation at all or primary key is specified as multi-column, or even if it's single-column with non-integer type. For those cases MVSecondaryIndex would be created to represent SQL PRIMARY KEY construct. Values of that index (and of any other index, except MVPrimaryIndex) are keys from MVPrimaryIndex. The only special case is a single-column primary key of integer type, specified at table creation. Short cut is taken to avoid additional index creation, and instead of row sequence numbers, keys of that MVPrimaryIndex would be user-supplied values from the specified column. To make things look uniform between two cases, MVPrimaryIndex is wrapped into MVDelegateIndex.
I suspect that uniformity between two cases is broken somehow and when optimizer tries to use range retrieval from primary index to satisfy query condition it works in one case but not the other.

@manticore-projects
Copy link
Contributor

Thank you a lot for this explanation @andreitokar.
Usually I don't ask for documentation (because it is always outdated or misleading), but those concepts would be nice to written down somewhere?

Can we start with saving this snippet above and publish it to the website?

@andreitokar
Copy link
Contributor

What makes you to believe that my vague recollection can't be outdated or misleading? 😄
Here is your never outdated and rarely misleading source.

@manticore-projects
Copy link
Contributor

Thank you @andreitokar, I fully agree with you that only source is the reliable documentation.
However, for a big project with a long history there is also what I call "the Lore". It makes it much easier to understand what to look for in the source and I consider your lore very valuable and wonder if we can write it down somewhere in order to attract more contributors.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants