Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix IEEE & bool special values SELECT , empty as non-text NULL , add message for datatype error #57

Closed
wants to merge 18 commits into from

Conversation

mkgrgis
Copy link
Contributor

@mkgrgis mkgrgis commented Feb 2, 2023

In case of SQLite non-STRICT table contains in real column valid for PostgreSQL strings with text affinity such as case insensitive NaN, +Infinity, +INF, infinity, -INF etc. will be converted to PostgreSQL IEEE special values.
Note: there is no special IEEE values in SQlite at all, we just use SQlite as normal input for PostgreSQL float-family values regarding to output of psql for this special values.
Also added selecting from SQLite text-affinity values to PostgreSQL bool columns like in firebird_fdw realisation.

@mkgrgis mkgrgis changed the title Add diagnostics for datatype error Fix IEEE special values, add diagnostics for datatype error Feb 2, 2023
@mkgrgis mkgrgis mentioned this pull request Feb 2, 2023
…t for PostgreSQL

All character case combinations is valid for special values, eg. NaN, Nan, NAN, nan etc. There is calulating optimisation for mass SELECT.
@mkgrgis mkgrgis changed the title Fix IEEE special values, add diagnostics for datatype error Fix IEEE & bool special values, add diagn. for datatype error Feb 3, 2023
@mkgrgis mkgrgis changed the title Fix IEEE & bool special values, add diagn. for datatype error Fix IEEE & bool special values SELECT, add message for datatype error Feb 3, 2023
@t-kataym
Copy link
Contributor

t-kataym commented Feb 3, 2023

Thank you for creating the pull request.

  • Could you create new option for foreign server and foreign table?
    If the option is enabled, sqlite_fdw supports the treatment of special values.
    There is no concept of such special values in SQLite. So by default, sqlite_fdw should not change the meaning of data in SQLite automatically implicitly.
    But there are people in need, I think. So I would like to enable to switch the behavior.
  • Could you consider data conversion not only "from SQLite to PostgreSQL" but also "from PostgreSQL to SQLite"?
    For example, when inserting data, updating data and checking a pushdown of WHERE clause, the conversion is necessary.
    • INSERT INTO foreign_table VALUES ('Infinity');
    • UPDATE foreign_table SET col = 'Infinity';
    • SELECT * FROM foreign_table WHERE col > '-Infinity';
  • Could you create test cases also?

@mkgrgis
Copy link
Contributor Author

mkgrgis commented Feb 3, 2023

  • Could you create new option for foreign server and foreign table?

It's still hard for me. I can locate to

but I don't understand all changes needs outside of validation of this new option.
Also I have feature request for new readonly option for foreign server and pull request #59.
I think all of this options will be near
if (strcmp(def->defname, "truncatable") == 0 ||

I propose a names for new options implicit_bool_type and special_real_values. Are this names good?

If the option is enabled, sqlite_fdw supports the treatment of special values.

Good proposition. I am not against. However, treatment of special values is in algorithmic branch of error of value transformation due to datatype difference.

There is no concept of such special values in SQLite.

Yes. It's only popular usage examples influenced by PostgreSQL or Oracle conceptions. Especially Y/N pseudoboolean. All special values comes only from PostgreSQL input behaviour, not from SQLite documentation.

  • Could you consider data conversion not only "from SQLite to PostgreSQL" but also "from PostgreSQL to SQLite"?

Yes, I have made some tests. Results are described in c56c816
For real data PostgreSQL always converts special values to case sensitive exact NaN, -Infinity or Infinity in WHERE and other conditions.
For bool data sqlite_fdw always converts WHERE c_column and WHERE NOT c_column to c_column and inverse condition.

Tests.
SQLite

CREATE TABLE bool (
	с1 INTEGER,
	с2 TEXT
);
CREATE TABLE "Float" ( n real);
CREATE TABLE "Float2" ( n real);

INSERT INTO bool (с12) VALUES (1,'Y');
INSERT INTO bool (с12) VALUES (0,'No');
INSERT INTO bool (с12) VALUES (1,'Yes');
INSERT INTO bool (с12) VALUES (-1,NULL);
INSERT INTO bool (с12) VALUES (2,'true');
INSERT INTO bool (с12) VALUES (1,'false');
INSERT INTO bool (с12) VALUES (10,'ывы');
INSERT INTO bool (с12) VALUES (0,'TRUE');
INSERT INTO bool (с12) VALUES (1,'FALSE');
INSERT INTO "Float" (n) VALUES ('NaN');
INSERT INTO "Float" (n) VALUES (2.4);
INSERT INTO "Float" (n) VALUES (5.0);
INSERT INTO "Float" (n) VALUES (0.0);
INSERT INTO "Float" (n) VALUES ('-infinity');
INSERT INTO "Float" (n) VALUES ('+infinity');
INSERT INTO "Float" (n) VALUES ('-Infinity');
INSERT INTO "Float" (n) VALUES ('+Infinity');
INSERT INTO "Float" (n) VALUES ('nan');
INSERT INTO "Float" (n) VALUES ('naN');
INSERT INTO "Float" (n) VALUES ('-inF');
INSERT INTO "Float" (n) VALUES ('+Inf');
INSERT INTO "Float" (n) VALUES ('+INF');

PostgreSQL

CREATE FOREIGN TABLE public."Float" (
    n float8 NULL
)
SERVER sqlite_server;
CREATE FOREIGN TABLE public."Float2" (
    n float8 NULL
)
SERVER sqlite_server;

SELECT * FROM "Float" WHERE "n"= double precision '-INF' have been converted to SELECT n FROM main."Float" WHERE ((n = '-Infinity')) (this example isn't about my PR, this behaviour exists). Result is only 1 row.

SELECT * FROM "bool";
SELECT * FROM "bool" WHERE "с1";
SELECT * FROM "bool" WHERE NOT "с1";
SELECT * FROM "bool" WHERE "с1" IS NULL;
SELECT * FROM "bool" WHERE "с1" IS NOT NULL;
SELECT * FROM "bool" WHERE "с2" IS NULL;
SELECT * FROM "bool" WHERE "с2" IS NOT NULL;

is normal queries (c1 look like integer), but

SELECT * FROM "bool" WHERE "с2";
SELECT * FROM "bool" WHERE NOT "с2";

is not normal.

For example, when inserting data, updating data and checking a pushdown of WHERE clause, the conversion is necessary.

  • INSERT INTO foreign_table VALUES ('Infinity');
  • UPDATE foreign_table SET col = 'Infinity';
INSERT INTO "Float2" VALUES ('Infinity'::float8);
INSERT INTO "Float" VALUES ('Infinity'::float8);

both with no problem for non STRICT table.

  • SELECT * FROM foreign_table WHERE col > '-Infinity';

SELECT * FROM "Float" WHERE n > '-Infinity' to SELECT ``n`` FROM main."Float" WHERE ((``n`` > '-Infinity'))

     n     
-----------
       NaN
 -Infinity
       NaN
       NaN
 -Infinity

SELECT * FROM "Float" WHERE n < 'Infinity' to SELECT n FROM main."Float" WHERE ((n < 'Infinity'))

     n     
-----------
         2
         0
         5
 -Infinity
  Infinity
 -Infinity
  Infinity
 -Infinity
  Infinity
  Infinity

During INSERT in SQLite NaN inserted as NULL, but Infinity as Inf, and -Infinity as -Inf

select * from Float2;
0.0
5.0

1.3333
Inf
-Inf

DBeaver for the same data
изображение

original in PostgreSQL

INSERT INTO public."Float0" (n) VALUES (0.0);
INSERT INTO public."Float0" (n) VALUES (5.0);
INSERT INTO public."Float0" (n) VALUES ('NaN');
INSERT INTO public."Float0" (n) VALUES (1.3333);
INSERT INTO public."Float0" (n) VALUES ('Infinity');
INSERT INTO public."Float0" (n) VALUES ('-Infinity');
INSERT INTO public."Float0" (n) VALUES (NULL);
INSERT INTO "Float2" SELECT * FROM "Float0";

The same in DBeaver
изображение

  • Could you create test cases also?

I don't know how exactly it works. I have seen only some SQLs like

-- special inputs
.

@mkgrgis
Copy link
Contributor Author

mkgrgis commented Feb 4, 2023

@t-kataym , what about my work with server's (as first) options in #59 ? Is this normal if works as tested?

@mkgrgis
Copy link
Contributor Author

mkgrgis commented Feb 9, 2023

@t-kataym, this branch was updated. New options of foreign server, foreign table and column for IEEE values and special boolean literals were added.

@mkgrgis mkgrgis changed the title Fix IEEE & bool special values SELECT, add message for datatype error Fix IEEE & bool special values SELECT , empty as non-text NULL add message for datatype error Mar 3, 2023
First without pushdowning in `WHERE`, only for `SELECT`.
@mkgrgis mkgrgis changed the title Fix IEEE & bool special values SELECT , empty as non-text NULL add message for datatype error Fix IEEE & bool special values SELECT , empty as non-text NULL , add message for datatype error Mar 3, 2023
@t-kataym
Copy link
Contributor

t-kataym commented Mar 3, 2023

@mkgrgis I have a concern.
For example, SQL for PostgreSQL is ‘SELECT col1 FROM tbl WHERE col1 < 1;’.
If SQLite FDW pushes down the WHERE condition, the remote query is ‘SELECT col1 FROM tbl WHERE col1 < 1;’. If a record in SQLite is ‘-Infinity’, the condition is false on SQLite unlike PostgreSQL. The result will be incorrect.
It means that SQLite FDW cannot push down a WHERE condition if it contains a column of numeric type. By not pushing down it, the result can be correct but the performance will be not good.

The above one is just an example. I think you had better to consider the feasibility and specification carefully before the implementation.
This comment is also applied to #66 if trying to support it.

@mkgrgis
Copy link
Contributor Author

mkgrgis commented Mar 4, 2023

Yes, @t-kataym, we have a concern just now, regardless of the new options here discussed, with pushdowning something like SELECT col1 FROM tbl WHERE col1 < 1; to "invisible" SQLite tables (before any SELECT with error message). But only non-STRICT tables are affected.

... WHERE col1 < 1;’. If a record in SQLite is -Infinity` , the condition is false on SQLite unlike PostgreSQL. The result will be incorrect.

Yes. This noticed in

- SQLite does not support special values for IEEE 754-2008 numbers such as `NaN`, `+Infinity` and `-Infinity` in SQL expressions with numeric context. Also SQLite can not store this values with `real` [affinity](https://www.sqlite.org/datatype3.html). In opposite to SQLite, PostgreSQL can store special values in columns belongs to `real` datatype family such as `float` or `double precision` and use arithmetic comparation for this values. In oppose to PostgreSQL, SQLite stores `NaN`, `+Infinity` and `-Infinity` as a text values. Also conditions with special literals (such as ` n < '+Infinity'` or ` m > '-Infinity'` ) isn't numeric conditions in SQLite and gives unexpected result after pushdowning in oppose to internal PostgreSQL calculations. During `INSERT INTO ... SELECT` or in `WHERE` conditions `sqlite_fdw` uses given by PostgreSQL standard case sensetive literals **only** in follow forms: `NaN`, `-Infinity`, `Infinity`, not original strings from `WHERE` condition. *This can caused selecting issues*.

It means that SQLite FDW cannot push down a WHERE condition if it contains a column of numeric type. By not pushing down it, the result can be correct but the performance will be not good.

Yes, we have a hard alternatives here. Must we always thereat any data in PostgreSQL foreign tables as data with SQL2016 behaviour? This sounds tempting, but many querying models to foreign datasource have no SQL2016 behaviour around points we discussed. Where is the distinction between PostgeSQL SQL2016 and SQLite data behaviour in С-language code in our FDW? For many years authors of SQLite provoked non-strict data inserting and we have gigabytes tables with de-facto non-IEEE 754 Infinity in real columns in SQLite and some empty string instead NULL in int columns after JavaScript like https://sql.js.org/examples/GUI/ or bash.

What is use cases for such alternatives as performance and SQL2016 behaviour?

  1. @Soni-Harriz and I need safe and transparent data extraction with some lite obvious deviations from SQL2016.
  2. @de-sgen need editing or inserting data from PostgreSQL to SQLite. No problems here expect to special IEEE 754 values not supported in STRICT tables.
  3. Most of sqlite_fdw users needs querying to SQLite with pushdowning. I think we needn't decrease performance during stronger SQL2016 behaviour. There is a interesting question: where will distinction between PostgreSQL/SQL2016 and SQLite data behaviour? Now the mainline of this distinction is undocumented. Generally we have SQL2016 data behaviour before pushdowning, and SQLite data behaviour only after pushdowning. For the most contrast we have two types of JSON behaviour in SQLite and in PostgreSQL, especially for -> and ->> operators.

The above one is just an example. I think you had better to consider the feasibility and specification carefully before the implementation.

Yes, but I think the main answer about distinction between PostgreSQL/SQL2016 and SQLite data behaviour inside of FDW C language code may be written by sqlite_fdw architect. I think it's You, @t-kataym.

@mkgrgis
Copy link
Contributor Author

mkgrgis commented Mar 6, 2023

@t-kataym, I have made a table about datatype behaviour in sqlite_fdw mkgrgis@d100e45 Let's observe all problem areas here discussed in this table.

@mkgrgis
Copy link
Contributor Author

mkgrgis commented May 1, 2023

Non actual because of #71

@mkgrgis mkgrgis closed this May 1, 2023
@mkgrgis mkgrgis deleted the pg-real-values branch May 1, 2023 17:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants