Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong result when searching on json attributes #1724

Open
Nick-S-2018 opened this issue Jan 12, 2024 · 9 comments
Open

Wrong result when searching on json attributes #1724

Nick-S-2018 opened this issue Jan 12, 2024 · 9 comments
Assignees
Labels

Comments

@Nick-S-2018
Copy link
Collaborator

Nick-S-2018 commented Jan 12, 2024

Describe the bug
Searching on a numeric JSON property produces wrong results.

To Reproduce
Steps to reproduce the behavior:

  1. mysql> create table t(f json);
  2. mysql > insert into t(id,f) values(1,'{"a":2.5}');
  3. mysql > select * from t where f.a <= 2;

Expected behavior
Empty result.

Actual behavior
+------+----------------+
| id | f |
+------+----------------+
| 1 | {"a":2.500000} |
+------+----------------+

Describe the environment:

  • Manticore 6.2.13 85d592c@240108 dev

Additional context
Search with strict comparison works fine. E.g., select * from t where f.a < 2; indeed produces empty result in the example above.
Providing a float value in the search query, e.g., select * from t where f.a <= 2.0;, produces correct results as well.

@sanikolaev
Copy link
Collaborator

@Nick-S-2018 As discussed on today's dev call, please check if type casting works in this case and how applicable it is to be used in the clients.

@sanikolaev
Copy link
Collaborator

double(f.a) doesn't work:

mysql> drop table if exists t; create table t(f json); insert into t(id,f) values(1,'{"a":2.5}'); select * from t where double(f.a) <= 2;
--------------
drop table if exists t
--------------

Query OK, 0 rows affected (0.00 sec)

--------------
create table t(f json)
--------------

Query OK, 0 rows affected (0.00 sec)

--------------
insert into t(id,f) values(1,'{"a":2.5}')
--------------

Query OK, 1 row affected (0.00 sec)

--------------
select * from t where double(f.a) <= 2
--------------

+------+----------------+
| id   | f              |
+------+----------------+
|    1 | {"a":2.500000} |
+------+----------------+
1 row in set (0.00 sec)
--- 1 out of 1 results in 0ms ---

@tomatolog
Copy link
Contributor

seems the right form that works now is the

select * from t where f.a <= 2.0;
Empty set (0.00 sec)
--- 0 out of 0 results in 0ms ---

@Nick-S-2018
Copy link
Collaborator Author

Nick-S-2018 commented Jan 18, 2024

It looks like the correct float comparison can be guaranteed for int numbers with less than 7 digits; the limit is 2097152 (the first 22-bit number in binary format). For the numbers less than that conversion to float should work fine, like in the example above; for the larger numbers precision loss appears.

@sanikolaev
Copy link
Collaborator

It looks like the correct float comparison can be guaranteed for int numbers with less than 7 digits

Is this what you mean by the general theory of float comparison or you mean our specific case?

@sanikolaev
Copy link
Collaborator

mysql > select * from t where f.a <= 2;

Nick has an idea of converting N to N.0 in this case when N is in a specific range. @Nick-S-2018 pls provide more details.

@Nick-S-2018
Copy link
Collaborator Author

Nick-S-2018 commented Jan 22, 2024

Unfortunately, this idea guarantees only the precision for values with a certain number of digits. E.g., '+-2097152' range would fix the issue only for values with one digit after decimal point, like in the original example. The more digits after the point a stored value has, the lesser such range would became. So, this approach does not look effective.

@Nick-S-2018
Copy link
Collaborator Author

Elastic always converts the compared value to the data type of the compared field. In case of a JSON object attribute, its type is defined by the detected type of the first inserted value, unless it has been set explicitly on the index creation . By default, Elastic detects numbers with floating points as 'floats', so the compared numeric value will be converted to 'float'. However, if 'double' has been set as the attribute data type, the compared value will be treated as 'double' too.

@sanikolaev sanikolaev assigned glookka and unassigned Nick-S-2018 Jan 25, 2024
@sanikolaev
Copy link
Collaborator

As discussed, we'll implement support for where double(f.a) <= 2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants