New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Do not use FieldExistsQuery over REF that is arg of NON_NULLABLE function #15280
Conversation
ae82263
to
89b1cf2
Compare
Hi @mkleen could you also have a quick look at this draft? Thank you |
11cbef4
to
4651e2f
Compare
I think this cannot be easily back-ported since this depends on 4428eb1. |
4651e2f
to
45243e8
Compare
7dfba09
to
45243e8
Compare
return null; | ||
} | ||
} | ||
for (Symbol arg : function.arguments()) { | ||
arg.accept(this, context); | ||
arg.accept(this, NullabilityContext.of(context)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why can't we pass one context through? Why do we need to clone it here ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can just pass the same context and it will be the same result.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why can't we pass one context through? Why do we need to clone it here ?
Thanks for reviewing. It prevents any context.isNullable = false
from left arg to be carried over to right arg ultimately achieving A field is nullable if the field is nullable and all its parent functions are nullable.
references.add(symbol); | ||
private HashSet<Reference> nullableReferences = new HashSet<>(); | ||
private boolean isNullable = true; | ||
private AtomicBoolean enforceThreeValuedLogic = new AtomicBoolean(false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we don't need an atomic boolean here, because this will never be called by multiple threads simultaneously. Also NullabilityContext
is newly instantiated on each pass.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unlike isNullable
, a single instance of enforceThreeValuedLogic
is purposely shared across all instances of NullabilityContext
. I will try to attach better comments.
45243e8
to
a267e9e
Compare
// saves and restores isNullable of the current context | ||
// such that any non-nullables observed from the left arg is not transferred to the right arg. | ||
boolean isNullable = context.isNullable; | ||
for (Symbol arg : function.arguments()) { | ||
arg.accept(this, context); | ||
context.isNullable = isNullable; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for reviewing @mkleen . I think the static method of
was the cause of the confusion. Please check again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! Thank you!
…nctions cr> create table t (a int, b int); CREATE OK, 1 row affected (0.210 sec) cr> insert into t values (null, null), (null, 2), (2, null), (2, 2); INSERT OK, 4 rows affected (0.032 sec) cr> select * from t where a != (b||1); +---+---+ | a | b | +---+---+ | 2 | 2 | --> missing a row, see below +---+---+ SELECT 1 row in set (0.022 sec) cr> select a, b, a != (b||1) from t; +------+------+----------------------------+ | a | b | (NOT (a = concat(b, '1'))) | +------+------+----------------------------+ | NULL | 2 | NULL | | NULL | NULL | NULL | | 2 | NULL | TRUE | --> this row is missing from the result of the query above | 2 | 2 | TRUE | because of an unwanted FieldExistsQuery over field 'b' +------+------+----------------------------+ SELECT 4 rows in set (0.023 sec)
a267e9e
to
eadaded
Compare
Summary of the changes / Why this improves CrateDB
Fixes #15265 AND an undocumented bug:
the undocumented bug is not a regression from
master
, fails on5.5.0
as well.Current
NullabilityVisitor
will wrap all or none of the refs under anot
operator withFieldsExistsQuery
, and in this case, botha
andb
froma != (b||1)
are wrapped withFieldsExistsQuery
(considering both as nullable) causing extra rows to be filtered.What is missing from the current
NullabilityVisitor
is a way to identify a ref(leaf) is under aNON_NULLABLE
function. For instance,a != (b||1)
, b is implicitlyNON_NULLABLE
since it is under||
buta
isNULLABLE
.Similarly,
NOT(NOT(c IS NULL))
from #15265 fails becauseNullabilityVisitor
cannot identify the innerNOT(c IS NULL)
isNON_NULLABLE
.To describe the change in terms of the commit message of 4428eb1,
This improves the three-valued-logic (3vl) processing by grouping all
scalar functions into the three categories:
Nullable means the function returns a null value for a null argument.
Non-nullable means the function returns never a null values for null
arguments.
Conditional means it follows it's own logic. By default each scalar
function is conditional.
Based on these groups the generation of the Lucene queries for the
NotPredicate can be optimized in the following way:
If all involved scalar functions are nullable, no 3vl is requiredand the query will be negated and null values can be excluded by
using a field exist query.
If all involved scalar functions are non-nullable or nullable,no 3vl is required and the query can be simply negated.
If all involved scalar functions are non-nullable or nullable,
no 3vl is required. The query will be negated and null values
will be excluded from 'nullable' fields by field exist query. A field
is nullable if the field is nullable and all its parent functions
are nullable.
If any of the scalar function is conditional, 3vl is required,
the query will be negated and further processed by converting
the relevanting scalar function to generic function filters.
Checklist
docs/appendices/release-notes/<x.y.0>.rst
for user facing changessql_features
table for user facing changesdocs/appendices/release-notes/<x.y.0>.rst
(E.g. AdminUI)