New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[C++] Add an sql-compatible is_in variant #36420
Comments
take |
Let me investigate, I will leave a message about my choice before I implement it |
Thanks |
I think adding a new Function named |
Hi, @westonpace, I have implemented an |
@westonpace essentially, the behaviour you propose is to have a variant of |
Yes. That is correct. I encountered this when working on #34834 because Substrait translates SQL's |
I have some doubt about this one case, in which the lookup set contains a null but the input value is non-null:
I think it is not so simple to say that SQL engines return a null result in this case. For example, DuckDB returns a different result in this case depending on whether you are using
I wonder whether we really need to add two new options, one to handle the semantics of nulls in the value set and one to handle the semantics of nulls in the lookup set. |
P.S. to be clear: the behavior of DuckDB's
This is why I think maybe we need two new options. |
Regarding my two comments above ☝️ |
### Rationale for this change As #36420 says, we want add an sql-compatible `is_in` variant, which has a different logic handling Null. After a dicussion with @ ianmcook and @ bkietz, we decide to support an enum option `null_matching_behavior` for SetLookup, which actually adds two semantics of null handling for `is_in` and doesn't add an new behavior for `index_in`. The enum option `null_matching_behavior` will replace `skip_nulls` in the future. ### What changes are included in this PR? Add an enum parameter `null_matching_behavior` for SetLookupOptions. ### Are these changes tested? Two kinds of tests are implemented - Replace default parameter with `null_matching_behavior` instead of `skip_nulls` for `is_in` and `index_in` tests - Add tests for `NullMatchingBehavior::EMIT_NULL` and `NullMatchingBehavior::INCONCLUSIVE` for `is_in` Besides, since the `skip_nulls` is not deprecated now, I still preserve the old tests with `skip_nulls`. When the `skip_nulls` is totally deprecated, we can replace the test parameter `skip_nulls=false` with `null_matching_behavior=MATCH` and `skip_nulls=true` with `null_matching_behavior=SKIP` for these old tests. ### Are there any user-facing changes? No. Currently we support backward compatibility. In the future, we plan to replace `skip_nulls` with `null_matching_behavior` completely. * Closes: #36420 Lead-authored-by: Junming Chen <junming.chen.r@outlook.com> Co-authored-by: Sutou Kouhei <kou@cozmixng.org> Co-authored-by: Benjamin Kietzman <bengilgit@gmail.com> Signed-off-by: Benjamin Kietzman <bengilgit@gmail.com>
…e#36739) ### Rationale for this change As apache#36420 says, we want add an sql-compatible `is_in` variant, which has a different logic handling Null. After a dicussion with @ ianmcook and @ bkietz, we decide to support an enum option `null_matching_behavior` for SetLookup, which actually adds two semantics of null handling for `is_in` and doesn't add an new behavior for `index_in`. The enum option `null_matching_behavior` will replace `skip_nulls` in the future. ### What changes are included in this PR? Add an enum parameter `null_matching_behavior` for SetLookupOptions. ### Are these changes tested? Two kinds of tests are implemented - Replace default parameter with `null_matching_behavior` instead of `skip_nulls` for `is_in` and `index_in` tests - Add tests for `NullMatchingBehavior::EMIT_NULL` and `NullMatchingBehavior::INCONCLUSIVE` for `is_in` Besides, since the `skip_nulls` is not deprecated now, I still preserve the old tests with `skip_nulls`. When the `skip_nulls` is totally deprecated, we can replace the test parameter `skip_nulls=false` with `null_matching_behavior=MATCH` and `skip_nulls=true` with `null_matching_behavior=SKIP` for these old tests. ### Are there any user-facing changes? No. Currently we support backward compatibility. In the future, we plan to replace `skip_nulls` with `null_matching_behavior` completely. * Closes: apache#36420 Lead-authored-by: Junming Chen <junming.chen.r@outlook.com> Co-authored-by: Sutou Kouhei <kou@cozmixng.org> Co-authored-by: Benjamin Kietzman <bengilgit@gmail.com> Signed-off-by: Benjamin Kietzman <bengilgit@gmail.com>
…e#36739) ### Rationale for this change As apache#36420 says, we want add an sql-compatible `is_in` variant, which has a different logic handling Null. After a dicussion with @ ianmcook and @ bkietz, we decide to support an enum option `null_matching_behavior` for SetLookup, which actually adds two semantics of null handling for `is_in` and doesn't add an new behavior for `index_in`. The enum option `null_matching_behavior` will replace `skip_nulls` in the future. ### What changes are included in this PR? Add an enum parameter `null_matching_behavior` for SetLookupOptions. ### Are these changes tested? Two kinds of tests are implemented - Replace default parameter with `null_matching_behavior` instead of `skip_nulls` for `is_in` and `index_in` tests - Add tests for `NullMatchingBehavior::EMIT_NULL` and `NullMatchingBehavior::INCONCLUSIVE` for `is_in` Besides, since the `skip_nulls` is not deprecated now, I still preserve the old tests with `skip_nulls`. When the `skip_nulls` is totally deprecated, we can replace the test parameter `skip_nulls=false` with `null_matching_behavior=MATCH` and `skip_nulls=true` with `null_matching_behavior=SKIP` for these old tests. ### Are there any user-facing changes? No. Currently we support backward compatibility. In the future, we plan to replace `skip_nulls` with `null_matching_behavior` completely. * Closes: apache#36420 Lead-authored-by: Junming Chen <junming.chen.r@outlook.com> Co-authored-by: Sutou Kouhei <kou@cozmixng.org> Co-authored-by: Benjamin Kietzman <bengilgit@gmail.com> Signed-off-by: Benjamin Kietzman <bengilgit@gmail.com>
…e#36739) ### Rationale for this change As apache#36420 says, we want add an sql-compatible `is_in` variant, which has a different logic handling Null. After a dicussion with @ ianmcook and @ bkietz, we decide to support an enum option `null_matching_behavior` for SetLookup, which actually adds two semantics of null handling for `is_in` and doesn't add an new behavior for `index_in`. The enum option `null_matching_behavior` will replace `skip_nulls` in the future. ### What changes are included in this PR? Add an enum parameter `null_matching_behavior` for SetLookupOptions. ### Are these changes tested? Two kinds of tests are implemented - Replace default parameter with `null_matching_behavior` instead of `skip_nulls` for `is_in` and `index_in` tests - Add tests for `NullMatchingBehavior::EMIT_NULL` and `NullMatchingBehavior::INCONCLUSIVE` for `is_in` Besides, since the `skip_nulls` is not deprecated now, I still preserve the old tests with `skip_nulls`. When the `skip_nulls` is totally deprecated, we can replace the test parameter `skip_nulls=false` with `null_matching_behavior=MATCH` and `skip_nulls=true` with `null_matching_behavior=SKIP` for these old tests. ### Are there any user-facing changes? No. Currently we support backward compatibility. In the future, we plan to replace `skip_nulls` with `null_matching_behavior` completely. * Closes: apache#36420 Lead-authored-by: Junming Chen <junming.chen.r@outlook.com> Co-authored-by: Sutou Kouhei <kou@cozmixng.org> Co-authored-by: Benjamin Kietzman <bengilgit@gmail.com> Signed-off-by: Benjamin Kietzman <bengilgit@gmail.com>
…e#36739) ### Rationale for this change As apache#36420 says, we want add an sql-compatible `is_in` variant, which has a different logic handling Null. After a dicussion with @ ianmcook and @ bkietz, we decide to support an enum option `null_matching_behavior` for SetLookup, which actually adds two semantics of null handling for `is_in` and doesn't add an new behavior for `index_in`. The enum option `null_matching_behavior` will replace `skip_nulls` in the future. ### What changes are included in this PR? Add an enum parameter `null_matching_behavior` for SetLookupOptions. ### Are these changes tested? Two kinds of tests are implemented - Replace default parameter with `null_matching_behavior` instead of `skip_nulls` for `is_in` and `index_in` tests - Add tests for `NullMatchingBehavior::EMIT_NULL` and `NullMatchingBehavior::INCONCLUSIVE` for `is_in` Besides, since the `skip_nulls` is not deprecated now, I still preserve the old tests with `skip_nulls`. When the `skip_nulls` is totally deprecated, we can replace the test parameter `skip_nulls=false` with `null_matching_behavior=MATCH` and `skip_nulls=true` with `null_matching_behavior=SKIP` for these old tests. ### Are there any user-facing changes? No. Currently we support backward compatibility. In the future, we plan to replace `skip_nulls` with `null_matching_behavior` completely. * Closes: apache#36420 Lead-authored-by: Junming Chen <junming.chen.r@outlook.com> Co-authored-by: Sutou Kouhei <kou@cozmixng.org> Co-authored-by: Benjamin Kietzman <bengilgit@gmail.com> Signed-off-by: Benjamin Kietzman <bengilgit@gmail.com>
Describe the enhancement requested
This could be a new compute function or it could be new options for the existing is_in function. Currently, the way
is_in
treatsnull
does not match SQL semantics which makes it difficult to use when interoperability is expected.The current behavior is summed up nicely in the
SetLookupOptions::skip_nulls
description:Or, to put it another way. The behavior we currently get (regardless of skip_nulls) is:
This makes perfect sense for the "null as sentinel" understanding of null. However, SQL typically takes a "null is unknown" approach. In which case we would want the following behavior:
Component(s)
C++
The text was updated successfully, but these errors were encountered: