New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add list of collations that are required to determine equality. #9757
Conversation
CreateCollationInfo::CreateCollationInfo(string name_p, ScalarFunction function_p, bool combinable_p, | ||
bool not_required_for_equality_p) | ||
: CreateInfo(CatalogType::COLLATION_ENTRY), function(std::move(function_p)), combinable(combinable_p), | ||
not_required_for_equality(not_required_for_equality_p) { | ||
this->name = std::move(name_p); | ||
if (equality_requires_collation.count(this->name) > 0) { | ||
this->not_required_for_equality = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be fixed where the CreateCollationInfo
is created (IcuExtension::Load
in icu-extension.cpp
). I would also say let's set this to false for all ICU collations until we have more info on which collations require this
I hope I can add a comment here. The wrong results with collation "da", are also observed with other ICU collations (tried "de"). It's not due to a special collation. |
@@ -2,6 +2,8 @@ | |||
|
|||
namespace duckdb { | |||
|
|||
static unordered_set<string> const equality_requires_collation = {"da"}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we remove this as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! LGTM
Merge pull request duckdb/duckdb#9772 from hawkfish/skiplist-rand Merge pull request duckdb/duckdb#9767 from samansmink/fix-vcpkg-patching-on-windows Merge pull request duckdb/duckdb#9757 from Tmonster/712-unicode-bugs Merge pull request duckdb/duckdb#9758 from samansmink/fix-issue-9727 Merge pull request duckdb/duckdb#9759 from samansmink/update-vcpkg-2023.10.19 Merge pull request duckdb/duckdb#9761 from samansmink/add-nightly-deploy-script
For some collations, we cannot do a normal byte comparison for equality. The collation information is needed instead.
"da" Is the only one I know of so far. @carlopi has suggested to write a script to check all collations and character equalities to see if there are other collations that are required to evaluate equality.