-
Notifications
You must be signed in to change notification settings - Fork 883
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Support unicode strings in search #1698
Conversation
2e42600
to
97c1e31
Compare
Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>
97c1e31
to
1ab792f
Compare
Looks good! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good 👨🍳
Run({"hset", "d:1", "title", "Веселая СТРЕКОЗА Иван", "visits", "400"}); | ||
Run({"hset", "d:2", "title", "Die fröhliche Libelle Günther", "visits", "300"}); | ||
Run({"hset", "d:3", "title", "השפירית המהירה יעקב", "visits", "200"}); | ||
Run({"hset", "d:4", "title", "πανίσχυρη ΛΙΒΕΛΛΟΎΛΗ Δίας", "visits", "100"}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Greek! Means strong dragonfly zeus 🤣 did you google translate this? (loved ALL the easter egg 👀 )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I was just mixing adjective + dragonfly + some stereotypical name 🙂 Sounds like children stories
EXPECT_EQ(Run({"ft.create", "i1", "schema", "title", "text", "visits", "numeric"}), "OK"); | ||
|
||
// Explicitly using screaming uppercase to check utf-8 to lowercase functionality | ||
Run({"hset", "d:1", "title", "Веселая СТРЕКОЗА Иван", "visits", "400"}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Merry DRAGONFLY Ivan" 🤣
Signed-off-by: Vladislav <vlad@dragonflydb.io>
This PR adds support for unicode strings in search. Those are now split by word boundaries with the ICU library, so unicode sentences should be handled correctly. Added tests with simple sentences in four languages
Adds a dependency on libicu-dev