-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Description
Describe the bug
Upper and lower should function against the unicode code properties for their respective case. I believe this boils down to using string.to_ascii_lowercase() and string.to_ascii_uppercase() vs string.to_lowercase() and string.to_uppercase()
If you use a unicode LC_CTYPE in postgres (not C) then the corresponding calls will properly respect the unicode code properties.
To Reproduce
❯ select upper('árvore ação αβγ');
+--------------------------------+
| upper(Utf8("árvore ação αβγ")) |
+--------------------------------+
| áRVORE AçãO αβγ |
+--------------------------------+
❯ select lower('ÁRVORE AÇÃO ΑΒΓ');
+--------------------------------+
| lower(Utf8("ÁRVORE AÇÃO ΑΒΓ")) |
+--------------------------------+
| Árvore aÇÃo ΑΒΓ |
+--------------------------------+
Expected behavior
upper and lower respect the unicode code maps.
Additional context
No response