Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIN1257_LV (Latvian) collation is wrong for 4 letters: A E I U. [CORE3131] #3508

Closed
firebird-issue-importer opened this issue Sep 9, 2010 · 13 comments

Comments

@firebird-issue-importer
Copy link

@firebird-issue-importer firebird-issue-importer commented Sep 9, 2010

Submitted by: Aleksey Timohin (tdelphi)

Attachments:
test_lv_script_utf8.sql
test_lv2.gbk

In latvian alphabet there can be accented letters A E I U (and others). Accented letters should follow after simple letters according the rules of alphabet, but they don't. For now, Firebird does not sort them, and our clients are unhappy with that.

For now it works that way:
A and Ā, a and ā - no difference in sorting
E and Ē, e and ē - no difference in sorting
I and Ī, i and ī - no difference in sorting
U and Ū, u and ū - no difference in sorting

Currently it works as described here: http://www.collation-charts.org/firebird20/fb203.WIN1257.WIN1257_LV.html

Should be:
AĀ, aā
EĒ, eē
IĪ, iī
UŪ, uū

Link to latvian alphabet in Wikipedia: http://lv.wikipedia.org/wiki/Latvie%C5%A1u_alfab%C4%93ts

I can provide you additional information and/or test DB and if you need.

Can this be fixed in Firebird 2.5 Final? Or maybe there is way to fix it also for older FB versions?

p.s. I tried to create custom collation with ACCENT, but it doesn't work as expected.

Thank you in advance.

Script to reproduce the problem in attachment. Script creates table with 2 fields: "TEXT" - latvian text, "SORTIROVKA" - text field with right indexes. Script is saved in UTF-8 encoding.

To reproduce problem, use query:
select *
from TEST_LV_SORT tls
order by tls.text COLLATE WIN1257;

or

select *
from TEST_LV_SORT tls
order by tls.text COLLATE test_lv;

Also I attached backup file for DB with test data (same as in script). Backup image (gbak) for Firebird 2.5.

Commits: b70b571 945b928 57ecbe4

====== Test Details ======

Test data were taken from the script provided in this ticket.

@firebird-issue-importer
Copy link
Author

@firebird-issue-importer firebird-issue-importer commented Sep 9, 2010

Commented by: Aleksey Timohin (tdelphi)

updated

@firebird-issue-importer
Copy link
Author

@firebird-issue-importer firebird-issue-importer commented Sep 9, 2010

Modified by: Aleksey Timohin (tdelphi)

description: In latvian alphabet there can be accented letters A E I U (and others). Accented letters should follow after simple letters according the rules of alphabet, but they don't. For now, Firebird does not sort them, and our clients are unhappy with that.

For now it works that way:
A and Ā, a and mailto:aleksejst@solcraft.lv

=>

In latvian alphabet there can be accented letters A E I U (and others). Accented letters should follow after simple letters according the rules of alphabet, but they don't. For now, Firebird does not sort them, and our clients are unhappy with that.

For now it works that way:
A and Ā, a and ā - no difference in sorting
E and Ē, e and ē - no difference in sorting
I and Ī, i and ī - no difference in sorting
U and Ū, u and ū - no difference in sorting

Currently it works as described here: http://www.collation-charts.org/firebird20/fb203.WIN1257.WIN1257_LV.html

Should be:
AĀ, aā
EĒ, eē
IĪ, iī
UŪ, uū

Link to latvian alphabet in Wikipedia: http://lv.wikipedia.org/wiki/Latvie%C5%A1u_alfab%C4%93ts

I can provide you additional information and/or test DB and if you need.

Can this be fixed in Firebird 2.5 Final? Or maybe there is way to fix it also for older FB versions?

p.s. I tried to create custom collation with ACCENT, but it doesn't work as expected.

Thank you in advance.

@firebird-issue-importer
Copy link
Author

@firebird-issue-importer firebird-issue-importer commented Sep 10, 2010

Modified by: @dyemanov

assignee: Dmitry Yemanov [ dimitr ]

@firebird-issue-importer
Copy link
Author

@firebird-issue-importer firebird-issue-importer commented Sep 10, 2010

Modified by: @dyemanov

priority: Critical [ 2 ] => Major [ 3 ]

@firebird-issue-importer
Copy link
Author

@firebird-issue-importer firebird-issue-importer commented Sep 10, 2010

Commented by: Aleksey Timohin (tdelphi)

added script and DB backup

@firebird-issue-importer
Copy link
Author

@firebird-issue-importer firebird-issue-importer commented Sep 10, 2010

Modified by: Aleksey Timohin (tdelphi)

Attachment: test_lv_script_utf8.sql [ 11773 ]

Attachment: test_lv2.gbk [ 11774 ]

description: In latvian alphabet there can be accented letters A E I U (and others). Accented letters should follow after simple letters according the rules of alphabet, but they don't. For now, Firebird does not sort them, and our clients are unhappy with that.

For now it works that way:
A and Ā, a and ā - no difference in sorting
E and Ē, e and ē - no difference in sorting
I and Ī, i and ī - no difference in sorting
U and Ū, u and ū - no difference in sorting

Currently it works as described here: http://www.collation-charts.org/firebird20/fb203.WIN1257.WIN1257_LV.html

Should be:
AĀ, aā
EĒ, eē
IĪ, iī
UŪ, uū

Link to latvian alphabet in Wikipedia: http://lv.wikipedia.org/wiki/Latvie%C5%A1u_alfab%C4%93ts

I can provide you additional information and/or test DB and if you need.

Can this be fixed in Firebird 2.5 Final? Or maybe there is way to fix it also for older FB versions?

p.s. I tried to create custom collation with ACCENT, but it doesn't work as expected.

Thank you in advance.

=>

In latvian alphabet there can be accented letters A E I U (and others). Accented letters should follow after simple letters according the rules of alphabet, but they don't. For now, Firebird does not sort them, and our clients are unhappy with that.

For now it works that way:
A and Ā, a and ā - no difference in sorting
E and Ē, e and ē - no difference in sorting
I and Ī, i and ī - no difference in sorting
U and Ū, u and ū - no difference in sorting

Currently it works as described here: http://www.collation-charts.org/firebird20/fb203.WIN1257.WIN1257_LV.html

Should be:
AĀ, aā
EĒ, eē
IĪ, iī
UŪ, uū

Link to latvian alphabet in Wikipedia: http://lv.wikipedia.org/wiki/Latvie%C5%A1u_alfab%C4%93ts

I can provide you additional information and/or test DB and if you need.

Can this be fixed in Firebird 2.5 Final? Or maybe there is way to fix it also for older FB versions?

p.s. I tried to create custom collation with ACCENT, but it doesn't work as expected.

Thank you in advance.

Script to reproduce the problem in attachment. Script creates table with 2 fields: "TEXT" - latvian text, "SORTIROVKA" - text field with right indexes. Script is saved in UTF-8 encoding.

To reproduce problem, use query:
select *
from TEST_LV_SORT tls
order by tls.text COLLATE WIN1257;

or

select *
from TEST_LV_SORT tls
order by tls.text COLLATE test_lv;

Also I attached backup file for DB with test data (same as in script). Backup image (gbak) for Firebird 2.5.

@firebird-issue-importer
Copy link
Author

@firebird-issue-importer firebird-issue-importer commented Sep 11, 2010

Modified by: @dyemanov

status: Open [ 1 ] => In Progress [ 3 ]

@firebird-issue-importer
Copy link
Author

@firebird-issue-importer firebird-issue-importer commented Sep 13, 2010

Modified by: @dyemanov

status: In Progress [ 3 ] => Open [ 1 ]

@firebird-issue-importer
Copy link
Author

@firebird-issue-importer firebird-issue-importer commented Sep 13, 2010

Modified by: @dyemanov

status: Open [ 1 ] => Resolved [ 5 ]

resolution: Fixed [ 1 ]

Fix Version: 2.1.4 [ 10361 ]

Fix Version: 3.0 Alpha 1 [ 10331 ]

Fix Version: 2.5.1 [ 10333 ]

@firebird-issue-importer
Copy link
Author

@firebird-issue-importer firebird-issue-importer commented Sep 13, 2010

Commented by: @dyemanov

Please test the next (tomorrow's) snapshot build. Note that you'll have to recreate all indices existing for WIN1257_LV columns, or backup/restore the database.

@firebird-issue-importer
Copy link
Author

@firebird-issue-importer firebird-issue-importer commented Sep 22, 2010

Commented by: Aleksey Timohin (tdelphi)

Tested (on 2.5.1). Work as described. Thank you.

But there is another issue:
Custom created collation with ACCENT INSENSITIVE still work in the same way as ACCENT SENSITIVE collation.
F.e.:

CREATE COLLATION my_lv2
FOR WIN1257
from win1257_lv
no pad
CASE INSENSITIVE
ACCENT INSENSITIVE;

select *
from TEST_LV_SORT tls
order by tls.text COLLATE my_lv2;

Query result records will be ordered using accented characters rules.

This bug is not actual and not vital for us, but it exist.

@firebird-issue-importer
Copy link
Author

@firebird-issue-importer firebird-issue-importer commented Feb 4, 2011

Modified by: @pcisar

status: Resolved [ 5 ] => Closed [ 6 ]

@firebird-issue-importer
Copy link
Author

@firebird-issue-importer firebird-issue-importer commented May 27, 2015

Modified by: @pavel-zotov

status: Closed [ 6 ] => Closed [ 6 ]

QA Status: Done successfully

Test Details: Test data were taken from the script provided in this ticket.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants