New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inconsistent and incorrect sorting of users in tables #319
Comments
By default, MySQL and PostgreSQL will sort according to the locale the database was created with. However, if you're using PostgreSQL and you install the pg_collkey extension, it will sort them according to the Unicode Collation Sequence. The easiest way to make sure you have the proper indices for pg_collkey use is to make sure pg_collkey is installed prior to running I would also recommend using Ruby 1.9 (which is now required for the latest version of Canvas), since it has better Unicode support. |
OK, can you explain then why it's sorted differently in various locations in Canvas? |
Just FYI, I'm talking about the canvas.instructure.com installation. |
Well, that's good to know. Canvas Cloud is on Ruby 1.9, Postgres, and uses pg_collkey. So the answer is probably that some places do not use pg_collkey for sorting in the database, and/or use Ruby, and the generic Ruby sort may not use Unicode collations. Would you say that the sorting of /users is "correct" from your perspective (sorting is done ABCD...Z as if Č, Ć = C, Đ = D, Š = S, Ž = Z). If so, you should point out specific cases where that is not done, and file a ticket at http://help.instructure.com/home (the github issues are primarily for people running open source canvas themselves; you are more likely to have a timely resolution on this issue by filing it there). |
No, both sortings are wrong. The Č, Ć = C, Đ = D, Š = S, Ž = Z is a bit less wrong than the other one IMHO, but still not what we want. Ideally, we should be able to configure locale somewhere, and get locale-dependant sorting. Thank you for your response, I will report a bug there. |
@ccutrer, have we looked at passing the current locale into collkey, rather than 'root'? the default collation, while pretty good, can't work for all languages, because some do it differently. for example, in spanish "ll" is treated as a single letter that follows "l" (las < los < llama). |
There are reasons to not do locale specific sorting. mainly because there is an index in the database based on the 'root' locale, and we can't create one of those for every possible locale. but after further investigation, that index is actually rarely used, because users are mostly already filtered by other conditions, so it's okay to use the params. Also, some (most?) locales actually don't have any rules that differ from the root locale (Croatian is one that does). So I've come up with a change that passes the user's actual locale through, but only if it differs from the root locale (meaning we get to use the index for the majority of users). |
I should note that I just checked groups, and as of the the 7/13 release, it will switch to sorting semi-correctly (using the DUCET sorting order, like /users). Hopefully by the 8/3 release it will be sorting completely correctly for Croatian. I'm still tracking down where the sorting happens for gradebook2 to make sure it gets fixed as well. If you see any others that sort non-ASCII characters at the end, let me know. |
@ccutrer Sounds good. If we can help with testing we would be glad to do it. |
not perfect, because the version that allows us to pass options like specific locale or case insensivity only exists in Chrome, but still much better than ASCII comparison even in older browsers refs instructuregh-319 test plan: * in a single course * name a student 'ñ' * name a student 'na' * name a student 'n' * in the latest version of Chrome * set your language to spanish on your profile page * go to gradebook2 * it should be sorted as 'n', 'na', 'ñ' Change-Id: I2b1f8e5f0071bfbaf8461697b4c663b06ca2f345 Reviewed-on: https://gerrit.instructure.com/21664 Tested-by: Jenkins <jenkins@instructure.com> Reviewed-by: Jon Jensen <jon@instructure.com> QA-Review: Jeremy Putnam <jeremyp@instructure.com> Product-Review: Cody Cutrer <cody@instructure.com>
refs instructuregh-319 if we have pg_collkey, use the current locale instead of 'root', but only if the current locale differs (so pg can still use the index as much as possible) test plan: * add three users to a course, named "n", "na", and "ñ" * set your language to English * the people page of the course should sort as "n", "ñ", "na" * set your language to Spanish * the people page of the course should sort as "n", "na", "ñ" Change-Id: Id01be92cb2dc103cc5f9e651da43cba6597bab22 Reviewed-on: https://gerrit.instructure.com/21648 Reviewed-by: Jacob Fugal <jacob@instructure.com> Tested-by: Jenkins <jenkins@instructure.com> QA-Review: Jeremy Putnam <jeremyp@instructure.com> Product-Review: Cody Cutrer <cody@instructure.com>
In Croatian alphabet we have special characters čćšđž. Canvas unfortunately doesn't do very well at sorting people alphabetically, namely:
There are probably other places; if it would help I could list all of them. This is quite a big issue if you have to frequently work with large lists of students because sorting is not predictable.
Ideally, Canvas should support sorting in locale-dependant way. At very least, it should sort in a consistent way everywhere.
I don't know RoR well enough to take a look at it myself, but I can help testing.
The text was updated successfully, but these errors were encountered: