New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug: Alphabetical ordering is not consistent between DBs #6104
Comments
May not help but at least for postgres it sounds like the default collation is OS-dependent? http://dba.stackexchange.com/questions/106964/why-is-my-postgresql-order-by-case-insensitive |
One workaround I've seen before is to add a virtual column to the select (e.g. I've seen the One downside is that postgres from what I can tell will still use the OS' collation so case-sensitive ordering will still be |
Postgres has an extension called Benefits:
Limitations:
http://www.postgresql.org/docs/9.4/static/citext.html I have less experience with MySQL but it appears that nonbinary strings are case insensitive and binary strings are case sensitive. So ensuring the nonbinary string types are used would give the desired behavior. http://dev.mysql.com/doc/refman/5.0/en/case-sensitivity.html |
This is an interesting idea, Bookshelf has some support for calculated columns. The trick would be only adding the column when an @hwhelchel Do you have any idea how to turn on that postgres extension within our bookshelf/knex/pg layers? Converting collations in MySQL is something we need to do in order to fix #5945, however this is a pretty hefty task. Your suggestions sound like the 'right' way to do things but are possibly going to be hard to manage because Ghost is distributed amongst so many different environments, databases and database versions. |
@ErisDS this would be done at the knex layer using the Hmm yes postgres citext would works in 9.1+. I'll have to take a look at the Ghost db layer to understand more what the challenges are. You might also find this issue on knex with multiple databases relevant: knex/knex#778 |
I'm closing all OAuth and most API issues temporarily with the RE: OAuth, for the next 2-3 months we'll be implementing an official Ghost OAuth login system, providing global access to all Ghost blogs with a single login. We'll be opening issues around this system soon, and I don't want to cause confusion with OAuth for the API. JSON API Overhaul & OAuth access are currently scheduled next on the roadmap |
As discussed in #10371 (comment), we will try the approach with overriding standard Bookshelf/knex |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
This problem still exists and should not have been closed by stale-bot. |
Closing as this hasn't really impacted anyone. |
Take the following strings: 'photo', 'Audio', 'Video'.
Note that the first one is not capitalised, but the other two are.
When ordering by names in SQL this can be done in either a case sensitive or case insensitive way:
Case sensitive order is:
This is used by MySQL and Postgres by default
Case insensitive order is:
This is used by Sqlite3 by default
I believe that the latter is the desired output in most cases, and that case sensitive ordering is not the expected outcome. I don't have much to base this on, other than common sense - the case of a letter doesn't really change what letter that is, so if I'm searching for something in an alphabetical list, I look for A and a in the same place.
This is slightly problematic for two reasons:
order by LOWER(name) ASC
. We could do this by default for all orders, or figure out which orders are strings and need to be made case insensitive. However, if we do this by default, there's no obvious/well understood way to turn it back off.Solutions
I don't have a clear idea what the best way to solve this is, but I have a couple of ideas:
Idea 1: make sqlite3 match mysql & postgres
By default force 'COLLATE NOCASE' for SQLite so that all 3 dbs behave the same and use case sensitive ordering. All resources have a
slug
as well as a name, which could be used for ordering by something case insensitive. Not ideal, but simple-ish.Long term, we can support adding
lower
andupper
functions to our order syntax.Idea 2: sensible defaults are best
By default, force case insensitive matches because this is what makes sense to anyone who isn't a computer scientist 😁. If almost noone wants case sensitive alphabetical order, then it makes more sense to make it harder to do that, rather than making the sensible thing hard.
So, if all of our queries go off with
LOWER
automatically wrapped around attributes which are known to be test/strings, we need to create a way for a user to say 'don't do case insensitive matches'.A couple of different ideas for that are:
Assuming we add support for all the mysql functions that could be used in an order statement, and that specifying one would override our default application of
LOWER()
then we can provide for using a mysql function which doesn't change the case.Unfortunately there is no MySQL function which means 'original case', but using a no-op function like
TRIM()
would work because it would get used instead ofLOWER()
and would not change the string. That's a bit unintuitive though, so we could also just make up our own MySQL functionNOCASE()
or whatever we want to call it, and use that to mean "don't apply lower but don't apply this either cos it's nonsense :)".Something like
order="^name ASC"
ororder="!name ASC"
could be used to indicate that case sensitivity should be used.order="name ASC"
- does a case insensitive order using the LOWER() functionorder="NAME ASC"
- does a case insensitive order using the UPPER() functionorder="Name ASC"
- does a case sensitive order without applying a functionI'd love to hear more ideas
Conclusion
I am very much leaning towards 'sensible defaults are best' I think doing a lower case ordering is the only thing that actually makes any sense, which means figuring out how to turn it off is all we're left with.
I'm genuinely not sure what the best mechanism for specifying turning it off would be. Using the case of the property seems elegant but a bit side-effect-y.
The text was updated successfully, but these errors were encountered: