Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

COLLATE NOCASE #1172

Closed
mtissington opened this issue Oct 15, 2017 · 13 comments
Closed

COLLATE NOCASE #1172

mtissington opened this issue Oct 15, 2017 · 13 comments

Comments

@mtissington
Copy link

mtissington commented Oct 15, 2017

This is a UTF16 database

I have a very strange problem when using COLLATE NOCASE on a column.
It seems that when I do a SELECT the WHERE clause is only looking at the first letter.

CREATE TABLE words (
word TEXT COLLATE NOCASE NOT NULL CHECK(word <> ''),
);

INSERT INTO words (word) VALUE ('aw');

SELECT * FROM WORDS WHERE word = 'a'

returns ALL rows that being with 'a'

I might add that using sqlite3.exe produces the correct result.

@justinclift
Copy link
Member

Interesting. With the sqlite3.exe version that's working, which version is that? Not sure if it makes a difference, but it might be helpful. 😄

@mtissington
Copy link
Author

Its 3.20.1

@justinclift
Copy link
Member

justinclift commented Oct 15, 2017

Just tried it here with 3.10.1 (on Linux though), with the following SQL in a brand new (unsaved) database. The query correctly returned no rows:

PRAGMA encoding = "UTF-16";

CREATE TABLE words (
    word TEXT COLLATE NOCASE NOT NULL CHECK(word <> '')
);

INSERT INTO words (word) VALUES ('aw');

SELECT * FROM WORDS WHERE word = 'a';

Note that your SQL has two slight typos in it, which stopped it from running as-is when cut-n-pasted. 😉

  • word TEXT COLLATE NOCASE NOT NULL CHECK(word <> ''),
    • The comma on the end needed removing
  • INSERT INTO words (word) VALUE ('aw');
    • Should be VALUES not VALUE

They're both minor, so I guess you've adapted your original SQL manually. Not really sure what to look at, as it's working for me. Maybe someone else can try it on their Windows system and see if it mucks up?

@justinclift
Copy link
Member

Ahhh, it seems like my test database above is still UTF-8. Running:

PRAGMA encoding;

Returns UTF-8. I'll probably need to muck around with creating the database from the command line as UTF-16 first instead. 😄

@justinclift
Copy link
Member

Yep, when creating the database as UTF-16 using the sqlite3 CLI, the problem happens for me too when running the select in DB4S.

DB4S does display a warning for the database though, indicating there's likely to be trouble:

collation_needed_warning

@MKleusberg Shouldn't "NOCASE" work?

    https://sqlite.org/datatype3.html#collating_sequences

@justinclift
Copy link
Member

As a data point #411 is related. 😄

@mtissington
Copy link
Author

Good you see the problem ... need to get rid of that scary warning :)

MKleusberg added a commit that referenced this issue Oct 19, 2017
Fix our default collation which we install when an unknown collation is
requested to (more or less) support UTF16 strings as well. Before this
UTF16 strings wouldn't compare correctly because they might contain null
bytes which would lead to only the first (few) byte(s) getting compared.

See issue #1172.
MKleusberg added a commit that referenced this issue Oct 19, 2017
Don't ask the user whether to load a collation function if the requested
collation is already built into SQLite. Also, never override the
built-in functions.

See issue #1172.
@MKleusberg
Copy link
Member

Ok, this seems to be two issues at once 😄

As described in @justinclift's link, the NOCASE function is built into SQLite and we don't need to override that at all. I've fixed that which has two effects: 1) the warning is gone for NOCASE, BINARY, and RTRIM collations, and 2) the SELECT query posted by @mtissington is working now.

The other issue was in our own collation function which was invoked before. It didn't handle UTF16 correctly which led to the initial problem. Even though it doesn't matter for this issue any more, I've still fixed it too 😄

@mtissington Are you ok to download tomorrow's nightly build and see if it's working for you? 😄

@mtissington
Copy link
Author

great - thanks - happy to test.

@mtissington
Copy link
Author

I can confirm the issue is fixed - many thanks :)

@justinclift
Copy link
Member

Excellent, thanks for confirming it @mtissington. Should we close #411 too?

@MKleusberg
Copy link
Member

Same for #367. Can we close that, too?

@justinclift
Copy link
Member

Good point. Just closed that one too. 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants