Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Connect to database with collation/character other than UTF8 #46

Closed
kazzkiq opened this issue Aug 11, 2017 · 10 comments · Fixed by #100
Closed

Connect to database with collation/character other than UTF8 #46

kazzkiq opened this issue Aug 11, 2017 · 10 comments · Fixed by #100

Comments

@kazzkiq
Copy link

kazzkiq commented Aug 11, 2017

In some cases other collations are needed in order to accept Emojis and/or non-latin characters.

It seems that as of today, this project do not provide a way to connect to MySQL database with any other option than UTF8, which leads to database errors when trying to add those "special" new characters.

With the wide use of smartphones, its basically impossible to build any application that do not support emojis. Its used anywhere, and its popularizing even more each year, so I believe the project should support some way to handle this.

@kazzkiq kazzkiq changed the title Connect to database with collation/character different other than UTF8 Connect to database with collation/character other than UTF8 Aug 11, 2017
@crisward crisward mentioned this issue Sep 23, 2017
@crisward
Copy link
Contributor

There are some short codes sometimes used for emojis :smile: = 😄 but I also think this would be good.

@metacortex
Copy link

https://github.com/crystal-lang/crystal-mysql/blob/master/src/mysql/packets.cr#L75
Just replacing value 0x21u8 to 0x2du8 utf8mb4_general_ci works.

@crisward
Copy link
Contributor

crisward commented Oct 8, 2017

When a database is created, the 'Encoding' and 'Collation' are set. The driver should probably detect this and set it to be the same, instead of being hard coded. I'll have a quick look to see how this can be read.

@crisward
Copy link
Contributor

crisward commented Oct 8, 2017

Something like this give us the correct settings for the database.

SELECT DEFAULT_CHARACTER_SET_NAME, DEFAULT_COLLATION_NAME FROM INFORMATION_SCHEMA.SCHEMATA WHERE SCHEMA_NAME = 'db_name'

We just then need to convert the names to the correct bytes.

@crisward
Copy link
Contributor

crisward commented Oct 8, 2017

The full list for the connected database can be found with SHOW COLLATION, the id column just needs converting to a Unit8 based on the result from the above query.

@crisward
Copy link
Contributor

crisward commented Oct 8, 2017

It look like this info may already be retrieved during the handshake - https://github.com/crystal-lang/crystal-mysql/blob/master/src/mysql/packets.cr#L20 however I've done some manual testing and the handshake seems to return 33 / utf8_general_ci even if the database is utf8mb4_general_ci.

@r3bo0t
Copy link

r3bo0t commented Jul 8, 2018

@crisward @waj Is there any plan to support setting the in application database collation (utf8_general_ci or utf8mb4_general_ci or whatever one sets in their corresponding database). As
I can see there is no closer to this issue.

@ysbaddaden
Copy link

BTW: utf8 in MySQL is invalid, the valid UTF-8 character set is utf8mb4 and maybe it should be the default.

https://medium.com/@adamhooper/in-mysql-never-use-utf8-use-utf8mb4-11761243e434

@kazzkiq
Copy link
Author

kazzkiq commented Mar 25, 2019

Do we have any updates in this?

@girng
Copy link

girng commented Oct 3, 2019

@kazzkiq I don't think this is an issue with this repo, setting the collation is db specific https://stackoverflow.com/questions/38949115/how-to-change-the-connection-collation-of-mysql

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants