Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CyMySQL cannot detect encoding of database #29

Closed
roniemartinez opened this issue Jul 18, 2019 · 7 comments
Closed

CyMySQL cannot detect encoding of database #29

roniemartinez opened this issue Jul 18, 2019 · 7 comments

Comments

@roniemartinez
Copy link

roniemartinez commented Jul 18, 2019

I recently encountered failure to insert unicode characters to a table even though the database and tables are already in unicode.

Here are the current conditions:

  1. Same tests works on PyMySQL but not on CyMySQL.
  2. Tests on CyMySQL only works when explicitly specifying charset='utf8mb4'

EDIT: This is in Python 3 which should automatically use unicode.

@nakagami
Copy link
Owner

nakagami commented Jul 19, 2019

Please show me sample code and that Traceback or any error messages.

@roniemartinez
Copy link
Author

roniemartinez commented Jul 19, 2019

Here is the error code, unfortunately, I cannot show everything:
image

That is just a normal error code when inserting an emoji into a table. However, the database and the tables are already set up in utf8mb4 (charset) and utf8mb4_unicode_ci (collation).

This is the CyMySQL code that does not work:

connection = cymysql.connect(host=host, port=port, user=user, passwd=password, db=db)

However, it's equivalent PyMySQL code works by just changing the module

connection = pymysql.connect(host=host, port=port, user=user, passwd=password, db=db)

By explicitly specifying charset, CyMySQL works.

connection = cymysql.connect(host=host, port=port, user=user, passwd=password, db=db, charset='utf8mb4')

This is not the intended behavior of CyMySQL (based on source) for Python 3 as it should default to using unicode.

@nakagami
Copy link
Owner

Thanks, I understand.

Can you test with master HEAD branch on Github ?

@roniemartinez
Copy link
Author

INSERT query works, however, it fails when SELECT is used to cross-check

E       AssertionError: assert '\U0001f40d' == '\xf0\\x9f\\x90\\x8d'
E         - \U0001f40d
E         ? ^
E         + \xf0\x9f\x90\x8d
E         ? ^^^^

Here are the original and SELECTed data

🐍
�

Explicitly specifying charset='utf8mb4' will output the correct emoji. I believe, the decoding side was not set.

@nakagami
Copy link
Owner

nakagami commented Jul 19, 2019

thanks
I will support utf8mb4 charset.

@nakagami
Copy link
Owner

version 0.9.14 released
Please check it

@roniemartinez
Copy link
Author

@nakagami Verified! Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants