New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(1366, "Incorrect string value: '\\xF0\\x9F\\x98\\x80\\xF0\\x9F...' for column 'name' at row 1") #28

Open
EssaAlshammri opened this Issue Feb 21, 2017 · 29 comments

Comments

Projects
None yet
4 participants
@EssaAlshammri

EssaAlshammri commented Feb 21, 2017

Hi,
I can't make it to post emojis to the database when I use my-app.appspot.com
but when I run it locally python manage.py runserverwith the same libraries on GAE everything works perfectly and I can post and retrieve emojis .

here is my settings .py

import os
if os.getenv('SERVER_SOFTWARE', '').startswith('Google App Engine'):
    # Running on production App Engine, so use a Google Cloud SQL database.
    DATABASES = {
        'default': {
            'ENGINE': 'django.db.backends.mysql',
            'HOST': '/cloudsql/my-app:us-central1:my-app-mysql',
            'NAME': '********',
            'USER': 'root',
            'PASSWORD': '*********',
        }
    }
else:
    DATABASES = {
        'default': {
            'ENGINE': 'django.db.backends.mysql',
            'NAME': '*******',
            'USER': 'root',
            'PASSWORD': '*********',
            'HOST': '**********',
            'PORT': '3306',
            'OPTIONS': {
                'charset': 'utf8mb4',
            }
        }
    }

here is the charset when using cloud shell

mysql> SHOW VARIABLES WHERE Variable_name LIKE 'character\_set\_%' OR Variable_name LIKE 'collation%';
+--------------------------+--------------------+
| Variable_name            | Value              |
+--------------------------+--------------------+
| character_set_client     | utf8               |
| character_set_connection | utf8               |
| character_set_database   | utf8mb4            |
| character_set_filesystem | binary             |
| character_set_results    | utf8               |
| character_set_server     | utf8mb4            |
| character_set_system     | utf8               |
| collation_connection     | utf8_general_ci    |
| collation_database       | utf8mb4_general_ci |
| collation_server         | utf8mb4_general_ci |
+--------------------------+--------------------+
10 rows in set (0.15 sec)

and here is the charset when I connect using the IP of the database from another client


Variable_name                        Value
character_set_client                utf8
character_set_connection            utf8mb4
character_set_database              utf8mb4
character_set_filesystem            binary
character_set_results               utf8
character_set_server                utf8mb4
character_set_system                utf8
collation_connection                utf8mb4_unicode_ci
collation_database                  utf8mb4_general_ci
collation_server                    utf8mb4_general_ci

I'm I missing something ?!!

how do I make work?

thanks

@EssaAlshammri

This comment has been minimized.

Show comment
Hide comment
@EssaAlshammri

EssaAlshammri Feb 21, 2017

here is my app.yaml and I'm using MySQLdb 1.2.5 locally

# [START django_app]
runtime: python27
api_version: 1
threadsafe: yes

handlers:
- url: /static
  static_dir: static/
- url: .*
  script: myapp.wsgi.application

# Only pure Python libraries can be vendored
# Python libraries that use C extensions can
# only be included if they are part of the App Engine SDK 
libraries:
- name: MySQLdb
  version: 1.2.5
- name: PIL
  version: "1.1.7"
- name: ssl
  version: latest

# [END django_app]

EssaAlshammri commented Feb 21, 2017

here is my app.yaml and I'm using MySQLdb 1.2.5 locally

# [START django_app]
runtime: python27
api_version: 1
threadsafe: yes

handlers:
- url: /static
  static_dir: static/
- url: .*
  script: myapp.wsgi.application

# Only pure Python libraries can be vendored
# Python libraries that use C extensions can
# only be included if they are part of the App Engine SDK 
libraries:
- name: MySQLdb
  version: 1.2.5
- name: PIL
  version: "1.1.7"
- name: ssl
  version: latest

# [END django_app]

@waprin

This comment has been minimized.

Show comment
Hide comment
@waprin

waprin Feb 21, 2017

Thanks for report.

Sounds like you are having problems connecting to CloudSQL through the App Engine unix socket.

Nothing is jumping out as immediately obvious, but I can give it another spin. If I haven't updated you in ~24 hours, feel free to give me reminder @waprin mention.

If you see any messages in the logs in the console, might be helpful to leave them here. Make sure you select the right App Engine version in the dropdown menu in the Logging console.

waprin commented Feb 21, 2017

Thanks for report.

Sounds like you are having problems connecting to CloudSQL through the App Engine unix socket.

Nothing is jumping out as immediately obvious, but I can give it another spin. If I haven't updated you in ~24 hours, feel free to give me reminder @waprin mention.

If you see any messages in the logs in the console, might be helpful to leave them here. Make sure you select the right App Engine version in the dropdown menu in the Logging console.

@EssaAlshammri

This comment has been minimized.

Show comment
Hide comment
@EssaAlshammri

EssaAlshammri Feb 21, 2017

@waprin there you go :) here is the raw stack trace.
last two calls

Traceback (most recent call last):
  ...............
  File "/base/data/home/runtimes/python27/python27_lib/versions/third_party/MySQLdb-1.2.5/MySQLdb/cursors.py", line 205, in execute
    self.errorhandler(self, exc, value)
  File "/base/data/home/runtimes/python27/python27_lib/versions/third_party/MySQLdb-1.2.5/MySQLdb/connections.py", line 36, in defaulterrorhandler
    raise errorclass, errorvalue
OperationalError: (1366, "Incorrect string value: '\\xF0\\x9F\\x98\\x8E\\xF0\\x9F...' for column 'name' at row 1")

EssaAlshammri commented Feb 21, 2017

@waprin there you go :) here is the raw stack trace.
last two calls

Traceback (most recent call last):
  ...............
  File "/base/data/home/runtimes/python27/python27_lib/versions/third_party/MySQLdb-1.2.5/MySQLdb/cursors.py", line 205, in execute
    self.errorhandler(self, exc, value)
  File "/base/data/home/runtimes/python27/python27_lib/versions/third_party/MySQLdb-1.2.5/MySQLdb/connections.py", line 36, in defaulterrorhandler
    raise errorclass, errorvalue
OperationalError: (1366, "Incorrect string value: '\\xF0\\x9F\\x98\\x8E\\xF0\\x9F...' for column 'name' at row 1")
@EssaAlshammri

This comment has been minimized.

Show comment
Hide comment
@EssaAlshammri

EssaAlshammri Feb 22, 2017

If I haven't updated you in ~24 hours, feel free to give me reminder @waprin mention.

it's 4 hours earlier :) 😆
@waprin
have you found a solution?

EssaAlshammri commented Feb 22, 2017

If I haven't updated you in ~24 hours, feel free to give me reminder @waprin mention.

it's 4 hours earlier :) 😆
@waprin
have you found a solution?

@waprin

This comment has been minimized.

Show comment
Hide comment
@waprin

waprin Feb 23, 2017

@EssaAlshammri thanks for reminder.

I've since come to remember this repo is out of date and is going to be replaced. We now have CloudSQL v2. I would suggest following this tutorial

https://cloud.google.com/python/django/appengine

which references the following sample:

https://github.com/GoogleCloudPlatform/python-docs-samples/tree/master/appengine/standard/django

which is the one we're officially supporting.

I'm going to do a quick run through now and see if I repro your issue, but I highly recommend trying those other samples anyway.

waprin commented Feb 23, 2017

@EssaAlshammri thanks for reminder.

I've since come to remember this repo is out of date and is going to be replaced. We now have CloudSQL v2. I would suggest following this tutorial

https://cloud.google.com/python/django/appengine

which references the following sample:

https://github.com/GoogleCloudPlatform/python-docs-samples/tree/master/appengine/standard/django

which is the one we're officially supporting.

I'm going to do a quick run through now and see if I repro your issue, but I highly recommend trying those other samples anyway.

@waprin

This comment has been minimized.

Show comment
Hide comment
@waprin

waprin Feb 23, 2017

@EssaAlshammri everything seems to be working fine for me.

One sanity check, does your lib folder contain MySQL dependencies? It shouldn't, but if it does that can cause some weird issues.

waprin commented Feb 23, 2017

@EssaAlshammri everything seems to be working fine for me.

One sanity check, does your lib folder contain MySQL dependencies? It shouldn't, but if it does that can cause some weird issues.

@EssaAlshammri

This comment has been minimized.

Show comment
Hide comment
@EssaAlshammri

EssaAlshammri Feb 23, 2017

@waprin

I actually have the exact same setup as the one you mentioned and I'm using second gen cloud sql from the get go.

I will give it another try with a new project from scratch. But before that, now it's giving another error message when I post emojis and I assure you I haven't done anything to the setup

OperationalError: (1267, "Illegal mix of collations (utf8mb4_unicode_ci,IMPLICIT) and (utf8_general_ci,COERCIBLE) for operation '='")

and I don't think I have any mysql dependencies
here is an ls -la of lib folder

drwxrwxr-x  5 user user  4096 Nov 25 10:23 babel
drwxrwxr-x  2 user user  4096 Nov 25 10:23 Babel-2.3.4.dist-info
drwxrwxr-x 57 user user  4096 Nov 26 12:12 boto
drwxrwxr-x  2 user user  4096 Nov 26 12:12 boto-2.43.0.dist-info
drwxrwxr-x 18 user user  4096 Dec 18 18:32 django
drwxrwxr-x  2 user user  4096 Dec 18 18:32 Django-1.10.4.dist-info
drwxrwxr-x  2 user user  4096 Jan 22 22:47 django_google_storage
drwxrwxr-x  2 user user  4096 Nov 26 17:55 django_google_storage_updated-0.4.0.dist-info
drwxrwxr-x  2 user user  4096 Nov 25 10:23 django_phonenumber_field-1.1.0.dist-info
drwxrwxr-x  2 user user  4096 Nov 25 10:23 djangorestframework-3.5.3.dist-info
drwxrwxr-x  3 user user  4096 Feb 22 18:49 django_rest_multitokenauth
drwxrwxr-x  2 user user  4096 Feb 22 18:49 django_rest_multitokenauth-0.2.4-py2.7.egg-info
drwxrwxr-x  3 user user  4096 Nov 25 10:23 phonenumber_field
drwxrwxr-x  7 user user  4096 Nov 25 10:23 phonenumbers
drwxrwxr-x  2 user user  4096 Nov 25 10:23 phonenumberslite-7.7.5.dist-info
drwxrwxr-x  2 user user  4096 Nov 25 10:23 plivo-0.11.3.dist-info
-rw-rw-r--  1 user user 42373 Nov 25 10:23 plivo.py
-rw-rw-r--  1 user user 44791 Nov 25 10:23 plivo.pyc
-rw-rw-r--  1 user user  7704 Nov 25 10:23 plivoxml.py
-rw-rw-r--  1 user user 13362 Nov 25 10:23 plivoxml.pyc
drwxrwxr-x  2 user user  4096 Feb 22 18:35 pyfcm
drwxrwxr-x  2 user user  4096 Feb 22 18:35 pyfcm-1.2.4.dist-info
drwxrwxr-x  3 user user  4096 Nov 25 10:23 pytz
drwxrwxr-x  2 user user  4096 Nov 25 10:23 pytz-2016.7.dist-info
drwxrwxr-x  3 user user  4096 Feb 22 18:35 requests
drwxrwxr-x  2 user user  4096 Nov 25 10:23 requests-2.12.1.dist-info
drwxrwxr-x  2 user user  4096 Feb 22 18:35 requests-2.13.0.dist-info
drwxrwxr-x  9 user user  4096 Nov 25 10:23 rest_framework

EssaAlshammri commented Feb 23, 2017

@waprin

I actually have the exact same setup as the one you mentioned and I'm using second gen cloud sql from the get go.

I will give it another try with a new project from scratch. But before that, now it's giving another error message when I post emojis and I assure you I haven't done anything to the setup

OperationalError: (1267, "Illegal mix of collations (utf8mb4_unicode_ci,IMPLICIT) and (utf8_general_ci,COERCIBLE) for operation '='")

and I don't think I have any mysql dependencies
here is an ls -la of lib folder

drwxrwxr-x  5 user user  4096 Nov 25 10:23 babel
drwxrwxr-x  2 user user  4096 Nov 25 10:23 Babel-2.3.4.dist-info
drwxrwxr-x 57 user user  4096 Nov 26 12:12 boto
drwxrwxr-x  2 user user  4096 Nov 26 12:12 boto-2.43.0.dist-info
drwxrwxr-x 18 user user  4096 Dec 18 18:32 django
drwxrwxr-x  2 user user  4096 Dec 18 18:32 Django-1.10.4.dist-info
drwxrwxr-x  2 user user  4096 Jan 22 22:47 django_google_storage
drwxrwxr-x  2 user user  4096 Nov 26 17:55 django_google_storage_updated-0.4.0.dist-info
drwxrwxr-x  2 user user  4096 Nov 25 10:23 django_phonenumber_field-1.1.0.dist-info
drwxrwxr-x  2 user user  4096 Nov 25 10:23 djangorestframework-3.5.3.dist-info
drwxrwxr-x  3 user user  4096 Feb 22 18:49 django_rest_multitokenauth
drwxrwxr-x  2 user user  4096 Feb 22 18:49 django_rest_multitokenauth-0.2.4-py2.7.egg-info
drwxrwxr-x  3 user user  4096 Nov 25 10:23 phonenumber_field
drwxrwxr-x  7 user user  4096 Nov 25 10:23 phonenumbers
drwxrwxr-x  2 user user  4096 Nov 25 10:23 phonenumberslite-7.7.5.dist-info
drwxrwxr-x  2 user user  4096 Nov 25 10:23 plivo-0.11.3.dist-info
-rw-rw-r--  1 user user 42373 Nov 25 10:23 plivo.py
-rw-rw-r--  1 user user 44791 Nov 25 10:23 plivo.pyc
-rw-rw-r--  1 user user  7704 Nov 25 10:23 plivoxml.py
-rw-rw-r--  1 user user 13362 Nov 25 10:23 plivoxml.pyc
drwxrwxr-x  2 user user  4096 Feb 22 18:35 pyfcm
drwxrwxr-x  2 user user  4096 Feb 22 18:35 pyfcm-1.2.4.dist-info
drwxrwxr-x  3 user user  4096 Nov 25 10:23 pytz
drwxrwxr-x  2 user user  4096 Nov 25 10:23 pytz-2016.7.dist-info
drwxrwxr-x  3 user user  4096 Feb 22 18:35 requests
drwxrwxr-x  2 user user  4096 Nov 25 10:23 requests-2.12.1.dist-info
drwxrwxr-x  2 user user  4096 Feb 22 18:35 requests-2.13.0.dist-info
drwxrwxr-x  9 user user  4096 Nov 25 10:23 rest_framework
@EssaAlshammri

This comment has been minimized.

Show comment
Hide comment
@EssaAlshammri

EssaAlshammri Feb 23, 2017

I think the problem is that connection initialization charset is not set to utf8mb4 when the app is running on the app engine

if there is a way I can set it the problem will be solved.

thing I tried so far.

if os.getenv('SERVER_SOFTWARE', '').startswith('Google App Engine'):
    # Running on production App Engine, so use a Google Cloud SQL database.
    DATABASES = {
        'default': {
            'ENGINE': 'django.db.backends.mysql',
            'HOST': '/cloudsql/myapp-app:us-central1:my-app-mysql',
            'NAME': '*******',
            'USER': 'root',
            'PASSWORD': '***********',
            'OPTIONS': {
                'charset': 'utf8mb4',
            }
        }
    } 

but this will give this error (2019, "Can't initialize character set utf8mb4 (path: /usr/local/mysql/share/charsets/)")

import os
if os.getenv('SERVER_SOFTWARE', '').startswith('Google App Engine'):
    # Running on production App Engine, so use a Google Cloud SQL database.
    DATABASES = {
        'default': {
            'ENGINE': 'django.db.backends.mysql',
            'HOST': '/cloudsql/my-app:us-central1:my-app-mysql',
            'NAME': '**********',
            'USER': 'root',
            'PASSWORD': '************',
            'OPTIONS': {
                'read_default_file': os.path.join(BASE_DIR, 'my.cnf'),
            }
        }
    }

my.cnf

[client]
default-character-set = utf8mb4

[mysql]
default-character-set = utf8mb4

[mysqld]
character-set-client-handshake = FALSE
character-set-server = utf8mb4
collation-server = utf8mb4_unicode_ci

this will also give the same error (2019, "Can't initialize character set utf8mb4 (path: /usr/local/mysql/share/charsets/)") + it doesn't event work if I put it on the local development configurations

and I also tried to set from the console
screenshot from 2017-02-23 09-44-28

and no luck 👎

EssaAlshammri commented Feb 23, 2017

I think the problem is that connection initialization charset is not set to utf8mb4 when the app is running on the app engine

if there is a way I can set it the problem will be solved.

thing I tried so far.

if os.getenv('SERVER_SOFTWARE', '').startswith('Google App Engine'):
    # Running on production App Engine, so use a Google Cloud SQL database.
    DATABASES = {
        'default': {
            'ENGINE': 'django.db.backends.mysql',
            'HOST': '/cloudsql/myapp-app:us-central1:my-app-mysql',
            'NAME': '*******',
            'USER': 'root',
            'PASSWORD': '***********',
            'OPTIONS': {
                'charset': 'utf8mb4',
            }
        }
    } 

but this will give this error (2019, "Can't initialize character set utf8mb4 (path: /usr/local/mysql/share/charsets/)")

import os
if os.getenv('SERVER_SOFTWARE', '').startswith('Google App Engine'):
    # Running on production App Engine, so use a Google Cloud SQL database.
    DATABASES = {
        'default': {
            'ENGINE': 'django.db.backends.mysql',
            'HOST': '/cloudsql/my-app:us-central1:my-app-mysql',
            'NAME': '**********',
            'USER': 'root',
            'PASSWORD': '************',
            'OPTIONS': {
                'read_default_file': os.path.join(BASE_DIR, 'my.cnf'),
            }
        }
    }

my.cnf

[client]
default-character-set = utf8mb4

[mysql]
default-character-set = utf8mb4

[mysqld]
character-set-client-handshake = FALSE
character-set-server = utf8mb4
collation-server = utf8mb4_unicode_ci

this will also give the same error (2019, "Can't initialize character set utf8mb4 (path: /usr/local/mysql/share/charsets/)") + it doesn't event work if I put it on the local development configurations

and I also tried to set from the console
screenshot from 2017-02-23 09-44-28

and no luck 👎

@waprin

This comment has been minimized.

Show comment
Hide comment
@waprin

waprin Feb 24, 2017

:-\

This is a little more involved for me to repro hence the delay but I'll still try to follow up ,give me a bit.

waprin commented Feb 24, 2017

:-\

This is a little more involved for me to repro hence the delay but I'll still try to follow up ,give me a bit.

@EssaAlshammri

This comment has been minimized.

Show comment
Hide comment
@EssaAlshammri

EssaAlshammri Feb 26, 2017

@waprin hey :)

I just wanna let you updated. I have setup everything from scratch with polls example following this https://cloud.google.com/python/django/appengine.

and nothing worked as expected. the exact same problem occurred.

EssaAlshammri commented Feb 26, 2017

@waprin hey :)

I just wanna let you updated. I have setup everything from scratch with polls example following this https://cloud.google.com/python/django/appengine.

and nothing worked as expected. the exact same problem occurred.

@waprin

This comment has been minimized.

Show comment
Hide comment
@waprin

waprin Feb 28, 2017

@EssaAlshammri sorry, can you give me slightly clearer steps to reproduce?

  1. Follow that polls example
  2. Use cloudql proxy locally to show charset
  3. ??

Thanks.

waprin commented Feb 28, 2017

@EssaAlshammri sorry, can you give me slightly clearer steps to reproduce?

  1. Follow that polls example
  2. Use cloudql proxy locally to show charset
  3. ??

Thanks.

@waprin

This comment has been minimized.

Show comment
Hide comment
@waprin

waprin Feb 28, 2017

@ryanmats just FYI about this issue.

waprin commented Feb 28, 2017

@ryanmats just FYI about this issue.

@EssaAlshammri

This comment has been minimized.

Show comment
Hide comment
@EssaAlshammri

EssaAlshammri Mar 1, 2017

@waprin here is the charset using cloudsql proxy locally

mysql> SHOW VARIABLES WHERE Variable_name LIKE 'character\_set\_%' OR Variable_name LIKE 'collation%';
+--------------------------+--------------------+
| Variable_name            | Value              |
+--------------------------+--------------------+
| character_set_client     | utf8               |
| character_set_connection | utf8               |
| character_set_database   | utf8mb4            |
| character_set_filesystem | binary             |
| character_set_results    | utf8               |
| character_set_server     | utf8mb4            |
| character_set_system     | utf8               |
| collation_connection     | utf8_general_ci    |
| collation_database       | utf8mb4_general_ci |
| collation_server         | utf8mb4_general_ci |
+--------------------------+--------------------+
10 rows in set (0.21 sec)

at the creation of the instance I set this:

EssaAlshammri commented Mar 1, 2017

@waprin here is the charset using cloudsql proxy locally

mysql> SHOW VARIABLES WHERE Variable_name LIKE 'character\_set\_%' OR Variable_name LIKE 'collation%';
+--------------------------+--------------------+
| Variable_name            | Value              |
+--------------------------+--------------------+
| character_set_client     | utf8               |
| character_set_connection | utf8               |
| character_set_database   | utf8mb4            |
| character_set_filesystem | binary             |
| character_set_results    | utf8               |
| character_set_server     | utf8mb4            |
| character_set_system     | utf8               |
| collation_connection     | utf8_general_ci    |
| collation_database       | utf8mb4_general_ci |
| collation_server         | utf8mb4_general_ci |
+--------------------------+--------------------+
10 rows in set (0.21 sec)

at the creation of the instance I set this:

@thebatlab

This comment has been minimized.

Show comment
Hide comment
@thebatlab

thebatlab Mar 1, 2017

I just ran into this same issue today. I have the database and tables set up with utf8mb4, but as soon as I add:
'OPTIONS': {
'charset': 'utf8mb4',
}

I get the exact same "(2019, "Can't initialize character set utf8mb4 (path: /usr/local/mysql/share/charsets/)" error.

If I run locally, and connect to the CloudSQL db, it runs fine. It has to be the MySQLDB version App Engine uses, and not supporting that charset. From what I see here: https://code.djangoproject.com/ticket/18392#comment:12

Support for this appears to have been added to the 1.2.5 release, which it would seem app engine uses. So I'm a bit stumped.

I gave PyMySQL a try, but end up with an error of:
Can't connect to MySQL server on 'localhost' ([Errno 97] Address family not supported by protocol)

So PyMySQL may not support how the connection is made, it would seem. Leaving me out of luck for supporting emojis using CloudSQL and App Engine, it would appear.

thebatlab commented Mar 1, 2017

I just ran into this same issue today. I have the database and tables set up with utf8mb4, but as soon as I add:
'OPTIONS': {
'charset': 'utf8mb4',
}

I get the exact same "(2019, "Can't initialize character set utf8mb4 (path: /usr/local/mysql/share/charsets/)" error.

If I run locally, and connect to the CloudSQL db, it runs fine. It has to be the MySQLDB version App Engine uses, and not supporting that charset. From what I see here: https://code.djangoproject.com/ticket/18392#comment:12

Support for this appears to have been added to the 1.2.5 release, which it would seem app engine uses. So I'm a bit stumped.

I gave PyMySQL a try, but end up with an error of:
Can't connect to MySQL server on 'localhost' ([Errno 97] Address family not supported by protocol)

So PyMySQL may not support how the connection is made, it would seem. Leaving me out of luck for supporting emojis using CloudSQL and App Engine, it would appear.

@EssaAlshammri

This comment has been minimized.

Show comment
Hide comment
@EssaAlshammri

EssaAlshammri commented Mar 6, 2017

@waprin anything yet ?

@waprin

This comment has been minimized.

Show comment
Hide comment
@waprin

waprin Mar 7, 2017

@EssaAlshammri reproduced, filed internal bug with engineering team, will keep you updated.

waprin commented Mar 7, 2017

@EssaAlshammri reproduced, filed internal bug with engineering team, will keep you updated.

@waprin

This comment has been minimized.

Show comment
Hide comment
@waprin

waprin Mar 7, 2017

You can try base64 encoding going in and out of the database as a workaround.

Supposedly google.appengine.ext.django.backends.rdbms also works but I haven't gotten it working yet myself.

waprin commented Mar 7, 2017

You can try base64 encoding going in and out of the database as a workaround.

Supposedly google.appengine.ext.django.backends.rdbms also works but I haven't gotten it working yet myself.

@thebatlab

This comment has been minimized.

Show comment
Hide comment
@thebatlab

thebatlab Mar 7, 2017

Yes, I have successfully done this by encoding/decoding in/out of the database, and it works fine. Just unfortunate to have to add the extra code :)

thebatlab commented Mar 7, 2017

Yes, I have successfully done this by encoding/decoding in/out of the database, and it works fine. Just unfortunate to have to add the extra code :)

@waprin

This comment has been minimized.

Show comment
Hide comment
@waprin

waprin Mar 8, 2017

For sure, it's not good. The bug I filed was a duplicate and the original was already assigned to someone who's working on it so at least it's on the radar to get fixed. Unfortunately I can't promise any sort of date for when that work will be done and shipped but I will keep eye on it and let you all know.

waprin commented Mar 8, 2017

For sure, it's not good. The bug I filed was a duplicate and the original was already assigned to someone who's working on it so at least it's on the radar to get fixed. Unfortunately I can't promise any sort of date for when that work will be done and shipped but I will keep eye on it and let you all know.

@EssaAlshammri

This comment has been minimized.

Show comment
Hide comment
@EssaAlshammri

EssaAlshammri Mar 8, 2017

yeah encoding and decoding worked for me too, but 😠 it isn't that good especially if you already have records on the database < there is a workaround though (iterate over old records and encode them) one way 🤔 .

thank you so much @waprin @thebatlab

I prefer to wait for the official fix for this issue.

EssaAlshammri commented Mar 8, 2017

yeah encoding and decoding worked for me too, but 😠 it isn't that good especially if you already have records on the database < there is a workaround though (iterate over old records and encode them) one way 🤔 .

thank you so much @waprin @thebatlab

I prefer to wait for the official fix for this issue.

@ghost

This comment has been minimized.

Show comment
Hide comment
@ghost

ghost Apr 11, 2017

@waprin is there any update on this?

ghost commented Apr 11, 2017

@waprin is there any update on this?

@waprin

This comment has been minimized.

Show comment
Hide comment
@waprin

waprin May 2, 2017

@rimeissner comment from internal engineer:

The standard way to get utf8mb4 working in Django is to specify it as DATABASES['default']['OPTIONS'] in settings.py, like this:

    'OPTIONS': {'charset': 'utf8mb4'},

The workaround is to manually call SET NAMES; edit lib/django/db/backends/mysql/base.py and add a conn.query("SET NAMES utf8mb4") line into DatabaseWrapper.get_new_connection, so it looks like this:

    def get_new_connection(self, conn_params):
        conn = Database.connect(**conn_params)
        conn.encoders[SafeText] = conn.encoders[six.text_type]
        conn.encoders[SafeBytes] = conn.encoders[bytes]
        conn.query("SET NAMES utf8mb4")
        return conn

Make sure that you also have utf8mb4 enabled on the backend.  The migration commands in the App Engine Django tutorial result in a Cloud SQL instance configured for utf8.  I needed to run these commands to enable utf8mb4 on the two tables:

    ALTER TABLE polls_question CONVERT TO CHARACTER SET utf8mb4;
    ALTER TABLE polls_choice CONVERT TO CHARACTER SET utf8mb4;

Let me know if that fixes your problem. There is a real fix in the pipeline but no timetable for it to be out.

waprin commented May 2, 2017

@rimeissner comment from internal engineer:

The standard way to get utf8mb4 working in Django is to specify it as DATABASES['default']['OPTIONS'] in settings.py, like this:

    'OPTIONS': {'charset': 'utf8mb4'},

The workaround is to manually call SET NAMES; edit lib/django/db/backends/mysql/base.py and add a conn.query("SET NAMES utf8mb4") line into DatabaseWrapper.get_new_connection, so it looks like this:

    def get_new_connection(self, conn_params):
        conn = Database.connect(**conn_params)
        conn.encoders[SafeText] = conn.encoders[six.text_type]
        conn.encoders[SafeBytes] = conn.encoders[bytes]
        conn.query("SET NAMES utf8mb4")
        return conn

Make sure that you also have utf8mb4 enabled on the backend.  The migration commands in the App Engine Django tutorial result in a Cloud SQL instance configured for utf8.  I needed to run these commands to enable utf8mb4 on the two tables:

    ALTER TABLE polls_question CONVERT TO CHARACTER SET utf8mb4;
    ALTER TABLE polls_choice CONVERT TO CHARACTER SET utf8mb4;

Let me know if that fixes your problem. There is a real fix in the pipeline but no timetable for it to be out.

@thebatlab

This comment has been minimized.

Show comment
Hide comment
@thebatlab

thebatlab May 2, 2017

The issue isn't getting the database or Django set up for utf8mb4, though. The issue is that the MySQL driver that app engine uses doesn't support that character set.

I could run everything just fine while on my localhost, and connected to the CloudSQL database. But as soon as it was deployed to App Engine, that is when the error comes up.

thebatlab commented May 2, 2017

The issue isn't getting the database or Django set up for utf8mb4, though. The issue is that the MySQL driver that app engine uses doesn't support that character set.

I could run everything just fine while on my localhost, and connected to the CloudSQL database. But as soon as it was deployed to App Engine, that is when the error comes up.

@waprin

This comment has been minimized.

Show comment
Hide comment
@waprin

waprin May 2, 2017

Yes, understood. The driver on GAE is outdated and needs to get fixed but in the meantime the steps above are a way to way to workaround the deficiencies of the driver by configuring utf8mb4 after the connection rather than during the connection, which is supposed to work (going to verify myself soon).

waprin commented May 2, 2017

Yes, understood. The driver on GAE is outdated and needs to get fixed but in the meantime the steps above are a way to way to workaround the deficiencies of the driver by configuring utf8mb4 after the connection rather than during the connection, which is supposed to work (going to verify myself soon).

@myelin

This comment has been minimized.

Show comment
Hide comment
@myelin

myelin May 2, 2017

Member

Aforementioned internal engineer here :)

TL;DR: Cloud SQL and App Engine support emojis and 4-byte UTF-8, but 'OPTIONS': {'charset': 'utf8mb4'} in your Django settings file will result in "Can't initialize character set utf8mb4".

The issue is in the C code that we use to talk to MySQL. It doesn't support utf8mb4 itself, so when it makes a connection to MySQL, it tells the server to use "UTF8", which in MySQL means UTF-8 minus the 4-byte characters... and all your emojis get mangled.

However, Python and Cloud SQL both support 4-byte UTF-8 just fine, so if you follow up with a "SET NAMES utf8mb4" command, as @waprin explained above, it'll tell the Cloud SQL server that it's safe to send 4-byte UTF-8, and everything will work.

As such, 'OPTIONS': {'charset': 'utf8mb4'} doesn't work on App Engine. To get utf8mb4 in Django:

  • Either use 'OPTIONS': {'charset': 'utf8'}, or just leave the 'charset' option off entirely.

  • Edit your base.py and change get_new_connection so that it sends the "SET NAMES utf8mb4" command.

After making these two changes, it should work on both localhost (using dev_appserver.py) and App Engine.

Likewise if you're using MySQLdb without Django, you need to initialize it like this:

conn = MySQLdb.connect(unix_socket='/cloudsql/', user='', passwd='', db='', charset='utf8')
conn.execute("SET NAMES utf8mb4")

This should all stop being a problem sometime in the next few months (we have a fix, but it's waiting on a ton of stuff to get deployed and tested before it can go out).

Here's a Stack Overflow thread with some more details: http://stackoverflow.com/questions/36144026/unable-to-use-utf8mb4-character-set-with-cloudsql-on-appengine-python

Member

myelin commented May 2, 2017

Aforementioned internal engineer here :)

TL;DR: Cloud SQL and App Engine support emojis and 4-byte UTF-8, but 'OPTIONS': {'charset': 'utf8mb4'} in your Django settings file will result in "Can't initialize character set utf8mb4".

The issue is in the C code that we use to talk to MySQL. It doesn't support utf8mb4 itself, so when it makes a connection to MySQL, it tells the server to use "UTF8", which in MySQL means UTF-8 minus the 4-byte characters... and all your emojis get mangled.

However, Python and Cloud SQL both support 4-byte UTF-8 just fine, so if you follow up with a "SET NAMES utf8mb4" command, as @waprin explained above, it'll tell the Cloud SQL server that it's safe to send 4-byte UTF-8, and everything will work.

As such, 'OPTIONS': {'charset': 'utf8mb4'} doesn't work on App Engine. To get utf8mb4 in Django:

  • Either use 'OPTIONS': {'charset': 'utf8'}, or just leave the 'charset' option off entirely.

  • Edit your base.py and change get_new_connection so that it sends the "SET NAMES utf8mb4" command.

After making these two changes, it should work on both localhost (using dev_appserver.py) and App Engine.

Likewise if you're using MySQLdb without Django, you need to initialize it like this:

conn = MySQLdb.connect(unix_socket='/cloudsql/', user='', passwd='', db='', charset='utf8')
conn.execute("SET NAMES utf8mb4")

This should all stop being a problem sometime in the next few months (we have a fix, but it's waiting on a ton of stuff to get deployed and tested before it can go out).

Here's a Stack Overflow thread with some more details: http://stackoverflow.com/questions/36144026/unable-to-use-utf8mb4-character-set-with-cloudsql-on-appengine-python

@thebatlab

This comment has been minimized.

Show comment
Hide comment
@thebatlab

thebatlab May 2, 2017

OK, excellent, thanks for clarifying. I was worried I had described the problem inadequately and you were perhaps fixing the wrong thing :)

I misunderstood the get_new_connection fix, too. I thought it was just doing the same thing as the options flag, but at a lower level than in the settings.

thebatlab commented May 2, 2017

OK, excellent, thanks for clarifying. I was worried I had described the problem inadequately and you were perhaps fixing the wrong thing :)

I misunderstood the get_new_connection fix, too. I thought it was just doing the same thing as the options flag, but at a lower level than in the settings.

@thebatlab

This comment has been minimized.

Show comment
Hide comment
@thebatlab

thebatlab Oct 17, 2017

Just wanted to poke in here and see if the final fix was deployed:
"This should all stop being a problem sometime in the next few months (we have a fix, but it's waiting on a ton of stuff to get deployed and tested before it can go out)."

I did a quick test and it seems we still need the change to edit the django backend code, but wanted to confirm, in case I tested incorrectly.

thebatlab commented Oct 17, 2017

Just wanted to poke in here and see if the final fix was deployed:
"This should all stop being a problem sometime in the next few months (we have a fix, but it's waiting on a ton of stuff to get deployed and tested before it can go out)."

I did a quick test and it seems we still need the change to edit the django backend code, but wanted to confirm, in case I tested incorrectly.

@thebatlab

This comment has been minimized.

Show comment
Hide comment
@thebatlab

thebatlab Mar 20, 2018

As a final followup, this appears to be working now with no changes to Django code needed!

My database settings now has:
'OPTIONS': {'charset': 'utf8mb4'}
And I changed the table I needed emoji support in to the proper character set via this command:
ALTER TABLE my_table_name CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_bin;

And emojis are working just fine. I put the SQL in a migration file so it's a part of the project with no need for extra database configuration manually, so it's all seamless.

Thanks, @myelin , presuming it was you who deployed the fix :)

thebatlab commented Mar 20, 2018

As a final followup, this appears to be working now with no changes to Django code needed!

My database settings now has:
'OPTIONS': {'charset': 'utf8mb4'}
And I changed the table I needed emoji support in to the proper character set via this command:
ALTER TABLE my_table_name CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_bin;

And emojis are working just fine. I put the SQL in a migration file so it's a part of the project with no need for extra database configuration manually, so it's all seamless.

Thanks, @myelin , presuming it was you who deployed the fix :)

@myelin

This comment has been minimized.

Show comment
Hide comment
@myelin

myelin Mar 20, 2018

Member

Huh, I must have been ignoring GitHub notifications or something; I didn't spot any of your comments until just now.

Anyway, yay, glad to see my fix finally made it out! Thanks for following up :)

Member

myelin commented Mar 20, 2018

Huh, I must have been ignoring GitHub notifications or something; I didn't spot any of your comments until just now.

Anyway, yay, glad to see my fix finally made it out! Thanks for following up :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment