Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix decoding of binary collations for non-binary character sets #43

Closed

Conversation

vtermanis
Copy link

Summary

Currently this library uses FieldFlag.BINARY / BINARY_FLAG to decide whether to decode a field or to leave it as raw bytes (when said flag is set). However, this appears to be incorrect behaviour.

The C API documentation states:

To distinguish between binary and nonbinary data for string data types, check whether the charsetnr value is 63. If so, the character set is binary, which indicates binary rather than nonbinary data.

And the BINARY/VARBINARY section also confirms this:

[...] the BINARY attribute does not cause the column to be treated as a binary string column. Instead, it causes the binary (_bin) collation for the column character set to be used, and the column itself contains non-binary character strings rather than binary byte strings.

Example

(Used Python 3.5.2)

# Assuming utf8mb4 connection (same behaviour for utf8 with equivalent collations)
cursor.execute(
    '''SELECT
    CAST(%s AS CHAR) COLLATE utf8mb4_bin,
    CAST(%s AS CHAR) COLLATE utf8mb4_unicode_ci,
    BINARY %s
    ''',
    ('abc',) * 3
)
print(cursor.fetchall())

Actual results:

[(bytearray(b'abc'), 'abc', bytearray(b'abc'))]  # pure
[(b'abc', 'abc', b'abc')]  # ext

Expected results:

[('abc', 'abc', bytearray(b'abc'))]  # pure
[('abc', 'abc', b'abc')]  # ext

Notes

  • This might be the same bug as already reported in #70543
  • Other MySQL client libraries such as MySQLdb and PyMySQL have the correct decoding behaviour
  • I have not updated the test cases. Because the description field has been expanded to include an additional entry (the field character set id) and the BINARY flag is no longer used for decoding logic, some test cases now most likely fail.

- Previously relied on BINARY flag, which only indicates that the
  collation is binary, not whether the character set is
- Use character set id to determine whether to tread field as
  binary. Expanded description tuple to include said id.
- WARNING: Unit tests not updated
@mysql-oca-bot
Copy link

Hi, thank you for your contribution. Please confirm this code is submitted under the terms of the OCA (Oracle's Contribution Agreement) you have previously signed by cutting and pasting the following text as a comment:
"I confirm the code being submitted is offered under the terms of the OCA, and that I am authorized to contribute it."
Thanks

@vtermanis
Copy link
Author

I confirm the code being submitted is offered under the terms of the OCA, and that I am authorized to contribute it.

@mysql-oca-bot
Copy link

Hi, thank you for your contribution. Your code has been assigned to an internal queue. Please follow
bug http://bugs.mysql.com/bug.php?id=91261 for updates.
Thanks

@wakaryry
Copy link

Need for help!

My setting for MySQL in django is already for utf8mb4.

DATABASES = {
    'default': {
        'ENGINE': 'mysql.connector.django',
        'NAME': 'xxxx',
        'USER': 'xx',
        'PASSWORD': 'xxxxxxxx',
        'HOST': '127.0.0.1',
        'PORT': '3306',
        'OPTIONS': {
            'autocommit': True,
            'use_pure': True,
            'charset': 'utf8mb4',
            'collation': 'utf8mb4_unicode_ci'
        }
    }
}

And my MySQL database is also setted into utf8mb4 for characterset and collation.

I migrated my table before. It's OK.

But today I added some tables, and then migrate them. It throws error:

django.db.utils.DatabaseError: 
(3719, "3719: 'utf8' is currently an alias for the character set UTF8MB3, 
which will be replaced by UTF8MB4 in a future release. 
Please consider using UTF8MB4 in order to be unambiguous.", None)

I really do not know why it throws the error.

I was using Python 3.6.5 Django 2.0.6 mysql-connector 8.0.11 MySQL 8.0.11.

And the detail:
The detail error info

Operations to perform:
  Apply all migrations: admin, auth, blog, contenttypes, goal, project, sessions, team, user
Running migrations:
  Applying blog.0003_auto_20180724_1543...Traceback (most recent call last):
  File "/Users/wakaryredou/venv/lib/python3.6/site-packages/django/db/utils.py", line 96, in inner
    return func(*args, **kwargs)
  File "/Users/wakaryredou/venv/lib/python3.6/site-packages/mysql/connector/cursor.py", line 904, in fetchall
    self._handle_eof(eof)
  File "/Users/wakaryredou/venv/lib/python3.6/site-packages/mysql/connector/cursor.py", line 838, in _handle_eof
    self._warnings[0][1], self._warnings[0][2])
mysql.connector.errors.DatabaseError: 3719: 
'utf8' is currently an alias for the character set UTF8MB3, 
which will be replaced by UTF8MB4 in a future release. 
Please consider using UTF8MB4 in order to be unambiguous.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "manage.py", line 15, in <module>
    execute_from_command_line(sys.argv)
  File "/Users/wakaryredou/venv/lib/python3.6/site-packages/django/core/management/__init__.py", line 371, in execute_from_command_line
    utility.execute()
  File "/Users/wakaryredou/venv/lib/python3.6/site-packages/django/core/management/__init__.py", line 365, in execute
    self.fetch_command(subcommand).run_from_argv(self.argv)
  File "/Users/wakaryredou/venv/lib/python3.6/site-packages/django/core/management/base.py", line 288, in run_from_argv
    self.execute(*args, **cmd_options)
  File "/Users/wakaryredou/venv/lib/python3.6/site-packages/django/core/management/base.py", line 335, in execute
    output = self.handle(*args, **options)
  File "/Users/wakaryredou/venv/lib/python3.6/site-packages/django/core/management/commands/migrate.py", line 200, in handle
    fake_initial=fake_initial,
  File "/Users/wakaryredou/venv/lib/python3.6/site-packages/django/db/migrations/executor.py", line 117, in migrate
    state = self._migrate_all_forwards(state, plan, full_plan, fake=fake, fake_initial=fake_initial)
  File "/Users/wakaryredou/venv/lib/python3.6/site-packages/django/db/migrations/executor.py", line 147, in _migrate_all_forwards
    state = self.apply_migration(state, migration, fake=fake, fake_initial=fake_initial)
  File "/Users/wakaryredou/venv/lib/python3.6/site-packages/django/db/migrations/executor.py", line 244, in apply_migration
    state = migration.apply(state, schema_editor)
  File "/Users/wakaryredou/venv/lib/python3.6/site-packages/django/db/migrations/migration.py", line 122, in apply
    operation.database_forwards(self.app_label, schema_editor, old_state, project_state)
  File "/Users/wakaryredou/venv/lib/python3.6/site-packages/django/db/migrations/operations/fields.py", line 216, in database_forwards
    schema_editor.alter_field(from_model, from_field, to_field)
  File "/Users/wakaryredou/venv/lib/python3.6/site-packages/django/db/backends/base/schema.py", line 525, in alter_field
    old_db_params, new_db_params, strict)
  File "/Users/wakaryredou/venv/lib/python3.6/site-packages/django/db/backends/base/schema.py", line 533, in _alter_field
    fk_names = self._constraint_names(model, [old_field.column], foreign_key=True)
  File "/Users/wakaryredou/venv/lib/python3.6/site-packages/django/db/backends/base/schema.py", line 1026, in _constraint_names
    constraints = self.connection.introspection.get_constraints(cursor, model._meta.db_table)
  File "/Users/wakaryredou/venv/lib/python3.6/site-packages/mysql/connector/django/introspection.py", line 320, in get_constraints
    for constraint, column, ref_table, ref_column in cursor.fetchall():
  File "/Users/wakaryredou/venv/lib/python3.6/site-packages/django/db/utils.py", line 96, in inner
    return func(*args, **kwargs)
  File "/Users/wakaryredou/venv/lib/python3.6/site-packages/django/db/utils.py", line 89, in __exit__
    raise dj_exc_value.with_traceback(traceback) from exc_value
  File "/Users/wakaryredou/venv/lib/python3.6/site-packages/django/db/utils.py", line 96, in inner
    return func(*args, **kwargs)
  File "/Users/wakaryredou/venv/lib/python3.6/site-packages/mysql/connector/cursor.py", line 904, in fetchall
    self._handle_eof(eof)
  File "/Users/wakaryredou/venv/lib/python3.6/site-packages/mysql/connector/cursor.py", line 838, in _handle_eof
    self._warnings[0][1], self._warnings[0][2])
django.db.utils.DatabaseError: (3719, "3719: '
utf8' is currently an alias for the character set UTF8MB3, 
which will be replaced by UTF8MB4 in a future release. 
Please consider using UTF8MB4 in order to be unambiguous.", None)

Thanks for any help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants