New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Database and Doctrine] utf8mb4 additional note for MySQL #6398

Closed
Fruchuxs opened this Issue Mar 24, 2016 · 9 comments

Comments

Projects
None yet
6 participants
@Fruchuxs

Fruchuxs commented Mar 24, 2016

Hello,
I suggest to add a note to the utf8mb4 configuration for indezies and unique entity properties. If a property is marked as unique in the doctrine configuration like in an annotation and doctrine is configured to use utf8mb4, the maximum length of this field needs to be adjusted to a maximum of 190 characters (for MySQL and maybe mandriva db a MySQL Distri - isn't it?).

E.g.:
@ORM\Column(type="string", length=190, nullable=false, unique=true)

I will try to explain why.
I have set up my configuration to use utf8mb4 as default. As I wanted to generate my tables, I run into the following error:
[PDOException] SQLSTATE[42000]: Syntax error or access violation: 1071 Specified key was too long; max key length is 767 bytes

So I was wondering whats happen. I have pointed out, that MySQL can only indexing 767 Bytes. MySQLs standard UTF8 uses 3 bytes for char encoding and a varchar has a maximum of 255 characters => 255 * 3 Byte = 765 Byte. But utf8mb4 uses 4 Bytes:767 Bytes / 4 Bytes/Character = 191.75 Characters.

I use MySQL v. 5.6.25.

@xabbuh

This comment has been minimized.

Show comment
Hide comment
@xabbuh

xabbuh Mar 25, 2016

Member

I agree that we should add such information. Did you check how many places we have that potentially deserve an update?

Member

xabbuh commented Mar 25, 2016

I agree that we should add such information. Did you check how many places we have that potentially deserve an update?

@Fruchuxs

This comment has been minimized.

Show comment
Hide comment
@Fruchuxs

Fruchuxs Mar 25, 2016

It's just a minor suggestion. I think if the docs reccomends to use utf8mb4 it should also note that this can ends up in some trouble with MySQL. But that is only my opinion, so I decided to suggest it here for the team which do the nice documentation work.

Fruchuxs commented Mar 25, 2016

It's just a minor suggestion. I think if the docs reccomends to use utf8mb4 it should also note that this can ends up in some trouble with MySQL. But that is only my opinion, so I decided to suggest it here for the team which do the nice documentation work.

@xabbuh

This comment has been minimized.

Show comment
Hide comment
@xabbuh

xabbuh Mar 25, 2016

Member

@Fruchuxs Of course, I fully agree with you about that. I was simply thinking that we could provide a list of the places where such a change would be useful to make the life easier for anybody who wants to contribute a pull request.

Member

xabbuh commented Mar 25, 2016

@Fruchuxs Of course, I fully agree with you about that. I was simply thinking that we could provide a list of the places where such a change would be useful to make the life easier for anybody who wants to contribute a pull request.

@Fruchuxs

This comment has been minimized.

Show comment
Hide comment
@Fruchuxs

Fruchuxs Mar 25, 2016

@xabbuh Sorry, I have misunderstood your comment.
The only place I think who needs an update ist the doctrine.rst, specifically the Configuring the Database chapter at the Setting up the Database to be UTF8 section. Because in the last paragraph (line 148) there is an explanation against MySQLs default UTF-8 charset and the recommandation for the utf8mb4 Charset.

I have also found informations to the problem in the MySQL Docs:

By default, an index key for a single-column index can be up to 767 bytes. The same length limit applies to any index key prefix. See Section 13.1.13, “CREATE INDEX Syntax”. For example, you might hit this limit with a column prefix index of more than 255 characters on a TEXT or VARCHAR column, assuming a UTF-8 character set and the maximum of 3 bytes for each character. When the innodb_large_prefix configuration option is enabled, this length limit is raised to 3072 bytes, for InnoDB tables that use the DYNAMIC and COMPRESSED row formats.

14.6.7 Limits on InnoDB Tables

Maybe it helps.

Fruchuxs commented Mar 25, 2016

@xabbuh Sorry, I have misunderstood your comment.
The only place I think who needs an update ist the doctrine.rst, specifically the Configuring the Database chapter at the Setting up the Database to be UTF8 section. Because in the last paragraph (line 148) there is an explanation against MySQLs default UTF-8 charset and the recommandation for the utf8mb4 Charset.

I have also found informations to the problem in the MySQL Docs:

By default, an index key for a single-column index can be up to 767 bytes. The same length limit applies to any index key prefix. See Section 13.1.13, “CREATE INDEX Syntax”. For example, you might hit this limit with a column prefix index of more than 255 characters on a TEXT or VARCHAR column, assuming a UTF-8 character set and the maximum of 3 bytes for each character. When the innodb_large_prefix configuration option is enabled, this length limit is raised to 3072 bytes, for InnoDB tables that use the DYNAMIC and COMPRESSED row formats.

14.6.7 Limits on InnoDB Tables

Maybe it helps.

@Tobion

This comment has been minimized.

Show comment
Hide comment
@Tobion

Tobion Nov 13, 2017

Member

I don't think this deserves a documentation note as it is not related to symfony or doctrine at all. It's plain mysql behavior you need to know. The same index size problem applies if you switch from latin1 to any multibyte charset for example. So it's not even specific to utf8mb4 .

Member

Tobion commented Nov 13, 2017

I don't think this deserves a documentation note as it is not related to symfony or doctrine at all. It's plain mysql behavior you need to know. The same index size problem applies if you switch from latin1 to any multibyte charset for example. So it's not even specific to utf8mb4 .

@Fruchuxs

This comment has been minimized.

Show comment
Hide comment
@Fruchuxs

Fruchuxs Nov 14, 2017

If you want to reach also new developers with your symfony documentation (and I think you want), it should be added. Especally, if I remember correctly, you recommend the use of utf8mb4, so you should also explain the pitfalls here.
And as a Developer I don't want to know about the technical speifications of a DBMS, if I can use an ORM. But .. It's just a suggestion.

Fruchuxs commented Nov 14, 2017

If you want to reach also new developers with your symfony documentation (and I think you want), it should be added. Especally, if I remember correctly, you recommend the use of utf8mb4, so you should also explain the pitfalls here.
And as a Developer I don't want to know about the technical speifications of a DBMS, if I can use an ORM. But .. It's just a suggestion.

@nrobinaubertin

This comment has been minimized.

Show comment
Hide comment
@nrobinaubertin

nrobinaubertin Nov 17, 2017

Another option is to enable innodb-large-prefix for the database and set the row_format to dynamic.
Here are the options I used to start mariadb: (documentation)

mysqld --character-set-server=utf8mb4 --collation-server=utf8mb4_unicode_ci --innodb-file-format=Barracuda --innodb-file-per-table=ON --innodb-large-prefix=1

And here is the config I'm using for doctrine:

doctrine:
    dbal:
        driver:   pdo_mysql
        host:     "%database_host%"
        port:     "%database_port%"
        dbname:   "%database_name%"
        user:     "%database_user%"
        password: "%database_password%"
        charset: utf8mb4
        default_table_options:
            charset: utf8mb4
            collate: utf8mb4_unicode_ci
            row_format: DYNAMIC

nrobinaubertin commented Nov 17, 2017

Another option is to enable innodb-large-prefix for the database and set the row_format to dynamic.
Here are the options I used to start mariadb: (documentation)

mysqld --character-set-server=utf8mb4 --collation-server=utf8mb4_unicode_ci --innodb-file-format=Barracuda --innodb-file-per-table=ON --innodb-large-prefix=1

And here is the config I'm using for doctrine:

doctrine:
    dbal:
        driver:   pdo_mysql
        host:     "%database_host%"
        port:     "%database_port%"
        dbname:   "%database_name%"
        user:     "%database_user%"
        password: "%database_password%"
        charset: utf8mb4
        default_table_options:
            charset: utf8mb4
            collate: utf8mb4_unicode_ci
            row_format: DYNAMIC
@bogdaniel

This comment has been minimized.

Show comment
Hide comment
@bogdaniel

bogdaniel Apr 4, 2018

Contributor

Seems like table ext_translations suffers from this change to doctrine :-)

root@05e082b08efd:/application/public# php bin/console doctrine:schema:create

 !
 ! [CAUTION] This operation should not be executed in a production environment!
 !

 Creating database schema...


In ToolsException.php line 34:

  Schema-Tool failed with Error 'An exception occurred while executing 'CREATE TABLE ext_translations (id INT AUTO_INCREMENT NOT NULL, locale VARCHAR(8) NOT NULL, object_class VARCHAR(255) NOT NULL, field VAR
  CHAR(32) NOT NULL, foreign_key VARCHAR(64) NOT NULL, content LONGTEXT DEFAULT NULL, INDEX translations_lookup_idx (locale, object_class, foreign_key), UNIQUE INDEX lookup_unique_idx (locale, object_class, f
  ield, foreign_key), PRIMARY KEY(id)) DEFAULT CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci ENGINE = InnoDB':

  SQLSTATE[42000]: Syntax error or access violation: 1071 Specified key was too long; max key length is 767 bytes' while executing DDL: CREATE TABLE ext_translations (id INT AUTO_INCREMENT NOT NULL, locale VA
  RCHAR(8) NOT NULL, object_class VARCHAR(255) NOT NULL, field VARCHAR(32) NOT NULL, foreign_key VARCHAR(64) NOT NULL, content LONGTEXT DEFAULT NULL, INDEX translations_lookup_idx (locale, object_class, forei
  gn_key), UNIQUE INDEX lookup_unique_idx (locale, object_class, field, foreign_key), PRIMARY KEY(id)) DEFAULT CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci ENGINE = InnoDB


In AbstractMySQLDriver.php line 121:

  An exception occurred while executing 'CREATE TABLE ext_translations (id INT AUTO_INCREMENT NOT NULL, locale VARCHAR(8) NOT NULL, object_class VARCHAR(255) NOT NULL, field VARCHAR(32) NOT NULL, foreign_key
  VARCHAR(64) NOT NULL, content LONGTEXT DEFAULT NULL, INDEX translations_lookup_idx (locale, object_class, foreign_key), UNIQUE INDEX lookup_unique_idx (locale, object_class, field, foreign_key), PRIMARY KEY
  (id)) DEFAULT CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci ENGINE = InnoDB':

  SQLSTATE[42000]: Syntax error or access violation: 1071 Specified key was too long; max key length is 767 bytes


In PDOConnection.php line 106:

  SQLSTATE[42000]: Syntax error or access violation: 1071 Specified key was too long; max key length is 767 bytes


In PDOConnection.php line 104:

  SQLSTATE[42000]: Syntax error or access violation: 1071 Specified key was too long; max key length is 767 bytes


doctrine:schema:create [--dump-sql] [--em [EM]] [-h|--help] [-q|--quiet] [-v|vv|vvv|--verbose] [-V|--version] [--ansi] [--no-ansi] [-n|--no-interaction] [-e|--env ENV] [--no-debug] [--] <command>

root@05e082b08efd:/application/public#
Contributor

bogdaniel commented Apr 4, 2018

Seems like table ext_translations suffers from this change to doctrine :-)

root@05e082b08efd:/application/public# php bin/console doctrine:schema:create

 !
 ! [CAUTION] This operation should not be executed in a production environment!
 !

 Creating database schema...


In ToolsException.php line 34:

  Schema-Tool failed with Error 'An exception occurred while executing 'CREATE TABLE ext_translations (id INT AUTO_INCREMENT NOT NULL, locale VARCHAR(8) NOT NULL, object_class VARCHAR(255) NOT NULL, field VAR
  CHAR(32) NOT NULL, foreign_key VARCHAR(64) NOT NULL, content LONGTEXT DEFAULT NULL, INDEX translations_lookup_idx (locale, object_class, foreign_key), UNIQUE INDEX lookup_unique_idx (locale, object_class, f
  ield, foreign_key), PRIMARY KEY(id)) DEFAULT CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci ENGINE = InnoDB':

  SQLSTATE[42000]: Syntax error or access violation: 1071 Specified key was too long; max key length is 767 bytes' while executing DDL: CREATE TABLE ext_translations (id INT AUTO_INCREMENT NOT NULL, locale VA
  RCHAR(8) NOT NULL, object_class VARCHAR(255) NOT NULL, field VARCHAR(32) NOT NULL, foreign_key VARCHAR(64) NOT NULL, content LONGTEXT DEFAULT NULL, INDEX translations_lookup_idx (locale, object_class, forei
  gn_key), UNIQUE INDEX lookup_unique_idx (locale, object_class, field, foreign_key), PRIMARY KEY(id)) DEFAULT CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci ENGINE = InnoDB


In AbstractMySQLDriver.php line 121:

  An exception occurred while executing 'CREATE TABLE ext_translations (id INT AUTO_INCREMENT NOT NULL, locale VARCHAR(8) NOT NULL, object_class VARCHAR(255) NOT NULL, field VARCHAR(32) NOT NULL, foreign_key
  VARCHAR(64) NOT NULL, content LONGTEXT DEFAULT NULL, INDEX translations_lookup_idx (locale, object_class, foreign_key), UNIQUE INDEX lookup_unique_idx (locale, object_class, field, foreign_key), PRIMARY KEY
  (id)) DEFAULT CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci ENGINE = InnoDB':

  SQLSTATE[42000]: Syntax error or access violation: 1071 Specified key was too long; max key length is 767 bytes


In PDOConnection.php line 106:

  SQLSTATE[42000]: Syntax error or access violation: 1071 Specified key was too long; max key length is 767 bytes


In PDOConnection.php line 104:

  SQLSTATE[42000]: Syntax error or access violation: 1071 Specified key was too long; max key length is 767 bytes


doctrine:schema:create [--dump-sql] [--em [EM]] [-h|--help] [-q|--quiet] [-v|vv|vvv|--verbose] [-V|--version] [--ansi] [--no-ansi] [-n|--no-interaction] [-e|--env ENV] [--no-debug] [--] <command>

root@05e082b08efd:/application/public#

@javiereguiluz javiereguiluz added the hasPR label Jul 12, 2018

javiereguiluz added a commit that referenced this issue Jul 16, 2018

minor #10055 Mentioned the 767 bytes index limit and its solution (ja…
…viereguiluz)

This PR was squashed before being merged into the 2.8 branch (closes #10055).

Discussion
----------

Mentioned the 767 bytes index limit and its solution

This fixes #6398. Sadly this error is very common (I also faced myself several times) and the error message is cryptic, so let's document this problem and its solution.

Commits
-------

092b7c0 Mentioned the 767 bytes index limit and its solution
@javiereguiluz

This comment has been minimized.

Show comment
Hide comment
@javiereguiluz

javiereguiluz Jul 16, 2018

Member

Fixed by #10055.

Member

javiereguiluz commented Jul 16, 2018

Fixed by #10055.

Deuchnord added a commit to Deuchnord/mercredifiction that referenced this issue Sep 9, 2018

🐛 Fixed the length of the strings in the entities
As standed here: symfony/symfony-docs#6398 (comment), the utf8mb4-encoded databases in MySQL cannot have keys larger than 191 lengths.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment