New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use Native JSON type instead of LONGTEXT for MariaDB #5916
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for working on this feature.
246bf28
to
355c62a
Compare
0c46972
to
fc88321
Compare
src/Platforms/MariaDBPlatform.php
Outdated
* | ||
* @return array{string, string} | ||
*/ | ||
public function getColumnTypeSQLSnippets(string $tableAlias = 'c'): array |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this override happen in MariaDb1043Platform
instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I put it in the base MariaDbPlatform to allow flexibility to choose between JSON and LONGTEXT columns for json types. It calls the parent where LONGTEXT is chosen and the executable comments prevent the additional SQL running on versions of MariaDb that don't support it. In other words, I saw adding the method to the base platform as a flexible way of supporting json columns on all versions of MariaDb. If it's confusing, I can move it to the specific platform (i.e. 1043).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, let's move it. This also allows us to get rid of those conditional comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will you have time to make this adjustment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do. I may have time tomorrow but more likely at the weekend.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have updated the pull request as follows:
-
I've moved the
getColumnTypeSQLSnippets()
method toMariaDb1043Platform
. Since it made little sense to override certain methods inMariaDBPlatform
and others inMariaDb1043Platform
, I also movedgetListTableColumnsSQL()
andgetColumnTypeSQLSnippets()
meaning there are no changes to the baseMariaDBPlatform
class. -
I added a couple of tests to ensure MariaDb's treatment of collation and character set did not cause problems (MariaDb forces collation to binary and reports it to doctrine so without these patches, the comparator sees a change from the general collation it was expecting and was generating incorrect migrations).
-
I spent time tracking through where this enforced collation triggered the unnecessary migrations and, as a result, moved the fix from the originally private
columnToArray()
method togetColumnDeclarationSQL()
, the column parameter for which was generated bycolumnToArray()
. This is the method called bycolumnsEqual()
to detect table changes. The amended patch is functionally equivalent to the earlier version. I still so no way of moving this into the schema manager as columnsEqual() is called on the metadata based table and the introspected table and the schema manager doesn't see the metadata based table. -
Once the collation issues are dealt with, contrary to what I'd said earlier, no migration is generated when moving from the old to new platforms, i.e. existing json tables are not upgraded to native JSON. This is because the columns are marked with DC2Type comments which override detection of underlying column types i.e. old and new columns appear equal to the comparator. I have added a final patch (it currently has effect only for MariaDb1043 platforms) which also compares the reported database data type where DC2Types are specified. This enables the comparator to detect the LONGTEXT to JSON change. If similar issues occur in future, the functionality could be activated for other platforms by overriding the new
AbstractPlatform->columnDeclarationsMatch()
method but the relevant platform schema manager would also need patching to detect the schema change and I have only done this forMySQLSchemaManager
in the proposed patch.
I have kept the above separate to the original patches to make your review easier (at least, that's what I intended). If you're okay with the proposals, I'd be happy to squash items 1, 2 and 3 into the original patches but I think 4 makes sense kept separate or, perhaps, dropped and users would then either have to manually migrate MariaDb JSON columns or live with the old LONGTEXT declarations for old tables.
Sorry this has taken me a while to amend - it turned out to be rather more involved than I at first thought and I wanted to make sure I had tested it locally before burdening you.
Sorry that this PR is taking so long, but my open source time is limited at the moment and I need to get my head around how MariaDB treats So basically, if I tell MariaDB to create a column of type |
Exactly right. And in addition, MariaDb also forces column collation to utf8mb4_bin irrespective of table or column specification so collation and charset need to be ignored when introspecting a MariaDb JSON column as they will likely not match what the dbal thinks they should be. |
Okay, got it. The MariaDB folks really chose an odd way of implementing JSON columns. 😓 Assuming an application has created a table with JSON columns using a previous version of DBAL and generates a migration after your patch has been applied, what would happen? The table would have |
After the patch, the dbal would detect that schema no longer matched the definition and the generated migration would alter the table from LONGTEXT to JSON (and MariaDb would then automatically add the CHECK constraint and change the column collation). |
f2ff1ad
to
ae4d249
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The part where we invalidate the definition of a column that has been inferred from a type comment looks really odd to me.
@greg0ire, I think you've worked on phasing out the type comments lately. Can you have a look at this PR please?
src/Driver/AbstractMySQLDriver.php
Outdated
if ($mariadb && version_compare($this->getMariaDbMysqlVersionNumber($version), '10.4.3', '>=')) { | ||
return new MariaDb1043Platform(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've just merged #5946 which introduces a platform class for MariaDB 10.5.2. We need to adjust the inheritance chain accordingly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated. About to rebase and push.
/** | ||
* Whether the database data type matches that expected for the doctrine type | ||
*/ | ||
public function columnDeclarationsMatch(Column $column1, ?Column $column2 = null): bool | ||
{ | ||
$column2 ??= $column1; | ||
|
||
return ! ( | ||
$column1->hasPlatformOption('declarationMismatch') || | ||
$column2->hasPlatformOption('declarationMismatch') | ||
); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code does not appear to be specific to MariaDB at all. We're merely checking for a flag that the schema manager might have set during comparison, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right that the code is generic. Moved to AbstractPlatform and the matching test case to AbstractPlatfformTestCase.
src/Platforms/AbstractPlatform.php
Outdated
* | ||
* To be implemented in specific platforms as necessary. | ||
*/ | ||
public function columnDeclarationsMatch(Column $column1, ?Column $column2 = null): bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apart from tests: When is this method ever called with one parameter or with the second parameter set to null
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Method signature amended to require two parameters.
ae4d249
to
60e96e4
Compare
Yes, it was in #5107 My understanding is that this PR has code to deal with such comment, but does not rely on them, and will work on 4.0.x as well. After merging it up, there will be useless code that should be removed on 4.0.x, correct? |
That sounds right. This bit of the code is only necessary because the DC2Type overrides the underlying LONGTEXT type with JSON which, in this case, is the same as the desired type so the comparator sees JSON to JSON i.e. no change. Without the DC2Type, the comparator would see LONGTEXT to JSON and the migration would be generated, i.e. removing the DC2Type solves the problem rendering part of the patch irrelevant for 4.0.x. |
60e96e4
to
efe0225
Compare
MariaDb aliases columns specified as JSON to LONGTEXT. Since version 10.4.3, MariaDb adds a CHECK constraint to JSON columns to ensure they contain valid JSON and sets the column collation to utf8mb4_bin. Simply setting JSON as the column type in the platform results in introspection failures as the database reports LONGTEXT (not JSON) with changed collation. Therefore, inverse the aliasing where the relevant check constraint exists and ignore collation changes for JSON columns in the relevant database versions. The main functional changes are in: - MariaDB1043Platform->getListTableColumnsSQL() and - MySQLSchemaManager->selectTableColumns() MariaDb1043Platform extends MariaDb1027Platform rather than MariaDBPlatform to ensure the platform checks in MySQLSchemaManager (based on MariaDb1027Platform) continue to work as expected. The alternative approach of adding the CHECK constraint to the SQL for the column is not robust as MariaDb syntax seems to require the CHECK constraint to be the final part of the column declaration and it was not obvious how to guarantee this (specifically, appending the CHECK constraint to the output of getJsonTypeDeclarationSQL() did not work). New public methods: MariaDBPlatform->getColumnTypeSQLSnippets() generates SQL snippets to reverse the JSON to LONGTEXT aliasing for JSON-specified columns. It is also used in MySQLSchemaManager. It still checks that json columns are JSON data type so that switching the getJsonTypeDeclarationSQL to LONGTEXT is all that is necessary to revert to old behaviour which is helpful for testing. Overridden methods in MariaDb1043Platform: MariaDb1043Platform->getJsonTypeDeclarationSQL(). To return JSON. MariaDBPlatform->getListTableColumnsSQL(). To reverse the JSON to LONGTEXT aliasing carried out by MariaDb. MariaDBPlatform->getColumnDeclarationSQL(). To unset collation and charset (used by the comparator). New test cases: 1. Types/JsonTest. Test storage and retrieval of json data. 2. Platforms/MariaDb1043PlatformTest. A clone of MariaDb1027Platform test. 3. Schema/MySQLSchemaManagerTest->testColumnIntrospection(). Ensures introspected table matches original table. Tests all doctrine types not just json. Based on failure highlighted in pull request doctrine#5100. For previous discussion on this topic, see: doctrine#5100 doctrine#3202 Further background at: https://mariadb.com/kb/en/json-data-type/ https://mariadb.com/kb/en/information-schema-check_constraints-table/ https://jira.mariadb.org/browse/MDEV-13916 Notes regarding JsonTest.php: 1. Some platforms return json as a stream so convert streams to strings. 2. 'json_table' is a reserved word in MySQL (and a function name in Oracle SQL). Use json_test_table instead. 3. Some platforms reorder json arrays so the insert order will not necessarily match the select order. Resort the arrays before testing for equality.
Select MariaDb1043Platform where MariaDb database is at version 10.4.3 or later.
MariaDb sets character set to utf8mb4 and collation to utf8mb4_bin for columns declared as JSON but still reports them as LONGTEXT with CHARACTER SET and COLLATE so comparator needs to ignore character set and collation for columns declared as JSON. Attempting to set CHARACTER SET or COLLATE for values other than utf8mb4 and utf8mb4_bin causes a database error on MariaDb. Test that character set and collation specifications are not passed to the database and are ignored by the comparator.
MariaDb JSON columns are marked as json with a DC2Type comment hint. The schema manager maps such columns to a json doctrine type which matches the metadata whether the column was declared as LONGTEXT (by the previous MariaDb1027 platform) or JSON. This means the comparator will not detect that a column should be upgraded from LONGTEXT to native JSON and no migration would be generated. Therefore, during column introspection check that the data type of the introspected columns match that which should be set for any doctrine types inferred from DC2Type comments. Set a flag (a platformOption for the relevant column) where it does not. Check whether the flag is set when the comparator diffs the expected and actual tables and mark a column type difference where the flag is set so a migration is generated.
efe0225
to
04c3330
Compare
| Q | A |------------- | ----------- | Type | bug | Fixed issues | contao/contao#6409 #### Summary Since #5916 some column configurations always show up as changed in the Comparator. For example a `Types::SIMPLE_ARRAY, ['length' => 255]` column. This pull request moves the `->expectedDbType()` check to the correct position to fix this issue and adds a test that verifies the fix.
This is a breaking change because MariaDB will throw failed I realize it sound a bit strange to complain that I cannot store invalid JSON anymore, however the default value of empty string was supported before and was a nice convenience. |
This is why I would not consider this as a breaking change. If you declare a column as JSON, storing JSON is covered by the contract of this library. Storing something else is not.
Use |
Summary
Adds a new MariaDb1043Platform with JSON type instead of LONGTEXT to be used for MariaDb versions at or above 10.4.3 when additional functionality specific to JSON columns was added.
Deals with introspection issue flagged in pull request 5100 (the previous attempt to add this functionality) by inversing the JSON to LONGTEXT aliasing on table introspection (if applicable based on platform and MariaDb version).
Tested against MariaDb 10.6.11. See commit message for further details.