Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Utf8mb4 refactoring - remove duplicate code and fix utf8mb4 to utf8 statement downgrade #9847

Merged
merged 6 commits into from May 7, 2016

Conversation

richard67
Copy link
Member

@richard67 richard67 commented Apr 11, 2016

Pull Request for new issue.

Only relevant for mysql databases.

Summary of Changes

When implementation of utf8mb4 conversion started a few months ago, we (all who worked together on that) tried to make it work also when using Extension Installer to update Joomla! Core. So we impemented functions to check utf8mb4 support and downgrade "create table" or "alter table" not only in the db driver but also at several places in case of the update is running while still being connected to the old db driver.

Meanwhile this update method is not supported anymore, and with using the Joomla! Update component we can rely on the new, updated db driver being used when running schema updates, and so the duplicate functions are not necessary anymore.

This PR removes these duplicate function impementations and replaces calls to them by calling the corresponding db driver functions.

After this change, the 2 functions to perform the utf8mb4 conversion in script.php and the database schema manager (administrator/components/com_installer/models/database.php) are identical beside a message to check the db fixer in case if some error occurred, which makes sense to be shown on update but not when using the db fixer and so already being in this view. Because the schema manager already includes script.php and uses functions of it just a few lines above the conversion function is called, it is a logical step to make the schema manager use the conversion function from script.php, too.

So this PR also replaces the call to the local conversion function in the schema manager by a call to the one of scrip.php, and remove the duplicate local code. The function in script.php is extended by a parameter telling if the special message to check the db fixer is shown or not.

Finally, this PR corrects following 2 issues with the statement downgrade function for which I once created PR #9504 , which I will close soon in favour of this PR here:

  1. If in a "CREATE TABLE" or "ALTER TABLE" statement there is more than 1 space or a tab between the words "CREATE" or "ALTER" and "TABLE", or the keywords are not in uppercase, which is all still valid SQL, the statement will not be detected to be a "CREATE TABLE" or "ALTER TABLE" statement and so not be processed. This will lead to an "unknown character set" SQL error on a db not supporting utf8mb4 e.g. when installing extensions already using utf8mb4 in their sql for creating tables.
  2. If a "CREATE TABLE" or "ALTER TABLE" has been detected, the procedure is to simply replace "utf8mb4" by "utf8" in the statement, regardless if it occurs withint quoted text or name quotes. This makes it impossible to use the pattern "utf8mb4" within table or column names, table or column comments or column default values.

This PR fixes both issues.

Testing Instructions

Pre-requisites

This PR is only relevant for mysql databases.

It needs tests with database sever/client combinations which do not support utf8mb4, i.e. either the mysql server version is lower than 5.5.3, or a msqlnd client API with version lower than 5.0.9 is used, or another mysql-related client API with version lower than 5.5.3 is used.

Testers who have only database sever/client combinations which do support utf8mb4 are welcome to test that updating Joomla! core with any valid method or installing extensions already using utf8mb4 (e.g. Patchtester) still works the same as before. Please report in this case that you tested with utf8mb4 support.

Test 1: Joomla! Update component

Step 1: Update a Joomla! version 3.5.1 or lower to a 3.5.2-dev patched by this PR using following custom URL: http://test5.richard-fath.de/list_test10.xml.

If you want to continue later with Test 2, you should export your database before doing the update, so you can later use the database dump for Test 2.

Result: Update ends with success.

Step 2: Check "Extensions -> Manage -> Database".

Result:

Database table structure is up to date.
Other Information

  • Database schema version (in #__schemas): 3.5.1-2016-03-29.
  • Update version (in #__extensions): 3.5.2-dev.
  • Database driver: mysqli.
  • 94 database changes were checked successfully.
  • 145 database changes did not alter table structure and were skipped.

Step 3: Check database table character sets and collations with phpMyAdmin.

Result: All core tables have utf8mb4 or utf8 charset, depending on utf8mb4 support. All core table have unicode_ci collations, except of those for com_finder, which have general_ci.

Step 4: Check with phpMyAdmin the index idx_name of table #__users.

Result: The index idx_name of table #__users includes the name column with 100 chars.

Test 2: Database Schema Manager aka "DB fixer"

Step 1: Prepare to update a Joomla! version 3.5.1 or lower to a 3.5.2-dev patched by this PR with the "copy files and run db fixer" method.

If you have done Test 1 before and have exported the database before updating, just import back the exported database.

Otherwise use a clean 3.5.1 or lower and copy the files of this PR'S branch to the joomla folder. A zip of this branch can be found here: https://github.com/richard67/joomla-cms/archive/utf8mb4-refactoring-1.zip.

Step 2: Go to "Extensions -> Manage -> Database".

Result: Database problems shown for schema update files according to the pre-update schema version, last problem is that utf8mb4 (or utf8) conversion has not been done yet.

Step 3: Use the "Fix" button.

Result: See result of Test 1 Step 2.

Check database as described with Steps 3 and 4 of Test 1.

Test 3: Install extensions

Step 1: On current staging patched by this PR, or on the updated 3.5.2-dev patched by this PR which you have after Test 1 or Test 2, install some extension which already uses utf8mb4, e.g. the latest com_patchtester 2.0.0-rc.

Step 2: Check with phpMyAdmin that the database tables for this component have the correct character sets and collations depending on utf8mb4 support, e.g. for patchtester: utf8mb4_unicode_ci or utf8_unicode_ci.

Result: No SQL errors related to character sets or collations, database tables look as expected.

Test 4: Statement downgrade bug fix

This test only makes sense on a database server/client combination which does not support utf8mb4.

Step 1: Download following dummy extension and then install it, using "Extensions -> Manage -> Install", "Upload & Install Package": http://test5.richard-fath.de/test_package_utf8mb4.zip.

Result: The installation succeeds without any SQL errors.

Step 2: Check with phpMyAdmin that the package has created a new table #__test_package_utf8mb4 with default character set=utf8 and default collation = utf8_unicode_ci.

Step 3: Check with phpMyAdmin the structure of this new table.

Result: The table has following 2 columns:

  • utf8mb4_test_1 varchar(40) NOT NULL DEFAULT 'Do not replace utf8mb4 in quoted text.' COMMENT 'Do not replace utf8mb4 in quoted text.',
  • utf8mb4_test_2 varchar(7) CHARACTER SET utf8 COLLATE utf8_bin NOT NULL DEFAULT '#fff' COMMENT 'Do not replace utf8mb4 in quoted text.'

Step 4: Uninstall the package to clean up if you need your test system for other tests.

Step 5: Repeat Steps 1 to 4 on an unpatched staging or 3.5.1 or 3.5.0.

Result: Table names, column names, column default values and column comments are wrong because "utf8mb4" has been replaced everywhere by "utf8".

@brianteeman
Copy link
Contributor

I have completed tests 1,2,3 on mysql 5.5.42 with utf8mb4 support

Thanks for the detailed instructions. It looks like a lot but it took me about 15 minutes
This comment was created with the J!Tracker Application at issues.joomla.org/joomla-cms/9847.

@richard67
Copy link
Member Author

Thanks for testing, Brian.

@richard67
Copy link
Member Author

@andrepereiradasilva Could you help with testing this PR?

As Brian already found out, it does not take long time. The description is only this long because I try to explain what this PR does.

I need especially some test on a mysql database not supporting utf8mb4.

You have such? Or only such supporting utf8mb4?


This comment was created with the J!Tracker Application at issues.joomla.org/joomla-cms/9847.

@andrepereiradasilva
Copy link
Contributor

i don't have a non utf8mb4 server now. so can't test. Maybe others can help with testing.

@richard67
Copy link
Member Author

@andrepereiradasilva OK, thanks for telling. Hope we find someone else.

@richard67
Copy link
Member Author

@brianteeman I just rebased the branch of this PR to current staging and also updated the zip container behind my custom URL for testing, so it can be tested now with the new Joomla! Update features recently merged into staging.

If you want you can test again, but I leave it up to you, since you have a mysql supporting utf8mb4. I guess that was the reason why you not have not marked any test result, because it needs non-utf8mb4 for reliable testing.

But thanks a lot for your test, it confirms at least that my PR does not break anything for environments with utf8mb4 supporting databases.


This comment was created with the J!Tracker Application at issues.joomla.org/joomla-cms/9847.

Correct the sql statement utf8mb4 to utf8 downgrade function of the db
driver to be tolerant regarding spaces or case in alter table statement
and not replace utf8mb4 by utf8 in qouted text or table or column names.
Remove the local implementations of db driver functions for utf8mb4
stuff which once were added to support updating Joomla! Core with
extension installer, and use db driver functions instead, since we can
rely on the new db driver being available when using Joomla! Update
component.
The schema manager uses script.php anyway just a few lines above calling
the local conversion function, so it can use also the conversion
function of script.php instead of a local one. Both functions are equal
anyway except of the special message text used in script.php, and this
also can be used for the schema manager, the message text might have to
be changed maybe for that.
This call to prepareUtf8mb4StatusTable() is not necessary anymore since
recently the same call to this function had been added to populateStatem
which runs before the changeset is created and so does all what is
needed.
Remove the local implementations of db driver functions  for utf8mb4
stuff which once were added to support  updating Joomla! Core with
extension installer, and use db driver functions instead, since we can
rely on the new db driver being available when using Joomla! Update
component.
Show message that db fixer should be checked only if requested by a new
parameter flag, e.g. not show it if already in the database fixer.
@ghost
Copy link

ghost commented Apr 15, 2016

@richard67 MySQL connection collation is set on utf8mb4_general_ci. So i can't test with this System cause you need a non-utf8mb4_general_ci System.

@richard67
Copy link
Member Author

@franz-wohlkoenig Yes, seems not many people have such anymore. But thanks for checking.

@waader
Copy link
Contributor

waader commented Apr 15, 2016

@richard67 Is the combination of MySQL-Client-Version: 5.1.47 with Server Version: 5.1.47 something you are looking for?

@richard67
Copy link
Member Author

@waader Oh yes, that's perfect. As brian pointed out a few items above, it does not need much time to test if you go through the instructions step by step. The descriptions are this long only because I wanted to explain all I did. So if you find time for testing would be nice.

@waader
Copy link
Contributor

waader commented Apr 15, 2016

Ok. Give me some minutes.

@waader
Copy link
Contributor

waader commented Apr 15, 2016

@richard67 Sorry, the php version does not meet the requirements. I looked through some of my virtual machines, but couldn´t find the right combination. I will search for a solution tomorrow.

@richard67
Copy link
Member Author

@waader Thanks for trying. Maybe you could use a new PHP with a new client API where Joomla! files are, and have the db on another machine, e.g. a VM, using an old mysql sever version. The PHP version on this db server would not matter then. For Joomla! it will be a client API which supports utf8mb4 (because newer PHP) looking on a database not supporting it (because too old mysql version), and so the result will be no utf8mb4 support, which is what is needed here. So maybe just use 2 of your VMs: The one with new PHP version for Joomla! files (= db client) and the other one with old stuff for having the old database server.

@waader
Copy link
Contributor

waader commented Apr 16, 2016

@richard67 Finally I managed to test your patch with mysql v. 5.5.2 and I could reproduce the single steps with one exception. In step 4 your line

utf8mb4_test_1 varchar(40) NOT NULL DEFAULT 'Do not replace utf8mb4 in quoted text.' COMMENT 'Do not replace utf8mb4 in quoted text.',

looks a bit different than mine:

utf8mb4_test_1 varchar(40) COLLATE utf8_unicode_ci NOT NULL DEFAULT 'Do not replace utf8mb4 in quoted text.' COMMENT 'Do not replace utf8mb4 in quoted text.'

Does that matter?

@richard67
Copy link
Member Author

@waader No, that doesn't matter. IF you do an export (database dump) to SQL with phpMyAdmin, it adds this information (collation) to the columns. What you get shows even better that all works as it should (change charsets and collations if no utf8mb4 support but not change column or table names or quoted text).

So from my point of view a good test.

@waader
Copy link
Contributor

waader commented Apr 16, 2016

I have tested this item ✅ successfully on 85c0f77

Thanks Richard for your efforts!


This comment was created with the J!Tracker Application at issues.joomla.org/joomla-cms/9847.

@richard67
Copy link
Member Author

Thanks a lot for testing. Be prepared for a new PR coming after this one is merged, for extending the utf8mb4 conversion to handle extensions, too, and to have versioned files, i.e. not run all completely again when some change e.g. between 3.5.3 and 3.5.4 in future.

@richard67
Copy link
Member Author

@wilsonge Still no time for testing this?

@brianteeman
Copy link
Contributor

brianteeman commented Apr 23, 2016 via email

@richard67
Copy link
Member Author

@brianteeman Ah thanks for the info. Hope he has a nice holiday. Hard to find someone to test this with a non-utf8mb4 capable db.

@richard67
Copy link
Member Author

@brianteeman Do you have any idea when @wilsonge will be back? You know, he once promised me to test this PR, and now it seems it will not be tested before 3.6. Or do you know if this will be obsolete because there are things being changed for 3.6 regarding utf8mb4 comversion of which I don't know?

@brianteeman
Copy link
Contributor

Sorry I dont know

On 2 May 2016 at 09:11, Richard Fath notifications@github.com wrote:

@brianteeman https://github.com/brianteeman Do you have any idea when
@wilsonge https://github.com/wilsonge will be back? You know, he once
promised me to test this PR, and now it seems it will not be tested before
3.6. Or do you know if this will be obsolete because there are things being
changed for 3.6 regarding utf8mb4 comversion of which I don't know?


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#9847 (comment)

Brian Teeman
Co-founder Joomla! and OpenSourceMatters Inc.
http://brian.teeman.net/

@wilsonge
Copy link
Contributor

wilsonge commented May 7, 2016

I have tested this item ✅ successfully on 85c0f77

Tested methods 1 to 3 successfully. Also tested a 3.4.8 to this pr as a test also sucessfully. RTC


This comment was created with the J!Tracker Application at issues.joomla.org/joomla-cms/9847.

@richard67
Copy link
Member Author

@waader @wilsonge Thanks for testing.

@joomla-cms-bot joomla-cms-bot added the RTC This Pull Request is Ready To Commit label May 7, 2016
@Kubik-Rubik Kubik-Rubik added this to the Joomla 3.6.0 milestone May 7, 2016
@Kubik-Rubik
Copy link
Member

Thank you @richard67 and testers!

@Kubik-Rubik Kubik-Rubik merged commit 34e238b into joomla:staging May 7, 2016
@joomla-cms-bot joomla-cms-bot removed the RTC This Pull Request is Ready To Commit label May 7, 2016
@richard67 richard67 deleted the utf8mb4-refactoring-1 branch May 7, 2016 16:38
wilsonge added a commit to joomla-framework/database that referenced this pull request Aug 31, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants