-
Notifications
You must be signed in to change notification settings - Fork 438
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Checksum mismatches when restoring between versions #860
Comments
@stephanie-dba THANK YOU!!! this bug report is so complete! I will be glad to fix this for next release. |
@davidducos Thank you! This is a great tool and I am so glad I found it. One thing to note that may be obvious but I thought I should mention: I'm not sure how the "standard" widths get assigned to INT column_types (I'm new to mySQL; have a MSSQLSERVER background), but according to the documentation https://dev.mysql.com/doc/refman/8.0/en/numeric-type-attributes.html, the width can be set to anything. So my REPLACE worked for our instance, but ideally, you'd just remove anything in () after the initial INT in the table definitions. Again, probably obvious, but... :) Thanks so much! |
@davidducos I found a couple other checksum issues while going over the rest of the loader log. Triggers Checksum Indexes Checksum This could be related to collation, as 8.0 defaults to UTF and my 5.7 instance was latin1. Or it could be because 8.0 removed support for sorting by GROUP BY: https://dev.mysql.com/doc/refman/8.0/en/upgrading-from-previous-series.html#upgrade-sql-changes. Regardless, I was able to get them to match by adding an ORDER BY clause like this: Anyway, I think that gets me through all of the issues I had during this process, but if anything else comes up, I'll report back here. Thank you! |
Describe the Issue
I am using mydumper/loader to upgrade from mySQL 5.7 to 8.0 (our cloud hosting provider does not offer in-place upgrade). When loading into mySQL 8.0 using checksums, I receive warnings like this:
[WARNING] - Structure checksum mismatch found for
database
.table
. Got 'a5a324c6', expecting '2C313EF' in file: database.table-schema-checksum.Further investigation uncovered that this is because of the difference in the way our integer columns are defined. In the dump from 5.7, they are defined as tinyint(4), int(11), and bigint(20) - with precision/display width included. In 8.0, the display width is deprecated and they are loaded as simply tinyint, int, and bigint. This throws off the checksum value because the COLUMN_TYPE used in the calculation is different between versions.
To Reproduce
Commands executed:
mydumper --host hostname --user root --password pw --database dbname --outputdir dirpath/filename --ssl --compress --routines --triggers --data-checksums --schema-checksums --routine-checksums --trx-consistency-only --verbose 3 --threads 12
myloader --host hostname --user root --password pw --database dbname --directory dirpath/filename --ssl --disable-redo-log --innodb-optimize-keys --verbose 3 --threads 12 --max-threads-per-table 8 --max-threads-for-index-creation 8 --overwrite-tables
Expected behavior
This may be an edge case of how you expect this tool to be used. However, it would be nice if the checksum validation could account for the difference in definition. (See additional context for my rudimentary workaround).
How to repeat
CREATE TABLE test_table ( id int(11) NOT NULL AUTO_INCREMENT, name varchar(255) DEFAULT NULL ) ENGINE = INNODB, AUTO_INCREMENT = 1, CHARACTER SET latin1, COLLATE latin1_swedish_ci;
Dump and load that table using --schema-checksums. Upon load, warning will be thrown and checksum will not match, leading the user to believe that perhaps something went wrong during the load.
Environment:
Additional context
I was able to get around this by modifying the checksum query you posted here: (#361) to account for the difference in column definition syntax:
SELECT table_name, COALESCE(LOWER(CONV(BIT_XOR(CAST(CRC32(CONCAT_WS(COLUMN_NAME, ORDINAL_POSITION, DATA_TYPE, REPLACE(REPLACE(REPLACE(COLUMN_TYPE, 'int(11)', 'int'), 'bigint(20)', 'bigint'), 'tinyint(4)', 'tinyint'))) AS UNSIGNED)), 10, 16)), 0) AS crc FROM information_schema.COLUMNS WHERE TABLE_SCHEMA = 'schema_name' GROUP BY table_name ORDER BY table_name;
I ran this in source and target databases and manually validated that the checksums matched for each table when the COLUMN_TYPE value was standardized. They did.
The text was updated successfully, but these errors were encountered: