Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error 1264: Out of range value #681

Closed
mohsen3 opened this issue Nov 26, 2018 · 9 comments
Closed

Error 1264: Out of range value #681

mohsen3 opened this issue Nov 26, 2018 · 9 comments

Comments

@mohsen3
Copy link

mohsen3 commented Nov 26, 2018

We are trying to use gh-ost to add a new column to an existing table with a few hundreds of millions of rows and we are getting an Error 1264: Out of range value for column '...' at row 1 error message. Unfortunately, I cannot reproduce this error on my dev environment. In the prod environment, the migration goes fine until a new row is inserted into the table (so it seems to me that it has something to do with live updates, although I am just guessing since I am not too familiar with the MySql replication protocol that gh-ost uses).

The column that is causing this issue is of type Decimal(65, 30) and the values in there are pretty large (10**35-1 which is slightly smaller than the max possible value). Also in production, we have a master/slave setup that I don't have on my dev machine. Does any of these ring a bell? Is this a known issue in gh-ost?

@mohsen3
Copy link
Author

mohsen3 commented Nov 26, 2018

This is more of a question right now, but it seems that I don't have the permission to label it so.

@ggunson
Copy link
Contributor

ggunson commented Nov 26, 2018

@mohsen3 That would be from the live updates, specifically from gh-ost parsing the binary logs to apply the base table's writes to the _gho table.

This sounds like something we should test on our end.

@mohsen3
Copy link
Author

mohsen3 commented Nov 26, 2018

@ggunson Thank you for your quick response. This is a blocker for us right now. So, if there is something we can help with, please let me know. We are trying to reproduce it locally right now that may help you pin pointing the issue as well.

Also, does it sound like a reasonable theory to you? If yes, we can stop insertions into that table for a while and does the migration. It means readonly access to the table for a few hours (which is not ideal, but acceptable to us if it works). Do you have any documentation on that issue that helps us understand or debug the issue better?

@ggunson
Copy link
Contributor

ggunson commented Nov 26, 2018

Also, does it sound like a reasonable theory to you?

It does, because we've seen the same sort of error message in regards to the binary log parsing. For example, #157

Also, we don't use DECIMAL columns much at GitHub so if there's a bug we would likely not have seen it in our internal testing infrastructure.

If you're not altering that specific decimal column, I would guess that pausing writes to that table would allow gh-ost to run (or, if it failed it would be on something else).

Could you please provide (so we can test this):

  1. mysql version of master and replicas (if both are being used to run gh-ost)
  2. gh-ost version
  3. gh-ost command
  4. show create table for the table being migrated
  5. the column value it's failing on (if you know it)

@mohsen3
Copy link
Author

mohsen3 commented Nov 26, 2018

We are actually making those columns nullable (before migration, they are not nullable).
Here is the answers to your questions:

mysql version of master and replicas (if both are being used to run gh-ost)

5.6.39

gh-ost version

1.0.44

gh-ost command

./gh-ost-1.0.44 --database <db> --debug --table <table-name> --alter 'ADD col99 varbinary(90), ADD CONSTRAINT index_col99_and_col5 UNIQUE (`col5`, `col99`), MODIFY col12 DECIMAL(65,30) UNSIGNED NULL DEFAULT "0.000000000000000000000000000000", MODIFY col11 DECIMAL(65,30) UNSIGNED NULL DEFAULT "1.000000000000000000000000000000";' --host 127.0.0.1 --port 3306 --replica-server-id 1001831 --verbose --user="myuser" --password="mypass" --allow-on-master --allow-master-master --initially-drop-ghost-table --switch-to-rbr --initially-drop-socket-file --default-retries=2 --execute

show create table for the table being migrated

CREATE TABLE `table_name` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `col0` int(11) NOT NULL,
  `col5` int(11) DEFAULT NULL,
  `col1` int(11) NOT NULL DEFAULT '0',
  `col12` decimal(65,30) unsigned NOT NULL DEFAULT '0.000000000000000000000000000000',
  `col11` decimal(65,30) unsigned NOT NULL DEFAULT '1.000000000000000000000000000000',
  `flag` tinyint(1) DEFAULT '1',
  `col13` tinyint(4) unsigned NOT NULL DEFAULT '64',
  `col14` tinyint(4) unsigned NOT NULL DEFAULT '64',
  `col20` bigint(11) unsigned DEFAULT NULL,
  `col2_id` int(11) unsigned NOT NULL DEFAULT '0',
  PRIMARY KEY (`id`),
  UNIQUE KEY `index_post_col1` (`col5`,`col11`,`col1`),
  KEY `index2` (`col5`,`col12`,`col1`)
) ENGINE=InnoDB AUTO_INCREMENT=81726321 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci

the column value it's failing on (if you know it)

1e+35 (it's what's printed to the console, I am not sure if it is somehow rounded or not).

@alexdeschamps
Copy link

We ended up performing the decimal migration by avoiding live updates for the 1h migration. This removes any adverse effect for us at the moment. Is there anything that we can do to help drive closure to this issue? Here is an example binlog entry:

BINLOG '
Umr8WxMBAAAAXwAAAHwZEgAAAMwFAAAAAAEAFHRoZW1pc19kZXZlbG9wbWVudF8yABVpcmlzX2l0
ZW1faGllcmFyY2hpZXMACwMDAwP29gEBAQgDBEEeQR5EAok6XKE=
Umr8Wx4BAAAAeQAAAPUZEgAAAMwFAAAAAAEAAgAL//8A+iIAAED///8/AAAAQBAAAACAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAACF9eD/O5rJ/zuayf87msn/AAAAAAAAAAAAAAAAAAABQED/
//8/mNVkCQ==
'/*!*/;
### INSERT INTO `database_1`.`table_1`
### SET
###   @1=1073741858
###   @2=1073741823
###   @3=1073741824
###   @4=16
###   @5=0.000000000000000000000000000000
###   @6=99999999999999999999999999999999999.000000000000000000000000000000
###   @7=1
###   @8=64
###   @9=64
###   @10=NULL
###   @11=1073741823
# at 1186293
#181126 21:49:06 server id 1  end_log_pos 1186324 CRC32 0xb3d87d18      Xid = 75786
COMMIT/*!*/;
DELIMITER ;

@shlomi-noach
Copy link
Contributor

@alexdeschamps thank you for the input, the bug is confirmed in #684

@shlomi-noach
Copy link
Contributor

#684 seems to fix the DECIMAL problem. The problem was due to the underlying go-mysql library, which did not support DECIMAL correctly. This has recently been fixed in go-mysql-org/go-mysql#332, which I imported.

@shlomi-noach
Copy link
Contributor

fixed by #684

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants