Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

parsing mysqlbinlog output #3

Closed
shlomi-noach opened this issue Mar 23, 2016 · 16 comments
Closed

parsing mysqlbinlog output #3

shlomi-noach opened this issue Mar 23, 2016 · 16 comments

Comments

@shlomi-noach
Copy link
Contributor

As per #1, I can see or think of the following:

  • Generally, a binlog entry starts with # at 123456 and the following entry has an end_log_pos
  • But sometimes an entry looks like this: ### Row event for unknown table ... at 123456
  • For now both are handled, but are there further such cases?
    • For now this matters, because I'm using the end_log_pos from previous entry to validate that the current entry has same number. It's just a self test
    • But if we cannot use this self test -- that's also OK, as long as we don't miss out on anything
  • At this time we don't even bother checking for rolledback transactions or halfway-through transactions
    • This may be an issue, making parsing of mysqlbinlog less ideal than the already less-than-ideal state. We'll see.
  • How do we tail a binlog file? The size of the binlog file does not necessarily indicate an end-pos, because it might be only halfway-written throughout the current transaction.
    • So we would need to connect to the replica and issue SHOW MASTER STATUS or SHOW BINARY LOGS, as the output of these commands does indicate a true position as which a statement/entry is complete.
@shlomi-noach
Copy link
Contributor Author

Related: #4

@shlomi-noach
Copy link
Contributor Author

I'm displeased at this time. In testing all seems to go well (See #4), but on production, parsing a true binary log file yields with different (smaller) number of statements as compared to raw binlog parsing.
I will pursue checking which entries are differing, however I'm getting the nasty feeling there will be more hidden gotchas to parsing the binary log file properly.

@shlomi-noach
Copy link
Contributor Author

Displease turns to satisfaction:

shlomi-noach@my-test-machine~$ /tmp/gh-osc --debug --mysql-basedir=/usr --mysql-datadir=/home/shlomi-noach/tmp/ --binlog-file=mysql-bin.012323 --internal-experiment=true 2> /dev/null > tmp/statements-gh4.sql
shlomi-noach@my-test-machine~$ /usr/bin/mysqlbinlog --verbose --base64-output=DECODE-ROWS --start-position=4  /home/shlomi-noach/tmp/mysql-bin.012323 | egrep "### (INSERT|UPDATE|DELETE)" | sed -e "s/### //" -e "s/INTO //" -e "s/FROM //" > tmp/statements-mbl4.sql
shlomi-noach@my-test-machine~$ wc -l tmp/statements-gh4.sql
2214601 tmp/statements-gh4.sql
shlomi-noach@my-test-machine~$ wc -l tmp/statements-mbl4.sql
2214601 tmp/statements-mbl4.sql
shlomi-noach@my-test-machine~$ diff tmp/statements-mbl4.sql tmp/statements-gh4.sql
shlomi-noach@my-test-machine~$ md5sum tmp/*4.sql
3ca149d5f53bbb83eb9467ee89adb0fe  tmp/statements-gh4.sql
3ca149d5f53bbb83eb9467ee89adb0fe  tmp/statements-mbl4.sql

So for now this is something we can work with, on production data

@shlomi-noach
Copy link
Contributor Author

LOL string escaping:

### INSERT INTO `test`.`samplet`
### SET
###   @1=11
###   @2=11
###   @3='I'm here'

@shlomi-noach
Copy link
Contributor Author

Uggggggh!

A REPLACE translates into DELETE+UPDATE such that the UPDATE modifies the PK value!

### DELETE FROM `test`.`samplet`
### WHERE
###   @1=2
### UPDATE `test`.`samplet`
### WHERE
###   @1=4
### SET
###   @1=2
###   @2=4
###   @3='replaced 2,4'

This is no fun. This means when we wish to apply the change on the ghost table, we need to verify whether there's been a change to PK (as in the above, so easy to detect). If not, good life, we just REPLACE INTO ghost_table SELECT * FROM original_table WHERE ...

But if there is a change to PK value, we need to DELETE the old row and INSERT a new one.
I would think that in RBR we wouldn't need to do this, but seems like we do.

cc @ggunson

@jonahberquist
Copy link
Contributor

A REPLACE translates into DELETE+UPDATE such that the UPDATE modifies the PK value!

Huh, that's weird. What statement did you run that ended up with this in the binary log? I'd love to see the SQL somewhere to reproduce the test cases.

@ggunson
Copy link
Contributor

ggunson commented Mar 24, 2016

A REPLACE translates into DELETE+UPDATE such that the UPDATE modifies the PK value!

That sounds really, really wrong. MySQL RBR has gone rogue and is thinning the row herd

@shlomi-noach
Copy link
Contributor Author

@jonahberquist see https://github.com/github/gh-osc/pull/4/files#diff-0d909eaf5269b40f9d9afe5ef9fae52cR2

The replace into samplet values (2,4,'replaced 2,4'); is the one that translates into a DELETE+UPDATE, as can be seen further down the file on https://github.com/github/gh-osc/pull/4/files#diff-0d909eaf5269b40f9d9afe5ef9fae52cR146

@ggunson
Copy link
Contributor

ggunson commented Mar 24, 2016

Well, that is what it's supposed to be doing, yes? I created a test samplet table and tried it out: it comes from there being a unique index on license as well as a PK, and the REPLACE hits duplicates on both, and I guess this is what it's expected to do?

mysql> select * from samplet;
+----+---------+-------------+
| id | license | name        |
+----+---------+-------------+
|  1 |       1 | a           |
|  2 |       2 | extended    |
|  3 |       3 | extended    |
|  4 |       4 | transaction |
|  5 |       5 | transaction |
|  6 |       6 | transaction |
+----+---------+-------------+
6 rows in set (0.00 sec)

mysql>     replace into samplet values (2,4,'replaced 2,4');
Query OK, 3 rows affected (0.00 sec)

mysql> select * from samplet;
+----+---------+--------------+
| id | license | name         |
+----+---------+--------------+
|  1 |       1 | a            |
|  2 |       4 | replaced 2,4 |
|  3 |       3 | extended     |
|  5 |       5 | transaction  |
|  6 |       6 | transaction  |
+----+---------+--------------+
5 rows in set (0.00 sec)

@jonahberquist
Copy link
Contributor

Yeah, I agree with @ggunson. This makes a lot more sense seeing the unique index on column 2. The row with the unique key getting updated and the row with the PK getting deleted is a bit weird to me -- we could have deleted row 4 and updated row 2 and ended up with the same table. I don't really see this as a problem. It's just a bit weird.

Given that it's a correct set of action, if we encountered this, it's basically the same as if we encountered two distinct things:

### DELETE FROM `test`.`samplet`
### WHERE
###   @1=2

and

### UPDATE `test`.`samplet`
### WHERE
###   @1=4
### SET
###   @1=2
###   @2=4
###   @3='replaced 2,4'

And updates and deletes are both things we'd need to be able to handle anyway. The delete is just a delete. The update is just an update. So, not a problem.

Oh. 😯 Wait. 😰

The problem we're talking about now isn't actually in our parsing of the logs. It's in the fact that we'll be issuing REPLACE statements, and they might modify rows differently than just a DELETE and INSERT for the PK. When we were spec'ing this out in person a few weeks ago, I don't remember talking about that aspect of REPLACE behavior.

Is there a time where our ghost table could have data that would lead to a REPLACE replacing more than the single row we want to if that table has a unique secondary key?

Since we'll be applying all of the changes from the binary log in order, I don't think we could, unless the table had already had data that would violate that secondary unique constraint, but this is definitely a corner case that we should make sure we understand.

@shlomi-noach
Copy link
Contributor Author

@ggunson the end result -- yes, that's what is supposed to happen. I was more ranting on the way it got there. It could have DELETEd the other row, and then UPDATEd the row with the same PK, so no change to PK.
Or it could have done what happens in reality: issue two DELETEs and one INSERT instead.
The fact it chose to modify a PK value is what upsets me.

@shlomi-noach
Copy link
Contributor Author

One implication of parsing the RBR is that we are unable to handle an ALTER to a PK, even if there's another UNIQUE KEY hat is good to go.

@jonahberquist I don't think there's going to be an integrity problem. It's more that given an event such as:

### UPDATE `test`.`samplet`
### WHERE
###   @1=4
### SET
###   @1=2
###   @2=4
###   @3='replaced 2,4'

We would need to be more specific about how to apply this change. It's no longer REPLACE INTO ghost SELECT * FROM original WHERE @1=4, because @1=4 does not exist in the original table anymore. We would instead need to parse the values (in this case), issue a DELETE on @1=4 on the ghost table and then REPLACE INTO ghost SELECT * FROM original WHERE @1=2.

@ggunson envisioned that we would only need to parse the WHERE values and then go apply on the tables based on those, but according to this, we also need to parse the applied values themselves and use them in our logic.

@ggunson
Copy link
Contributor

ggunson commented Mar 25, 2016

I was mostly just pointing out the reason for the behaviour, which wasn't obvious until the secondary unique key was known.

I'm actually a bit weirded out in terms of the MySQL documentation, since it specifically states that a REPLACE is either an INSERT or a DELETE+INSERT, and doesn't say it's any different depending on key violations:

The REPLACE statement returns a count to indicate the number of rows affected. This is the sum of the rows deleted and inserted. If the count is 1 for a single-row REPLACE, a row was inserted and no rows were deleted. If the count is greater than 1, one or more old rows were deleted before the new row was inserted. It is possible for a single row to replace more than one old row if the table contains multiple unique indexes and the new row duplicates values for different old rows in different unique indexes.
(http://dev.mysql.com/doc/refman/5.6/en/replace.html)

However, in my very lame testing with REPLACE, here's what I'm seeing in the row binary logs:

  1. A REPLACE with no key collisions is an INSERT (as expected)
  2. A REPLACE that hits a duplicate primary key becomes DELETE + INSERT (okay too)
  3. A REPLACE that hits a duplicate secondary unique key becomes an UPDATE (wat)
  4. A REPLACE that hits both the primary key and a secondary unique key is DELETE + UPDATE (as @shlomi-noach discovered).

Later on the docs say that, well, the storage engine could be wack, but like, who cares, it's all the same:

It is possible that in the case of a duplicate-key error, a storage engine may perform the REPLACE as an update rather than a delete plus insert, but the semantics are the same.

Which still doesn't point out all the differences we're seeing.

Oddly, my quick test of MyISAM shows it gives more information on the PK plus unique collision situation than the InnoDB version of RBR does (just for testing, I know we don't care about MyISAM):

### DELETE FROM `test`.`samplet2`
### WHERE
###   @1=2 /* INT meta=0 nullable=0 is_null=0 */
###   @2=6 /* INT meta=0 nullable=1 is_null=0 */
###   @3='replace 2, 6' /* VARSTRING(64) meta=64 nullable=1 is_null=0 */
### UPDATE `test`.`samplet2`
### WHERE
###   @1=4 /* INT meta=0 nullable=0 is_null=0 */
###   @2=2 /* INT meta=0 nullable=1 is_null=0 */
###   @3='regular insert of 4,2' /* VARSTRING(64) meta=64 nullable=1 is_null=0 */
### SET
###   @1=2 /* INT meta=0 nullable=0 is_null=0 */
###   @2=2 /* INT meta=0 nullable=1 is_null=0 */
###   @3='replace 2,2' /* VARSTRING(64) meta=64 nullable=1 is_null=0 */
# at 706

@jonahberquist
Copy link
Contributor

It's no longer REPLACE INTO ghost SELECT * FROM original WHERE @1=4, because @1=4 does not exist in the original table anymore. We would instead need to parse the values (in this case), issue a DELETE on @1=4 on the ghost table and then REPLACE INTO ghost SELECT * FROM original WHERE @1=2.

Yeah, this makes sense to me now. I still don't like it, but at least it makes sense.

Depending on how we want our binlogs, specifically with regard to binlog_row_image, I think we have a few options for handling this.

If we have full row images, when we see an UPDATE we could always parse the WHERE for the PK to delete and then always parse the SET to get the PK to do the REPLACE. (Or do 2 deletes and 1 insert as you mentioned -- in this case we might issue two DELETEs for the same row, which is fine).

With minimal row images, we would only get the columns that change. I noticed this in the sample binlogs you added to #4 (see below). In this case we'd have to check if the PK is listed in the SET columns at all and, if not, just REPLACE using the PK from the WHERE.

### UPDATE `test`.`samplet`
### WHERE
###   @1=5
### SET
###   @3='update'

@shlomi-noach
Copy link
Contributor Author

Well, we control the row image. We can force configuring it to full if we choose to.
I'm in fact more inclined at this time to go back to the original proposition of just parsing the query from the RBR log and applying it; in this case I don't care if a REPLACE did or did not cause an UPDATE to a PK.
In the particular case of blobs or anything that seems risky to parse from the RBR, we may resort to applying via INSERT INTO ghost SELECT FROM original.
Hmmm, but the latter means we'll have to comply with this anyway... Further thinking...

@shlomi-noach
Copy link
Contributor Author

I'm closing this for now, having nice success with #5

RainbowDashy referenced this issue in RainbowDashy/gh-ost Jun 29, 2022
* fix: close streamer when tearing down

* fix: set CutOverCompleteFlag to 1 to stop binlog syncer

* add debug info

* Revert "add debug info"

This reverts commit 149f174.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants