Potential too-early-rename via killed connections #26

ggunson · 2016-05-03T23:38:53Z

Paraphrasing a lot due to lack of ability to understand Go, at the point where the tool is ready to swap tables, there are these connections:

GET_LOCK('something', 0);
SELECT RELEASE_LOCK('something') FROM original_table WHERE GET_LOCK('something',999)>=0 LIMIT 1; # or similar
RENAME TABLE original_table to original_old, original_new to original;

And then there's a check for the parsed binlog writes to have all gone to original_new, and then locks are released and the rename happens.

However, if either the GET_LOCK() connection or the SELECT RELEASE_LOCK() connection gets killed, the RENAME happens. So there are two places where a killed connection could result in a too-early-swap and some binlog events being lost.

I like the idea of confidence in being able to say that gh-ost wouldn't result in unintentional lost rows/writes. Maybe in this case it might involve locking the now-original-table after the rename and writing the missed events then (e.g. we'd know because the REPLACE/DELETE to original_new would have failed). Just brainstorming.

The text was updated successfully, but these errors were encountered:

shlomi-noach · 2016-05-04T07:15:24Z

To solve half the problem, connection death at the time of the RELEASE_LOCK() is fine, since at that time we're already satisfied we've applied the latest changes to the table.

I concur the case where the connection dies right after obtaining the initial GET_LOCK can result with premature RENAME, meaning some writes on the original table are lost.

I'm still unsure how to solve it mathematically. There is an easy way of substantially reducing the risk: as you recall, connection #2 issues a SELECT ... FROM original_table WHERE GET_LOCK('ding',9999) >= 0. This query is supposed to be blocking on the lock.

We are able to inject a suspense. Such that we force a 1s delay before the RENAME: SELECT ... FROM original_table WHERE GET_LOCK('ding',9999) + SLEEP(1) >= 0

This will delay the swap by 1s, leading to longer lock time, but will almost certainly cover for the case where the lock is killed. I think the combination of the two reduces the risk to insubstantial level.

The above is not a clean solution, and not a complete solution. Let's further think of how this can be solved mathematically.

shlomi-noach · 2016-05-04T07:54:42Z

The solution presented above is mutt. The SELECT ... WHERE GET_LOCK('ding',9999) query itself can be killed. The death of either two connections will cause the premature RENAME.

tomkrouper · 2016-05-04T15:09:13Z

I know you both understand this way better than me, so this might be a horribly stupid idea, but could we just add a lock right after the rename to do a quick check to make sure a previous lock didn't get killed to cause a premature rename? I'm guessing it won't help the situation other than alerting us to the mess up.

shlomi-noach · 2016-05-04T15:25:30Z

The current implementation already makes it known should something go
wrong, since the release_lock would return with 0/null. We can let the user
know that this may mean an inconsistency.
A lock immediately after the rename is non atomic, and writes will slip
through in between the two, which is the very problem we wanted to solve:
an atomic lock&rename combination.

I'm trying to think if we can employ a different storage engine which would
cause the lock and make a third table or something.

On Wednesday, May 4, 2016, Tom Krouper notifications@github.com wrote:

I know you both understand this way better than me, so this might be a
horribly stupid idea, but could we just add a lock right after the rename
to do a quick check to make sure a previous lock didn't get killed to cause
a premature rename? I'm guessing it won't help the situation other than
alerting us to the mess up.

—
You are receiving this because you commented.
Reply to this email directly or view it on GitHub
#26 (comment)

Shlomi Noach
Systems Engineer
GitHub

ggunson · 2016-05-04T16:41:14Z

The current implementation already makes it known should something go
wrong, since the release_lock would return with 0/null.

In my manual testing of just 4 connections:

GET_LOCK() (succeeds)
SELECT RELEASE_LOCK() FROM orig_table WHERE GET_LOCK()... (blocks)
REnAME TABLE orig_table TO ... (blocks)
KILL the GET_LOCK() connection

What seems to return from the SELECT RELEASE_LOCK() FROM orig_table WHERE GET_LOCK()... seems to be different whether we're using MySQL 5.6 or 5.7. Though in both cases, the RENAME happens.

With 5.6, if the GET_LOCK() conn is killed, the SELECT RELEASE_LOCK() returns a 1. In 5.7, if the GET_LOCK() is killed, the SELECT returns the empty set. And it looks like @shlomi-noach is using mysql1v, which is 5.7.

I know with 5.7 they deal with locks differently (now as metadata locks), so maybe that's why. But I'm not sure how to get, with either version, SELECT RELEASE_LOCK() to return a 0 or NULL. But I am likely missing a test in here.

shlomi-noach · 2016-05-04T16:53:58Z

I'm not testing the lock on mysql1v. It's a replica so this algorithm does
not apply. I haven't tested on a production master yet.

On Wednesday, May 4, 2016, Gillian Gunson notifications@github.com wrote:

The current implementation already makes it known should something go
wrong, since the release_lock would return with 0/null.

In my manual testing of just 4 connections:

GET_LOCK() (succeeds)

SELECT RELEASE_LOCK() FROM orig_table WHERE GET_LOCK()... (blocks)

REnAME TABLE orig_table TO ... (blocks)

KILL the GET_LOCK() connection

What seems to return from the SELECT RELEASE_LOCK() FROM orig_table WHERE
GET_LOCK()... seems to be different whether we're using MySQL 5.6 or 5.7.
Though in both cases, the RENAME happens.

With 5.6, if the GET_LOCK() conn is killed, the SELECT RELEASE_LOCK()
returns a 1. In 5.7, if the GET_LOCK() is killed, the SELECT returns the
empty set. And it looks like @shlomi-noach
https://github.com/shlomi-noach is using mysql1v, which is 5.7.

I know with 5.7 they deal with locks differently (now as metadata locks),
so maybe that's why. But I'm not sure how to get, with either version, SELECT
RELEASE_LOCK() to return a 0 or NULL. But I am likely missing a test in
here.

—
You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#26 (comment)

Shlomi Noach
Systems Engineer
GitHub

ggunson · 2016-05-04T17:19:30Z

I'm not testing the lock on mysql1v.

Okay, I guess my points from my last comment were:

In my testing I couldn't get select release_lock()... to return 0 or null (other than the 0 from giving it a short duration in the WHERE get_lock() so it timed out), so I was wondering how you were testing it to get a 0/null.
General comment about how locks behave differently in 5.7 compared to 5.6 and that may be something we have to deal with if using them for the swapping part.

shlomi-noach · 2016-05-04T17:26:27Z

I was intentionally injecting the wrong lock name into the release_lock and
so, based on that name not being taken by the connection, it returned NULL,
which the query converts to 0.

Elaborating for the uninitiated: we are using voluntary locks, which are
owned by connections. When a connection closes, expected or unexpectedly,
what locks it held are released. Any attempt to release_lock that is either
not yours or does not exist, results with a NULL. I haven't tested this on
5.7 and wasn't aware the bahviour might change. Will further test with 5.7.

On Wednesday, May 4, 2016, Gillian Gunson notifications@github.com wrote:

I'm not testing the lock on mysql1v.

Okay, I guess my points from my last comment were:

In my testing I couldn't get select release_lock()... to return 0 or
null (other than the 0 from giving it a short duration in the WHERE
get_lock() so it timed out), so I was wondering how you were testing
it to get a 0/null.
2.

General comment about how locks behave differently in 5.7 compared to
5.6 and that may be something we have to deal with if using them for the
swapping part.

—
You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#26 (comment)

Shlomi Noach
Systems Engineer
GitHub

ggunson · 2016-05-04T17:31:44Z

@shlomi-noach I'm going to drop out of this conversation, because

I was intentionally injecting the wrong lock name into the release_lock
above

and

SELECT RELEASE_LOCK('ding') FROM tbl WHERE GET_LOCK('ding', 999999)>=0 LIMIT 1;
(from your blog post)

are contradictory, and there's no point my trying to help if I'm testing the wrong behaviour.

shlomi-noach · 2016-05-04T17:49:59Z

@ggunson sorry, please give me a couple hours till I can explain more
thoroughly after I've put girls to sleep.

On Wednesday, May 4, 2016, Gillian Gunson notifications@github.com wrote:

@shlomi-noach https://github.com/shlomi-noach I'm going to drop out of
this conversation, because

I was intentionally injecting the wrong lock name into the release_lock
above

and

SELECT RELEASE_LOCK('ding') FROM tbl WHERE GET_LOCK('ding', 999999)>=0
LIMIT 1;
(from your blog post)

are contradictory, and there's no point my trying to help if I'm testing
the wrong behaviour.

—
You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#26 (comment)

Shlomi Noach
Systems Engineer
GitHub

shlomi-noach · 2016-05-04T17:52:15Z

What does strike me as an enlightenment though is that we are not talking
about the same release_lock(). But again, please allow for a couple hours.

On Wednesday, May 4, 2016, Shlomi Noach shlomi-noach@github.com wrote:

@ggunson sorry, please give me a couple hours till I can explain more
thoroughly after I've put girls to sleep.

On Wednesday, May 4, 2016, Gillian Gunson <notifications@github.com
javascript:_e(%7B%7D,'cvml','notifications@github.com');> wrote:

@shlomi-noach https://github.com/shlomi-noach I'm going to drop out of
this conversation, because

I was intentionally injecting the wrong lock name into the release_lock
above

and

SELECT RELEASE_LOCK('ding') FROM tbl WHERE GET_LOCK('ding', 999999)>=0
LIMIT 1;
(from your blog post)

are contradictory, and there's no point my trying to help if I'm testing
the wrong behaviour.

—
You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#26 (comment)

Shlomi Noach
Systems Engineer
GitHub

Shlomi Noach
Systems Engineer
GitHub

shlomi-noach · 2016-05-04T18:37:52Z

What seems to return from the SELECT RELEASE_LOCK() FROM orig_table WHERE GET_LOCK()... seems to be different whether we're using MySQL 5.6 or 5.7. Though in both cases, the RENAME happens.

The result of this query is actually ignored and not interesting to the algorithm. For now, please just ignore completely the fact that there's a RELEASE_LOCK in this query. The only reason it's in there is in case we need to make retries of the entire operation (and this makes for a cleanup, or disowning of the lock). But for now, please let's concentrate on a single iteration.

Recall that connection #1 issues a GET_LOCK('ding', 0). Also recall that when we're satisfied that we have applied the last of events, we use same connection #1 to RELEASE_LOCK('ding'), which leads to the completion of connection #2's query, which leads to the unblocking of the RENAME.
Throughout this thread, when I was mentioning RELEASE_LOCK() it was this one. Not the RELEASE_LOCK found in SELECT RELEASE_LOCK(...) FROM orig_table....

I hope this clarifies some confusion throughout this thread.

It is the result of that RELEASE_LOCK() that we're checking to verify that connection #1 actually maintained the lock throughout the duration of the operation. If connection #1 loses the lock (or loses itself and dies), the RELEASE_LOCK "fails" (returns with NULL). To test that effect, I simulated in code the situation where the RELEASE_LOCK returns with NULL by hard-coding something like: RELEASE_LOCK('does-not-exist').
This is not the best test and I haven't simulated a KILL on connection #1 in between the GET_LOCK and the RELEASE_LOCK. I was only made aware of the potential problem by @ggunson 's comment a couple days ago and this needs to be tested as well.

Let's now complete the flow:

GET_LOCK succeeds
SELECT RELEASE_LOCK(...) FROM orig_table WHERE GET_LOCK(...) blocks
RENAME blocks
we are satisfied all rows are accounted for
We RELEASE_LOCK on connection mysqlbinlog reader proof of concept #1
The SELECT ... WHERE GET_LOCK() unblocks, starts executing. It actually acquires the lock. But we don't actually want it to hold the lock; we only wanted the query to block. This is why it issues a SELECT RELEASE_LOCK() FROM ..., so once this query unblocks, it gets AND immediately releases the lock. We never read the result from this query. We ignore it. The query is there to block the RENAME and we don't care about it any further.
The RENAME unblocks
🍕

If there's a reasonable error along the way, such as timeout on RENAME, we start everything again. The lock is known to be released because we release it as part of the flow.

The initial problem @ggunson refers to is the case where Connection #1 dies in between step #1 and step #4, in which case the lock gets released prematurely, leading to the SELECT executing, leading to the RENAME executing.

Actually, this is only half the danger. The same problem applies should connection #2 die. That's the connection issuing the SELECT. Because the RENAME blocks on said SELECT.

In both those cases we get a RENAME executing, potentially before all events in the binary log are accounted for.

I hope this clarifies better 😄

ggunson · 2016-05-04T19:33:47Z

Actually, this is only half the danger. The same problem applies should connection #2 die.

I stated both in the original comment:

However, if either the GET_LOCK() connection or the SELECT RELEASE_LOCK() connection gets killed, the RENAME happens.

shlomi-noach · 2016-05-04T19:35:33Z

I stated both in the original comment:

Indeed you have 😮
Doh

shlomi-noach · 2016-05-05T05:46:52Z

At this time I've fried my brains trying to solve either this loophole or find yet another atomic method of swapping tables. I'm going to have a brain massage and do other stuff in the hope something comes up.

shlomi-noach · 2016-05-09T10:03:34Z

I have several directions (all of them dead ends at this time) that I'll share.
However here's a mitigation to the problem depicted in this issue.

In this problem death of either two connections would lead to the premature RENAME.

We can statistically reduce the risk: no one says we only have to have a single blocking query and a single GET_LOCK. We can copy+paste those couple-connections such that we have n couples of connections, all blocking the RENAME.
That way, there would need to be n connections dying in order for the RENAME to complete prematurely. Death of n-1 connections would still have the RENAME blocked.

shlomi-noach · 2016-06-14T09:44:57Z

Thank you for spending the time reviewing this. I'm closing this issue as #65 came up, which I believe to solve the cut-over even in face of connection failure.

GENERATED column as part of PK: handling UPDATE scenario

shlomi-noach mentioned this issue May 9, 2016

Working towards an atomic table swap #32

Closed

shlomi-noach mentioned this issue Jun 6, 2016

Describing cut-over phase via UDF wait condition #50

Closed

shlomi-noach closed this as completed Jun 14, 2016

jbielick pushed a commit to jbielick/gh-ost that referenced this issue Jul 12, 2021

Merge pull request github#26 from openark/pk-change-to-json

f25eea1

GENERATED column as part of PK: handling UPDATE scenario

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Potential too-early-rename via killed connections #26

Potential too-early-rename via killed connections #26

ggunson commented May 3, 2016

shlomi-noach commented May 4, 2016

shlomi-noach commented May 4, 2016

tomkrouper commented May 4, 2016

shlomi-noach commented May 4, 2016

ggunson commented May 4, 2016

shlomi-noach commented May 4, 2016

ggunson commented May 4, 2016

shlomi-noach commented May 4, 2016

ggunson commented May 4, 2016

shlomi-noach commented May 4, 2016

shlomi-noach commented May 4, 2016

shlomi-noach commented May 4, 2016

ggunson commented May 4, 2016

shlomi-noach commented May 4, 2016

shlomi-noach commented May 5, 2016

shlomi-noach commented May 9, 2016

shlomi-noach commented Jun 14, 2016

Potential too-early-rename via killed connections #26

Potential too-early-rename via killed connections #26

Comments

ggunson commented May 3, 2016

shlomi-noach commented May 4, 2016

shlomi-noach commented May 4, 2016

tomkrouper commented May 4, 2016

shlomi-noach commented May 4, 2016

ggunson commented May 4, 2016

shlomi-noach commented May 4, 2016

ggunson commented May 4, 2016

shlomi-noach commented May 4, 2016

ggunson commented May 4, 2016

shlomi-noach commented May 4, 2016

shlomi-noach commented May 4, 2016

shlomi-noach commented May 4, 2016

ggunson commented May 4, 2016

shlomi-noach commented May 4, 2016

shlomi-noach commented May 5, 2016

shlomi-noach commented May 9, 2016

shlomi-noach commented Jun 14, 2016