Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Async commit does not ensure linearizability #8589

Closed
sticnarf opened this issue Sep 3, 2020 · 31 comments · Fixed by pingcap/tidb#20276
Closed

Async commit does not ensure linearizability #8589

sticnarf opened this issue Sep 3, 2020 · 31 comments · Fixed by pingcap/tidb#20276
Assignees
Labels
severity/major sig/transaction SIG: Transaction type/bug Type: Issue - Confirmed a bug

Comments

@sticnarf
Copy link
Contributor

sticnarf commented Sep 3, 2020

Bug Report

Read in the same session

Read with max u64 immediately after an async commit transaction. Because the lock in the previous transaction may be not committed yet, and locks can be ignored if we read with max u64. So we may not read the latest changes from the last async commit transaction.

Causal consistency across nodes

Example:

At first the values of k1 and k2 are all 0, they can locate in different regions.

We commit k1=1@100 at first. The max_read_ts of the k2 region leader is 50.

Then, we prewrite and commit k2=1@51 using async commit.

Now, a transaction with start_ts=80, reads k1 and k2 in the same transaction. We get k1=0, k2=1.

We break an external consistency that k2 commits later than k1.

Affected version

master

@sticnarf sticnarf added the sig/transaction SIG: Transaction label Sep 3, 2020
@sticnarf sticnarf added this to To do in Async Commit via automation Sep 3, 2020
@sticnarf
Copy link
Contributor Author

sticnarf commented Sep 3, 2020

For the first problem, we can change the check_ts_conflict function. If the lock is from an async commit transaction, we cannot ignore it even if the read ts is max u64.

@youjiali1995
Copy link
Contributor

youjiali1995 commented Sep 3, 2020

We commit k1=1@100 at first. The max_read_ts of the k2 region leader is 50.
Then, we prewrite and commit k2=1@51 using async commit.

It's impossible if transactions modify k1 and k2 are not concurrent. In your case, all transactions are concurrent so the order is OK.

I can't think of a case that async commit in TiDB can break external consistency. And because we get start ts from PD which can avoid causal reverse.

@sticnarf
Copy link
Contributor Author

sticnarf commented Sep 3, 2020

@youjiali1995
A more concrete example:

  1. An external client starts two sessions A and B.
  2. Session A starts T1, start_ts = 1.
  3. Session B starts T2, start_ts = 2.
  4. Client starts session C, it starts a transaction T3, start_ts = 80.
  5. Session A prewrites k1=1.
  6. Session A commits T1 with commit_ts = 100.
  7. Session B prewrites k2=1 using async commit, min_commit_ts = 51.
  8. Session B commits T2 with commit_ts = 51.
  9. Session C reads k1 and k2.

@sticnarf
Copy link
Contributor Author

sticnarf commented Sep 3, 2020

@youjiali1995
A more concrete example:

1. An external client starts two sessions A and B.

2. Session A starts T1, start_ts = 1.

3. Session B starts T2, start_ts = 2.

4. Client starts session C, it starts a transaction T3, start_ts = 80.

5. Session A prewrites k1=1.

6. Session A commits T1 with commit_ts = 100.

7. Session B prewrites k2=1 using async commit, min_commit_ts = 51.

8. Session B commits T2 with commit_ts = 51.

9. Session C reads k1 and k2.

Session B triggers commit after it receives the success from session A (between 6 and 7)

@MyonKeminta
Copy link
Contributor

I'm actually not very sure if we need to guarantee this. @coocood PTAL

@jackysp
Copy link
Contributor

jackysp commented Sep 3, 2020

Can Jepsen test find this out?

@youjiali1995
Copy link
Contributor

youjiali1995 commented Sep 3, 2020

I see. It's another kind of causal reverse. We can't avoid it but I think it's not very severe because it rarely happens. We can implement causality tokens to solve it which is passed between transactions if needed.

@sticnarf
Copy link
Contributor Author

sticnarf commented Sep 3, 2020

It's quite similar to the causal reverse crdb mentioned in their blog. So I think it can be detected by Jepsen too. @jackysp

Async Commit automation moved this from To do to Done Sep 3, 2020
@youjiali1995 youjiali1995 reopened this Sep 3, 2020
Async Commit automation moved this from Done to In progress Sep 3, 2020
@youjiali1995
Copy link
Contributor

youjiali1995 commented Sep 3, 2020

If a user depends on external consistency, we can disable async commit. Actually, we can't offer the true external consistency even with the timestamp oracle which requires serializability isolation.

@coocood
Copy link
Contributor

coocood commented Sep 3, 2020

This anomaly requires one client maintains two transactions at the same time, are there any existing application do it this way?

@sticnarf
Copy link
Contributor Author

sticnarf commented Sep 3, 2020

This anomaly requires one client maintains two transactions at the same time, are there any existing application do it this way?

Usually operations with causality are done in a single transaction. But maybe one client receives success and then inform another client to commit? I don't know...

But this is indeed a change in consistency guarantee which should be mentioned and make users aware of.

@coocood
Copy link
Contributor

coocood commented Sep 3, 2020

I can think of an example, an application A call service B inside a transaction, and service B do another transaction to get a value, then application A use that value to commit its transaction.

But this case application A and service B are separated components, the inconsistency should not matter much.

@cfzjywxk
Copy link
Collaborator

cfzjywxk commented Sep 3, 2020

Maybe there are some applications that will use this form to simulate some savepoint or nested transaction usages, which may start multiple transactions and decide the next step based on these results.

@youjiali1995
Copy link
Contributor

youjiali1995 commented Sep 3, 2020

If write-1, write-2 are write transactions and read-1 is the read transaction, this anomaly can exhibit only when:

  1. Write-1, write-2, and read-1 are concurrent which means if write-2 or read-1 begins after write-1 committing, it can't exhibit.
  2. The application depends on the order of write-1 and write-2. And the order should be guaranteed even in concurrent transactions.
  3. Write-1 and write-2 can't modify data on the same store(region exactly).
  4. Max ts on stores must meet the condition: write-2's commit ts < read-1's start ts < write-1's commit ts.

@gengliqi
Copy link
Member

gengliqi commented Sep 3, 2020

We can tell the users that the async commit does not fully ensure external consistency.

  1. Async commit ensures that any transaction can observe the effects of all transactions that committed before the start of it.
  2. Async commit doesn't ensure that the concurrent transactions' serial order is consistent with the order in which transactions can be observed to commit.

I think the first one is the most needed while the second one is not.

@gengliqi
Copy link
Member

gengliqi commented Sep 3, 2020

@youjiali1995
A more concrete example:

1. An external client starts two sessions A and B.

2. Session A starts T1, start_ts = 1.

3. Session B starts T2, start_ts = 2.

4. Client starts session C, it starts a transaction T3, start_ts = 80.

5. Session A prewrites k1=1.

6. Session A commits T1 with commit_ts = 100.

7. Session B prewrites k2=1 using async commit, min_commit_ts = 51.

8. Session B commits T2 with commit_ts = 51.

9. Session C reads k1 and k2.

Session B triggers commit after it receives the success from session A (between 6 and 7)

BTW, maybe we can get a ts from PD before starting prewrite phase, then we can carry it with the prewrite request to primary key and calculate the max(ts, max_read_ts from all keys)+1 as the commit ts. The order of A and B can be ensured in this case because the new ts from PD must be greater than the A's commit ts.(maybe equal?)
I am not sure if it is enough for any situation. I think it's enough. Maybe this change can be an option for users if they want truly external consistency.

@sticnarf
Copy link
Contributor Author

sticnarf commented Sep 3, 2020

@gengliqi Previously I talked with @5kbpers about this idea too.
I think maybe it's better to ensure external consistency by default because that's what we guarantee before.

@gengliqi
Copy link
Member

gengliqi commented Sep 4, 2020

@gengliqi Previously I talked with @5kbpers about this idea too.
I think maybe it's better to ensure external consistency by default because that's what we guarantee before.

I agree.

@jackysp
Copy link
Contributor

jackysp commented Sep 4, 2020

@gengliqi Previously I talked with @5kbpers about this idea too.
I think maybe it's better to ensure external consistency by default because that's what we guarantee before.

+1

@coocood
Copy link
Contributor

coocood commented Sep 4, 2020

@gengliqi Previously I talked with @5kbpers about this idea too.
I think maybe it's better to ensure external consistency by default because that's what we guarantee before.

So the latency increased by half of the async commit saved.

@sticnarf
Copy link
Contributor Author

sticnarf commented Sep 4, 2020

So the latency increased by half of the async commit saved.

Not that much. A commit operation requires raft replication. So if we also count the RPC from TiDB to TiKV, that's at least two RPCs in total. TSO has only one, and can be easily in batch.

@cfzjywxk
Copy link
Collaborator

cfzjywxk commented Sep 4, 2020

So the latency increased by half of the async commit saved.

Not that much. A commit operation requires raft replication. So if we also count the RPC from TiDB to TiKV, that's at least two RPCs in total. TSO has only one, and can be easily in batch.

How should we control this behavior, do we need to add another system variable for this?

@sticnarf
Copy link
Contributor Author

sticnarf commented Sep 4, 2020

How should we control this behavior, do we need to add another system variable for this?

Yes

@sticnarf
Copy link
Contributor Author

sticnarf commented Nov 2, 2020

It's a bug of async commit, which is a feature still under development. The feature is not enabled by default and it shouldn't affect any production user.

Lower its severity to major. But still mark it as a blocker of GA of async commit.

@cfzjywxk
Copy link
Collaborator

cfzjywxk commented Nov 2, 2020

It's a bug of async commit, which is a feature still under development. The feature is not enabled by default and it shouldn't affect any production user.

Lower its severity to major. But still mark it as a blocker of GA of async commit.

There is a still a problem if our goal is to enable this new feature by default for GA.

@coocood
Copy link
Contributor

coocood commented Nov 2, 2020

I think it's not a GA blocker, the causality between two on-going transactions is not something a normal application would depend on.

@nrc
Copy link
Contributor

nrc commented Nov 5, 2020

@sticnarf is this addressed by pingcap/tidb#20276 or is this a separate issue?

Never mind, looks like it is

Async Commit automation moved this from In progress to Done Nov 12, 2020
Question and Bug Reports automation moved this from Need Triage to Closed(This Week) Nov 12, 2020
@jiangyuzhao
Copy link
Contributor

方便问下这个提交跟之前的方案有什么区别咩?看起来主要想法依然是在commit之前从TSO获得一个时间戳来作为commit_ts,跟支持异步之前主要的区别是什么呢?

@sticnarf
Copy link
Contributor Author

sticnarf commented Dec 9, 2020

方便问下这个提交跟之前的方案有什么区别咩?看起来主要想法依然是在commit之前从TSO获得一个时间戳来作为commit_ts,跟支持异步之前主要的区别是什么呢?

之前获取 commit_ts 是在第一阶段提交(prewrite)之后获取,现在这是在 prewrite 之前获取。
从总延迟的角度看,以前是 prewrite + TSO + commit primary lock,现在缩减为 TSO + prewrite,少了一次提交 primary lock 的延迟。

@jiangyuzhao
Copy link
Contributor

方便问下这个提交跟之前的方案有什么区别咩?看起来主要想法依然是在commit之前从TSO获得一个时间戳来作为commit_ts,跟支持异步之前主要的区别是什么呢?

之前获取 commit_ts 是在第一阶段提交(prewrite)之后获取,现在这是在 prewrite 之前获取。
从总延迟的角度看,以前是 prewrite + TSO + commit primary lock,现在缩减为 TSO + prewrite,少了一次提交 primary lock 的延迟。

commit primary lock是指把提交信息写入一个中心么?我记得之前看Percolator的时候,这个commit其实只是本地检查lock列,然后写时间戳而已,感觉从论文上没有额外的RPC了,而且现在的方案也应该有这个流程,这个减少commit primary lock主要是指?

@sticnarf
Copy link
Contributor Author

sticnarf commented Dec 9, 2020

方便问下这个提交跟之前的方案有什么区别咩?看起来主要想法依然是在commit之前从TSO获得一个时间戳来作为commit_ts,跟支持异步之前主要的区别是什么呢?

之前获取 commit_ts 是在第一阶段提交(prewrite)之后获取,现在这是在 prewrite 之前获取。
从总延迟的角度看,以前是 prewrite + TSO + commit primary lock,现在缩减为 TSO + prewrite,少了一次提交 primary lock 的延迟。

commit primary lock是指把提交信息写入一个中心么?我记得之前看Percolator的时候,这个commit其实只是本地检查lock列,然后写时间戳而已,感觉从论文上没有额外的RPC了,而且现在的方案也应该有这个流程,这个减少commit primary lock主要是指?

RPC 数量没有减少,但以前需要等到提交 primary lock 后才能向用户返回成功。现在在 prewrite 结束时就能返回成功了。所以从用户的角度,少了一次 RPC 的延时。

在 issue 中提问题可能会对其他关注 issue 的人产生信息干扰,建议来我们 Slack channel 来讨论:https://slack.tidb.io/invite?team=tikv-wg&channel=sig-transaction&ref=community-sig

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
severity/major sig/transaction SIG: Transaction type/bug Type: Issue - Confirmed a bug
Projects
Async Commit
  
Done
Question and Bug Reports
  
Closed(This Week)
Development

Successfully merging a pull request may close this issue.

10 participants