Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

txn: protect primary locks of pessimistic transactions from being collapsed #5575

Merged
merged 33 commits into from Oct 17, 2019

Conversation

@sticnarf
Copy link
Contributor

sticnarf commented Oct 4, 2019

What have you changed?

We can't prevent stale acquire_pessimistic_lock by WriteConflict. It may break atomicity if the rollback record of the primary lock is collapsed and TiKV receives stale acquire_pessimistic_lock and prewrite, the transaction will commit successfully even if secondary locks are rolled back.

To solve the problem, we can prevent the rollback records of the primary key of pessimistic transactions from being collapsed. To implement this, we set the short value of these rollback records to "protected".

What is the type of the changes?

Bugfix (a change which fixes an issue)

How is the PR tested?

  • Unit test
  • Integration test

Does this PR affect documentation (docs) or should it be mentioned in the release notes?

No

Does this PR affect tidb-ansible?

No

Refer to a related PR or issue link (optional)

Benchmark result if necessary (optional)

sysbench 32 tables, size = 100, 64 threads (to create many conflicts)

master:

SQL statistics:
    queries performed:
        read:                            0
        write:                           50244
        other:                           25418
        total:                           75662
    transactions:                        11722  (19.41 per sec.)
    queries:                             75662  (125.29 per sec.)
    ignored errors:                      1974   (3.27 per sec.)
    reconnects:                          0      (0.00 per sec.)

Throughput:
    events/s (eps):                      19.4099
    time elapsed:                        603.9173s
    total number of events:              11722

Latency (ms):
         min:                                   48.94
         avg:                                 3286.51
         max:                                31678.28
         95th percentile:                     9799.46
         sum:                             38524528.26

This PR:

SQL statistics:
    queries performed:
        read:                            0
        write:                           49887
        other:                           25257
        total:                           75144
    transactions:                        11626  (19.22 per sec.)
    queries:                             75144  (124.21 per sec.)
    ignored errors:                      2005   (3.31 per sec.)
    reconnects:                          0      (0.00 per sec.)

Throughput:
    events/s (eps):                      19.2173
    time elapsed:                        604.9753s
    total number of events:              11626

Latency (ms):
         min:                                   41.19
         avg:                                 3315.87
         max:                                27865.45
         95th percentile:                    10158.80
         sum:                             38550318.92

No significant difference.

Any examples? (optional)

Signed-off-by: Yilin Chen <sticnarf@gmail.com>
@sticnarf

This comment has been minimized.

Copy link
Contributor Author

sticnarf commented Oct 4, 2019

/run-integration-tests

sticnarf added 2 commits Oct 7, 2019
Signed-off-by: Yilin Chen <sticnarf@gmail.com>
Signed-off-by: Yilin Chen <sticnarf@gmail.com>
@zhouqiang-cl

This comment has been minimized.

Copy link
Member

zhouqiang-cl commented Oct 7, 2019

/run-all-tests

@tikv tikv deleted a comment from sre-bot Oct 7, 2019
@sticnarf sticnarf marked this pull request as ready for review Oct 7, 2019
@sticnarf

This comment has been minimized.

Copy link
Contributor Author

sticnarf commented Oct 7, 2019

@sticnarf sticnarf removed the S: WIP label Oct 7, 2019
src/storage/mvcc/write.rs Outdated Show resolved Hide resolved
src/storage/mvcc/write.rs Outdated Show resolved Hide resolved
src/storage/mvcc/write.rs Outdated Show resolved Hide resolved
src/storage/txn/process.rs Outdated Show resolved Hide resolved
Signed-off-by: Yilin Chen <sticnarf@gmail.com>
@sticnarf sticnarf force-pushed the sticnarf:protect-rollback branch from 650f370 to c89b367 Oct 8, 2019
Copy link
Contributor

MyonKeminta left a comment

LGTM. @coocood I think maybe we need your review too.

Copy link
Contributor

coocood left a comment

If the key does not exist on CleanUp, then the key must not be the primary key of a pessimistic transaction.

Copy link
Contributor

coocood left a comment

If the key does not exist on CleanUp, then the key must not be the primary key of a pessimistic transaction.

@sticnarf

This comment has been minimized.

Copy link
Contributor Author

sticnarf commented Oct 8, 2019

If the key does not exist on CleanUp, then the key must not be the primary key of a pessimistic transaction.

Good point. Then we needn't make it protected when there is no existing lock.

@coocood

This comment has been minimized.

Copy link
Contributor

coocood commented Oct 8, 2019

We also need to evaluate the performance impact on the pessimistic transactions.

@coocood

This comment has been minimized.

Copy link
Contributor

coocood commented Oct 8, 2019

Theoretically, the impact is very small because CleanUp is rarely called for the pessimistic transactions even under high contention.

@@ -20,6 +20,8 @@ const FLAG_DELETE: u8 = b'D';
const FLAG_LOCK: u8 = b'L';
const FLAG_ROLLBACK: u8 = b'R';

const FLAG_PROTECTED: u8 = b'p';

This comment has been minimized.

Copy link
@zhangjinpeng1987

zhangjinpeng1987 Oct 8, 2019

Member

I don't think it is a good idea to add such a field for write.

This comment has been minimized.

Copy link
@coocood

coocood Oct 8, 2019

Contributor

We can put a short value to rollback key without the flag.

This comment has been minimized.

Copy link
@coocood

coocood Oct 8, 2019

Contributor

It's a compatible change.

This comment has been minimized.

Copy link
@zhangjinpeng1987

zhangjinpeng1987 Oct 8, 2019

Member

Version control is needed.

This comment has been minimized.

Copy link
@coocood

coocood Oct 8, 2019

Contributor

We never read the value of a Rollback key, so it's safe to put a value in it.

This comment has been minimized.

Copy link
@zhangjinpeng1987

zhangjinpeng1987 Oct 8, 2019

Member

For example, in collapse_prev_rollback, it will call seek_write to check if the previous commit is rollback and extract the ts.

This comment has been minimized.

Copy link
@sticnarf

sticnarf Oct 8, 2019

Author Contributor

It's workable but a bit tricky. I can change the PR to @coocood 's plan. But in the end we need something like version control.

This comment has been minimized.

Copy link
@zhangjinpeng1987

zhangjinpeng1987 Oct 8, 2019

Member

I mean remove the FLAG_PROTECTED and put a value under SHORT_VALUE_PREFIX flag.

You mean reuse short value flag for Rollback, emm, it works but weird. Maybe the value set as "protected" looks a little comfortable.

This comment has been minimized.

Copy link
@breeswish

breeswish Oct 12, 2019

Member

Is the short value only used for compatibility consideration?

This comment has been minimized.

Copy link
@sticnarf

sticnarf Oct 12, 2019

Author Contributor

Yes.

@zhangjinpeng1987

This comment has been minimized.

Copy link
Member

zhangjinpeng1987 commented Oct 8, 2019

I think this pr need version control, for example leader write a rollback with protected field, and follower is still old version of TiKV, if the leader transfer to these old version of TiKV, panic will happen.

Signed-off-by: Yilin Chen <sticnarf@gmail.com>
Copy link
Contributor

youjiali1995 left a comment

LGTM

Signed-off-by: Yilin Chen <sticnarf@gmail.com>
@sticnarf

This comment has been minimized.

Copy link
Contributor Author

sticnarf commented Oct 16, 2019

Sorry I just push a new commit to fix some comments in tests. Approves are dismissed.

PTAL again @youjiali1995 @coocood
And ping @zhangjinpeng1987 and @MyonKeminta for review

// is received after a cleanup command.
// During cleanup, when there is no lock, it cannot be the primary key
// of a pessimistic transaction, so the rollback needn't be protected.
let write = Write::new_rollback(ts, false);

This comment has been minimized.

Copy link
@MyonKeminta

MyonKeminta Oct 16, 2019

Contributor

I'm in doubt about the correctness to set this non-protected. Rest LGTM.

This comment has been minimized.

Copy link
@AndreMouche

AndreMouche Oct 16, 2019

Member

It is ok because if is no Lock exist here, then the corresponding transaction must not a pessimistic transaction's primary.

This comment has been minimized.

Copy link
@sticnarf

sticnarf Oct 16, 2019

Author Contributor

I've update the comments here to:

Pessimistic transactions prewrite successfully only if all its pessimistic locks exist. So collapsing the rollback of a pessimistic lock is safe. After a pessimistic transaction acquires all its locks, it is impossible that neither a lock nor a write record is found. Therefore, we don't need to protect the rollback here.

Do you think it's OK? @MyonKeminta

Copy link
Member

AndreMouche left a comment

Rest LGTM

// is received after a cleanup command.
// During cleanup, when there is no lock, it cannot be the primary key
// of a pessimistic transaction, so the rollback needn't be protected.
let write = Write::new_rollback(ts, false);

This comment has been minimized.

Copy link
@AndreMouche

AndreMouche Oct 16, 2019

Member

It is ok because if is no Lock exist here, then the corresponding transaction must not a pessimistic transaction's primary.

@@ -20,6 +20,9 @@ const FLAG_DELETE: u8 = b'D';
const FLAG_LOCK: u8 = b'L';
const FLAG_ROLLBACK: u8 = b'R';

/// The short value for rollback records which are protected from being collapsed.
const PROTECTED_ROLLBACK_SHORT_VALUE: &[u8] = b"protected";

This comment has been minimized.

Copy link
@AndreMouche

AndreMouche Oct 16, 2019

Member

Maybe 1 byte is enough?

Signed-off-by: Yilin Chen <sticnarf@gmail.com>
Copy link
Member

AndreMouche left a comment

LGTM

Copy link
Contributor

MyonKeminta left a comment

LGTM

@MyonKeminta

This comment has been minimized.

Copy link
Contributor

MyonKeminta commented Oct 17, 2019

/run-integration-common-test

@sre-bot

This comment has been minimized.

Copy link
Collaborator

sre-bot commented Oct 17, 2019

/run-all-tests

@sre-bot sre-bot merged commit 0e74e1e into tikv:master Oct 17, 2019
6 checks passed
6 checks passed
DCO All commits are signed off!
Details
idc-jenkins-ci-tikv/integration-common-test Jenkins job succeeded.
Details
idc-jenkins-ci-tikv/integration-compatibility-test Jenkins job succeeded.
Details
idc-jenkins-ci-tikv/integration-cop-push-down-test Jenkins job succeeded.
Details
idc-jenkins-ci-tikv/integration-ddl-test Jenkins job succeeded.
Details
idc-jenkins-ci/test Jenkins job succeeded.
Details
@sre-bot

This comment has been minimized.

Copy link
Collaborator

sre-bot commented Oct 17, 2019

cherry pick to release-3.0 failed

@sre-bot

This comment has been minimized.

Copy link
Collaborator

sre-bot commented Oct 17, 2019

cherry pick to release-3.1 failed

sticnarf added a commit to sticnarf/tikv that referenced this pull request Oct 17, 2019
…lapsed (tikv#5575)

Signed-off-by: Yilin Chen <sticnarf@gmail.com>
sticnarf added a commit to sticnarf/tikv that referenced this pull request Oct 27, 2019
…lapsed (tikv#5575)

Signed-off-by: Yilin Chen <sticnarf@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.