Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

store: upgrade the CheckTxnStatus API #13123

Merged
merged 18 commits into from Nov 7, 2019

Conversation

tiancaiamao
Copy link
Contributor

@tiancaiamao tiancaiamao commented Nov 4, 2019

What problem does this PR solve?

CheckTxnStatus introduces a non-block read mode.
In this mode, TiDB can ignore the secondary lock TTL check and send the CheckTxnStatus request.

What is changed and how it works?

The 2PC prewrite operation writes the primary and the secondary regions concurrently.
When the CheckTxnStatus request arrives, the primary key lock may be still ongoing (secondary locks exist while the primary lock doesn't).
We can back off and retry CheckTxnStatus in that case.

  • update kvproto
  • add a noWait parameter to the ResolveLocks() function
  • handle TxnNotFound error and retry in getTxnStatusFromLock()
  • handle the RollbackIfNotExist field of the CheckTxnStatus request in the mocktikv

Check List

Tests

  • Unit test

Code changes

  • Has exported function/method change

Side effects

  • Increased code complexity

CheckTxnStatus introduces a non-block read mode. In this mode, TiDB can ignore
the secondary lock TTL check and send the CheckTxnStatus request.
@tiancaiamao
Copy link
Contributor Author

PTAL @coocood @MyonKeminta @cfzjywxk

@codecov
Copy link

codecov bot commented Nov 4, 2019

Codecov Report

Merging #13123 into master will decrease coverage by 0.6806%.
The diff coverage is 75.8064%.

@@               Coverage Diff                @@
##             master     #13123        +/-   ##
================================================
- Coverage   80.8733%   80.1926%   -0.6807%     
================================================
  Files           469        469                
  Lines        115012     111519      -3493     
================================================
- Hits          93014      89430      -3584     
- Misses        15054      15197       +143     
+ Partials       6944       6892        -52

@coocood
Copy link
Member

coocood commented Nov 5, 2019

We don't need noWait parameter, it should always be true.

store/tikv/lock_resolver.go Outdated Show resolved Hide resolved
store/tikv/lock_resolver.go Outdated Show resolved Hide resolved
@tiancaiamao
Copy link
Contributor Author

We don't need noWait parameter, it should always be true.

Removed. @coocood

Note, we have no switch, so after this PR the resolve lock logic is changed.
It affects not only the new large transaction feature but also the old code!

@tiancaiamao
Copy link
Contributor Author

/run-all-tests

store/tikv/lock_resolver.go Outdated Show resolved Hide resolved
store/tikv/lock_resolver.go Outdated Show resolved Hide resolved
@@ -919,7 +919,7 @@ func rollbackKey(db *leveldb.DB, batch *leveldb.Batch, key []byte, startTS uint6
return nil
}

func rollbackLock(batch *leveldb.Batch, lock mvccLock, key []byte, startTS uint64) error {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This parameter is not used in the function, so I remove it

@tiancaiamao
Copy link
Contributor Author

PTAL @coocood @MyonKeminta @cfzjywxk

return lr.getTxnStatus(bo, l.TxnID, l.Primary, callerStartTS, currentTS)

rollbackIfNotExist := false
for {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is already a for loop in getTxnStatus why not handle retry in getTxnStatus directly?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To make the unit test easier.

}
// Call getTxnStatusFromLock to cover the retry logic.
status, err := resolver.getTxnStatusFromLock(bo, lock, currentTS)
c.Assert(err, IsNil)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this time, the lock may still not written, we can only be sure it is written after <-errCh.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The lock may still not written, and getTxnStatusFromLock should retry in that case, so we cover the retry logic.
The lock TTL is set to 10s, long enough for the goroutine to finish, the result should be stable.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or we add some sleep before errCh <- xxx to ensure covering the retry path

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or we add some sleep before errCh <- xxx to ensure covering the retry path

So as to make the CI slower?

@tiancaiamao
Copy link
Contributor Author

/run-all-tests

@tiancaiamao
Copy link
Contributor Author

/rebuild

@coocood
Copy link
Member

coocood commented Nov 6, 2019

LGTM

@coocood
Copy link
Member

coocood commented Nov 6, 2019

@MyonKeminta @youjiali1995 PTAL

Copy link
Contributor

@youjiali1995 youjiali1995 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@MyonKeminta MyonKeminta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly LGTM

@@ -663,22 +663,22 @@ func (s *testMVCCLevelDB) TestErrors(c *C) {
func (s *testMVCCLevelDB) TestCheckTxnStatus(c *C) {
s.mustPrewriteWithTTLOK(c, putMutations("pk", "val"), "pk", 5, 666)

ttl, commitTS, err := s.store.CheckTxnStatus([]byte("pk"), 5, 0, 666)
ttl, commitTS, err := s.store.CheckTxnStatus([]byte("pk"), 5, 0, 666, false)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you need to cover the case where the last parameter is true.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

store/tikv/lock_resolver.go Show resolved Hide resolved
Comment on lines +387 to +389
if _, ok := errors.Cause(err).(txnNotFoundErr); !ok {
return TxnStatus{}, err
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think here it needs comments to explain it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Contributor

@MyonKeminta MyonKeminta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@tiancaiamao
Copy link
Contributor Author

/run-all-tests

1 similar comment
@tiancaiamao
Copy link
Contributor Author

/run-all-tests

@tiancaiamao
Copy link
Contributor Author

/run-unit-test

coocood
coocood previously approved these changes Nov 7, 2019
Copy link
Member

@coocood coocood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@tiancaiamao
Copy link
Contributor Author

/run-unit-test

@tiancaiamao
Copy link
Contributor Author

/run-all-tests

@tiancaiamao tiancaiamao added the require-LGT3 Indicates that the PR requires three LGTM. label Nov 7, 2019
@coocood coocood merged commit 60d4291 into pingcap:master Nov 7, 2019
@tiancaiamao tiancaiamao added status/LGT3 The PR has already had 3 LGTM. and removed require-LGT3 Indicates that the PR requires three LGTM. labels Nov 7, 2019
@tiancaiamao tiancaiamao deleted the update-check-txn-status branch November 7, 2019 11:09
XiaTianliang pushed a commit to XiaTianliang/tidb that referenced this pull request Dec 21, 2019
CheckTxnStatus introduces a non-block read mode. In this mode, TiDB can ignore
the secondary lock TTL check and send the CheckTxnStatus request.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/tikv sig/transaction SIG:Transaction status/LGT3 The PR has already had 3 LGTM.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants