New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

store/tikv: abort an unsafe transaction subject to gc command #4469

Merged
merged 37 commits into from Sep 21, 2017

Conversation

Projects
None yet
6 participants
@atmzhou
Contributor

atmzhou commented Sep 7, 2017

Whenever a transaction fetches some data, we check whether the cached safepoint is fresh. If it is out-of-date, we abort the transaction. If the start timestamp of the transaction falls behind the cached safepoint, we also abort the transaction.

@atmzhou atmzhou requested review from AndreMouche and disksing Sep 7, 2017

@CLAassistant

This comment has been minimized.

Show comment
Hide comment
@CLAassistant

CLAassistant Sep 7, 2017

CLA assistant check
All committers have signed the CLA.

CLAassistant commented Sep 7, 2017

CLA assistant check
All committers have signed the CLA.

@disksing disksing requested a review from tiancaiamao Sep 7, 2017

@coocood coocood added the contribution label Sep 7, 2017

Show outdated Hide outdated store/tikv/kv.go
Show outdated Hide outdated store/tikv/gc_worker.go
Show outdated Hide outdated store/tikv/gc_worker.go
Show outdated Hide outdated store/tikv/gc_worker.go
Show outdated Hide outdated store/tikv/gc_worker.go
Show outdated Hide outdated store/tikv/kv.go
Show outdated Hide outdated store/tikv/kv.go
Show outdated Hide outdated store/tikv/kv.go
Show outdated Hide outdated vendor
Show outdated Hide outdated store/tikv/safepoint_test.go

@coocood coocood removed the contribution label Sep 7, 2017

Show outdated Hide outdated store/tikv/coprocessor.go
Show outdated Hide outdated store/tikv/coprocessor.go
Show outdated Hide outdated store/tikv/coprocessor.go
Show outdated Hide outdated store/tikv/coprocessor.go
Show outdated Hide outdated store/tikv/coprocessor.go
Show outdated Hide outdated store/tikv/safepoint_test.go
Show outdated Hide outdated store/tikv/safepoint_test.go
Show outdated Hide outdated store/tikv/safepoint_test.go
Show outdated Hide outdated store/tikv/safepoint_test.go
Show outdated Hide outdated store/tikv/scan.go
Show outdated Hide outdated store/tikv/gc_worker.go
log.Infof("[gc worker] %s start.", w.uuid)
w.session = createSession(w.store)

This comment has been minimized.

@AndreMouche

AndreMouche Sep 11, 2017

Member

Why we need to createSession again after L101?

@AndreMouche

AndreMouche Sep 11, 2017

Member

Why we need to createSession again after L101?

This comment has been minimized.

@atmzhou

atmzhou Sep 14, 2017

Contributor

I have removed this line. Thanks.

@atmzhou

atmzhou Sep 14, 2017

Contributor

I have removed this line. Thanks.

Show outdated Hide outdated store/tikv/gc_worker.go
Show outdated Hide outdated store/tikv/gc_worker.go
Show outdated Hide outdated store/tikv/kv.go
@disksing

This comment has been minimized.

Show comment
Hide comment
@disksing

disksing Sep 12, 2017

Member

LGTM.

Member

disksing commented Sep 12, 2017

LGTM.

@@ -43,10 +44,37 @@ type GCWorker struct {
lastFinish time.Time
cancel goctx.CancelFunc
done chan error
session tidb.Session

This comment has been minimized.

@tiancaiamao

tiancaiamao Sep 12, 2017

Contributor

Data race may exists, I see both doGC and StartSafePointChecker goroutine call loadSafePoint related function, and they both use this session?

@tiancaiamao

tiancaiamao Sep 12, 2017

Contributor

Data race may exists, I see both doGC and StartSafePointChecker goroutine call loadSafePoint related function, and they both use this session?

This comment has been minimized.

@atmzhou

atmzhou Sep 14, 2017

Contributor

Thanks, it has been changed.

@atmzhou

atmzhou Sep 14, 2017

Contributor

Thanks, it has been changed.

cachedSafePoint := s.safePoint
cachedTime := s.spTime
s.spMutex.RUnlock()
diff := time.Since(cachedTime)

This comment has been minimized.

@tiancaiamao

tiancaiamao Sep 12, 2017

Contributor

It's possible that SafePointChecker can't load data in 10s...
Then this check make read data from TiKV totally unavailable, what's worse,
if CheckVisibility fail, we can't read and update safepoint,
if we can't update safepoint, CheckVisibility fail again...

@tiancaiamao

tiancaiamao Sep 12, 2017

Contributor

It's possible that SafePointChecker can't load data in 10s...
Then this check make read data from TiKV totally unavailable, what's worse,
if CheckVisibility fail, we can't read and update safepoint,
if we can't update safepoint, CheckVisibility fail again...

This comment has been minimized.

@atmzhou

atmzhou Sep 14, 2017

Contributor

Good Insight! I have resorted to ETCD for storing data.
Thanks!

@atmzhou

atmzhou Sep 14, 2017

Contributor

Good Insight! I have resorted to ETCD for storing data.
Thanks!

s.waitUntilErrorPlugIn(txn2.startTS)
_, geterr2 := txn2.Get(encodeKey(s.prefix, s08d("key", 0)))
c.Assert(geterr2, NotNil)

This comment has been minimized.

@tiancaiamao

tiancaiamao Sep 12, 2017

Contributor

Check error detail, not just error not nil.

@tiancaiamao

tiancaiamao Sep 12, 2017

Contributor

Check error detail, not just error not nil.

This comment has been minimized.

@atmzhou

atmzhou Sep 14, 2017

Contributor

Thanks, I have revised.

@atmzhou

atmzhou Sep 14, 2017

Contributor

Thanks, I have revised.

atmzhou added some commits Sep 14, 2017

Show outdated Hide outdated store/tikv/2pc.go
Show outdated Hide outdated store/tikv/2pc_test.go
Show outdated Hide outdated store/tikv/gc_worker.go
Show outdated Hide outdated store/tikv/gc_worker.go
Show outdated Hide outdated store/tikv/gc_worker.go
Show outdated Hide outdated store/tikv/gc_worker.go
Show outdated Hide outdated store/tikv/gc_worker.go
Show outdated Hide outdated store/tikv/gc_worker.go
Show outdated Hide outdated store/tikv/gc_worker.go
Show outdated Hide outdated store/tikv/gc_worker.go
@tiancaiamao

This comment has been minimized.

Show comment
Hide comment
@tiancaiamao

tiancaiamao Sep 18, 2017

Contributor

LGTM

Contributor

tiancaiamao commented Sep 18, 2017

LGTM

@disksing

This comment has been minimized.

Show comment
Hide comment
@disksing

disksing Sep 18, 2017

Member

LGTM. PTAL @AndreMouche

Member

disksing commented Sep 18, 2017

LGTM. PTAL @AndreMouche

Show outdated Hide outdated store/tikv/gc_worker.go
}
// Get implements the Get method for SafePointKV
func (w *MockSafePointKV) Get(k string) (string, error) {

This comment has been minimized.

@AndreMouche

AndreMouche Sep 18, 2017

Member

error seems redundant since it always return nil?

@AndreMouche

AndreMouche Sep 18, 2017

Member

error seems redundant since it always return nil?

This comment has been minimized.

@atmzhou

atmzhou Sep 18, 2017

Contributor

MockSafePointKV implements SafePointKV.
SafePointKV is also implemented by EtcdSafePointKV, which may return not nil.
So, the interface should be compatible with both implementations.

@atmzhou

atmzhou Sep 18, 2017

Contributor

MockSafePointKV implements SafePointKV.
SafePointKV is also implemented by EtcdSafePointKV, which may return not nil.
So, the interface should be compatible with both implementations.

ctx, cancel := goctx.WithTimeout(goctx.Background(), time.Second*5)
_, err := w.cli.Put(ctx, k, v)
cancel()
if err != nil {

This comment has been minimized.

@AndreMouche

AndreMouche Sep 18, 2017

Member

You can return errors.Trace(err) directly here

@AndreMouche

AndreMouche Sep 18, 2017

Member

You can return errors.Trace(err) directly here

This comment has been minimized.

@atmzhou

atmzhou Sep 18, 2017

Contributor

Thanks~

@atmzhou

atmzhou Sep 18, 2017

Contributor

Thanks~

Show outdated Hide outdated store/tikv/gc_worker.go
@AndreMouche

LGTM

@AndreMouche AndreMouche merged commit f9e48cb into master Sep 21, 2017

5 checks passed

ci/circleci Your tests passed on CircleCI!
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
coverage/coveralls First build on master at 72.352%
Details
jenkins-ci-tidb/build Jenkins job succeeded.
Details
license/cla Contributor License Agreement is signed.
Details

@AndreMouche AndreMouche deleted the ningnanzhou/checksafepoint branch Sep 21, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment