Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pump client: increase retry time, and refine some code #158

Merged
merged 31 commits into from Jan 10, 2019

Conversation

@WangXiangUSTC
Copy link
Member

commented Jan 2, 2019

What problem does this PR solve?

the retry time interval is too small, may cause no available pump error when update the tidb cluster.
issue: https://internal.pingcap.net/jira/browse/TOOL-772

What is changed and how it works?

  1. use write binlog timeout config set in tidb
  2. when all pumps write binlog failed, use all the unavailable pump to try again
  3. handle error for watch etcd
  4. refine log and some code

Check List

Tests

  • Unit test
@WangXiangUSTC

This comment has been minimized.

Copy link
Member Author

commented Jan 2, 2019

/run-all-tests

@zhouqiang-cl

This comment has been minimized.

Copy link
Member

commented Jan 2, 2019

/rebuild

@GregoryIan

This comment has been minimized.

Copy link
Collaborator

commented Jan 3, 2019

@july2993 PTAL

@WangXiangUSTC WangXiangUSTC added status/WIP and removed status/PTAL labels Jan 3, 2019
}

// setPumpAvaliable set pump's isAvaliable, and modify UnAvaliablePumps or AvaliablePumps.
func (c *PumpsClient) setPumpAvaliable(pump *PumpStatus, avaliable bool) {
pump.IsAvaliable = avaliable
if pump.IsAvaliable {
err := pump.createGrpcClient(c.Security)

This comment has been minimized.

Copy link
@GregoryIan

GregoryIan Jan 3, 2019

Collaborator

nice modification

@WangXiangUSTC

This comment has been minimized.

Copy link
Member Author

commented Jan 4, 2019

/run-all-tests

@WangXiangUSTC

This comment has been minimized.

Copy link
Member Author

commented Jan 7, 2019

/run-all-tests

}

// NewPumpStatus returns a new PumpStatus according to node's status.
func NewPumpStatus(status *node.Status, security *tls.Config) *PumpStatus {
pumpStatus := &PumpStatus{}
pumpStatus.Status = *status
pumpStatus.IsAvaliable = (status.State == node.Online)
pumpStatus.security = security

if status.State != node.Online {
return pumpStatus

This comment has been minimized.

Copy link
@GregoryIan

GregoryIan Jan 9, 2019

Collaborator

need to add ErrNum?

This comment has been minimized.

Copy link
@GregoryIan

GregoryIan Jan 9, 2019

Collaborator

I don't understand definition of pump available, is pump unavailable after pump fail to write binlog?

This comment has been minimized.

Copy link
@WangXiangUSTC

WangXiangUSTC Jan 9, 2019

Author Member

all use pump.IsUsable() now

This comment has been minimized.

Copy link
@WangXiangUSTC

WangXiangUSTC Jan 9, 2019

Author Member

errNum default is 0

@WangXiangUSTC

This comment has been minimized.

Copy link
Member Author

commented Jan 9, 2019

@GregoryIan @july2993 PTAL again

p.Lock()
defer p.Unlock()

atomic.StoreInt64(&p.ErrNum, 0)

This comment has been minimized.

Copy link
@GregoryIan

GregoryIan Jan 9, 2019

Collaborator

also put it into create grpc client?

@july2993

This comment has been minimized.

Copy link
Contributor

commented Jan 9, 2019

note, this case may be easy to happen
1, stop all pumps, all pump is update to paused
2, receive WriteBinlog, no pumps available( fall to the so called backoffWriteBinlog case)
3, pump start again, or it have started but you update pumps in Selector before receive WriteBinlog

} else {
err := p.createGrpcClient()
if err != nil {
atomic.AddInt64(&p.ErrNum, 1)

This comment has been minimized.

Copy link
@GregoryIan

GregoryIan Jan 9, 2019

Collaborator

put it into createGrpcClient

@july2993

This comment has been minimized.

Copy link
Contributor

commented Jan 9, 2019

add some log at unexpect case like in WriteBinlog fall back to backoffWriteBinlog case

@@ -117,6 +114,8 @@ type PumpsClient struct {

// binlog socket file path, for compatible with kafka version pump.
binlogSocket string

This comment has been minimized.

Copy link
@GregoryIan

GregoryIan Jan 9, 2019

Collaborator

RetryTime of PumpsClient is useless

This comment has been minimized.

Copy link
@WangXiangUSTC

WangXiangUSTC Jan 9, 2019

Author Member

can't remove now, because tools relay on tidb, and tidb relay on tools, this variable is used in tidb's test, if remove this, will build failed. I will add a comment for this variable.

@GregoryIan

This comment has been minimized.

Copy link
Collaborator

commented Jan 10, 2019

LGTM

@GregoryIan GregoryIan added status/LGT1 and removed status/PTAL labels Jan 10, 2019
Copy link
Contributor

left a comment

LGTM

@july2993 july2993 added status/LGT2 and removed status/LGT1 labels Jan 10, 2019
@WangXiangUSTC WangXiangUSTC merged commit 15b677a into pingcap:master Jan 10, 2019
2 checks passed
2 checks passed
continuous-integration/jenkins/pr-head This commit looks good
Details
continuous-integration/jenkins/pr-merge This commit looks good
Details
@WangXiangUSTC WangXiangUSTC deleted the WangXiangUSTC:xiang/refine_pump_client branch Jan 10, 2019
WangXiangUSTC added a commit to WangXiangUSTC/tidb-tools that referenced this pull request Mar 25, 2019
GregoryIan added a commit that referenced this pull request Mar 29, 2019
… strategy config (#223)

* pump client: compatible with kafka version tidb-binlog && add unit test (#139)

* pump client: write commit binlog will never return error (#148)

* pkg watcher: move watcher from tidb-enterprise-tools (#146)

* pump client: increase retry time, and refine some code (#158)

* pump client: add initial log function (#165)

* pump client: support change select pump's strategy (#221)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.