New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
server, version: check and wait if cluster is incompatible #2695
Conversation
Signed-off-by: Neil Shen <overvenus@gmail.com>
[REVIEW NOTIFICATION] This pull request has not been approved. To complete the pull request process, please ask the reviewers in the list to review by filling The full list of commands accepted by this bot can be found here. Reviewer can indicate their review by submitting an approval review. |
pkg/version/check.go
Outdated
// TODO bump 5.2.0-alpha once PD releases. | ||
minPDVersion *semver.Version = semver.New("5.1.0-alpha") | ||
MinPDVersion *semver.Version = semver.New("5.1.0-alpha") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this version be updated to 5.2.0-alpha, ditto for MinTiKVVersion
cdc/server.go
Outdated
err = version.CheckClusterVersion( | ||
ctx, pdClient, pdEndpoint, security, errorTiKVIncompatible) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems the parameter errorTiKVIncompatible
can be removed from function CheckClusterVersion
Signed-off-by: Neil Shen <overvenus@gmail.com>
Signed-off-by: Neil Shen <overvenus@gmail.com>
Signed-off-by: Neil Shen <overvenus@gmail.com>
…into warn-incompatible
@@ -161,7 +161,7 @@ func (f factoryImpl) PdClient() (pd.Client, error) { | |||
|
|||
// TODO: we need to check all pd endpoint and make sure they belong to the same cluster. | |||
// See also: https://github.com/pingcap/ticdc/pull/2341#discussion_r673021305. | |||
err = version.CheckClusterVersion(ctx, pdClient, pdEndpoints[0], credential, true) | |||
err = version.CheckClusterVersion(ctx, pdClient, pdEndpoints[0], credential) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to check all PD endpoints to avoid failure when the first PD endpoint is unreachable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#2713 implements your suggestion.
// TODO bump 5.2.0-alpha once PD releases. | ||
minPDVersion *semver.Version = semver.New("5.1.0-alpha") | ||
// MinPDVersion is the version of the minimal compatible PD. | ||
MinPDVersion *semver.Version = semver.New("5.2.0-alpha") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we remove all the version-related global variables? I think we can check whether the major/minor version matched TiKV/PD/TiCDC.
// Check cluster version and wait if it's incompatible. | ||
// We start status server first to not block tiup cluster upgrading. | ||
checkAndWaitClusterVersion(ctx, s.pdClient, s.pdEndpoints, conf.Security) | ||
|
||
return s.run(ctx) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently /status
API is available after status server starts, however #2691 is considering to change this behavior, which means the /status
API will be available after capture info is persisted to etcd.
if err == nil { | ||
break | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two scenarios have logic conflicts here.
- More than 3 PDs in upstream, one of them is down, TiCDC can start up, which is the HA feature, in this scenario TiCDC only checks one PD.
- Upstream upgrade, should TiCDC wait all PDs upgrade to the required version, then TiCDC should check all PDs.
@overvenus: PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Does this pr need to push? |
What problem does this PR solve?
Check and wait if cluster incompatible before starting CDC.
Check List
Tests
Related changes
Release note