-
Notifications
You must be signed in to change notification settings - Fork 190
Reduce gateway failover detection interval #348
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🤖 Created branch: z_pr348/sridhargaddam/leader-interval |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should make those parameters configurable.
I'm afraid this could eventually generate spurious fail-overs in busy clusters, so may be it's better to make this configurable (more env vars...)
Eventually I think we should support a configmap.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's safer to make this configurable, probably with the low defaults you're suggesting, but being safer because I'm not sure of the impact in busy k8s.
In an HA setup where multiple nodes are labelled as Gateway nodes, when the active SubmarinerEngine/Node goes down, it takes about 15+ seconds for one of the backup Submariner Gateway PODs to become active. This patch makes the leader election values configurable with some defaults that reduces this interval. Signed-off-by: Sridhar Gaddam <sgaddam@redhat.com>
Done |
ff4bc8c
to
9334c3f
Compare
🤖 Updated branch: z_pr348/sridhargaddam/leader-interval |
@@ -43,6 +43,19 @@ func init() { | |||
flag.StringVar(&localMasterURL, "master", "", "The address of the Kubernetes API server. Overrides any value in kubeconfig. Only required if out-of-cluster.") | |||
} | |||
|
|||
type leaderConfig struct { | |||
LeaseDuration int64 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can also annotate with default:"5"
and the module will handle it, but I'm fin with it now, we need to release.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I tried it and it did not seem to work when I tested it. So, I modified it this way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, ok, makes sense then. I guess the lib is buggy
err := envconfig.Process(leadershipConfigEnvPrefix, &gwLeadershipConfig) | ||
if err != nil { | ||
klog.Fatalf("error processing environment config for %s: %v", leadershipConfigEnvPrefix, err) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we should start using a ConfigMap for this type of config ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are many params that can be moved to a ConfigMap. Is it ok if we take it up in a separate patch?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it’s something we should do, and also the secrets. There’s an open issue for that. May be something we can do after this release
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
🤖 Closed branches: [z_pr348/sridhargaddam/leader-interval] |
In an HA setup where multiple nodes are labelled as Gateway
nodes, when the active SubmarinerEngine/Node goes down, it takes
about 15+ seconds for one of the backup Submariner Gateway PODs
to become active. This patch makes the leader election values
configurable along with some defaults that reduce this interval.
Signed-off-by: Sridhar Gaddam sgaddam@redhat.com