New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
etcd: Add initial support for an IPv6 control plane #1211
Conversation
This has been rebased. @runcom any thoughts on this? I think it should be safe. There shouldn't be any changes in behavior in the normal case. |
I think we need to be careful about IPv6 until we get kube 1.16.3 there is a known bug[1] that we have resolved in etcd via 3.3.17[2] but won't get into the client (apiserver) until 1.16.3. I believe this is a hole in CI as AWS and GCP use ipv4 to my knowledge. [1] kubernetes/kubernetes#83550 |
Thanks for the pointer to the bug. I haven't hit that one (yet?). re: CI, indeed, there's no IPv6 CI yet. That's on the todo list as soon as I can get enough of the cluster up and running. I've an IPv6 cluster coming up partially on AWS with a bunch of changes (including the ones in this PR). The etcd cluster forms using IPv6. Right now I'm working through cluster SDN issues. We'll have CI soon to help exercise this further. |
FTR 1.16.3 should land next week[1] so this won't be a prolonged issue for supporting IPv6. |
This...seems safer to do in 4.4 to me. But I know that's annoying as it makes development now harder. |
I'm agreeing with this actually, we cannot take this PR anyway at this point tho. It looks pretty safeto me tho so when 4.4 opens it can go in |
/skip |
/retest |
@hexfusion @alaypatel07 PTAL |
/skip |
I picked 3.3.17 client into apiserver so I believe we are good here as 1.16.3 won't actually land in 4.3. |
/hold etcd-team would like a little time to manually test this and review. We are moving on this now. |
@@ -22,6 +22,10 @@ contents: | | |||
--container-runtime=remote \ | |||
--container-runtime-endpoint=/var/run/crio/crio.sock \ | |||
--node-labels=node-role.kubernetes.io/master,node.openshift.io/os_id=${ID} \ | |||
{{- if .KubeletIPv6}} | |||
--node-ip :: \ | |||
--address :: \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet/
--address 0.0.0.0
The IP address for the Kubelet to serve on (set to 0.0.0.0 for all IPv4 interfaces and `::` for all IPv6 interfaces) (default 0.0.0.0) (DEPRECATED: This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing out that this was deprecated. I missed that.
@@ -22,6 +22,10 @@ contents: | | |||
--container-runtime=remote \ | |||
--container-runtime-endpoint=/var/run/crio/crio.sock \ | |||
--node-labels=node-role.kubernetes.io/master,node.openshift.io/os_id=${ID} \ | |||
{{- if .KubeletIPv6}} | |||
--node-ip :: \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet/
--node-ip string
IP address of the node. If set, kubelet will use this IP address for the node
What does it mean to set the node-ip to ::
?? will this break the apiserver to node communication?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It requires a not-yet-merged upstream patch to kubelet. https://github.com/kubernetes/kubernetes/pull/85850/files
@russellb general question, how can setup-etcd-env know |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@russellb Added comments inline. For context, we added changes to cluster-etcd-operator related to ipv6 recently, if possible can you change the naming convention of the env variables to something like this? openshift/cluster-etcd-operator@57825f6 this will help the etcd team in reading the code. Thanks
cmd/setup-etcd-environment/run.go
Outdated
"WILDCARD_DNS_NAME": fmt.Sprintf("*.%s", setupEnv.opts.discoverySRV), | ||
// TODO This can actually be IPv6, so we should rename this ... | ||
"IPV4_ADDRESS": setupEnv.etcdIP, | ||
"ESCAPED_IP_ADDR": escapedIP, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/ESCAPED_IP_ADDR/IP_ADDRESS
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
made it ESCAPED_IP_ADDRESS
// TODO This can actually be IPv6, so we should rename this ... | ||
"IPV4_ADDRESS": setupEnv.etcdIP, | ||
"ESCAPED_IP_ADDR": escapedIP, | ||
"ESCAPED_ALL_IPS": escapedAllIPs, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would really help readability if instead of ESCAPED_ALL_IPS
we have LISTEN_CLIENT_URLS
and LISTEN_PEER_URLS
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we'd need LISTEN_CLIENT_URLS, LISETN_PEER_URLS, LISTEN_METRIC_URLS, and METRICS_ADDR. sure you want 4 variables instead of this 1?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@russellb thanks for pointing out, I thought it would make the naming more explicit but having 4 variables instead of 1 would be an overdo. Feel free to mark it resolved.
"ESCAPED_IP_ADDR": escapedIP, | ||
"ESCAPED_ALL_IPS": escapedAllIPs, | ||
"LOCALHOST_IP": localhostIP, | ||
"ESCAPED_LOCALHOST_IP": escapedLocalhostIP, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
confused why we need both, is it possible to conditionally set LOCALHOST_IP and conditionally escape it if ipv6?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's kind of annoying, because we only want to use it in its escaped form in some cases (in a URL), but not when it's provided as a raw IP, that's why it's there twice
took a look at the PR and left a comment on one issue I spotted. If you don't like the ESCAPED versions of variables in the PR, I could move that logic into shell code instead, I suppose. |
as requested in during review of PR openshift#1211.
I've updated this PR to drop the kubelet config changes, which depended on a kubelet change to land first. We can address that with a follow-up PR. @danwinship I think this should be fine to merge now ... |
@alaypatel07 @hexfusion PTAL |
@russellb: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@russellb sorry for the delay in getting back and thanks for the PR, the changes look fine to me!
I reviewed the changes and looks good to me. /lgtm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks ok to me.
@alaypatel07 @retroflexer please feel free to remove @hexfusion 's hold. this is ready to go. |
To expedite, I'll just remove the hold myself. :) /hold cancel |
not sure why tide says it still needs and approved label when it clearly has one? it's kind of wonky lately. /approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: kikisdeliveryservice, retroflexer, russellb The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/retest Please review the full test history for this PR and help us cut down flakes. |
/cherry-pick release-4.3 |
@russellb: #1211 failed to apply on top of branch "release-4.3":
In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
well, not a clean cherry-pick, but I already have a 4.3 version of these changes in this branch: https://github.com/openshift-kni/machine-config-operator/commits/4.3-ipv6 |
as requested in during review of PR openshift#1211.
as requested in during review of PR openshift#1211.
I'm working on bringing OpenShift up with an IPv6 control plane. With these changes, the etcd cluster forms and the bootstrap process continues. There are still some suspicious messages in the etcd-member log, but this seems like a good start.
The changes should have no impact to other installs unless the etcd DNS records are created with IPv6 addresses instead of IPv4. Otherwise, there should be no changes in the resulting behavior.
This is not intended to be complete IPv6 support, just a start. There's an
IPV4_ADDRESS
variable I left unchanged to cut down on the size of the patch. I also noticed similar issues still need to be fixed in the recovery tools.