config/v1: add platform status for OpenStack #374

tomassedovic · 2019-07-11T10:45:36Z

The two fields OpenStack needs are the API and DNS VIPs which are
needed by the coredns and loadbalancer static pods running on the
nodes.

This is a manual workaround for testing until this PR merges: openshift/api#374

This is a temporary workaround to test PlatformStatus. It should be vendored properly once this PR merges: openshift/api#374

tomassedovic · 2019-07-11T17:01:59Z

/cc @coreydaley @ironcladlou @deads2k

I would really appreciate a quick review. We need this for OpenStack and since it means updating the installer & MCO deps, I'd rather have this merged sooner if at all possible.

coreydaley · 2019-07-11T17:10:25Z

I am not that familiar with OpenStack, so I can't speak to whether or not those are all of the needed fields, but the syntax of what you have there, to me, looks correct.

adambkaplan · 2019-07-11T18:02:35Z

config/v1/types_infrastructure.go

+	APIVIP string `json:"apiVip,omitempty`
+
+	// DNSVIP is the VRRP address for the internal cluster DNS.
+	DNSVIP string `json:"dnsVip,omitempty`


@tomassedovic can you point to what needs to consume these fields?

I didn't have time to put up proper PRs for this yet (plan to do that tomorrow), but this is how it's intended to be used (from my in-development branches I tested today, the IPs are hardcoded, etc. for now):

tomassedovic/openshift-installer@74bff07

openshift/machine-config-operator@0490406#diff-401a0f8b1d8edd607fbaf7c85beba53a

This is more or less analogous to what @rgolangh is doing with oVirt: #369 and I believe the baremetal IPI folks will want to do in their case too. We need these fields to pass VIPs from the installer to static pods running on the nodes.

Since we've got full control over the field names, we can change those if they don't fit the project's style.

ironcladlou · 2019-07-11T19:29:06Z

config/v1/types_infrastructure.go

+	APIVIP string `json:"apiVip,omitempty`
+
+	// DNSVIP is the VRRP address for the internal cluster DNS.
+	DNSVIP string `json:"dnsVip,omitempty`


See #348 (comment)

I have the same question here about what DNS service this IP refers to, who provides and consumes the value, etc.

@ironcladlou the answers are pretty much the same (we're facing similar issues and solving them in a similar manner as baremetal and ovirt).

Unlike AWS, none of these platforms can expect a load balancer or DNS-as-a-service and so they have to provide the underlying DNS (e.g. the api & api-int endpoints and the SRV etcd records) some other way. The solution we've all converged on is that each node runs (and relies) on its own coredns static pod with mdns providing dynamic updates and keepalived for IP failover.

You make good points about differentiating between this and the DNS service running inside the cluster as well as the VIP/VRRP implementation details leaking in.

Would something like this work?

// ApiIntIp is an IP address managed by the OpenStack provider backing // the internal `api-int` record ApiIntIp string `json:"apiIntIp,omitempty` // ProviderDnsIp is the IP address for the internal DNS used by the nodes. // Unlike the one managed by the DNS operator, `ProviderDnsIp` provides // name resolution for the nodes themselves. ProviderDnsIp string `json:"providerDnsIp,omitempty`

(I agree with the importance of choosing the right names and descriptions, but I'd rather not spend too long on this as the 4.2 feature freeze is looming over us)

Oh, and the values are consumed by the MCO, which provides the configuration for the static pods we need to run (and it's those extra static pods that need these IPs).

@tomassedovic

+1, understood, and I think your suggestion does a good job disambiguating. I just want to make sure I understood what these things mean and whether they implied some missed requirements in other components which might need to consume them.

(I agree with the importance of choosing the right names and descriptions, but I'd rather not spend too long on this as the 4.2 feature freeze is looming over us)

Sure, I don't want to get in the weeds rehashing the same points here. I personally don't want to block either PR on the subjective naming stuff.

@ironcladlou thanks, that makes perfect sense.

I've updated the PR, please let me know if there's anything else you need.

config/v1/types_infrastructure.go

deads2k · 2019-07-12T15:46:59Z

config/v1/types_infrastructure.go

+type OpenStackPlatformStatus struct {
+	// apiIntIP is an IP address managed by the OpenStack provider
+	// backing the internal `api-int` record.
+	APIIntIP string `json:"apiIntIP,omitempty"`


we don't use abbreviations for internal. See https://github.com/openshift/api/blob/master/config/v1/types_infrastructure.go#L65 . Speaking of which, does this field correspond to this URL in some way?

Ah yes. This is an IP address that OpenStack generates which that URL will resolve to.

Platforms without a DNSaaS (openstack, baremetal, ovirt) need to do that. I'll update the name.

I've updated the field name & comment. Hope this is clearer.

deads2k · 2019-07-12T15:47:48Z

config/v1/types_infrastructure.go

+// OpenStackPlatformStatus holds the current status of the OpenStack infrastructure provider.
+type OpenStackPlatformStatus struct {
+	// apiIntIP is an IP address managed by the OpenStack provider
+	// backing the internal `api-int` record.


can you describe what this api-int record is used for? kube-apiserver? Some openstack informational URL?

Yeah it's the kube API for the internal services

tomassedovic · 2019-07-13T08:25:59Z

The PR is rebased and the field names are identical to the recently-merged baremetal provider:

https://github.com/openshift/api/pull/348/files

config/v1/types_infrastructure.go

tomassedovic · 2019-07-15T15:47:40Z

/retest

The CI failures look unrelated to the changes here.

iamemilio · 2019-07-15T21:58:33Z

/retest

tomassedovic · 2019-07-16T14:06:44Z

@ironcladlou @deads2k what are the next steps? Anything else you'd like to see here?

This PR now mirrors the already-merged baremetal patch and adds the CloudName field.

adambkaplan · 2019-07-16T14:36:39Z

/lgtm

Looks good for the registry.

tomassedovic · 2019-07-16T15:36:38Z

@deads2k The situation with OpenStack is identical to baremetal so I've used the same wording. Is that okay?

This is part of the work to remove the service VM from the openstack architecture. This relies on the coredns/mdns static pods setup in: openshift/machine-config-operator/pull/740 openstack: remove service vm Run haproxy on 7443 with NAT rule forward to 6443 openstack: add clustervars that can be sourced by static pods hacks: integrate changes from pull/1808 hacks: start haproxy with only bootstrap node in backends don't use /tmp for clustervars hacks: don't redirect traffic from the cluster CIDR hacks: open master SGs to internet Set domain search to cluster domain So that nodes can resolve other nodes in the cluster using short names. Remove unused master_port_names tfvar WIP Some progress on getting Ignition via IP The LbFloatingIP is now used as the predictible address for API. It points to the bootstrap node first, then is moved to the first master upon bootstrap removal. As a consequence, the LbFloatingIP becomes a mandatory parameter for the installer on OpenStack platform. This patch also cleans up the network architecture a little bit by removing the subnet complexity for the deployed nodes. There is now only one subnet for all the provisioned nodes. Some cleanup Stop setting selinux permissive on master nodes Use bootstrap node's default hosts file Unbreak other platforms for ignition retrieval Make the ignition retrieval via IP address specific to the OpenStack platform. bootstrap: add switch-api-endpoint service There is a potential cycle where the temporary bootstrap control plane gets torn down and the API endpoint on the bootstrap node points to itself (via the floating IP) rather than the masters. The installer is waiting for the bootstrapping to be completed, but the `progress` service is unable to send the bootstrap-complete event, because the API server is no longer running on the bootstrap node (which still has the FIP attached). Therefore, the installer never runs the bootstrap destroy terraform actions and the FIP is stuck on the bootstrap node. This adds a new service that waits until the `bootkube` and `openshift` services finish (just like `progress.service` does), but then creates an `/etc/hosts` entry for the API endpoints so that `progress` can communicate with the master control plane. Run haproxy from MCO WIP Current state of keepalived Add allow VIP on master and bootstrap nodes Add VIP for DNS WIP: vendor openshift api changes This is a temporary workaround to test PlatformStatus. It should be vendored properly once this PR merges: openshift/api#374 HARDCODE my own VIPs WIP: pass the openstack VIP addresses to MCO via PlatformStatus The IP addresses are hardcoded for now (just like in the previous commit) for testing purposes. We will need to gather them from terraform/assets. WIP BOOTKUBE FIXUP I've no idea why this is needed, but without it, bootkube fails when running etcd with: Jun 19 11:16:33 bootstrap bootkube.sh[1605]: error retrieving local image after pulling registry.svc.ci.openshift.org/origin/4.2-2019-06-19-092222@sha256:17b4446c44e90a7b50cf2d941127a1c78dd36f191cd728b1527ea9fd676c6b17: unable to find 'registry.svc.ci.openshift.org/origin/4.2-2019-06-19-092222@sha256:17b4446c44e90a7b50cf2d941127a1c78dd36f191cd728b1527ea9fd676c6b17' in local storage: no such image vendor: update openshift/api to the latest PR Use the new PlatformStatus fields cleanup This should make the PR a little smaller.

deads2k · 2019-07-16T19:11:45Z

/lgtm
/hold cancel

openshift-ci-robot · 2019-07-16T19:11:58Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: adambkaplan, deads2k, tomassedovic

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [deads2k]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

This is part of the work to remove the service VM from the openstack architecture. This relies on the coredns/mdns static pods setup in: openshift/machine-config-operator/pull/740 openstack: remove service vm Run haproxy on 7443 with NAT rule forward to 6443 openstack: add clustervars that can be sourced by static pods hacks: integrate changes from pull/1808 hacks: start haproxy with only bootstrap node in backends don't use /tmp for clustervars hacks: don't redirect traffic from the cluster CIDR hacks: open master SGs to internet Set domain search to cluster domain So that nodes can resolve other nodes in the cluster using short names. Remove unused master_port_names tfvar WIP Some progress on getting Ignition via IP The LbFloatingIP is now used as the predictible address for API. It points to the bootstrap node first, then is moved to the first master upon bootstrap removal. As a consequence, the LbFloatingIP becomes a mandatory parameter for the installer on OpenStack platform. This patch also cleans up the network architecture a little bit by removing the subnet complexity for the deployed nodes. There is now only one subnet for all the provisioned nodes. Some cleanup Stop setting selinux permissive on master nodes Use bootstrap node's default hosts file Unbreak other platforms for ignition retrieval Make the ignition retrieval via IP address specific to the OpenStack platform. bootstrap: add switch-api-endpoint service There is a potential cycle where the temporary bootstrap control plane gets torn down and the API endpoint on the bootstrap node points to itself (via the floating IP) rather than the masters. The installer is waiting for the bootstrapping to be completed, but the `progress` service is unable to send the bootstrap-complete event, because the API server is no longer running on the bootstrap node (which still has the FIP attached). Therefore, the installer never runs the bootstrap destroy terraform actions and the FIP is stuck on the bootstrap node. This adds a new service that waits until the `bootkube` and `openshift` services finish (just like `progress.service` does), but then creates an `/etc/hosts` entry for the API endpoints so that `progress` can communicate with the master control plane. Run haproxy from MCO WIP Current state of keepalived Add allow VIP on master and bootstrap nodes Add VIP for DNS WIP: vendor openshift api changes This is a temporary workaround to test PlatformStatus. It should be vendored properly once this PR merges: openshift/api#374 HARDCODE my own VIPs WIP: pass the openstack VIP addresses to MCO via PlatformStatus The IP addresses are hardcoded for now (just like in the previous commit) for testing purposes. We will need to gather them from terraform/assets. WIP BOOTKUBE FIXUP I've no idea why this is needed, but without it, bootkube fails when running etcd with: Jun 19 11:16:33 bootstrap bootkube.sh[1605]: error retrieving local image after pulling registry.svc.ci.openshift.org/origin/4.2-2019-06-19-092222@sha256:17b4446c44e90a7b50cf2d941127a1c78dd36f191cd728b1527ea9fd676c6b17: unable to find 'registry.svc.ci.openshift.org/origin/4.2-2019-06-19-092222@sha256:17b4446c44e90a7b50cf2d941127a1c78dd36f191cd728b1527ea9fd676c6b17' in local storage: no such image vendor: update openshift/api to the latest PR Use the new PlatformStatus fields cleanup This should make the PR a little smaller. openstack: remove SRV records from service VM This is part of the work to remove the service VM from the openstack architecture. This relies on the coredns/mdns static pods setup in: openshift/machine-config-operator/pull/740 openstack: remove service vm Run haproxy on 7443 with NAT rule forward to 6443 openstack: add clustervars that can be sourced by static pods hacks: integrate changes from pull/1808 hacks: start haproxy with only bootstrap node in backends don't use /tmp for clustervars hacks: don't redirect traffic from the cluster CIDR hacks: open master SGs to internet Set domain search to cluster domain So that nodes can resolve other nodes in the cluster using short names. Remove unused master_port_names tfvar WIP Some progress on getting Ignition via IP The LbFloatingIP is now used as the predictible address for API. It points to the bootstrap node first, then is moved to the first master upon bootstrap removal. As a consequence, the LbFloatingIP becomes a mandatory parameter for the installer on OpenStack platform. This patch also cleans up the network architecture a little bit by removing the subnet complexity for the deployed nodes. There is now only one subnet for all the provisioned nodes. Some cleanup Stop setting selinux permissive on master nodes Use bootstrap node's default hosts file Unbreak other platforms for ignition retrieval Make the ignition retrieval via IP address specific to the OpenStack platform. bootstrap: add switch-api-endpoint service There is a potential cycle where the temporary bootstrap control plane gets torn down and the API endpoint on the bootstrap node points to itself (via the floating IP) rather than the masters. The installer is waiting for the bootstrapping to be completed, but the `progress` service is unable to send the bootstrap-complete event, because the API server is no longer running on the bootstrap node (which still has the FIP attached). Therefore, the installer never runs the bootstrap destroy terraform actions and the FIP is stuck on the bootstrap node. This adds a new service that waits until the `bootkube` and `openshift` services finish (just like `progress.service` does), but then creates an `/etc/hosts` entry for the API endpoints so that `progress` can communicate with the master control plane. Run haproxy from MCO WIP Current state of keepalived Add allow VIP on master and bootstrap nodes Add VIP for DNS WIP: vendor openshift api changes This is a temporary workaround to test PlatformStatus. It should be vendored properly once this PR merges: openshift/api#374 HARDCODE my own VIPs WIP: pass the openstack VIP addresses to MCO via PlatformStatus The IP addresses are hardcoded for now (just like in the previous commit) for testing purposes. We will need to gather them from terraform/assets. WIP BOOTKUBE FIXUP I've no idea why this is needed, but without it, bootkube fails when running etcd with: Jun 19 11:16:33 bootstrap bootkube.sh[1605]: error retrieving local image after pulling registry.svc.ci.openshift.org/origin/4.2-2019-06-19-092222@sha256:17b4446c44e90a7b50cf2d941127a1c78dd36f191cd728b1527ea9fd676c6b17: unable to find 'registry.svc.ci.openshift.org/origin/4.2-2019-06-19-092222@sha256:17b4446c44e90a7b50cf2d941127a1c78dd36f191cd728b1527ea9fd676c6b17' in local storage: no such image vendor: update openshift/api to the latest PR Use the new PlatformStatus fields cleanup This should make the PR a little smaller. WIP do not hardcode the OpenStack VIPs Note, there's still one hardcoded Ip in the keepalived.sh script that needs resolving and none of this is tested. Chances are, the values are not being set / passed around cleanly.

The experimental OpenStack backend used to create an extra server running DNS and load balancer services that the cluster needed. OpenStack does not always come with DNSaaS or LBaaS so we had to provide the functionality the OpenShift cluster depends on (e.g. the etcd SRV records, the api-int records & load balancing, etc.). This approach is undesirable for two reasons: first, it adds an extra node that the other IPI platforms do not need. Second, this node is a single point of failure. The Baremetal platform has faced the same issues and they have solved them with a few virtual IP addresses managed by keepalived in combination with coredns static pod running on every node using the mDNS protocol to update records as new nodes are added or removed and a similar static pod haproxy to load balance the control plane internally. The VIPs are defined here in the installer and they use the PlatformStatus field to be passed to the necessary machine-config-operator fields: openshift/api#374 The Bare Metal IPI Networking Infrastructure document is broadly applicable here as well: https://github.com/openshift/installer/blob/master/docs/design/baremetal/networking-infrastructure.md Notable differences in OpenStack: * We only use the API and DNS VIPs right now * Instead of Baremetal's Ingress VIP (which is attached to the OpenShift routers) our haproxy static pods balance the 80 & 443 pods to the worker nodes * We do not run coredns on the bootstrap node. Instead, bootstrap itself uses one of the masters for DNS. These differences are not fundamental to OpenStack and we will be looking at aligning more closely with the Baremetal provider in the future. There is also a great oportunity to share some of the configuration files and scripts here. This change needs several other pull requests: Keepalived plus the coredns & haproxy static pods in the MCO: openshift/machine-config-operator/pull/740 Passing the API and DNS VIPs through the installer: openshift#1998 Vendoring the OpenStack PlatformStatus changes in the MCO: openshift/machine-config-operator#978 Allowing to use PlatformStatus in the MCO templates: openshift/machine-config-operator#943 Co-authored-by: Emilio Garcia <egarcia@redhat.com> Co-authored-by: John Trowbridge <trown@redhat.com> Co-authored-by: Martin Andre <m.andre@redhat.com> Co-authored-by: Tomas Sedovic <tsedovic@redhat.com> Massive thanks to the Bare Metal and oVirt people!

openstack: point master nodes to themselves for DNS openstack: render coredns Corefile directly openstack: add test_data for openstack mdns files openstack: add A records for api(-int) hacks: forward all non-cluster traffic to 8.8.8.8 Add *.apps wildcard entry don't use /tmp for clustervars hacks: send *.apps to all masters hacks: don't add bootstrap to round robin api-int run mdns on workers? openshift-metalkube -> openshift-metal3 Runs HaProxy in static pods on master nodes haproxy static pod almost working Move haproxy-watcher to standalone script and excape go templating More good stuff for haproxy Use clustervars instead of a function in `pkg/controller/template/render.go` that only works when MCO is running on the bootstrap node. Only add workers to haproxy config when there are workers in the cluster. Reinitialize the variables for each iteration of the haproxy-watcher.sh script. Fix worker hostnames in mdns config Hostnames for workers should be like: host-10-0-128-22 WIP Current status for keepalived Add VIP for DNS Fix keepalived config vrrp_instance keyword for ${CLUSTER_NAME}_API was duplicated. Make the infra object available for template rendering oVirt and possibly other platforms needs access the InfrastructureStatus to render properly config files with stuff like API VIP, DNS VIP etc. By embedding the infra object its possibly to directly get that in tremplates and reder for example a coredns config file like this: ``` {{ .Infra.Status.PlatformStatus.Ovirt.DnsVIP }} api-int.{{.EtcdDiscoveryDomain}} api.{{.EtcdDiscoveryDomain}} ns1.{{.EtcdDiscoveryDomain}} ``` Signed-off-by: Roy Golan <rgolan@redhat.com> HACK: vendor: vendor openshift API PlatformStatus changes This is a manual workaround for testing until this PR merges: openshift/api#374 WIP: Add worker DNS static pod and read VIPs from PlatformStatus vendor: update openshift/api to the latest changes Use openshift namespaced images where possible Update the latest fields from PlatformStatus WIP replace 127.0.0.1 in DHCP config with the node's fixed IP This uses this the runtimecfg tool: github.com/openshift/baremetal-runtimecfg Eventually, we'll want to replace all our other scripts that gather IPs for keepalived etc. with it too. escape Render /etc/dhcp/dhclient.conf and /etc/resolv.conf with a static pod This doesn't seem to work, because the discovery pods running on the masters keep failing because they use DNS from before the resolv.conf change. Resolv.conf is now updated at runtime once kubelet launches sattic pods, see. Also, the render config pods keep restarting -- they succeed but kubernetes thinks they should be kept running. Use the DNS VIP for DNS for all nodes This should provide DNS for every node, relying on teh master DNS VIP. It might cause delays, because there will be a time when the VIP is not active because coredns did not come up yet on any master. We should really run coredns on the bootstrap node too. Fix the copy-pasted filename for keepalived The keepalived manifest was being created as `haproxy.yaml` which kept overwriting the actual haproxy static pod, resulting in the LB never being run. Disable worker's coredns & mdns-publisher temporarily The scripts are currently not working and I think they may be blocking the worker kubelet readiness. We're already setting the DNS VIP so they should be able to resolve the cluster. The only thing that won't happen is the workers' own hostnames won't be available. It's still not clear to me whether that's an issue or not. Going to disable this for now to get a stable deployment (hopefully) and then see. Add initial haproxy configuration This means haproxy should always start with this config as opposed to relying on haproxy-watcher to create it. Since this uses the API VIP, it should always work (so long as the VIP is active). Make the haxproxy-watcher script more stable First, it uses API VIP unconditionally. That means the API access should always work (as long as the VIP is assigned to a node that responds to it -- which is handled by keepalived). Second, it runs with `set -e` and `set -o pipefail` so as not to mess known-good configuration. If anything fails, the script won't break API access. I suspect this has been causing a lot of the weird racey issues and just general unpleasantness. Third, the master control plane no longer depends on this script running successfully even once. There is a separate file with the initial haproxy config, so even if this keeps failing, the control plane should be unaffected. Of course this script failing would mean the worker endpoints never being added so we don't want that either. And finally, it prints some messages to show what's going on as well as setting `-x` for now to ease debugging. I expect that to go away but it's helpful now. Fix haproxy-watcher bash indent syntax error Fix the worker server haproxy section as well More indent fixes because yaml and bash heredoc interop Enable mdns-publisher on the workers again This time bringing in the necessary mdns config too.

The experimental OpenStack backend used to create an extra server running DNS and load balancer services that the cluster needed. OpenStack does not always come with DNSaaS or LBaaS so we had to provide the functionality the OpenShift cluster depends on (e.g. the etcd SRV records, the api-int records & load balancing, etc.). This approach is undesirable for two reasons: first, it adds an extra node that the other IPI platforms do not need. Second, this node is a single point of failure. The Baremetal platform has faced the same issues and they have solved them with a few virtual IP addresses managed by keepalived in combination with coredns static pod running on every node using the mDNS protocol to update records as new nodes are added or removed and a similar static pod haproxy to load balance the control plane internally. The VIPs are defined here in the installer and they use the PlatformStatus field to be passed to the necessary machine-config-operator fields: openshift/api#374 The Bare Metal IPI Networking Infrastructure document is broadly applicable here as well: https://github.com/openshift/installer/blob/master/docs/design/baremetal/networking-infrastructure.md Notable differences in OpenStack: * We only use the API and DNS VIPs right now * Instead of Baremetal's Ingress VIP (which is attached to the OpenShift routers) our haproxy static pods balance the 80 & 443 pods to the worker nodes * We do not run coredns on the bootstrap node. Instead, bootstrap itself uses one of the masters for DNS. These differences are not fundamental to OpenStack and we will be looking at aligning more closely with the Baremetal provider in the future. There is also a great oportunity to share some of the configuration files and scripts here. This change needs several other pull requests: Keepalived plus the coredns & haproxy static pods in the MCO: openshift/machine-config-operator/pull/740 Passing the API and DNS VIPs through the installer: openshift#1998 Vendoring the OpenStack PlatformStatus changes in the MCO: openshift/machine-config-operator#978 Allowing to use PlatformStatus in the MCO templates: openshift/machine-config-operator#943 Co-authored-by: Emilio Garcia <egarcia@redhat.com> Co-authored-by: John Trowbridge <trown@redhat.com> Co-authored-by: Martin Andre <m.andre@redhat.com> Co-authored-by: Tomas Sedovic <tsedovic@redhat.com> Massive thanks to the Bare Metal and oVirt people!

openstack: point master nodes to themselves for DNS openstack: render coredns Corefile directly openstack: add test_data for openstack mdns files openstack: add A records for api(-int) hacks: forward all non-cluster traffic to 8.8.8.8 Add *.apps wildcard entry don't use /tmp for clustervars hacks: send *.apps to all masters hacks: don't add bootstrap to round robin api-int run mdns on workers? openshift-metalkube -> openshift-metal3 Runs HaProxy in static pods on master nodes haproxy static pod almost working Move haproxy-watcher to standalone script and excape go templating More good stuff for haproxy Use clustervars instead of a function in `pkg/controller/template/render.go` that only works when MCO is running on the bootstrap node. Only add workers to haproxy config when there are workers in the cluster. Reinitialize the variables for each iteration of the haproxy-watcher.sh script. Fix worker hostnames in mdns config Hostnames for workers should be like: host-10-0-128-22 WIP Current status for keepalived Add VIP for DNS Fix keepalived config vrrp_instance keyword for ${CLUSTER_NAME}_API was duplicated. Make the infra object available for template rendering oVirt and possibly other platforms needs access the InfrastructureStatus to render properly config files with stuff like API VIP, DNS VIP etc. By embedding the infra object its possibly to directly get that in tremplates and reder for example a coredns config file like this: ``` {{ .Infra.Status.PlatformStatus.Ovirt.DnsVIP }} api-int.{{.EtcdDiscoveryDomain}} api.{{.EtcdDiscoveryDomain}} ns1.{{.EtcdDiscoveryDomain}} ``` Signed-off-by: Roy Golan <rgolan@redhat.com> HACK: vendor: vendor openshift API PlatformStatus changes This is a manual workaround for testing until this PR merges: openshift/api#374 WIP: Add worker DNS static pod and read VIPs from PlatformStatus vendor: update openshift/api to the latest changes Use openshift namespaced images where possible Update the latest fields from PlatformStatus WIP replace 127.0.0.1 in DHCP config with the node's fixed IP This uses this the runtimecfg tool: github.com/openshift/baremetal-runtimecfg Eventually, we'll want to replace all our other scripts that gather IPs for keepalived etc. with it too. escape Render /etc/dhcp/dhclient.conf and /etc/resolv.conf with a static pod This doesn't seem to work, because the discovery pods running on the masters keep failing because they use DNS from before the resolv.conf change. Resolv.conf is now updated at runtime once kubelet launches sattic pods, see. Also, the render config pods keep restarting -- they succeed but kubernetes thinks they should be kept running. Use the DNS VIP for DNS for all nodes This should provide DNS for every node, relying on teh master DNS VIP. It might cause delays, because there will be a time when the VIP is not active because coredns did not come up yet on any master. We should really run coredns on the bootstrap node too. Fix the copy-pasted filename for keepalived The keepalived manifest was being created as `haproxy.yaml` which kept overwriting the actual haproxy static pod, resulting in the LB never being run. Disable worker's coredns & mdns-publisher temporarily The scripts are currently not working and I think they may be blocking the worker kubelet readiness. We're already setting the DNS VIP so they should be able to resolve the cluster. The only thing that won't happen is the workers' own hostnames won't be available. It's still not clear to me whether that's an issue or not. Going to disable this for now to get a stable deployment (hopefully) and then see. Add initial haproxy configuration This means haproxy should always start with this config as opposed to relying on haproxy-watcher to create it. Since this uses the API VIP, it should always work (so long as the VIP is active). Make the haxproxy-watcher script more stable First, it uses API VIP unconditionally. That means the API access should always work (as long as the VIP is assigned to a node that responds to it -- which is handled by keepalived). Second, it runs with `set -e` and `set -o pipefail` so as not to mess known-good configuration. If anything fails, the script won't break API access. I suspect this has been causing a lot of the weird racey issues and just general unpleasantness. Third, the master control plane no longer depends on this script running successfully even once. There is a separate file with the initial haproxy config, so even if this keeps failing, the control plane should be unaffected. Of course this script failing would mean the worker endpoints never being added so we don't want that either. And finally, it prints some messages to show what's going on as well as setting `-x` for now to ease debugging. I expect that to go away but it's helpful now. Fix haproxy-watcher bash indent syntax error Fix the worker server haproxy section as well More indent fixes because yaml and bash heredoc interop Enable mdns-publisher on the workers again This time bringing in the necessary mdns config too.

The experimental OpenStack backend used to create an extra server running DNS and load balancer services that the cluster needed. OpenStack does not always come with DNSaaS or LBaaS so we had to provide the functionality the OpenShift cluster depends on (e.g. the etcd SRV records, the api-int records & load balancing, etc.). This approach is undesirable for two reasons: first, it adds an extra node that the other IPI platforms do not need. Second, this node is a single point of failure. The Baremetal platform has faced the same issues and they have solved them with a few virtual IP addresses managed by keepalived in combination with coredns static pod running on every node using the mDNS protocol to update records as new nodes are added or removed and a similar static pod haproxy to load balance the control plane internally. The VIPs are defined here in the installer and they use the PlatformStatus field to be passed to the necessary machine-config-operator fields: openshift/api#374 The Bare Metal IPI Networking Infrastructure document is broadly applicable here as well: https://github.com/openshift/installer/blob/master/docs/design/baremetal/networking-infrastructure.md Notable differences in OpenStack: * We only use the API and DNS VIPs right now * Instead of Baremetal's Ingress VIP (which is attached to the OpenShift routers) our haproxy static pods balance the 80 & 443 pods to the worker nodes * We do not run coredns on the bootstrap node. Instead, bootstrap itself uses one of the masters for DNS. These differences are not fundamental to OpenStack and we will be looking at aligning more closely with the Baremetal provider in the future. There is also a great oportunity to share some of the configuration files and scripts here. This change needs several other pull requests: Keepalived plus the coredns & haproxy static pods in the MCO: openshift/machine-config-operator/pull/740 Passing the API and DNS VIPs through the installer: openshift#1998 Vendoring the OpenStack PlatformStatus changes in the MCO: openshift/machine-config-operator#978 Allowing to use PlatformStatus in the MCO templates: openshift/machine-config-operator#943 Co-authored-by: Emilio Garcia <egarcia@redhat.com> Co-authored-by: John Trowbridge <trown@redhat.com> Co-authored-by: Martin Andre <m.andre@redhat.com> Co-authored-by: Tomas Sedovic <tsedovic@redhat.com> Massive thanks to the Bare Metal and oVirt people!

The experimental OpenStack backend used to create an extra server running DNS and load balancer services that the cluster needed. OpenStack does not always come with DNSaaS or LBaaS so we had to provide the functionality the OpenShift cluster depends on (e.g. the etcd SRV records, the api-int records & load balancing, etc.). This approach is undesirable for two reasons: first, it adds an extra node that the other IPI platforms do not need. Second, this node is a single point of failure. The Baremetal platform has faced the same issues and they have solved them with a few virtual IP addresses managed by keepalived in combination with coredns static pod running on every node using the mDNS protocol to update records as new nodes are added or removed and a similar static pod haproxy to load balance the control plane internally. The VIPs are defined here in the installer and they use the PlatformStatus field to be passed to the necessary machine-config-operator fields: openshift/api#374 The Bare Metal IPI Networking Infrastructure document is applicable here as well: https://github.com/openshift/installer/blob/master/docs/design/baremetal/networking-infrastructure.md There is also a great oportunity to share some of the configuration files and scripts here. This change needs several other pull requests: Keepalived plus the coredns & haproxy static pods in the MCO: openshift/machine-config-operator/pull/740 Passing the API and DNS VIPs through the installer: openshift#1998 Co-authored-by: Emilio Garcia <egarcia@redhat.com> Co-authored-by: John Trowbridge <trown@redhat.com> Co-authored-by: Martin Andre <m.andre@redhat.com> Co-authored-by: Tomas Sedovic <tsedovic@redhat.com> Massive thanks to the Bare Metal and oVirt people!

The experimental OpenStack backend used to create an extra server running DNS and load balancer services that the cluster needed. OpenStack does not always come with DNSaaS or LBaaS so we had to provide the functionality the OpenShift cluster depends on (e.g. the etcd SRV records, the api-int records & load balancing, etc.). This approach is undesirable for two reasons: first, it adds an extra node that the other IPI platforms do not need. Second, this node is a single point of failure. The Baremetal platform has faced the same issues and they have solved them with a few virtual IP addresses managed by keepalived in combination with coredns static pod running on every node using the mDNS protocol to update records as new nodes are added or removed and a similar static pod haproxy to load balance the control plane internally. The VIPs are defined here in the installer and they use the PlatformStatus field to be passed to the necessary machine-config-operator fields: openshift/api#374 The Bare Metal IPI Networking Infrastructure document is applicable here as well: https://github.com/openshift/installer/blob/master/docs/design/baremetal/networking-infrastructure.md There is also a great opportunity to share some of the configuration files and scripts here. This change needs several other pull requests: Keepalived plus the coredns & haproxy static pods in the MCO: openshift/machine-config-operator#740 Passing the API and DNS VIPs through the installer: openshift#1998 Co-authored-by: Emilio Garcia <egarcia@redhat.com> Co-authored-by: John Trowbridge <trown@redhat.com> Co-authored-by: Martin Andre <m.andre@redhat.com> Co-authored-by: Tomas Sedovic <tsedovic@redhat.com> Massive thanks to the Bare Metal and oVirt people!

The experimental OpenStack backend used to create an extra server running DNS and load balancer services that the cluster needed. OpenStack does not always come with DNSaaS or LBaaS so we had to provide the functionality the OpenShift cluster depends on (e.g. the etcd SRV records, the api-int records & load balancing, etc.). This approach is undesirable for two reasons: first, it adds an extra node that the other IPI platforms do not need. Second, this node is a single point of failure. The Baremetal platform has faced the same issues and they have solved them with a few virtual IP addresses managed by keepalived in combination with coredns static pod running on every node using the mDNS protocol to update records as new nodes are added or removed and a similar static pod haproxy to load balance the control plane internally. The VIPs are defined here in the installer and they use the PlatformStatus field to be passed to the necessary machine-config-operator fields: openshift/api#374 The Bare Metal IPI Networking Infrastructure document is applicable here as well: https://github.com/openshift/installer/blob/master/docs/design/baremetal/networking-infrastructure.md There is also a great opportunity to share some of the configuration files and scripts here. This change needs several other pull requests: Keepalived plus the coredns & haproxy static pods in the MCO: openshift/machine-config-operator#740 Co-authored-by: Emilio Garcia <egarcia@redhat.com> Co-authored-by: John Trowbridge <trown@redhat.com> Co-authored-by: Martin Andre <m.andre@redhat.com> Co-authored-by: Tomas Sedovic <tsedovic@redhat.com> Massive thanks to the Bare Metal and oVirt people!

openshift-ci-robot added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label Jul 11, 2019

openshift-ci-robot requested review from abhinavdahiya and adambkaplan July 11, 2019 10:45

tomassedovic added a commit to tomassedovic/openshift-machine-config-operator that referenced this pull request Jul 11, 2019

HACK: vendor: vendor openshift API PlatformStatus changes

be08a7d

This is a manual workaround for testing until this PR merges: openshift/api#374

tomassedovic added a commit to tomassedovic/openshift-installer that referenced this pull request Jul 11, 2019

WIP: vendor openshift api changes

b74c50d

This is a temporary workaround to test PlatformStatus. It should be vendored properly once this PR merges: openshift/api#374

openshift-ci-robot requested review from coreydaley, deads2k and ironcladlou July 11, 2019 17:01

adambkaplan reviewed Jul 11, 2019

View reviewed changes

ironcladlou reviewed Jul 11, 2019

View reviewed changes

bcrochet mentioned this pull request Jul 12, 2019

Add BareMetal status for BareMetal platform #348

Merged

ironcladlou reviewed Jul 12, 2019

View reviewed changes

config/v1/types_infrastructure.go Outdated Show resolved Hide resolved

ironcladlou reviewed Jul 12, 2019

View reviewed changes

config/v1/types_infrastructure.go Outdated Show resolved Hide resolved

deads2k reviewed Jul 12, 2019

View reviewed changes

openshift-ci-robot added needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Jul 13, 2019

adambkaplan suggested changes Jul 15, 2019

View reviewed changes

config/v1/types_infrastructure.go Show resolved Hide resolved

adambkaplan mentioned this pull request Jul 15, 2019

Add cloud name to Infrastructure status openshift/installer#1986

Closed

tomassedovic mentioned this pull request Jul 16, 2019

openstack: get ignition via IP openshift/installer#1998

Closed

openshift-ci-robot assigned adambkaplan Jul 16, 2019

openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Jul 16, 2019

openshift-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 16, 2019

openshift-ci-robot assigned deads2k Jul 16, 2019

openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Jul 16, 2019

openshift-merge-robot merged commit d752fe3 into openshift:master Jul 16, 2019

tomassedovic mentioned this pull request Jul 17, 2019

vendor: update github.com/openshift/api openshift/machine-config-operator#978

Merged

tomassedovic mentioned this pull request Jul 18, 2019

openstack: remove the service VM openshift/installer#1959

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

config/v1: add platform status for OpenStack #374

config/v1: add platform status for OpenStack #374

tomassedovic commented Jul 11, 2019

tomassedovic commented Jul 11, 2019

coreydaley commented Jul 11, 2019

adambkaplan Jul 11, 2019

tomassedovic Jul 11, 2019

ironcladlou Jul 11, 2019

tomassedovic Jul 11, 2019 •

edited

tomassedovic Jul 11, 2019

ironcladlou Jul 11, 2019

tomassedovic Jul 12, 2019

deads2k Jul 12, 2019

tomassedovic Jul 12, 2019

tomassedovic Jul 12, 2019

deads2k Jul 12, 2019

tomassedovic Jul 12, 2019

tomassedovic commented Jul 13, 2019

tomassedovic commented Jul 15, 2019

iamemilio commented Jul 15, 2019

tomassedovic commented Jul 16, 2019

adambkaplan commented Jul 16, 2019

tomassedovic commented Jul 16, 2019

deads2k commented Jul 16, 2019

openshift-ci-robot commented Jul 16, 2019

config/v1: add platform status for OpenStack #374

config/v1: add platform status for OpenStack #374

Conversation

tomassedovic commented Jul 11, 2019

tomassedovic commented Jul 11, 2019

coreydaley commented Jul 11, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tomassedovic Jul 11, 2019 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tomassedovic commented Jul 13, 2019

tomassedovic commented Jul 15, 2019

iamemilio commented Jul 15, 2019

tomassedovic commented Jul 16, 2019

adambkaplan commented Jul 16, 2019

tomassedovic commented Jul 16, 2019

deads2k commented Jul 16, 2019

openshift-ci-robot commented Jul 16, 2019

tomassedovic Jul 11, 2019 •

edited