Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

openstack: Add coredns, mdns-publisher and haproxy static pods #740

Merged
merged 5 commits into from
Aug 2, 2019

Conversation

trown
Copy link

@trown trown commented May 13, 2019

This is part of the work to remove the "service" VM from the OpenStack architecture. Instead of running coredns on this extra VM, we will run it in a static pod on all masters. We are using the mdns work from metal3.io in order to coordinate the DNS records on the masters.

@openshift-ci-robot openshift-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label May 13, 2019
trown pushed a commit to trown/installer that referenced this pull request May 13, 2019
This is part of the work to remove the service VM from the
openstack architecture. This relies on the coredns/mdns static
pods setup in: openshift/machine-config-operator/pull/740
@cgwalters cgwalters added the 4.2 label May 14, 2019
Copy link
Member

@cgwalters cgwalters left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve

path: "/etc/kubernetes/static-pod-resources/mdns"
initContainers:
- name: render-config
image: quay.io/openshift/origin-node:latest
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will need to be added to image-references and come from the payload.

imagePullPolicy: IfNotPresent
containers:
- name: mdns-publisher
image: quay.io/openshift-metalkube/mdns-publisher:latest
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto.

export NON_VIRTUAL_IP
export DOMAIN
export CLUSTER_NAME
/usr/libexec/platform-python -c "from __future__ import print_function
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the more correct approach is for the image to install the Python it wants.

CLUSTER_NAME={{(split "." .EtcdDiscoveryDomain)._0}}
DOMAIN={{.EtcdDiscoveryDomain}}
API_VIP="$(dig +noall +answer "api.${DOMAIN}" | awk '{print $NF}')"
SUBNET_CIDR="$(ip addr show | grep -v "scope host" | grep -Po 'inet \K[\d.]+/[\d.]+' | head -n1)"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels a bit hacky...

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. I added a TODO to come up with a better method of getting these.

@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 14, 2019
@trown
Copy link
Author

trown commented May 15, 2019

/hold

need to properly publish the coredns and mdns-publisher images this patch uses

@openshift-ci-robot openshift-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label May 15, 2019
creationTimestamp:
deletionGracePeriodSeconds: 65
labels:
app: kni-infra-mdns
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: remove kni

args:
- "--conf"
- "/etc/coredns/Corefile"
resources:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These requests are pretty high.

trown pushed a commit to trown/installer that referenced this pull request May 30, 2019
This is part of the work to remove the service VM from the
openstack architecture. This relies on the coredns/mdns static
pods setup in: openshift/machine-config-operator/pull/740
trown pushed a commit to trown/installer that referenced this pull request May 30, 2019
This is part of the work to remove the service VM from the
openstack architecture. This relies on the coredns/mdns static
pods setup in: openshift/machine-config-operator/pull/740
trown pushed a commit to trown/installer that referenced this pull request Jun 7, 2019
This is part of the work to remove the service VM from the
openstack architecture. This relies on the coredns/mdns static
pods setup in: openshift/machine-config-operator/pull/740
trown pushed a commit to trown/installer that referenced this pull request Jun 21, 2019
This is part of the work to remove the service VM from the
openstack architecture. This relies on the coredns/mdns static
pods setup in: openshift/machine-config-operator/pull/740
mandre pushed a commit to mandre/installer that referenced this pull request Jun 24, 2019
This is part of the work to remove the service VM from the
openstack architecture. This relies on the coredns/mdns static
pods setup in: openshift/machine-config-operator/pull/740
mandre pushed a commit to mandre/installer that referenced this pull request Jun 25, 2019
This is part of the work to remove the service VM from the
openstack architecture. This relies on the coredns/mdns static
pods setup in: openshift/machine-config-operator/pull/740
trown pushed a commit to trown/installer that referenced this pull request Jun 26, 2019
This is part of the work to remove the service VM from the
openstack architecture. This relies on the coredns/mdns static
pods setup in: openshift/machine-config-operator/pull/740
mandre pushed a commit to mandre/installer that referenced this pull request Jun 27, 2019
This is part of the work to remove the service VM from the
openstack architecture. This relies on the coredns/mdns static
pods setup in: openshift/machine-config-operator/pull/740
trown pushed a commit to trown/installer that referenced this pull request Jun 28, 2019
This is part of the work to remove the service VM from the
openstack architecture. This relies on the coredns/mdns static
pods setup in: openshift/machine-config-operator/pull/740
mandre pushed a commit to mandre/installer that referenced this pull request Jul 2, 2019
This is part of the work to remove the service VM from the
openstack architecture. This relies on the coredns/mdns static
pods setup in: openshift/machine-config-operator/pull/740
mandre pushed a commit to mandre/installer that referenced this pull request Jul 3, 2019
This is part of the work to remove the service VM from the
openstack architecture. This relies on the coredns/mdns static
pods setup in: openshift/machine-config-operator/pull/740
mandre pushed a commit to mandre/installer that referenced this pull request Jul 5, 2019
This is part of the work to remove the service VM from the
openstack architecture. This relies on the coredns/mdns static
pods setup in: openshift/machine-config-operator/pull/740
@openshift-ci-robot openshift-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 30, 2019
mandre added a commit to mandre/machine-config-operator that referenced this pull request Jul 30, 2019
openshift#740 and
openshift/installer#1959 depend on each other.
We need to break this dependency to allow them to merge while keeping
the e2e-openstack job green.

This patch disables the static pods added for the openstack platform in
openshift#740 unless the
installer provided the needed info set via
openshift/installer#1959.

It can be safely reverted once
openshift/installer#1959 merges.
John Trowbridge and others added 5 commits July 30, 2019 23:30
… pods

In order to have a more fault tolerant networking architecture, we are
replacing the functionality of the service vm with a number of static
pod resources that are run on master and worker nodes.
We use Baremetal-RuntimeCfg to clean up our code, as well as align our
architecture much more closely with that of the Baremetal Team.
We need the wildcard record on the master nodes to resolve the Ingress
IP. Since the `hosts` plugin doesn't support wildcards, it is replaced
with the RFC 1035-style zone DB file which serves the API and wildcard
records via the `file` plugin.

This also adds the API record to both the bootstrap and master nodes.
…tches

openshift#740 and
openshift/installer#1959 depend on each other.
We need to break this dependency to allow them to merge while keeping
the e2e-openstack job green.

This patch disables the static pods added for the openstack platform in
openshift#740 unless the
installer provided the needed info set via
openshift/installer#1959.

It can be safely reverted once
openshift/installer#1959 merges.
@openshift-ci-robot openshift-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 30, 2019
@mandre
Copy link
Member

mandre commented Jul 31, 2019

/test e2e-openstack

1 similar comment
@mandre
Copy link
Member

mandre commented Jul 31, 2019

/test e2e-openstack

@tomassedovic
Copy link
Contributor

/retest

@tomassedovic
Copy link
Contributor

/retest

This worked for me manually on the master. Looks more like a timeout on the openstack job.

tomassedovic pushed a commit to tomassedovic/openshift-installer that referenced this pull request Jul 31, 2019
The experimental OpenStack backend used to create an extra server
running DNS and load balancer services that the cluster needed.
OpenStack does not always come with DNSaaS or LBaaS so we had to provide
the functionality the OpenShift cluster depends on (e.g. the etcd SRV
records, the api-int records & load balancing, etc.).

This approach is undesirable for two reasons: first, it adds an extra
node that the other IPI platforms do not need. Second, this node is a
single point of failure.

The Baremetal platform has faced the same issues and they have solved
them with a few virtual IP addresses managed by keepalived in
combination with coredns static pod running on every node using the mDNS
protocol to update records as new nodes are added or removed and a
similar static pod haproxy to load balance the control plane internally.

The VIPs are defined here in the installer and they use the
PlatformStatus field to be passed to the necessary
machine-config-operator fields:

openshift/api#374

The Bare Metal IPI Networking Infrastructure document is applicable here as
well:

https://github.com/openshift/installer/blob/master/docs/design/baremetal/networking-infrastructure.md

There is also a great opportunity to share some of the configuration
files and scripts here.

This change needs several other pull requests:

Keepalived plus the coredns & haproxy static pods in the MCO:
openshift/machine-config-operator#740

Passing the API and DNS VIPs through the installer:
openshift#1998

Co-authored-by: Emilio Garcia <egarcia@redhat.com>
Co-authored-by: John Trowbridge <trown@redhat.com>
Co-authored-by: Martin Andre <m.andre@redhat.com>
Co-authored-by: Tomas Sedovic <tsedovic@redhat.com>

Massive thanks to the Bare Metal and oVirt people!
@tomassedovic
Copy link
Contributor

/retest

@tomassedovic
Copy link
Contributor

@cgwalters @runcom would you mind having a look at this PR?

It's the OpenStack equivalent to: #795

We've tested it locally (with and without the corresponding installer PR openshift/installer#1959), it passes the CI and OpenStack has a FFE for this work. Is there anything else you'd like us to do?

@@ -180,6 +180,11 @@ func generateMachineConfigForName(config *RenderConfig, role, name, templateDir,
platformDirs := []string{}
// Loop over templates/common which applies everywhere
for _, dir := range []string{platformBase, platform} {
// Bypass OpenStack template rendering until
// https://github.com/openshift/installer/pull/1959 merges
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hum...so if I understand this correctly, this PR is adding all of the code, and then we need to get the installer PR in first, then we'll do a PR to drop these conditionals?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that's the intention. The installer PR depends on the changes there, but this PR depends on the changes in the installer PR so this let's us break that cycle.

The conditionals are isolated in a single commit we can revert afterwards.

@cgwalters
Copy link
Member

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Aug 1, 2019
@openshift-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cgwalters, trown

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@cgwalters
Copy link
Member

I find this part encouraging:

These differences are not fundamental to OpenStack and we will be
looking at aligning more closely with the Baremetal provider in the
future.

Bigger picture, I wonder whether we should consider having the cluster own DNS by default across the board in a uniform way. That's another topic though.

The commit message in openshift/installer#1959 is great, very helpful!

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-merge-robot openshift-merge-robot merged commit 7f52bd8 into openshift:master Aug 2, 2019
mandre pushed a commit to mandre/installer that referenced this pull request Aug 5, 2019
The experimental OpenStack backend used to create an extra server
running DNS and load balancer services that the cluster needed.
OpenStack does not always come with DNSaaS or LBaaS so we had to provide
the functionality the OpenShift cluster depends on (e.g. the etcd SRV
records, the api-int records & load balancing, etc.).

This approach is undesirable for two reasons: first, it adds an extra
node that the other IPI platforms do not need. Second, this node is a
single point of failure.

The Baremetal platform has faced the same issues and they have solved
them with a few virtual IP addresses managed by keepalived in
combination with coredns static pod running on every node using the mDNS
protocol to update records as new nodes are added or removed and a
similar static pod haproxy to load balance the control plane internally.

The VIPs are defined here in the installer and they use the
PlatformStatus field to be passed to the necessary
machine-config-operator fields:

openshift/api#374

The Bare Metal IPI Networking Infrastructure document is applicable here as
well:

https://github.com/openshift/installer/blob/master/docs/design/baremetal/networking-infrastructure.md

There is also a great opportunity to share some of the configuration
files and scripts here.

This change needs several other pull requests:

Keepalived plus the coredns & haproxy static pods in the MCO:
openshift/machine-config-operator#740

Passing the API and DNS VIPs through the installer:
openshift#1998

Co-authored-by: Emilio Garcia <egarcia@redhat.com>
Co-authored-by: John Trowbridge <trown@redhat.com>
Co-authored-by: Martin Andre <m.andre@redhat.com>
Co-authored-by: Tomas Sedovic <tsedovic@redhat.com>

Massive thanks to the Bare Metal and oVirt people!
tomassedovic pushed a commit to tomassedovic/openshift-installer that referenced this pull request Aug 8, 2019
The experimental OpenStack backend used to create an extra server
running DNS and load balancer services that the cluster needed.
OpenStack does not always come with DNSaaS or LBaaS so we had to provide
the functionality the OpenShift cluster depends on (e.g. the etcd SRV
records, the api-int records & load balancing, etc.).

This approach is undesirable for two reasons: first, it adds an extra
node that the other IPI platforms do not need. Second, this node is a
single point of failure.

The Baremetal platform has faced the same issues and they have solved
them with a few virtual IP addresses managed by keepalived in
combination with coredns static pod running on every node using the mDNS
protocol to update records as new nodes are added or removed and a
similar static pod haproxy to load balance the control plane internally.

The VIPs are defined here in the installer and they use the
PlatformStatus field to be passed to the necessary
machine-config-operator fields:

openshift/api#374

The Bare Metal IPI Networking Infrastructure document is applicable here as
well:

https://github.com/openshift/installer/blob/master/docs/design/baremetal/networking-infrastructure.md

There is also a great opportunity to share some of the configuration
files and scripts here.

This change needs several other pull requests:

Keepalived plus the coredns & haproxy static pods in the MCO:
openshift/machine-config-operator#740

Co-authored-by: Emilio Garcia <egarcia@redhat.com>
Co-authored-by: John Trowbridge <trown@redhat.com>
Co-authored-by: Martin Andre <m.andre@redhat.com>
Co-authored-by: Tomas Sedovic <tsedovic@redhat.com>

Massive thanks to the Bare Metal and oVirt people!
jhixson74 pushed a commit to jhixson74/installer that referenced this pull request Dec 6, 2019
The experimental OpenStack backend used to create an extra server
running DNS and load balancer services that the cluster needed.
OpenStack does not always come with DNSaaS or LBaaS so we had to provide
the functionality the OpenShift cluster depends on (e.g. the etcd SRV
records, the api-int records & load balancing, etc.).

This approach is undesirable for two reasons: first, it adds an extra
node that the other IPI platforms do not need. Second, this node is a
single point of failure.

The Baremetal platform has faced the same issues and they have solved
them with a few virtual IP addresses managed by keepalived in
combination with coredns static pod running on every node using the mDNS
protocol to update records as new nodes are added or removed and a
similar static pod haproxy to load balance the control plane internally.

The VIPs are defined here in the installer and they use the
PlatformStatus field to be passed to the necessary
machine-config-operator fields:

openshift/api#374

The Bare Metal IPI Networking Infrastructure document is applicable here as
well:

https://github.com/openshift/installer/blob/master/docs/design/baremetal/networking-infrastructure.md

There is also a great opportunity to share some of the configuration
files and scripts here.

This change needs several other pull requests:

Keepalived plus the coredns & haproxy static pods in the MCO:
openshift/machine-config-operator#740

Co-authored-by: Emilio Garcia <egarcia@redhat.com>
Co-authored-by: John Trowbridge <trown@redhat.com>
Co-authored-by: Martin Andre <m.andre@redhat.com>
Co-authored-by: Tomas Sedovic <tsedovic@redhat.com>

Massive thanks to the Bare Metal and oVirt people!
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants