Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[baremetal] Add dispatcher script to statically configure DHCP address #2188

Merged
merged 1 commit into from Dec 3, 2020

Conversation

cybertron
Copy link
Member

We have a feature request for nodes to be able to initially get
assigned their IP addresses via infinite DHCP leases, but then to
have those addresses statically configured on the node so that if
the DHCP server is unavailable for any reason they don't lose their
network configuration.

This adds a dispatcher script for NetworkManager that does so. It
looks at the DHCP assigned network settings and adds static
configuration with the same values.

- What I did

- How to verify it

- Description for the changelog

@cybertron
Copy link
Member Author

/hold

This is a PoC. If we decide to pursue this it will almost certainly need some refinement.

@openshift-ci-robot openshift-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 28, 2020
#!/bin/bash
set -ex

if [ ${DHCP4_DHCP_LEASE_TIME:-0} -lt 4294967295 ]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this not exist in IPv6?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only implemented ipv4 in this version. Based on my discussion with the NM team, I would probably argue that for this use case in ipv6 we should just use SLAAC addressing. If there's a reason we can't do that then we can always add support for DHCPv6 here too.

@cybertron
Copy link
Member Author

/hold cancel

I've tested this locally with infinite DHCP leases and it worked fine.

@openshift-ci-robot openshift-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Nov 25, 2020
@celebdor
Copy link
Contributor

/retest

celebdor added a commit to celebdor/machine-config-operator that referenced this pull request Dec 1, 2020
This is essentially an ipv6 version of openshift#2188.

Co-authored-by: Antoni Segura Puimedon <antoni@redhat.com>
@cybertron
Copy link
Member Author

/hold

We discovered that this would cause problems on reboot. Fix coming.

@openshift-ci-robot openshift-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Dec 1, 2020
@cybertron cybertron force-pushed the static-from-dhcp branch 2 times, most recently from 8c8441f to c8ed64a Compare December 2, 2020 03:41
cybertron added a commit to cybertron/machine-config-operator that referenced this pull request Dec 2, 2020
This is essentially an ipv6 version of openshift#2188.

Co-authored-by: Antoni Segura Puimedon <antoni@redhat.com>
@cybertron cybertron force-pushed the static-from-dhcp branch 2 times, most recently from a78db50 to 0551cd9 Compare December 2, 2020 18:53
@cybertron cybertron force-pushed the static-from-dhcp branch 2 times, most recently from dceaf9f to e905d8c Compare December 2, 2020 20:40
@celebdor
Copy link
Contributor

celebdor commented Dec 3, 2020

Just tested it with a small modification on one of my masters with openshift-sdn. It worked well. Here some snippets:

mode: 0755
path: "/etc/NetworkManager/dispatcher.d/30-static-dhcp"
contents:
  inline: |
    #!/bin/bash
    set -ex -o pipefail
    if [[ "{{ .NetworkType }}" == "OVNKubernetes" && "$CONNECTION_ID" == "Wired Connection" ]]
    then
        >&2 echo "Refusing to modify default connection."
        exit 0
    fi

    if [ -z ${DHCP4_IP_ADDRESS:-} ]; then
        >&2 echo "Not a DHCP4 address. Ignoring."
        exit 0
    fi
    if [ ${DHCP4_DHCP_LEASE_TIME:-0} -lt 4294967295 ]; then
        >&2 echo "Not an infinite DHCP4 lease. Ignoring."
        exit 0
    fi
    IPS=($IP4_ADDRESS_0)
    CIDR=${IPS[0]}
    GATEWAY=${IPS[1]}
    TYPE=$(nmcli --get-values connection.type connection show "$CONNECTION_ID")
    if ! nmcli con show inf-lease-to-static
    then
        nmcli con add type "$TYPE" con-name inf-lease-to-static
    fi
    nmcli con mod inf-lease-to-static \
      conn.interface "$1" \
      connection.autoconnect yes \
      ipv4.addresses "$CIDR" \
      ipv4.method manual \
      ipv4.gateway "$GATEWAY" \
      ipv4.dns "$IP4_NAMESERVERS"
    if [ -n "$IP4_DOMAINS" ]; then
        nmcli con mod inf-lease-to-static ipv4.dns-search "$IP4_DOMAINS"
    fi
    plus=''
    for i in $(seq 0 $(($IP4_NUM_ROUTES-1)) )
    do
        varname="IP4_ROUTE_$i"
        nmcli con mod inf-lease-to-static ${plus}ipv4.routes "${!varname}"
        plus='+'
    done
    nmcli con up inf-lease-to-static
    # Copy it from the OverlayFS mount to the persistent lowerdir
    cp "/etc/NetworkManager/system-connections-merged/inf-lease-to-static.nmconnection" /etc/NetworkManager/system-connections

    if [ -n "${DHCP4_HOST_NAME:-}" ]
    then
        hostnamectl set-hostname --static --transient "$DHCP4_HOST_NAME"
    fi

After making these changes, that make sure that it works for openshift SDN and for OVN kubernetes, you can see in the following image that there is a reboot of master-0 (I also rebooted master-1) after I had killed the dnsmasq that provided DHCP functionality.
You can also see that the nodes boot up with the hostname they are supposed and become ready again.

image

We have a feature request for nodes to be able to initially get
assigned their IP addresses via infinite DHCP leases, but then to
have those addresses statically configured on the node so that if
the DHCP server is unavailable for any reason they don't lose their
network configuration.

This adds a dispatcher script for NetworkManager that does so. It
looks at the DHCP assigned network settings and adds static
configuration with the same values.
@cybertron
Copy link
Member Author

/hold cancel

I also had success rebooting nodes with the latest version of this. I think it's good to go.

@openshift-ci-robot openshift-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Dec 3, 2020
@celebdor
Copy link
Contributor

celebdor commented Dec 3, 2020

/lgtm

The version in this PR now matches what I tested successfully in #2188 (comment)

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Dec 3, 2020
@sinnykumari
Copy link
Contributor

overall lgtm.
/approve

@openshift-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: celebdor, cybertron, sinnykumari

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Dec 3, 2020
@sinnykumari
Copy link
Contributor

/retest

@celebdor
Copy link
Contributor

celebdor commented Dec 3, 2020

/retest

@openshift-merge-robot openshift-merge-robot merged commit 4c331f2 into openshift:master Dec 3, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants