Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bond interfaces stuck at 1500 MTU on Bionic #3190

Closed
ubuntu-server-builder opened this issue May 11, 2023 · 33 comments
Closed

Bond interfaces stuck at 1500 MTU on Bionic #3190

ubuntu-server-builder opened this issue May 11, 2023 · 33 comments
Labels
launchpad Migrated from Launchpad

Comments

@ubuntu-server-builder
Copy link
Collaborator

This bug was originally filed in Launchpad as LP: #1774666

Launchpad details
affected_projects = ['maas', 'cloud-init (Ubuntu)', 'cloud-init (Ubuntu Xenial)', 'cloud-init (Ubuntu Artful)', 'cloud-init (Ubuntu Bionic)', 'cloud-init (Ubuntu Cosmic)']
assignee = chad.smith
assignee_name = Chad Smith
date_closed = 2018-06-20T18:05:41.886220+00:00
date_created = 2018-06-01T15:32:09.418451+00:00
date_fix_committed = 2018-06-12T15:26:19.883495+00:00
date_fix_released = 2018-06-20T18:05:41.886220+00:00
id = 1774666
importance = medium
is_complete = True
lp_url = https://bugs.launchpad.net/cloud-init/+bug/1774666
milestone = None
owner = andreserl
owner_name = Andres Rodriguez
private = False
status = fix_released
submitter = kj-kingj
submitter_name = KingJ
tags = ['cdo-qa', 'foundations-engine', 'mtu', 'netplan']
duplicates = [1774648]

Launchpad user KingJ(kj-kingj) wrote on 2018-06-01T15:32:09.418451+00:00

When deploying a machine through MAAS with bonded network interfaces, the bond does not have a 9000 byte MTU applied despite the attached VLANs having had a 9000 MTU explicitly set. The MTU size is set on the bond members, but not on the bond itself in Netplan. Consequently, when the bond is brought up, the interface MTU is decreased from 9000 to 1500. Manually changing the interface MTU after boot is successful.

This is not observed when deploying Xenial on the same machine. The bond comes up at the expected 9000 byte MTU.

@ubuntu-server-builder ubuntu-server-builder added the launchpad Migrated from Launchpad label May 11, 2023
@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user KingJ(kj-kingj) wrote on 2018-06-01T15:32:09.418451+00:00

Launchpad attachments: Bionic Netplan Configuration

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user KingJ(kj-kingj) wrote on 2018-06-01T15:32:54.244537+00:00

Output of dmesg on Bionic - note interface MTU being changed from 9000 to 1500.
Launchpad attachments: Bionic dmesg output.txt

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user KingJ(kj-kingj) wrote on 2018-06-01T15:33:31.358579+00:00

Launchpad attachments: bionic ip link sh

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user KingJ(kj-kingj) wrote on 2018-06-01T15:34:28.116414+00:00

Launchpad attachments: MAAS Curtin Config

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user KingJ(kj-kingj) wrote on 2018-06-01T15:35:27.383305+00:00

Launchpad attachments: Xenial /etc/network/interfaces.d/50-cloud-init.cfg

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user KingJ(kj-kingj) wrote on 2018-06-01T15:36:43.922598+00:00

Launchpad attachments: MASS dpkg

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Andres Rodriguez(andreserl) wrote on 2018-06-01T15:37:15.370003+00:00

Looking at the MAAS curtin configuration, I see the following:

  • bond_interfaces:
    • enp5s0f0
    • enp5s0f1
    • enp6s0f0
    • enp6s0f1
      id: eth1
      mac_address: 00:1b:21:4a:99:50
      mtu: 9000
      name: eth1
      params:
      bond-downdelay: 0
      bond-lacp-rate: fast
      bond-miimon: 100
      bond-mode: 802.3ad
      bond-updelay: 0
      bond-xmit-hash-policy: encap3+4
      subnets:
    • type: manual
      type: bond

The resulting netplan configuration, however, doesn't include an MTU. On the other hand, the Xenial configuration does correctly have the MTU for the bond.

As such, this seems like an issue in cloud-init to me.

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user KingJ(kj-kingj) wrote on 2018-06-01T15:37:33.543896+00:00

Launchpad attachments: Xenial dmesg

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user KingJ(kj-kingj) wrote on 2018-06-01T15:38:32.820560+00:00

Launchpad attachments: Xenial dpkg | grep cloud-init

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Mathieu Trudel-Lapierre(cyphermox) wrote on 2018-06-01T15:42:38.057355+00:00

Can you manually change the mtu in the netplan yaml under eth1? If you do so, is the MTU then set correctly?

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user KingJ(kj-kingj) wrote on 2018-06-01T16:01:47.279462+00:00

@cyphermox I'm deploying the machine back to Bionic to try this now. Can you confirm where exactly in the netplan config I need to set this? (e.g. in the bonds section, or add a new eth1 definition to the ethernets section?)

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Mathieu Trudel-Lapierre(cyphermox) wrote on 2018-06-01T18:16:38.914208+00:00

In the existing bonds section, for the interface that is your bond (eth1)

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user KingJ(kj-kingj) wrote on 2018-06-01T18:53:04.493313+00:00

I've added "mtu: 9000" to the bonds section, which now reads as follows;

bonds:
    eth1:
        mtu: 9000
        interfaces:
        - enp5s0f0
        - enp5s0f1
        - enp6s0f0
        - enp6s0f1
        parameters:
            down-delay: 0
            lacp-rate: fast
            mii-monitor-interval: 100
            mode: 802.3ad
            transmit-hash-policy: encap3+4
            up-delay: 0

I re-ran netplan apply, but the bond had the same MTU as before. I also tried restarting systemd-networkd as I could see that the relevant .netdev files in /run/systemd/network/ had MTUBytes set to 9000. However, the interface remained at 1500 bytes.

After a system restart however, the bond interface is now running at a MTU of 9000. dmesg now only has messages for the member interfaces being increased from 1500 to 9000 MTU - the messages regarding the MTU being lowered from 9000 to 1500 when the bond was being configured are no longer present.

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Launchpad Janitor(janitor) wrote on 2018-06-05T22:09:23.469403+00:00

Status changed to 'Confirmed' because the bug affects multiple users.

1 similar comment
@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Launchpad Janitor(janitor) wrote on 2018-06-05T22:09:23.469403+00:00

Status changed to 'Confirmed' because the bug affects multiple users.

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Jason Hobbs(jason-hobbs) wrote on 2018-06-05T22:10:00.009488+00:00

We are seeing this in our test runs as well.

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Jason Hobbs(jason-hobbs) wrote on 2018-06-05T22:18:29.404224+00:00

This is causing test failures for us, because containers deployed by juju that are bound to a space that sits on top of the bond have the corrent mtu (9000) but the bond's mtu is stuck at (1500), so packets are being dropped.

curtin config for the machine:
http://paste.ubuntu.com/p/8tMR2YBGYm/

cloud-init netplan yaml (50-cloud-init.yaml.bak.1528231254):
http://paste.ubuntu.com/p/Vkq77KqwBp/

juju netplan yaml:
http://paste.ubuntu.com/p/wf9F2xzCy6/

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Jason Hobbs(jason-hobbs) wrote on 2018-06-05T22:22:53.163704+00:00

Subscribed to Canonical Field High SLA.

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Chad Smith(chad.smith) wrote on 2018-06-12T15:26:18.084379+00:00

An upstream commit landed for this bug.

To view that commit see the following URL:
https://git.launchpad.net/cloud-init/commit/?id=c3f1ad9a

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Chad Smith(chad.smith) wrote on 2018-06-12T15:53:09.057586+00:00

hrm, didn't intend to nominate netplan.io for xenial, artful, bionic, cosmic; only cloud-init. Not sure how to revoke the netplan.io nomination

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Andreas Hasenack(ahasenack) wrote on 2018-06-12T16:58:30.294833+00:00

Somehow the netplan.io and cloud-init tasks are linked in terms of those nominations. If I approve the cloud-init ones, netplan's also get approved.

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Chad Smith(chad.smith) wrote on 2018-06-12T17:10:48.460688+00:00

This is a cloud-init issue only. Once cloud-init is SRU'd netplan will properly set mtu.

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Chad Smith(chad.smith) wrote on 2018-06-12T17:11:03.725870+00:00

This is a cloud-init issue only. Once cloud-init is SRU'd netplan will properly set mtu.

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Chad Smith(chad.smith) wrote on 2018-06-12T17:11:16.009425+00:00

This is a cloud-init issue only. Once cloud-init is SRU'd netplan will properly set mtu.

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Chad Smith(chad.smith) wrote on 2018-06-12T17:12:47.475867+00:00

Setting netplan series tasks as invalid as this is a cloud-init bug, netplan on artful++ will do as cloud-init tells it, but we need an SRU for cloud-init into artful/bionic to fix things. (and a new cloud-init devel release is cosmic this week to fix behavior).

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Launchpad Janitor(janitor) wrote on 2018-06-16T17:34:40.340347+00:00

This bug was fixed in the package cloud-init - 18.2-77-g4ce67201-0ubuntu1


cloud-init (18.2-77-g4ce67201-0ubuntu1) cosmic; urgency=medium

  • New upstream snapshot.
    • lxd: Delete default network and detach device if lxd-init created them.
      (LP: #1776958)
    • openstack: avoid unneeded metadata probe on non-openstack platforms
      (LP: #1776701)
    • stages: fix tracebacks if a module stage is undefined or empty
      [Robert Schweikert] (LP: #1770462)
    • Be more safe on string/bytes when writing multipart user-data to disk.
      (LP: #1768600)
    • Fix get_proc_env for pids that have non-utf8 content in environment.
      (LP: #1775371)
    • tests: fix salt_minion integration test on bionic and later
    • tests: provide human-readable integration test summary when --verbose
    • tests: skip chrony integration tests on lxd running artful or older
    • test: add optional --preserve-instance arg to integraiton tests
    • netplan: fix mtu if provided by network config for all rendered types
      (LP: #1774666)
    • tests: remove pip install workarounds for pylxd, take upstream fix.
    • subp: support combine_capture argument.
    • tests: ordered tox dependencies for pylxd install

-- Chad Smith chad.smith@canonical.com Fri, 15 Jun 2018 20:05:07 -0600

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Scott Moser(smoser) wrote on 2018-06-20T18:05:44.315424+00:00

This bug is believed to be fixed in cloud-init in version 18.3. If this is still a problem for you, please make a comment and set the state back to New

Thank you.

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Launchpad Janitor(janitor) wrote on 2018-07-02T15:10:30.526352+00:00

Status changed to 'Confirmed' because the bug affects multiple users.

1 similar comment
@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Launchpad Janitor(janitor) wrote on 2018-07-02T15:10:30.526352+00:00

Status changed to 'Confirmed' because the bug affects multiple users.

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Launchpad Janitor(janitor) wrote on 2018-07-02T15:10:30.526352+00:00

Status changed to 'Confirmed' because the bug affects multiple users.

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Ryan Beisner(1chb1n) wrote on 2018-07-02T15:11:15.610365+00:00

FWIW, we squarely hit this while redeploying our dev cloud (serverstack) on bionic, which uses bonds and jumbo frames.

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Scott Moser(smoser) wrote on 2018-07-13T13:44:33.045933+00:00

Hi,
This bug is belived to be fixed in the version of cloud-init in -proposed of 16.04, 17.10 and 18.04 under SRU bug 1777912.

It would be good if someone could report back on that bug as to whether or not this is now working for them.

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Jason Hobbs(jason-hobbs) wrote on 2018-09-12T19:53:03.145051+00:00

Marked as Fix Released on Bionic/Xenial because the SRU for bug 1777912 is done. I can't make Artful "Won't Fix", but it should be.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
launchpad Migrated from Launchpad
Projects
None yet
Development

No branches or pull requests

1 participant