New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kernel 4.12 breaks non-zero updelay in network bonding driver #2065

Closed
bgilbert opened this Issue Jul 20, 2017 · 4 comments

Comments

Projects
None yet
3 participants
@bgilbert
Member

bgilbert commented Jul 20, 2017

Issue Report

Bug

Container Linux Version

$ cat /etc/os-release
NAME="Container Linux by CoreOS"
ID=coreos
VERSION=1478.0.0
VERSION_ID=1478.0.0
BUILD_ID=2017-07-19-0038
PRETTY_NAME="Container Linux by CoreOS 1478.0.0 (Ladybug)"
ANSI_COLOR="38;5;75"
HOME_URL="https://coreos.com/"
BUG_REPORT_URL="https://issues.coreos.com"
COREOS_BOARD="amd64-usr"

Environment

Packet

Expected Behavior

Working network.

Actual Behavior

Networking is unreliable. The kernel log gets a message every 100 ms:

link status up for interface eth1, enabling it in 200 ms

Reproduction Steps

  1. Boot a Container Linux release with a 4.12 kernel (1465 or above) on a Packet type 1 instance.
  2. If dmesg is not already filling with log messages, do this:
echo -enp1s0f1 | sudo tee /sys/devices/virtual/net/bond0/bonding/slaves
echo +enp1s0f1 | sudo tee /sys/devices/virtual/net/bond0/bonding/slaves

Other Information

Problem bisected to torvalds/linux@de77ecd.

Workaround:

echo 0 | sudo tee /sys/devices/virtual/net/bond0/bonding/updelay

This causes:

bond0: Setting up delay to 0
bond0: link status definitely up for interface eth1, 1000 Mbps full duplex
@bgilbert

This comment has been minimized.

Show comment
Hide comment
@bgilbert

bgilbert Jul 21, 2017

Member

Posted to netdev; no response yet.

Member

bgilbert commented Jul 21, 2017

Posted to netdev; no response yet.

@bgilbert

This comment has been minimized.

Show comment
Hide comment
@bgilbert

bgilbert Jul 29, 2017

Member

Fixed by coreos/linux#74, which will likely be included in alpha 1492.0.0 and beta 1465.3.0.

Member

bgilbert commented Jul 29, 2017

Fixed by coreos/linux#74, which will likely be included in alpha 1492.0.0 and beta 1465.3.0.

@f0

This comment has been minimized.

Show comment
Hide comment
@f0

f0 Oct 13, 2017

@bgilbert
we have exactly this problem with 1520.6 (comming from 1465.8.0)

internet and bonding are the names of the bonding interfaces
The system is not stable, starts and then breaks

[  127.468853] 8021q: adding VLAN 0 to HW filter on device mv-internet
[  129.976625] 8021q: adding VLAN 0 to HW filter on device mv-internet
[  130.274563] 8021q: adding VLAN 0 to HW filter on device mv-internet
[  131.479939] internet: link status down for interface eno4, disabling it in 200 ms
[  131.488827] internet: link status down for interface eno3, disabling it in 200 ms
[  131.499935] internet: link status down for interface eno4, disabling it in 200 ms
[  131.508829] internet: link status down for interface eno3, disabling it in 200 ms
[  131.518936] internet: link status down for interface eno4, disabling it in 200 ms
[  131.527811] internet: link status down for interface eno3, disabling it in 200 ms
[  131.551937] bonding: link status down for interface eno2, disabling it in 200 ms
[  131.561933] bonding: link status down for interface eno2, disabling it in 200 ms
[  131.571936] bonding: link status down for interface eno2, disabling it in 200 ms
[  131.687960] bonding: link status down for interface eno1, disabling it in 200 ms
[  131.697958] bonding: link status down for interface eno1, disabling it in 200 ms
[  131.707958] bonding: link status down for interface eno1, disabling it in 200 ms
[  131.717957] bonding: link status down for interface eno1, disabling it in 200 ms
[  131.727959] bonding: link status down for interface eno1, disabling it in 200 ms
[  131.737955] bonding: link status down for interface eno1, disabling it in 200 ms
[  131.740719] igb 0000:01:00.3 eno4: speed changed to 0 for port eno4
[  131.740721] 8021q: adding VLAN 0 to HW filter on device eno4
[  131.746854] internet: link status definitely down for interface eno4, disabling it
[  131.746858] internet: now running without any active interface!
[  131.746873] internet: link status definitely down for interface eno3, disabling it
[  131.785720] bonding: link status definitely down for interface eno2, disabling it
[  131.794622] bonding: now running without any active interface!
[  131.905489] 8021q: adding VLAN 0 to HW filter on device eno3
[  132.022874] 8021q: adding VLAN 0 to HW filter on device eno2
[  132.029961] bonding: link status definitely down for interface eno1, disabling it
[  132.143833] 8021q: adding VLAN 0 to HW filter on device eno1
[  136.179303] igb 0000:01:00.3 eno4: igb: eno4 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX
[  136.206285] igb 0000:01:00.1 eno2: igb: eno2 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX
[  136.216911] internet: link status up for interface eno4, enabling it in 0 ms
[  136.216917] internet: link status definitely up for interface eno4, 1000 Mbps full duplex
[  136.216937] internet: first active interface up!
[  136.303962] bonding: link status up for interface eno2, enabling it in 0 ms
[  136.312070] bonding: link status definitely up for interface eno2, 1000 Mbps full duplex
[  136.321627] bonding: first active interface up!
[  136.654290] igb 0000:01:00.0 eno1: igb: eno1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX
[  136.743938] bonding: link status up for interface eno1, enabling it in 200 ms
[  136.878283] igb 0000:01:00.2 eno3: igb: eno3 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX
[  136.943979] internet: link status up for interface eno3, enabling it in 200 ms
[  136.959993] bonding: link status definitely up for interface eno1, 1000 Mbps full duplex
[  137.159969] internet: link status definitely up for interface eno3, 1000 Mbps full duplex
[  137.731647] 8021q: adding VLAN 0 to HW filter on device mv-internet
[  139.569075] igb 0000:01:00.0 eno1: igb: eno1 NIC Link is Down
[  139.575956] igb 0000:01:00.0 eno1: speed changed to 0 for port eno1
[  139.671972] bonding: link status down for interface eno1, disabling it in 200 ms
[  139.888010] bonding: link status definitely down for interface eno1, disabling it
[  140.420073] igb 0000:01:00.2 eno3: igb: eno3 NIC Link is Down
[  140.503969] internet: link status down for interface eno3, disabling it in 200 ms
[  140.576099] igb 0000:01:00.2 eno3: speed changed to 0 for port eno3
[  140.720052] internet: link status definitely down for interface eno3, disabling it
[  142.737319] igb 0000:01:00.0 eno1: igb: eno1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX
[  142.807955] bonding: link status up for interface eno1, enabling it in 200 ms
[  143.023980] bonding: link status definitely up for interface eno1, 1000 Mbps full duplex
[  143.561310] igb 0000:01:00.2 eno3: igb: eno3 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX
[  143.639966] internet: link status up for interface eno3, enabling it in 200 ms
[  143.855952] internet: link status definitely up for interface eno3, 1000 Mbps full duplex
[  154.438086] 8021q: adding VLAN 0 to HW filter on device mv-internet
[  154.669631] 8021q: adding VLAN 0 to HW filter on device mv-internet

f0 commented Oct 13, 2017

@bgilbert
we have exactly this problem with 1520.6 (comming from 1465.8.0)

internet and bonding are the names of the bonding interfaces
The system is not stable, starts and then breaks

[  127.468853] 8021q: adding VLAN 0 to HW filter on device mv-internet
[  129.976625] 8021q: adding VLAN 0 to HW filter on device mv-internet
[  130.274563] 8021q: adding VLAN 0 to HW filter on device mv-internet
[  131.479939] internet: link status down for interface eno4, disabling it in 200 ms
[  131.488827] internet: link status down for interface eno3, disabling it in 200 ms
[  131.499935] internet: link status down for interface eno4, disabling it in 200 ms
[  131.508829] internet: link status down for interface eno3, disabling it in 200 ms
[  131.518936] internet: link status down for interface eno4, disabling it in 200 ms
[  131.527811] internet: link status down for interface eno3, disabling it in 200 ms
[  131.551937] bonding: link status down for interface eno2, disabling it in 200 ms
[  131.561933] bonding: link status down for interface eno2, disabling it in 200 ms
[  131.571936] bonding: link status down for interface eno2, disabling it in 200 ms
[  131.687960] bonding: link status down for interface eno1, disabling it in 200 ms
[  131.697958] bonding: link status down for interface eno1, disabling it in 200 ms
[  131.707958] bonding: link status down for interface eno1, disabling it in 200 ms
[  131.717957] bonding: link status down for interface eno1, disabling it in 200 ms
[  131.727959] bonding: link status down for interface eno1, disabling it in 200 ms
[  131.737955] bonding: link status down for interface eno1, disabling it in 200 ms
[  131.740719] igb 0000:01:00.3 eno4: speed changed to 0 for port eno4
[  131.740721] 8021q: adding VLAN 0 to HW filter on device eno4
[  131.746854] internet: link status definitely down for interface eno4, disabling it
[  131.746858] internet: now running without any active interface!
[  131.746873] internet: link status definitely down for interface eno3, disabling it
[  131.785720] bonding: link status definitely down for interface eno2, disabling it
[  131.794622] bonding: now running without any active interface!
[  131.905489] 8021q: adding VLAN 0 to HW filter on device eno3
[  132.022874] 8021q: adding VLAN 0 to HW filter on device eno2
[  132.029961] bonding: link status definitely down for interface eno1, disabling it
[  132.143833] 8021q: adding VLAN 0 to HW filter on device eno1
[  136.179303] igb 0000:01:00.3 eno4: igb: eno4 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX
[  136.206285] igb 0000:01:00.1 eno2: igb: eno2 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX
[  136.216911] internet: link status up for interface eno4, enabling it in 0 ms
[  136.216917] internet: link status definitely up for interface eno4, 1000 Mbps full duplex
[  136.216937] internet: first active interface up!
[  136.303962] bonding: link status up for interface eno2, enabling it in 0 ms
[  136.312070] bonding: link status definitely up for interface eno2, 1000 Mbps full duplex
[  136.321627] bonding: first active interface up!
[  136.654290] igb 0000:01:00.0 eno1: igb: eno1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX
[  136.743938] bonding: link status up for interface eno1, enabling it in 200 ms
[  136.878283] igb 0000:01:00.2 eno3: igb: eno3 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX
[  136.943979] internet: link status up for interface eno3, enabling it in 200 ms
[  136.959993] bonding: link status definitely up for interface eno1, 1000 Mbps full duplex
[  137.159969] internet: link status definitely up for interface eno3, 1000 Mbps full duplex
[  137.731647] 8021q: adding VLAN 0 to HW filter on device mv-internet
[  139.569075] igb 0000:01:00.0 eno1: igb: eno1 NIC Link is Down
[  139.575956] igb 0000:01:00.0 eno1: speed changed to 0 for port eno1
[  139.671972] bonding: link status down for interface eno1, disabling it in 200 ms
[  139.888010] bonding: link status definitely down for interface eno1, disabling it
[  140.420073] igb 0000:01:00.2 eno3: igb: eno3 NIC Link is Down
[  140.503969] internet: link status down for interface eno3, disabling it in 200 ms
[  140.576099] igb 0000:01:00.2 eno3: speed changed to 0 for port eno3
[  140.720052] internet: link status definitely down for interface eno3, disabling it
[  142.737319] igb 0000:01:00.0 eno1: igb: eno1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX
[  142.807955] bonding: link status up for interface eno1, enabling it in 200 ms
[  143.023980] bonding: link status definitely up for interface eno1, 1000 Mbps full duplex
[  143.561310] igb 0000:01:00.2 eno3: igb: eno3 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX
[  143.639966] internet: link status up for interface eno3, enabling it in 200 ms
[  143.855952] internet: link status definitely up for interface eno3, 1000 Mbps full duplex
[  154.438086] 8021q: adding VLAN 0 to HW filter on device mv-internet
[  154.669631] 8021q: adding VLAN 0 to HW filter on device mv-internet

@euank euank reopened this Oct 13, 2017

@euank euank closed this Oct 13, 2017

@euank euank reopened this Oct 13, 2017

@bgilbert

This comment has been minimized.

Show comment
Hide comment
@bgilbert

bgilbert Oct 13, 2017

Member

@f0 This looks like a different problem. In the original bug, the link status up message repeated indefinitely, without any actual link status change. In the log you posted, the underlying interfaces are going down and coming back up. Could you open a new issue for this? Please include the output of lspci.

Member

bgilbert commented Oct 13, 2017

@f0 This looks like a different problem. In the original bug, the link status up message repeated indefinitely, without any actual link status change. In the log you posted, the underlying interfaces are going down and coming back up. Could you open a new issue for this? Please include the output of lspci.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment