New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

regression in 779.0.0: bnx2x fails to load firmware #450

Closed
jumanjiman opened this Issue Aug 25, 2015 · 3 comments

Comments

Projects
None yet
2 participants
@jumanjiman

jumanjiman commented Aug 25, 2015

primary symptom: host fails to get on network. 馃槩

host info:

  • bare metal host
  • bnx2x NICs
  • coreos alpha 779.0.0

other symptoms

  • bnx2x firmware fails to load

    [   11.639898] bnx2x: [bnx2x_init_firmware:12987(enp2s0f0)]Can't load firmware file bnx2x/bnx2x-e1h-7.10.51.0.fw
    [   11.642299] bnx2x: [bnx2x_func_hw_init:5523(enp2s0f0)]Error loading firmware
    [   11.644321] bnx2x: [bnx2x_nic_load:2706(enp2s0f0)]HW init failed, aborting
    
  • network monitoring shows the link flapping as the host repeatedly tries to up the interface (and fails).

immediate steps taken to work around the issue:

  • reboot into the A partition, which boots into coreos alpha 774.0.0
  • host is back on the network 馃檶

modinfo bnx2x on 774.0.0 shows:

core@td-lss01 ~ $ modinfo bnx2x
filename:       /lib/modules/4.1.5-coreos/kernel/drivers/net/ethernet/broadcom/bnx2x/bnx2x.ko
firmware:       bnx2x/bnx2x-e2-7.10.51.0.fw
firmware:       bnx2x/bnx2x-e1h-7.10.51.0.fw
firmware:       bnx2x/bnx2x-e1-7.10.51.0.fw
version:        1.710.51-0
license:        GPL
description:    Broadcom NetXtreme II BCM57710/57711/57711E/57712/57712_MF/57800/57800_MF/57810/57810_MF/57840/57840_MF Driver
author:         Eliezer Tamir
srcversion:     565EAFC6073A6007ADEBFD1
alias:          pci:v000014E4d0000163Fsv*sd*bc*sc*i*
alias:          pci:v000014E4d0000163Esv*sd*bc*sc*i*
alias:          pci:v000014E4d0000163Dsv*sd*bc*sc*i*
alias:          pci:v000014E4d000016ADsv*sd*bc*sc*i*
alias:          pci:v000014E4d000016A4sv*sd*bc*sc*i*
alias:          pci:v000014E4d000016ABsv*sd*bc*sc*i*
alias:          pci:v000014E4d000016AFsv*sd*bc*sc*i*
alias:          pci:v000014E4d000016A2sv*sd*bc*sc*i*
alias:          pci:v000014E4d000016A1sv*sd*bc*sc*i*
alias:          pci:v000014E4d0000168Dsv*sd*bc*sc*i*
alias:          pci:v000014E4d000016AEsv*sd*bc*sc*i*
alias:          pci:v000014E4d0000168Esv*sd*bc*sc*i*
alias:          pci:v000014E4d000016A9sv*sd*bc*sc*i*
alias:          pci:v000014E4d000016A5sv*sd*bc*sc*i*
alias:          pci:v000014E4d0000168Asv*sd*bc*sc*i*
alias:          pci:v000014E4d0000166Fsv*sd*bc*sc*i*
alias:          pci:v000014E4d00001663sv*sd*bc*sc*i*
alias:          pci:v000014E4d00001662sv*sd*bc*sc*i*
alias:          pci:v000014E4d00001650sv*sd*bc*sc*i*
alias:          pci:v000014E4d0000164Fsv*sd*bc*sc*i*
alias:          pci:v000014E4d0000164Esv*sd*bc*sc*i*
depends:        mdio,libcrc32c,ptp,firmware_class
intree:         Y
vermagic:       4.1.5-coreos SMP mod_unload 
signer:         Build time autogenerated kernel key
sig_key:        27:A4:29:D4:51:88:99:B2:3E:E7:DF:EC:51:75:52:91:A2:58:8F:80
sig_hashalgo:   sha256
parm:           num_queues: Set number of queues (default is as a number of CPUs) (int)
parm:           disable_tpa: Disable the TPA (LRO) feature (int)
parm:           int_mode: Force interrupt mode other than MSI-X (1 INT#x; 2 MSI) (int)
parm:           dropless_fc: Pause on exhausted host ring (int)
parm:           mrrs: Force Max Read Req Size (0..3) (for debug) (int)
parm:           debug: Default debug msglevel (int)

next steps

we may follow the steps at https://coreos.com/os/docs/latest/manual-rollbacks.html to manually rollback.

more info

here is a larger snippet of spew around the time we see "Can't load firmware..."

[    7.845914] bnx2x 0000:02:00.1 enp2s0f1: renamed from eth3
[    7.860400] sfc 0000:06:00.0 ens1f0: renamed from eth0
[    7.870253] sfc 0000:06:00.1 ens1f1: renamed from eth1
[    8.326074] EXT4-fs (sda6): recovery complete
[    8.327057] EXT4-fs (sda6): mounted filesystem with ordered data mode. Opts: commit=600
[    8.604237] FAT-fs (sda1): Volume was not properly unmounted. Some data may be corrupt. Please run fsck.
[   10.070065] bnx2x 0000:02:00.1: Direct firmware load for bnx2x/bnx2x-e1h-7.10.51.0.fw failed with error -2
[   10.072359] bnx2x: [bnx2x_init_firmware:12987(enp2s0f1)]Can't load firmware file bnx2x/bnx2x-e1h-7.10.51.0.fw
[   10.074849] bnx2x: [bnx2x_func_hw_init:5523(enp2s0f1)]Error loading firmware
[   10.076681] bnx2x: [bnx2x_nic_load:2706(enp2s0f1)]HW init failed, aborting
[   11.637478] bnx2x 0000:02:00.0: Direct firmware load for bnx2x/bnx2x-e1h-7.10.51.0.fw failed with error -2
[   11.639898] bnx2x: [bnx2x_init_firmware:12987(enp2s0f0)]Can't load firmware file bnx2x/bnx2x-e1h-7.10.51.0.fw
[   11.642299] bnx2x: [bnx2x_func_hw_init:5523(enp2s0f0)]Error loading firmware
[   11.644321] bnx2x: [bnx2x_nic_load:2706(enp2s0f0)]HW init failed, aborting
[   13.727277] sfc 0000:06:00.1 ens1f1: link down
[   13.733201] IPv6: ADDRCONF(NETDEV_UP): ens1f1: link is not ready
[   13.735174] sfc 0000:06:00.0 ens1f0: link down
[   13.741503] IPv6: ADDRCONF(NETDEV_UP): ens1f0: link is not ready
@marineam

This comment has been minimized.

marineam commented Aug 25, 2015

Ugg, looks like coreos/coreos-overlay#1445 broke installing symlinks. Will get this fixed in the next alpha!

@jumanjiman

This comment has been minimized.

jumanjiman commented Aug 25, 2015

thanks for the quick response @marineam !

@jumanjiman

This comment has been minimized.

jumanjiman commented Aug 30, 2015

fyi

i can confirm that https://coreos.com/releases/#789.0.0 resolves this issue for us.

core@td-lss01 ~ $ cat /etc/os-release 
NAME=CoreOS
ID=coreos
VERSION=789.0.0
VERSION_ID=789.0.0
BUILD_ID=
PRETTY_NAME="CoreOS 789.0.0"
ANSI_COLOR="1;32"
HOME_URL="https://coreos.com/"
BUG_REPORT_URL="https://github.com/coreos/bugs/issues"


core@td-lss01 ~ $ ethtool -i enp2s0f0
driver: bnx2x
version: 1.710.51-0
firmware-version: bc 5.2.7 phy baa0.105
bus-info: 0000:02:00.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
supports-priv-flags: yes



core@td-lss01 ~ $ ip -4 addr show !$
ip -4 add show enp2s0f0
4: enp2s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq portid f4ce467f2650 state UP group default qlen 1000
    inet 51.4.1.150/24 brd 51.4.1.255 scope global dynamic enp2s0f0
       valid_lft 21112sec preferred_lft 21112sec
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment