All eth0 interrupts going to CPU0 and saturating it on Debian Stretch, kernel 4.9.0 #122

sevagh · 2019-09-24T23:27:10Z

On Debian Stretch hosts with kernel 4.9.0-9-amd64, irqbalance is putting all eth0 interrupts on CPU0, causing some CPU saturation issues. It's almost as if it's doing the inverse of the expected behavior - on older Debian hosts, it does "the right thing" (unless my expectations are wrong).

stretch_host $ grep eth /proc/interrupts
  26:          1          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI 1048576-edge      eth0
  27:  292512958          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI 1048577-edge      eth0-TxRx-0
  28:  330616806          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI 1048578-edge      eth0-TxRx-1
  29:  247469290          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI 1048579-edge      eth0-TxRx-2
  30:  333730223          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI 1048580-edge      eth0-TxRx-3
  31:  311422652          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI 1048581-edge      eth0-TxRx-4
  32:  251719915          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI 1048582-edge      eth0-TxRx-5
  33:  231362205          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI 1048583-edge      eth0-TxRx-6
  34:  283640680          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI 1048584-edge      eth0-TxRx-7

Should I expect irqbalance to be balancing these across more than one core? That's what I see on my older hosts. E.g. Debian Jessie, kernel 3.16.0:

jessie_host $ grep eth /proc/interrupts
 100:          1          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0  IR-PCI-MSI-edge      eth0
 101:     111885      32715  121458886          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0  IR-PCI-MSI-edge      eth0-TxRx-0
 102:      23928  107978704          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0      88300          0          0          0          0          0          0          0          0  IR-PCI-MSI-edge      eth0-TxRx-1
 103:   50979259          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0      63439      38143          0          0          0          0          0          0          0          0  IR-PCI-MSI-edge      eth0-TxRx-2
 104:       1508          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0      57963      18987   46733236          0          0          0          0          0          0          0          0  IR-PCI-MSI-edge      eth0-TxRx-3
 105:        951          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0      27665       8185   46292938          0          0          0          0          0          0          0          0          0  IR-PCI-MSI-edge      eth0-TxRx-4
 106:       2536          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0      62179      26868   46021450          0          0          0          0          0          0          0          0          0          0  IR-PCI-MSI-edge      eth0-TxRx-5
 107:        604          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0      26712       8409   48573746          0          0          0          0          0          0          0          0          0          0          0  IR-PCI-MSI-edge      eth0-TxRx-6
 108:        262          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0      70242       7012   45856363          0          0          0          0          0          0          0          0          0          0          0          0  IR-PCI-MSI-edge      eth0-TxRx-7

The text was updated successfully, but these errors were encountered:

sevagh · 2019-09-24T23:27:56Z

I have also seen this: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=926967

nhorman · 2019-09-25T10:26:51Z

Based on that bug, it sounds like you are using irqbalance in a virtual guest. You shouldn't do that unless you also pin you virtual cpus to physical cpus, otherwise you have no idea what the actual mapping will be. If you are indeed running on a physical host, please let me know what version of irqbalance you are using, and post the output of irqbalance -f -d here

sevagh · 2019-09-25T11:01:52Z

I shouldn't have linked that bug then, I'm running on bare metal. I mistakenly believed the discussion in that bug was relevant to my issue.

irqbalance version: 1.1.0-2.3. There's no config file, irqbalance is running as a systemd daemon (regular, not oneshot).

irqbalance -f -d output:

$ sudo irqbalance -f -d
Isolated CPUs: 00000001
Adaptive-ticks CPUs: 00000000
Package 0:  numa_node is 1 cpu mask is 0000fffe (load 0)
        Cache domain 0:  numa_node is 1 cpu mask is 00003000  (load 0)
                CPU number 13  numa_node is 1 (load 0)
                CPU number 12  numa_node is 1 (load 0)
        Cache domain 2:  numa_node is 1 cpu mask is 00000300  (load 0)
                CPU number 9  numa_node is 1 (load 0)
                CPU number 8  numa_node is 1 (load 0)
        Cache domain 4:  numa_node is 1 cpu mask is 00000c00  (load 0)
                CPU number 11  numa_node is 1 (load 0)
                CPU number 10  numa_node is 1 (load 0)
        Cache domain 5:  numa_node is 1 cpu mask is 000000c0  (load 0)
                CPU number 7  numa_node is 1 (load 0)
                CPU number 6  numa_node is 1 (load 0)
        Cache domain 7:  numa_node is 1 cpu mask is 00000030  (load 0)
                CPU number 5  numa_node is 1 (load 0)
                CPU number 4  numa_node is 1 (load 0)
        Cache domain 10:  numa_node is 1 cpu mask is 0000000c  (load 0)
                CPU number 3  numa_node is 1 (load 0)
                CPU number 2  numa_node is 1 (load 0)
        Cache domain 13:  numa_node is 1 cpu mask is 00000002  (load 0)
                CPU number 1  numa_node is 1 (load 0)
        Cache domain 14:  numa_node is 1 cpu mask is 0000c000  (load 0)
                CPU number 14  numa_node is 1 (load 0)
                CPU number 15  numa_node is 1 (load 0)
Package 1:  numa_node is 3 cpu mask is ffff0000 (load 0)
        Cache domain 1:  numa_node is 3 cpu mask is c0000000  (load 0)
                CPU number 31  numa_node is 3 (load 0)
                CPU number 30  numa_node is 3 (load 0)
        Cache domain 3:  numa_node is 3 cpu mask is 00300000  (load 0)
                CPU number 21  numa_node is 3 (load 0)
                CPU number 20  numa_node is 3 (load 0)
        Cache domain 6:  numa_node is 3 cpu mask is 30000000  (load 0)
                CPU number 28  numa_node is 3 (load 0)
                CPU number 29  numa_node is 3 (load 0)
        Cache domain 8:  numa_node is 3 cpu mask is 000c0000  (load 0)
                CPU number 18  numa_node is 3 (load 0)
                CPU number 19  numa_node is 3 (load 0)
        Cache domain 9:  numa_node is 3 cpu mask is 0c000000  (load 0)
                CPU number 26  numa_node is 3 (load 0)
                CPU number 27  numa_node is 3 (load 0)
        Cache domain 11:  numa_node is 3 cpu mask is 00030000  (load 0)
                CPU number 16  numa_node is 3 (load 0)
                CPU number 17  numa_node is 3 (load 0)
        Cache domain 12:  numa_node is 3 cpu mask is 03000000  (load 0)
                CPU number 24  numa_node is 3 (load 0)
                CPU number 25  numa_node is 3 (load 0)
        Cache domain 15:  numa_node is 3 cpu mask is 00c00000  (load 0)
                CPU number 22  numa_node is 3 (load 0)
                CPU number 23  numa_node is 3 (load 0)
Adding IRQ 17 to database
Adding IRQ 16 to database
Adding IRQ 43 to database
Adding IRQ 41 to database
Adding IRQ 38 to database
Adding IRQ 36 to database
Adding IRQ 44 to database
Adding IRQ 42 to database
Adding IRQ 40 to database
Adding IRQ 39 to database
Adding IRQ 37 to database
Adding IRQ 18 to database
Adding IRQ 22 to database
Adding IRQ 20 to database
Adding IRQ 19 to database
Adding IRQ 33 to database
Adding IRQ 31 to database
Adding IRQ 28 to database
Adding IRQ 26 to database
Adding IRQ 34 to database
Adding IRQ 32 to database
Adding IRQ 30 to database
Adding IRQ 29 to database
Adding IRQ 27 to database
Adding IRQ 24 to database
Adding IRQ 0 to database
Adding IRQ 1 to database
Adding IRQ 4 to database
Adding IRQ 8 to database
Adding IRQ 9 to database
Adding IRQ 12 to database
Adding IRQ 14 to database
Adding IRQ 15 to database
NUMA NODE NUMBER: -1
LOCAL CPU MASK: ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff

NUMA NODE NUMBER: 2
LOCAL CPU MASK: 00ff0000

NUMA NODE NUMBER: 0
LOCAL CPU MASK: 000000ff

NUMA NODE NUMBER: 3
LOCAL CPU MASK: ff000000

NUMA NODE NUMBER: 1
LOCAL CPU MASK: 0000ff00

nhorman · 2019-09-25T13:44:12Z

Four things:

There should be way more output then that. What I see here is just the initial cputree and irq parsing, there should be some subsequent output showing how all the irqs get affined. If you want you can run the same command as above and include the --oneshot option, so that irqbalance will exit after you capture all the output
Make sure when you are running the test, that you shutdown other copies of irqbalance (sorry, should have mentioned that previously)
It appears that you are running isolcpus on the kernel command line, can I ask what your isolation mask is?
This version of irqbalance is years old (at least 5 releases behind upstream). Whatever problem we find here, we're first going to have to validate on the latest upstream release, so you may want to clone, build and test with that version here, as if its fixed there, the solution will be update debian to that version.

sevagh · 2019-09-25T16:19:54Z

Here's an output with 1 and 2 taken into account. As for isolcpus, I don't know much about it - we don't run our kernel with the isolcpus boot line, and $ cat /sys/devices/system/cpu/isolated is empty.

$ sudo irqbalance -f -d --oneshot
Isolated CPUs: 00000001
Adaptive-ticks CPUs: 00000000
Package 0:  numa_node is 1 cpu mask is 0000fffe (load 0)
        Cache domain 0:  numa_node is 1 cpu mask is 00003000  (load 0)
                CPU number 13  numa_node is 1 (load 0)
                CPU number 12  numa_node is 1 (load 0)
        Cache domain 2:  numa_node is 1 cpu mask is 00000300  (load 0)
                CPU number 9  numa_node is 1 (load 0)
                CPU number 8  numa_node is 1 (load 0)
        Cache domain 4:  numa_node is 1 cpu mask is 00000c00  (load 0)
                CPU number 11  numa_node is 1 (load 0)
                CPU number 10  numa_node is 1 (load 0)
        Cache domain 5:  numa_node is 1 cpu mask is 000000c0  (load 0)
                CPU number 7  numa_node is 1 (load 0)
                CPU number 6  numa_node is 1 (load 0)
        Cache domain 7:  numa_node is 1 cpu mask is 00000030  (load 0)
                CPU number 5  numa_node is 1 (load 0)
                CPU number 4  numa_node is 1 (load 0)
        Cache domain 10:  numa_node is 1 cpu mask is 0000000c  (load 0)
                CPU number 3  numa_node is 1 (load 0)
                CPU number 2  numa_node is 1 (load 0)
        Cache domain 13:  numa_node is 1 cpu mask is 00000002  (load 0)
                CPU number 1  numa_node is 1 (load 0)
        Cache domain 14:  numa_node is 1 cpu mask is 0000c000  (load 0)
                CPU number 14  numa_node is 1 (load 0)
                CPU number 15  numa_node is 1 (load 0)
Package 1:  numa_node is 3 cpu mask is ffff0000 (load 0)
        Cache domain 1:  numa_node is 3 cpu mask is c0000000  (load 0)
                CPU number 31  numa_node is 3 (load 0)
                CPU number 30  numa_node is 3 (load 0)
        Cache domain 3:  numa_node is 3 cpu mask is 00300000  (load 0)
                CPU number 21  numa_node is 3 (load 0)
                CPU number 20  numa_node is 3 (load 0)
        Cache domain 6:  numa_node is 3 cpu mask is 30000000  (load 0)
                CPU number 28  numa_node is 3 (load 0)
                CPU number 29  numa_node is 3 (load 0)
        Cache domain 8:  numa_node is 3 cpu mask is 000c0000  (load 0)
                CPU number 18  numa_node is 3 (load 0)
                CPU number 19  numa_node is 3 (load 0)
        Cache domain 9:  numa_node is 3 cpu mask is 0c000000  (load 0)
                CPU number 26  numa_node is 3 (load 0)
                CPU number 27  numa_node is 3 (load 0)
        Cache domain 11:  numa_node is 3 cpu mask is 00030000  (load 0)
                CPU number 16  numa_node is 3 (load 0)
                CPU number 17  numa_node is 3 (load 0)
        Cache domain 12:  numa_node is 3 cpu mask is 03000000  (load 0)
                CPU number 24  numa_node is 3 (load 0)
                CPU number 25  numa_node is 3 (load 0)
        Cache domain 15:  numa_node is 3 cpu mask is 00c00000  (load 0)
                CPU number 22  numa_node is 3 (load 0)
                CPU number 23  numa_node is 3 (load 0)
Adding IRQ 17 to database
Adding IRQ 16 to database
Adding IRQ 43 to database
Adding IRQ 41 to database
Adding IRQ 38 to database
Adding IRQ 36 to database
Adding IRQ 44 to database
Adding IRQ 42 to database
Adding IRQ 40 to database
Adding IRQ 39 to database
Adding IRQ 37 to database
Adding IRQ 18 to database
Adding IRQ 22 to database
Adding IRQ 20 to database
Adding IRQ 19 to database
Adding IRQ 33 to database
Adding IRQ 31 to database
Adding IRQ 28 to database
Adding IRQ 26 to database
Adding IRQ 34 to database
Adding IRQ 32 to database
Adding IRQ 30 to database
Adding IRQ 29 to database
Adding IRQ 27 to database
Adding IRQ 24 to database
Adding IRQ 0 to database
Adding IRQ 1 to database
Adding IRQ 4 to database
Adding IRQ 8 to database
Adding IRQ 9 to database
Adding IRQ 12 to database
Adding IRQ 14 to database
Adding IRQ 15 to database
NUMA NODE NUMBER: -1
LOCAL CPU MASK: ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff

NUMA NODE NUMBER: 2
LOCAL CPU MASK: 00ff0000

NUMA NODE NUMBER: 0
LOCAL CPU MASK: 000000ff

NUMA NODE NUMBER: 3
LOCAL CPU MASK: ff000000

NUMA NODE NUMBER: 1
LOCAL CPU MASK: 0000ff00




-----------------------------------------------------------------------------
Package 0:  numa_node is 1 cpu mask is 0000fffe (load 0)
        Cache domain 0:  numa_node is 1 cpu mask is 00003000  (load 0)
                CPU number 13  numa_node is 1 (load 0)
                CPU number 12  numa_node is 1 (load 0)
        Cache domain 2:  numa_node is 1 cpu mask is 00000300  (load 0)
                CPU number 9  numa_node is 1 (load 0)
                CPU number 8  numa_node is 1 (load 0)
        Cache domain 4:  numa_node is 1 cpu mask is 00000c00  (load 0)
                CPU number 11  numa_node is 1 (load 0)
                CPU number 10  numa_node is 1 (load 0)
        Cache domain 5:  numa_node is 1 cpu mask is 000000c0  (load 0)
                CPU number 7  numa_node is 1 (load 0)
                CPU number 6  numa_node is 1 (load 0)
        Cache domain 7:  numa_node is 1 cpu mask is 00000030  (load 0)
                CPU number 5  numa_node is 1 (load 0)
                CPU number 4  numa_node is 1 (load 0)
        Cache domain 10:  numa_node is 1 cpu mask is 0000000c  (load 0)
                CPU number 3  numa_node is 1 (load 0)
                CPU number 2  numa_node is 1 (load 0)
        Cache domain 13:  numa_node is 1 cpu mask is 00000002  (load 0)
                CPU number 1  numa_node is 1 (load 0)
        Cache domain 14:  numa_node is 1 cpu mask is 0000c000  (load 0)
                CPU number 14  numa_node is 1 (load 0)
                CPU number 15  numa_node is 1 (load 0)
  Interrupt 12 node_num is -1 (other/0:0)
  Interrupt 4 node_num is -1 (other/0:0)
Package 1:  numa_node is 3 cpu mask is ffff0000 (load 0)
        Cache domain 1:  numa_node is 3 cpu mask is c0000000  (load 0)
                CPU number 31  numa_node is 3 (load 0)
                CPU number 30  numa_node is 3 (load 0)
        Cache domain 3:  numa_node is 3 cpu mask is 00300000  (load 0)
                CPU number 21  numa_node is 3 (load 0)
                CPU number 20  numa_node is 3 (load 0)
        Cache domain 6:  numa_node is 3 cpu mask is 30000000  (load 0)
                CPU number 28  numa_node is 3 (load 0)
                CPU number 29  numa_node is 3 (load 0)
        Cache domain 8:  numa_node is 3 cpu mask is 000c0000  (load 0)
                CPU number 18  numa_node is 3 (load 0)
                CPU number 19  numa_node is 3 (load 0)
        Cache domain 9:  numa_node is 3 cpu mask is 0c000000  (load 0)
                CPU number 26  numa_node is 3 (load 0)
                CPU number 27  numa_node is 3 (load 0)
        Cache domain 11:  numa_node is 3 cpu mask is 00030000  (load 0)
                CPU number 16  numa_node is 3 (load 0)
                CPU number 17  numa_node is 3 (load 0)
        Cache domain 12:  numa_node is 3 cpu mask is 03000000  (load 0)
                CPU number 24  numa_node is 3 (load 0)
                CPU number 25  numa_node is 3 (load 0)
        Cache domain 15:  numa_node is 3 cpu mask is 00c00000  (load 0)
                CPU number 22  numa_node is 3 (load 0)
                CPU number 23  numa_node is 3 (load 0)
  Interrupt 14 node_num is -1 (other/0:0)
  Interrupt 8 node_num is -1 (other/0:0)
  Interrupt 0 node_num is -1 (other/0:0)

I wil try an upstream release now.

sevagh · 2019-09-25T16:33:31Z

1.6.0, just built from master:

$ sudo ./irqbalance -d -f --oneshot
This machine seems not NUMA capable.
Isolated CPUs: 00000000
Adaptive-ticks CPUs: 00000000
Banned CPUs: 00000000
Package 0:  numa_node -1 cpu mask is 0000ffff (load 0)
        Cache domain 0:  numa_node is -1 cpu mask is 00003000  (load 0)
                CPU number 13  numa_node is -1 (load 0)
                CPU number 12  numa_node is -1 (load 0)
        Cache domain 2:  numa_node is -1 cpu mask is 00000300  (load 0)
                CPU number 9  numa_node is -1 (load 0)
                CPU number 8  numa_node is -1 (load 0)
        Cache domain 4:  numa_node is -1 cpu mask is 00000c00  (load 0)
                CPU number 11  numa_node is -1 (load 0)
                CPU number 10  numa_node is -1 (load 0)
        Cache domain 5:  numa_node is -1 cpu mask is 000000c0  (load 0)
                CPU number 7  numa_node is -1 (load 0)
                CPU number 6  numa_node is -1 (load 0)
        Cache domain 7:  numa_node is -1 cpu mask is 00000030  (load 0)
                CPU number 5  numa_node is -1 (load 0)
                CPU number 4  numa_node is -1 (load 0)
        Cache domain 10:  numa_node is -1 cpu mask is 0000000c  (load 0)
                CPU number 3  numa_node is -1 (load 0)
                CPU number 2  numa_node is -1 (load 0)
        Cache domain 13:  numa_node is -1 cpu mask is 00000003  (load 0)
                CPU number 1  numa_node is -1 (load 0)
                CPU number 0  numa_node is -1 (load 0)
        Cache domain 14:  numa_node is -1 cpu mask is 0000c000  (load 0)
                CPU number 14  numa_node is -1 (load 0)
                CPU number 15  numa_node is -1 (load 0)
Package 1:  numa_node -1 cpu mask is ffff0000 (load 0)
        Cache domain 1:  numa_node is -1 cpu mask is c0000000  (load 0)
                CPU number 31  numa_node is -1 (load 0)
                CPU number 30  numa_node is -1 (load 0)
        Cache domain 3:  numa_node is -1 cpu mask is 00300000  (load 0)
                CPU number 21  numa_node is -1 (load 0)
                CPU number 20  numa_node is -1 (load 0)
        Cache domain 6:  numa_node is -1 cpu mask is 30000000  (load 0)
                CPU number 28  numa_node is -1 (load 0)
                CPU number 29  numa_node is -1 (load 0)
        Cache domain 8:  numa_node is -1 cpu mask is 000c0000  (load 0)
                CPU number 18  numa_node is -1 (load 0)
                CPU number 19  numa_node is -1 (load 0)
        Cache domain 9:  numa_node is -1 cpu mask is 0c000000  (load 0)
                CPU number 26  numa_node is -1 (load 0)
                CPU number 27  numa_node is -1 (load 0)
        Cache domain 11:  numa_node is -1 cpu mask is 00030000  (load 0)
                CPU number 16  numa_node is -1 (load 0)
                CPU number 17  numa_node is -1 (load 0)
        Cache domain 12:  numa_node is -1 cpu mask is 03000000  (load 0)
                CPU number 24  numa_node is -1 (load 0)
                CPU number 25  numa_node is -1 (load 0)
        Cache domain 15:  numa_node is -1 cpu mask is 00c00000  (load 0)
                CPU number 22  numa_node is -1 (load 0)
                CPU number 23  numa_node is -1 (load 0)
Adding IRQ 17 to database
Adding IRQ 16 to database
Adding IRQ 43 to database
Adding IRQ 41 to database
Adding IRQ 38 to database
Adding IRQ 36 to database
Adding IRQ 44 to database
Adding IRQ 42 to database
Adding IRQ 40 to database
Adding IRQ 39 to database
Adding IRQ 37 to database
Adding IRQ 18 to database
Adding IRQ 22 to database
Adding IRQ 20 to database
Adding IRQ 19 to database
Adding IRQ 33 to database
Adding IRQ 31 to database
Adding IRQ 28 to database
Adding IRQ 26 to database
Adding IRQ 34 to database
Adding IRQ 32 to database
Adding IRQ 30 to database
Adding IRQ 29 to database
Adding IRQ 27 to database
Adding IRQ 24 to database
Adding IRQ 0 to database
Adding IRQ 1 to database
Adding IRQ 4 to database
Adding IRQ 8 to database
Adding IRQ 9 to database
Adding IRQ 12 to database
Adding IRQ 14 to database
Adding IRQ 15 to database
NUMA NODE NUMBER: -1
LOCAL CPU MASK: ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff

Daemon couldn't be bound to the file-based socket.




-----------------------------------------------------------------------------
Package 0:  numa_node -1 cpu mask is 0000ffff (load 0)
        Cache domain 0:  numa_node is -1 cpu mask is 00003000  (load 0)
                CPU number 13  numa_node is -1 (load 0)
                  Interrupt 34 node_num is -1 (ethernet/0:43)
                CPU number 12  numa_node is -1 (load 0)
                  Interrupt 41 node_num is -1 (ethernet/0:0)
        Cache domain 2:  numa_node is -1 cpu mask is 00000300  (load 0)
                CPU number 9  numa_node is -1 (load 0)
                  Interrupt 28 node_num is -1 (ethernet/0:45)
                CPU number 8  numa_node is -1 (load 0)
                  Interrupt 20 node_num is -1 (video/0:0)
        Cache domain 4:  numa_node is -1 cpu mask is 00000c00  (load 0)
                CPU number 11  numa_node is -1 (load 0)
                  Interrupt 33 node_num is -1 (ethernet/0:42)
                CPU number 10  numa_node is -1 (load 0)
          Interrupt 19 node_num is -1 (legacy/0:0)
        Cache domain 5:  numa_node is -1 cpu mask is 000000c0  (load 0)
                CPU number 7  numa_node is -1 (load 0)
                  Interrupt 29 node_num is -1 (ethernet/0:25)
                CPU number 6  numa_node is -1 (load 0)
          Interrupt 24 node_num is -1 (legacy/0:0)
        Cache domain 7:  numa_node is -1 cpu mask is 00000030  (load 0)
                CPU number 5  numa_node is -1 (load 0)
                  Interrupt 32 node_num is -1 (ethernet/0:52)
                CPU number 4  numa_node is -1 (load 0)
          Interrupt 17 node_num is -1 (legacy/0:0)
        Cache domain 10:  numa_node is -1 cpu mask is 0000000c  (load 0)
                CPU number 3  numa_node is -1 (load 0)
                  Interrupt 39 node_num is -1 (ethernet/0:0)
                CPU number 2  numa_node is -1 (load 0)
        Cache domain 13:  numa_node is -1 cpu mask is 00000003  (load 0)
                CPU number 1  numa_node is -1 (load 0)
                  Interrupt 42 node_num is -1 (ethernet/0:0)
                CPU number 0  numa_node is -1 (load 0)
        Cache domain 14:  numa_node is -1 cpu mask is 0000c000  (load 0)
                CPU number 14  numa_node is -1 (load 0)
                  Interrupt 36 node_num is -1 (ethernet/0:0)
                CPU number 15  numa_node is -1 (load 0)
  Interrupt 14 node_num is -1 (other/0:0)
  Interrupt 9 node_num is -1 (other/0:0)
  Interrupt 4 node_num is -1 (other/0:0)
  Interrupt 0 node_num is -1 (other/0:0)
Package 1:  numa_node -1 cpu mask is ffff0000 (load 0)
        Cache domain 1:  numa_node is -1 cpu mask is c0000000  (load 0)
                CPU number 31  numa_node is -1 (load 0)
                  Interrupt 26 node_num is -1 (ethernet/0:0)
                CPU number 30  numa_node is -1 (load 0)
                  Interrupt 43 node_num is -1 (ethernet/0:0)
        Cache domain 3:  numa_node is -1 cpu mask is 00300000  (load 0)
                CPU number 21  numa_node is -1 (load 0)
                  Interrupt 31 node_num is -1 (ethernet/0:41)
                CPU number 20  numa_node is -1 (load 0)
                  Interrupt 22 node_num is -1 (storage/0:254)
        Cache domain 6:  numa_node is -1 cpu mask is 30000000  (load 0)
                CPU number 28  numa_node is -1 (load 0)
                  Interrupt 27 node_num is -1 (ethernet/0:102)
                CPU number 29  numa_node is -1 (load 0)
          Interrupt 18 node_num is -1 (legacy/0:0)
        Cache domain 8:  numa_node is -1 cpu mask is 000c0000  (load 0)
                CPU number 18  numa_node is -1 (load 0)
                  Interrupt 30 node_num is -1 (ethernet/0:32)
                CPU number 19  numa_node is -1 (load 0)
          Interrupt 16 node_num is -1 (legacy/0:0)
        Cache domain 9:  numa_node is -1 cpu mask is 0c000000  (load 0)
                CPU number 26  numa_node is -1 (load 0)
                  Interrupt 37 node_num is -1 (ethernet/0:0)
                CPU number 27  numa_node is -1 (load 0)
        Cache domain 11:  numa_node is -1 cpu mask is 00030000  (load 0)
                CPU number 16  numa_node is -1 (load 0)
                  Interrupt 40 node_num is -1 (ethernet/0:0)
                CPU number 17  numa_node is -1 (load 0)
        Cache domain 12:  numa_node is -1 cpu mask is 03000000  (load 0)
                CPU number 24  numa_node is -1 (load 0)
                  Interrupt 44 node_num is -1 (ethernet/0:0)
                CPU number 25  numa_node is -1 (load 0)
        Cache domain 15:  numa_node is -1 cpu mask is 00c00000  (load 0)
                CPU number 22  numa_node is -1 (load 0)
                  Interrupt 38 node_num is -1 (ethernet/0:0)
                CPU number 23  numa_node is -1 (load 0)
  Interrupt 15 node_num is -1 (other/0:0)
  Interrupt 12 node_num is -1 (other/0:0)
  Interrupt 8 node_num is -1 (other/0:0)
  Interrupt 1 node_num is -1 (other/0:0)

After running it this way (and stopping the contintually running irqbalance daemon in the background), the interrupts are becoming better spread:

$ grep eth /proc/interrupts
  26:          1          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI 1048576-edge      eth0
  27:  356669796          0          0          0          0          0         17          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0       1852          0          0          0   PCI-MSI 1048577-edge      eth0-TxRx-0
  28:  308170840          0          0          0          0          0          0          0          0       1348          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI 1048578-edge      eth0-TxRx-1
  29:  345044273          0          0          0          0          0          0        860          0          0          0          0          5          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI 1048579-edge      eth0-TxRx-2
  30:  309825176          0          0          0          0          0          0          0          0          0          0          0          0          0          0         17          0          0        909          0          0          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI 1048580-edge      eth0-TxRx-3
  31:  353394171          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0         33          0          0       1153          0          0          0          0          0          0          0          0          0          0   PCI-MSI 1048581-edge      eth0-TxRx-4
  32:  293922605          0          0          0          0       1075          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0         46          0          0          0          0          0          0          0          0          0          0   PCI-MSI 1048582-edge      eth0-TxRx-5
  33:  307812015          0          0          0          0          0          0          0          0          0          0        959          0          0          0          0          0          0          0          0          0          0          0          0         31          0          0          0          0          0          0          0   PCI-MSI 1048583-edge      eth0-TxRx-6
  34:  275611540          0          0          0          0          0          0          0          0          0          0          0          0        969          0          0          0          0          0          0          0          0          0          0          0          0          0         11          0          0          0          0   PCI-MSI 1048584-edge      eth0-TxRx-7

sevagh · 2019-09-25T16:40:09Z

I also don't know what the desired behavior of irqbalance is - should it be putting all ethernet interrupts on CPU0 for cache coherency? Or spreading it across other CPUs to reduce single hotspots/saturation?

nhorman · 2019-09-25T20:53:16Z

Thanks for the information.

In answer to your question, the desired behavior of irqbalance for most general purpose workloads is to spread irqs as much as possible throughout all of your cpus. This helps give an even latency to your user space processes, as well as maintaining a high data cache hit rate (nominally be default, irqs will trigger on different cpus for each raising, leading to dirtying cache on multiple cpus and then not reusing that data).

That said, you seem to have a unique situation here. For some reason your distro version of irqbalance is doing something very wrong. Its somehow improperly parsing the isolated bitmap and giving you an isolated cpu mask of 0x1, which it shouldn't be, and that is likely somehow leading to misbalancing

In either case however, this was fixed back in 2016 with commit 3c9a009. And from your testing you can see the upstream version is behaving. The debian maintainers need to backport that fix, or better still, just update to a modern version of irqbalance.

sevagh · 2019-09-25T20:59:18Z

Thanks for the help. Should I be filing a Debian bug/upgrade request somewhere?

nhorman · 2019-09-25T21:00:39Z

yes, following the instructions here:
https://www.debian.org/Bugs/Reporting

nhorman closed this as completed Sep 25, 2019

saifat29 mentioned this issue Oct 11, 2019

All interrupts running on single cpu #125

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

All eth0 interrupts going to CPU0 and saturating it on Debian Stretch, kernel 4.9.0 #122

All eth0 interrupts going to CPU0 and saturating it on Debian Stretch, kernel 4.9.0 #122

sevagh commented Sep 24, 2019

sevagh commented Sep 24, 2019

nhorman commented Sep 25, 2019

sevagh commented Sep 25, 2019 •

edited

nhorman commented Sep 25, 2019

sevagh commented Sep 25, 2019

sevagh commented Sep 25, 2019

sevagh commented Sep 25, 2019

nhorman commented Sep 25, 2019

sevagh commented Sep 25, 2019

nhorman commented Sep 25, 2019

All eth0 interrupts going to CPU0 and saturating it on Debian Stretch, kernel 4.9.0 #122

All eth0 interrupts going to CPU0 and saturating it on Debian Stretch, kernel 4.9.0 #122

Comments

sevagh commented Sep 24, 2019

sevagh commented Sep 24, 2019

nhorman commented Sep 25, 2019

sevagh commented Sep 25, 2019 • edited

nhorman commented Sep 25, 2019

sevagh commented Sep 25, 2019

sevagh commented Sep 25, 2019

sevagh commented Sep 25, 2019

nhorman commented Sep 25, 2019

sevagh commented Sep 25, 2019

nhorman commented Sep 25, 2019

sevagh commented Sep 25, 2019 •

edited