Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flatcar Container Linux fails and reboots: "kernel BUG at net/core/skbuff.c" #378

Closed
seh opened this issue Apr 7, 2021 · 45 comments
Closed
Labels
channel/beta Issue concerns the Beta channel. kind/bug Something isn't working

Comments

@seh
Copy link

seh commented Apr 7, 2021

Description

On AWS EC2 instances using Flatcar Container Linux versions 2765.1.0 and 2801.1.0 from the Beta channel, with a kOps-provisioned Kubernetes installation on top, we encounter a kernel bug that causes the machines to stop and reboot immediately.

The log entries in journalctl appear as follows:

Apr 06 17:59:41 ip-10-2-1-63.eu-west-1.compute.internal kernel: ------------[ cut here ]------------
Apr 06 17:59:41 ip-10-2-1-63.eu-west-1.compute.internal kernel: kernel BUG at net/core/skbuff.c:4008!
-- Boot 8314fb086d5b4ed0a9e80895ab0c4f0b --
Apr 06 17:59:59 localhost kernel: Linux version 5.10.25-flatcar (build@pony-truck.infra.kinvolk.io) (x86_64-cros-linux-gnu-gcc (Gentoo Hardened 9.3.0-r1 p3) 9.3.0, GNU ld (Gentoo 2.35 p1) 2.35.0) #1 SMP Wed Mar 24 14:51:21 ->
lines 3257-3278

Sometimes the line number in file net/core/skbuff.c is 3,996 instead of 4,008. Usually we'll see 3,996 cited, then after the machine reboots, thereafter we'll see 4,008, suggesting that the rebooting swapped some updated files into place.

Note that we have locksmithd disabled, but update-engine is enabled, so we're downloading updates but not putting them into use eagerly.

Impact

Our fleet of Kubernetes cluster machines reboot periodically, causing the containers running on them to exit without warning and be replaced (in most cases) by the kubelet after a short delay.

Environment and steps to reproduce

  1. Set-up:
  • AWS EC2 in the "eu-west-1" region, though we've seen these a few of failures in the "us-east-2" region as well.
  • Instance types we've seen fail:
    • m5.xlarge
    • m5.2xlarge
    • m5.4xlarge
    • m5a.2xlarge
    • c5.xlarge
  • Cluster provisioned by kOps version 1.19.1
  • Kubernetes versions 1.19.8 and 1.19.9
  • Cluster CNI: Calico version 3.17.3 and 3.18.1
  1. Task:
  • Kubernetes is running either control plane or worker node responsibilities.
  • We have not seen this failure occur on bastion machines (instance type t3.micro) that don't run any Kubernetes components.
  1. Action(s):
    a. Launch an EC2 instance using Flatcar Container Linux, perhaps via a supervising ASG.
    b. Allow various Kubernetes components to start (e.g. kubelet, CNI daemons).
    c. Periodically check the machine's last boot time.
    d. Inspect system logs with a command like journalct --grep=skbuff.
  2. Error:
    The machine will hum along normally, downloading updates occasionally, and running containers for Kubernetes workload. With no warning, the machine will reboot. Subsequent inspection of the log via journalctl shows a message like this:
kernel: kernel BUG at net/core/skbuff.c:3996!

One variation:

kernel: kernel BUG at net/core/skbuff.c:4008!

After the machine boots, the /sys/fs/pstore directory mentioned here exists, but is empty. The "pstore" mount entry is as follows:

pstore on /sys/fs/pstore type pstore (rw,nosuid,nodev,noexec,relatime,seclabel)

Perhaps our hardware does not support pstore, per the following uname -a output:

Linux ip-10-2-1-63.eu-west-1.compute.internal 5.10.25-flatcar #1 SMP Wed Mar 24 14:51:21 -00 2021 x86_64 Intel(R) Xeon(R) Platinum 8124M CPU @ 3.00GHz GenuineIntel GNU/Linux

Expected behavior
The machine should continue running normally without encountering errors that cause it to reboot without warning.

Additional information
We run similar Kubernetes cluster in several other AWS regions:

  • ap-northeast-1
  • ap-southeast-1
  • us-west-2

We have not seen this failure occur in those regions. We see it predominantly in "eu-west-1" and occasionally in "us-east-2." That could be due to more intense workload in the clusters in the former region.

@t-lo
Copy link
Member

t-lo commented Apr 12, 2021

Thank you for reporting @seh , we'll have a look. Do you think you could investigate into a reliable repro case for this issue?

@seh
Copy link
Author

seh commented Apr 12, 2021

That is going to be very difficult, as so far it amounts to, "Run this Kubernetes cluster with this workload."

We have confirmed that downgrading to the Flatcar Container Linux beta version 2705.1.2 alleviates the problem. Again, versions 2765.1.0 and 2801.1.0 both suffer this same kernel bug.

Looking at the workload that runs on all the machines on which we've seen this occur, we identified only three in common:

  • Calico's "calico-node" daemon pod
  • Prometheus node exporter daemon pod
  • Vector's "vector-agent" daemon pod

We disabled Vector and proved that that was not the culprit. It wasn't feasible to disable "calico-node" and still have a functional Kubernetes cluster. (Swapping a CNI implementation in a production-grade cluster is a delicate operation.) We did not get as far as disabling Prometheus node exporter, though we're running it on every machine in several other Kubernetes clusters—that just happen to be less busy—so it's not likely it's at fault.

@seh
Copy link
Author

seh commented Apr 12, 2021

I neglected to mention earlier that in our clusters where Flatcar Container Linux's locksmithd service is enabled, we don't see this bug arise. In our clusters where update_engine is enabled but locksmithd is disabled, the bug occurs on 10-15 out of 200 machines every day.

@t-lo
Copy link
Member

t-lo commented Apr 13, 2021

Interesting, thank you for sharing. While we're still looking for a solid repro the information you've provided will help with narrowing down the issue.

@seh
Copy link
Author

seh commented Apr 13, 2021

I mentioned that on our machines where both update_engine and locksmithd are enabled that we don't see this kernel bug arising. However, I did notice something odd in the system logs on those machines.

I've been polling our machines regularly via SSH, running a command like journalct --grep=skbuff and collecting the output, in order to see how often and on which machines the kernel bug has been occurring. On some of the machines with locksmithd enabled, I see output from that command like this:

journalctl --grep output
-- Journal begins at Sat 2021-02-13 23:16:07 UTC, ends at Mon 2021-04-12 21:28:47 UTC. --
-- Boot bfbdc7106d5c45699668e7d2cd4ea8c5 --
-- Boot 1725d5160db347908f79b244ed63da5e --
-- Boot bfbdc7106d5c45699668e7d2cd4ea8c5 --
-- Boot 1725d5160db347908f79b244ed63da5e --
-- Boot bfbdc7106d5c45699668e7d2cd4ea8c5 --
-- Boot 1725d5160db347908f79b244ed63da5e --
-- Boot bfbdc7106d5c45699668e7d2cd4ea8c5 --
-- Boot 1725d5160db347908f79b244ed63da5e --
-- Boot bfbdc7106d5c45699668e7d2cd4ea8c5 --
-- Boot 1725d5160db347908f79b244ed63da5e --
-- Boot bfbdc7106d5c45699668e7d2cd4ea8c5 --
-- Boot 1725d5160db347908f79b244ed63da5e --
-- Boot bfbdc7106d5c45699668e7d2cd4ea8c5 --
-- Boot 1725d5160db347908f79b244ed63da5e --
-- Boot bfbdc7106d5c45699668e7d2cd4ea8c5 --
-- Boot 1725d5160db347908f79b244ed63da5e --
-- Boot bfbdc7106d5c45699668e7d2cd4ea8c5 --
-- Boot 1725d5160db347908f79b244ed63da5e --
-- Boot bfbdc7106d5c45699668e7d2cd4ea8c5 --
-- Boot 1725d5160db347908f79b244ed63da5e --
-- Boot bfbdc7106d5c45699668e7d2cd4ea8c5 --
-- Boot 1725d5160db347908f79b244ed63da5e --
-- Boot bfbdc7106d5c45699668e7d2cd4ea8c5 --
-- Boot 1725d5160db347908f79b244ed63da5e --
-- Boot bfbdc7106d5c45699668e7d2cd4ea8c5 --
-- Boot 1725d5160db347908f79b244ed63da5e --
-- Boot bfbdc7106d5c45699668e7d2cd4ea8c5 --
-- Boot 1725d5160db347908f79b244ed63da5e --
-- Boot bfbdc7106d5c45699668e7d2cd4ea8c5 --
-- Boot 1725d5160db347908f79b244ed63da5e --
-- Boot bfbdc7106d5c45699668e7d2cd4ea8c5 --
-- Boot 1725d5160db347908f79b244ed63da5e --
-- Boot bfbdc7106d5c45699668e7d2cd4ea8c5 --
-- Boot 1725d5160db347908f79b244ed63da5e --
-- Boot bfbdc7106d5c45699668e7d2cd4ea8c5 --
-- Boot 1725d5160db347908f79b244ed63da5e --
-- Boot bfbdc7106d5c45699668e7d2cd4ea8c5 --
-- Boot 1725d5160db347908f79b244ed63da5e --
-- Boot bfbdc7106d5c45699668e7d2cd4ea8c5 --
-- Boot 1725d5160db347908f79b244ed63da5e --
-- Boot bfbdc7106d5c45699668e7d2cd4ea8c5 --
-- Boot 1725d5160db347908f79b244ed63da5e --
-- Boot bfbdc7106d5c45699668e7d2cd4ea8c5 --
-- Boot 1725d5160db347908f79b244ed63da5e --
-- Boot bfbdc7106d5c45699668e7d2cd4ea8c5 --
-- Boot 1725d5160db347908f79b244ed63da5e --
-- Boot bfbdc7106d5c45699668e7d2cd4ea8c5 --
-- Boot 1725d5160db347908f79b244ed63da5e --
-- Boot bfbdc7106d5c45699668e7d2cd4ea8c5 --
-- Boot 1725d5160db347908f79b244ed63da5e --
-- Boot bfbdc7106d5c45699668e7d2cd4ea8c5 --
-- Boot 1725d5160db347908f79b244ed63da5e --
-- Boot bfbdc7106d5c45699668e7d2cd4ea8c5 --
-- Boot 1725d5160db347908f79b244ed63da5e --
-- Boot bfbdc7106d5c45699668e7d2cd4ea8c5 --
-- Boot 1725d5160db347908f79b244ed63da5e --
-- Boot bfbdc7106d5c45699668e7d2cd4ea8c5 --
-- Boot 1725d5160db347908f79b244ed63da5e --
-- Boot bfbdc7106d5c45699668e7d2cd4ea8c5 --
-- Boot 1725d5160db347908f79b244ed63da5e --
-- Boot bfbdc7106d5c45699668e7d2cd4ea8c5 --
-- Boot 1725d5160db347908f79b244ed63da5e --
-- Boot bfbdc7106d5c45699668e7d2cd4ea8c5 --
-- Boot 1725d5160db347908f79b244ed63da5e --
-- Boot bfbdc7106d5c45699668e7d2cd4ea8c5 --
-- Boot 1725d5160db347908f79b244ed63da5e --
-- Boot bfbdc7106d5c45699668e7d2cd4ea8c5 --
-- Boot 1725d5160db347908f79b244ed63da5e --
-- Boot bfbdc7106d5c45699668e7d2cd4ea8c5 --
-- Boot 1725d5160db347908f79b244ed63da5e --
-- Boot bfbdc7106d5c45699668e7d2cd4ea8c5 --
-- Boot 1725d5160db347908f79b244ed63da5e --
-- Boot bfbdc7106d5c45699668e7d2cd4ea8c5 --
-- Boot 1725d5160db347908f79b244ed63da5e --
-- No entries --

Notice how it keeps flapping between two different IDs. What does that indicate?

@margamanterola
Copy link
Contributor

Are these machines in-place-upgraded from CoreOS or were they freshly installed with Flatcar?

@seh
Copy link
Author

seh commented Apr 14, 2021

These are fresh "installations" on EC2 instances by way of the published AMIs.

@igcherkaev
Copy link

Hello. We've noticed the same behavior here in our environment on VMWare provisioned VMs.

Few times a day a box which happens to be the busiest in terms of network load crashes with the following

May 19 18:30:45 rnqkbm401 kernel: kernel BUG at net/core/skbuff.c:4008!
-- Boot 757516b861de4db8a139aa895db71803 --
May 19 18:31:02 localhost kernel: Linux version 5.10.37-flatcar (build@pony-truck.infra.kinvolk.io) (x86_64-cros-linux-gnu-gcc (Gentoo Hardened 9.3.0-r1 p3) 9.3.0, GNU ld (Gentoo 2.35 p1) 2.35.0) #1 SMP Mon May 17 22:08:55 -00 2021

Happens on 5.10.32-flatcar as well. pstore is empty too.

@igcherkaev
Copy link

Less loaded VMs (in terms of network I/O) even having same kernel don't crash. We also run kubernetes on them with calico as our CNI (with eBPF mode enabled).

@igcherkaev
Copy link

We have another Kubernetes cluster with the same setup, but it's still running older kernel (e.g., Flatcar Container Linux by Kinvolk 2605.10.0 (Oklo) 5.4.83-flatcar), and there's no crashes at all.

@sayanchowdhury sayanchowdhury added the kind/bug Something isn't working label May 24, 2021
@igcherkaev
Copy link

Just a quick update: after disabling gso and gro on the box it hasn't crashed in 4 days already. We're monitoring the box, but it's already a great sign. It used to do it every other day or every day.

@igcherkaev
Copy link

Almost 7 days now without crashing since GSO/GRO got disabled.

@seh
Copy link
Author

seh commented May 26, 2021

How did you disable those, Igor?

@igcherkaev
Copy link

igcherkaev commented May 26, 2021

ethtool -K <iface name> gso off
ethtool -K <iface name> gro off

where iface name is your NIC card, e.g. eth0.

We have a systemd unit now to disable it on boot.

@t-lo t-lo added the channel/beta Issue concerns the Beta channel. label Jun 1, 2021
@jepio
Copy link
Member

jepio commented Jan 28, 2022

Is this still occurring with the most recent releases? Could you also test alpha which has kernel 5.15, which might not trigger this any longer.

@seh
Copy link
Author

seh commented Jan 28, 2022

We've seen it most recently two weeks ago with kernel version 5.10.84, which we received by way of an upgrade when rebooting one of our machines that started life with Flatcar version 2705.1.2.

It may be another couple of weeks before I can offer any testing outcome. What changed recently that you think may alleviate this problem?

@jepio
Copy link
Member

jepio commented Jan 31, 2022

Nothing @seh, I wast just hoping it might have resolved itself.

Would you be able to capture the full splat on the serial console, including the stacktrace?

@seh
Copy link
Author

seh commented Jan 31, 2022

Next time I see it, I will grab all I can from journalctl. Is there another source you’re recommending that I collect as well?

@pothos
Copy link
Member

pothos commented Jan 31, 2022

In case your system has a pstore backend, you may find dmesg traces in /var/lib/systemd/pstore/ on the next boot. The files get moved there for persistent storage instead of staying in /sys/fs/pstore. I'll update the docs (Edit: done here flatcar-archive/flatcar-docs#206).

@seh
Copy link
Author

seh commented Jan 31, 2022

Note that I mentioned in my initial description that our pstore directory winds up empty after these reboots, perhaps for lack of hardware support.

@pothos
Copy link
Member

pothos commented Jan 31, 2022

Maybe, but it could be that systemd-pstore.service ran and moved them to /var/lib/systemd/pstore, that's what I wanted to hint on.
Edit: check whether you have pstore support by looking if /sys/module/pstore/parameters/backend contains something else than (null)

@jmcgrath207
Copy link

@igcherkaev

ethtool -K <iface name> gso off
ethtool -K <iface name> gro off

where iface name is your NIC card, e.g. eth0.

We have a systemd unit now to disable it on boot.

Is this still working for you?

@seh
Copy link
Author

seh commented Jun 27, 2022

This is still happening to us with Flatcar Container Linux version 3227.1.1.

@Mitsuwa
Copy link

Mitsuwa commented Oct 4, 2022

This still seems to be happening in 3033.3.5

worse so i do not seem to be able to do the workaround

$ sudo ethtool -K eth0 gso off
Cannot get device udp-fragmentation-offload settings: Operation not supported
Cannot get device udp-fragmentation-offload settings: Operation not supporte

@seh
Copy link
Author

seh commented Oct 5, 2022

We suffered through this bug through the night and this morning, and have found the workaround suggested by @igcherkaev in #378 (comment) is working acceptably, so long as we're using Flatcar Container Linux with a kernel at version 5.15 or so. We found that using that new of a kernel wasn't enough without disabling generic receive and segmentation offload, and disabling that offload wasn't enough without a new enough kernel. In particular, kernel version 5.10.137 as offered by the LTS 3033.3.5 release wasn't new enough.

Here are the two systemd units I wrote to ensure that we toggle the offload off.

disable-generic-receive-offload.service
[Unit]
Description=Disable generic receive offload on primary Ethernet interface
Wants=network-online.target

[Service]
Type=oneshot
RemainAfterExit=yes
ExecStart=ethtool --offload eth0 generic-receive-offload off
ExecStop=ethtool --offload eth0 generic-receive-offload on
disable-generic-segmentation-offload.service
[Unit]
Description=Disable generic segmentation offload on primary Ethernet interface
Wants=network-online.target

[Service]
Type=oneshot
RemainAfterExit=yes
ExecStart=ethtool --offload eth0 generic-segmentation-offload off
ExecStop=ethtool --offload eth0 generic-segmentation-offload on

@seh
Copy link
Author

seh commented Oct 9, 2022

Using the stable Flatcar Container Linux version 3227.2.2 atop kernel version 5.15.63, we see this kernel bug occur in file net/core/skbuff.c on line 4219 when the expression list_skb->head_frag is false, due to that (bit)field being false.

If we disable GRO and GSO (we're not yet sure if it's crucial to disable both of these), we skirt this kernel bug, but the network performance suffers so drastically that we can't afford to run our workload like that.

@pothos
Copy link
Member

pothos commented Oct 10, 2022

Can we interact with upstream maintainers despite having no clear trace and did someone start that discussion? The source code link from the BUG at net/core/skbuff.c:123! messages and the workarounds may give some hints already.

@seh
Copy link
Author

seh commented Oct 10, 2022

We noticed that when running with GRO and GSO enabled again, with MTU ratcheted down on the eth0 interface from the default 9,001 to 1,500, this time using Flatcar Container Linux beta version 3346.1.0 and kernel version 5.15.70 atop the "m5.4xlarge" EC2 instance type, we see a different problem arise: Instead of the kernel reporting through the BUG_ON macro and rebooting, it reports a hardware checksum failure, and keeps going, albeit with degraded network performance afterward.

Please see this log fragment for an example.

dmesg output
[ 6654.575206] calia524c310aed: Caught tx_queue_len zero misconfig
[ 6802.827628] <unknown>: hw csum failure
[ 6802.828397] skb len=322 headroom=148 headlen=322 tailroom=3306
               mac=(114,14) net=(128,20) trans=148
               shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0))
               csum(0x61202bc9 ip_summed=2 complete_sw=0 valid=0 level=0)
               hash(0x9c389c38 sw=0 l4=1) proto=0x0800 pkttype=0 iif=3
[ 6802.834186] skb headroom: 00000000: bb cc 52 15 6a f0 46 8d ef db ac 01 22 40 4a bb
[ 6802.835623] skb headroom: 00000010: ed 41 14 b3 3d ef 2a 9c 41 4e 75 bb fa 30 b6 59
[ 6802.837068] skb headroom: 00000020: 7f 25 0b 98 22 71 76 ce d1 02 8d 94 ab 4b 9f 02
[ 6802.838501] skb headroom: 00000030: 4f 1f 51 6a b6 82 39 23 f7 09 a1 3d d0 07 00 00
[ 6802.839921] skb headroom: 00000040: 06 c0 b0 ea 75 8e 06 c3 48 fa de 48 08 00 45 00
[ 6802.841344] skb headroom: 00000050: 01 88 6e 89 00 00 40 11 21 62 0a 03 6d 62 0a 03
[ 6802.842748] skb headroom: 00000060: 68 12 aa 69 21 18 01 74 b4 2d 08 00 00 00 00 00
[ 6802.844160] skb headroom: 00000070: 01 00 c2 22 d9 0c d9 b7 ee ee ee ee ee ee 08 00
[ 6802.845828] skb headroom: 00000080: 45 00 01 56 35 84 40 00 3e 06 1d da 64 7a c1 cc
[ 6802.847521] skb headroom: 00000090: 64 69 5d 94
[ 6802.848539] skb linear:   00000000: a2 0c 1f 90 77 b0 94 1e 52 91 97 33 80 18 00 46
[ 6802.850223] skb linear:   00000010: e3 b1 00 00 01 01 08 0a 44 fb 3d b6 df e9 e5 7d
[ 6802.851920] skb linear:   00000020: 47 45 54 20 2f 6d 65 74 72 69 63 73 20 48 54 54
[ 6802.853618] skb linear:   00000030: 50 2f 31 2e 31 0d 0a 48 6f 73 74 3a 20 31 30 30
[ 6802.855307] skb linear:   00000040: 2e 31 30 35 2e 39 33 2e 31 34 38 3a 38 30 38 30
[ 6802.857001] skb linear:   00000050: 0d 0a 55 73 65 72 2d 41 67 65 6e 74 3a 20 50 72
[ 6802.858694] skb linear:   00000060: 6f 6d 65 74 68 65 75 73 2f 32 2e 33 39 2e 30 0d
[ 6802.860381] skb linear:   00000070: 0a 41 63 63 65 70 74 3a 20 61 70 70 6c 69 63 61
[ 6802.862067] skb linear:   00000080: 74 69 6f 6e 2f 6f 70 65 6e 6d 65 74 72 69 63 73
[ 6802.863766] skb linear:   00000090: 2d 74 65 78 74 3b 76 65 72 73 69 6f 6e 3d 31 2e
[ 6802.865456] skb linear:   000000a0: 30 2e 30 2c 61 70 70 6c 69 63 61 74 69 6f 6e 2f
[ 6802.867154] skb linear:   000000b0: 6f 70 65 6e 6d 65 74 72 69 63 73 2d 74 65 78 74
[ 6802.868845] skb linear:   000000c0: 3b 76 65 72 73 69 6f 6e 3d 30 2e 30 2e 31 3b 71
[ 6802.870536] skb linear:   000000d0: 3d 30 2e 37 35 2c 74 65 78 74 2f 70 6c 61 69 6e
[ 6802.872220] skb linear:   000000e0: 3b 76 65 72 73 69 6f 6e 3d 30 2e 30 2e 34 3b 71
[ 6802.873913] skb linear:   000000f0: 3d 30 2e 35 2c 2a 2f 2a 3b 71 3d 30 2e 31 0d 0a
[ 6802.875601] skb linear:   00000100: 41 63 63 65 70 74 2d 45 6e 63 6f 64 69 6e 67 3a
[ 6802.877291] skb linear:   00000110: 20 67 7a 69 70 0d 0a 58 2d 50 72 6f 6d 65 74 68
[ 6802.878977] skb linear:   00000120: 65 75 73 2d 53 63 72 61 70 65 2d 54 69 6d 65 6f
[ 6802.880660] skb linear:   00000130: 75 74 2d 53 65 63 6f 6e 64 73 3a 20 31 30 0d 0a
[ 6802.882338] skb linear:   00000140: 0d 0a
[ 6802.883243] skb tailroom: 00000000: 77 c3 25 55 29 3f 50 f3 c8 99 73 dd 04 a8 41 a1
[ 6802.884917] skb tailroom: 00000010: 15 7e 77 fd 79 58 fe 58 29 20 97 35 ce 13 a1 22
[ 6802.886598] skb tailroom: 00000020: f1 c8 62 f7 3d 7b 8a b0 60 e0 f5 89 dc dc 6f 8e
[ 6802.888289] skb tailroom: 00000030: cc 19 fd e3 66 1f 5f 6a fe db b8 46 1d 20 bf d3
[ 6802.889974] skb tailroom: 00000040: 5f ce c4 8c 48 b1 84 f5 38 39 8e 5e 95 61 55 f6
[ 6802.891699] skb tailroom: 00000050: f8 c5 75 a9 66 45 1e b8 d7 82 f7 f2 16 66 28 ea
[ 6802.893380] skb tailroom: 00000060: 9f b8 5a dc 75 b6 27 f3 43 9c 5a 59 e3 f2 23 b7
[ 6802.895076] skb tailroom: 00000070: b2 21 cb 6e be d2 a6 8d 6d bf 9a 5e 8b e3 8d 35
[ 6802.896758] skb tailroom: 00000080: 48 36 12 72 76 84 10 e3 8e 5a a7 1c b9 53 53 7d
[ 6802.898436] skb tailroom: 00000090: 81 db eb d8 8d c7 5a 94 d7 54 18 a4 2a 3c 91 ae
[ 6802.900130] skb tailroom: 000000a0: 8d ac c5 5d c9 9c 97 f1 a0 2f 03 18 ac 3f dc 64
[ 6802.901816] skb tailroom: 000000b0: 01 7d 01 97 c4 3b 74 23 1b 10 9e cc d4 16 ba f0
[ 6802.903501] skb tailroom: 000000c0: 54 99 43 77 fe e9 38 25 d8 6b b8 dc 00 02 05 9c
[ 6802.905184] skb tailroom: 000000d0: e6 49 45 6e ee b4 91 ee af 35 84 60 c5 44 82 bf
[ 6802.906881] skb tailroom: 000000e0: 43 8b aa 44 b8 16 52 2f 57 2d 6b de b2 7f ad 98
[ 6802.908565] skb tailroom: 000000f0: 9f b8 aa 57 66 5b ac b5 f1 e0 07 3d 62 cc a5 8a
[ 6802.910256] skb tailroom: 00000100: c5 9a 00 03 e0 2a 4e c8 6a df 91 ea 71 2b bf 04
[ 6802.911950] skb tailroom: 00000110: 32 03 24 47 14 b9 d9 e4 f2 7a 25 2e e8 9a 07 cb
[ 6802.913638] skb tailroom: 00000120: e6 6d 10 fc bf d3 e6 93 0a 7d c5 cb bd 68 dd a0
[ 6802.915342] skb tailroom: 00000130: 24 32 b0 26 32 51 44 b1 5a fd 26 51 d3 51 83 29
[ 6802.917019] skb tailroom: 00000140: 37 96 8c 17 62 f5 9b 5c d6 bb 20 0e c9 e5 0e 3a
[ 6802.918715] skb tailroom: 00000150: 65 92 99 a8 dd 94 4e a2 9f 37 f5 09 ee b6 46 66
[ 6802.920400] skb tailroom: 00000160: 05 b4 ad 4b 5d b4 d0 e6 b0 29 e2 86 d5 be 8a 28
[ 6802.922085] skb tailroom: 00000170: aa e4 57 bb 8b ef 89 eb 80 84 8b e2 45 f9 32 59
[ 6802.923796] skb tailroom: 00000180: 0f bf 31 fb 32 1f 52 34 e5 17 c9 50 93 67 52 a4
[ 6802.925469] skb tailroom: 00000190: 9b 8a 07 df 45 46 35 9a 9f a1 43 a7 37 4c e9 f2
[ 6802.927150] skb tailroom: 000001a0: 2a cb 43 8f aa 61 fa d0 03 7e 7b 27 10 c3 d4 ef
[ 6802.928681] skb tailroom: 000001b0: d7 3d 3e 58 a5 c0 85 c1 50 65 e7 74 8c 76 45 e7
[ 6802.930044] skb tailroom: 000001c0: 25 1e c4 76 a2 06 e2 b3 9f 4a 20 4f c4 f7 98 db
[ 6802.931419] skb tailroom: 000001d0: 35 9c 51 bb 21 a6 06 cc a7 2f 5f 00 20 93 c0 3c
[ 6802.932803] skb tailroom: 000001e0: a9 72 2f af a2 7e 87 53 97 e3 ba 24 ff b5 dd ed
[ 6802.934158] skb tailroom: 000001f0: 43 dd 18 0e 89 e4 e5 24 3d 84 19 a8 67 0b bd d7
[ 6802.935540] skb tailroom: 00000200: 88 ab e1 37 67 f6 de 20 25 c4 a7 23 24 fd b3 af
[ 6802.936904] skb tailroom: 00000210: 59 7c ce b2 25 47 6e 05 e2 db ff 7d 6e 27 e2 10
[ 6802.938275] skb tailroom: 00000220: 80 37 1d 95 98 22 3e 87 ba b0 c0 aa 9a ce 4c b8
[ 6802.939640] skb tailroom: 00000230: e7 fb 55 d3 69 7b 1c a5 bc d0 c4 a8 14 ab fd cd
[ 6802.940998] skb tailroom: 00000240: bc d7 9d cd a8 ee 62 0e 44 81 f1 39 c6 4c e5 34
[ 6802.942371] skb tailroom: 00000250: 81 42 a3 22 04 06 aa 06 97 64 37 78 34 bc 29 c1
[ 6802.943733] skb tailroom: 00000260: 36 2a c3 5c ea 26 8a 6f 5c ff f9 f0 f4 37 dc 9b
[ 6802.945106] skb tailroom: 00000270: 54 15 08 b7 86 8d 3e dc 1d 38 73 b9 4f 16 12 52
[ 6802.946463] skb tailroom: 00000280: 40 d7 91 a8 e4 3f ab 4a 20 09 6a ff cf 54 16 6b
[ 6802.947828] skb tailroom: 00000290: fe d5 8f 8e 7e 8c 13 47 09 a5 f2 5a 59 c7 3f ee
[ 6802.949183] skb tailroom: 000002a0: 0d f3 69 eb 52 c3 05 e1 6b d4 20 37 27 4b 65 82
[ 6802.950544] skb tailroom: 000002b0: a9 8d bd 54 a3 08 7f 2c 39 d0 c8 58 7f 8b 52 7d
[ 6802.951899] skb tailroom: 000002c0: d6 8e ef ec 4f 98 2a c9 40 61 e5 ce 6a b8 80 d4
[ 6802.953277] skb tailroom: 000002d0: a5 71 bb 6d 44 b1 09 a8 1d c5 83 01 92 43 9a fe
[ 6802.954631] skb tailroom: 000002e0: 79 e2 23 b8 02 ae ff 6d 57 04 7e 72 b9 7b 40 93
[ 6802.956005] skb tailroom: 000002f0: 27 52 db 6f fe 74 5d 92 53 bb f2 31 2f 4c 44 e9
[ 6802.957431] skb tailroom: 00000300: 69 46 e6 a7 a1 c1 af 47 77 9f 3a 19 0e f5 03 82
[ 6802.959112] skb tailroom: 00000310: b0 28 85 88 f6 56 fa 36 59 88 3d 66 89 d6 cc b8
[ 6802.960796] skb tailroom: 00000320: 4b de d6 12 66 2d 7b 4c f8 f4 7b 29 ba ef 91 db
[ 6802.962482] skb tailroom: 00000330: 48 52 e3 99 fe 70 c9 24 d2 75 dc 2b 2a 40 d4 96
[ 6802.964161] skb tailroom: 00000340: cd e8 ce 62 8d 38 04 1f ce b9 4e b1 bc 85 82 57
[ 6802.965838] skb tailroom: 00000350: 8b aa e1 72 ed a8 cc 10 ce 24 df 15 21 36 73 57
[ 6802.967522] skb tailroom: 00000360: 98 ec 22 49 f2 e9 02 4b 0a e4 b2 bc a8 bf 9c 63
[ 6802.969216] skb tailroom: 00000370: b9 f6 81 b8 18 c8 8a 6d 02 b0 14 ef d2 28 c6 0a
[ 6802.970892] skb tailroom: 00000380: 8a 14 69 54 24 e7 32 f8 78 4c 8e 44 e8 21 3a 78
[ 6802.972577] skb tailroom: 00000390: 85 97 d2 cd ae 65 19 d0 80 24 35 c8 e4 58 f7 83
[ 6802.974277] skb tailroom: 000003a0: e9 25 a1 1b c7 8d dd 36 f8 6f 7c 87 6a 8d 4a d8
[ 6802.975977] skb tailroom: 000003b0: 44 91 ca 5f ef 99 97 bf ee 56 7d 23 22 9a 9d 39
[ 6802.977656] skb tailroom: 000003c0: 62 ed cf 03 06 45 e2 de 42 59 1f d5 4e bc 46 3c
[ 6802.979343] skb tailroom: 000003d0: c3 19 e2 1c fd 8f f6 a0 9f b2 b3 e2 fc 95 93 e7
[ 6802.981033] skb tailroom: 000003e0: f7 6f c3 9b ff 6b d6 93 74 0c 3c f7 64 5b 70 1b
[ 6802.982727] skb tailroom: 000003f0: 86 43 46 72 5c 2f 19 69 39 0e d6 9c ba f2 21 c4
[ 6802.984427] skb tailroom: 00000400: 34 a5 8e 5c cf 82 03 5b 52 dc 50 9c b0 86 b2 92
[ 6802.986112] skb tailroom: 00000410: d7 fd f1 27 85 a6 e8 b9 26 70 e2 e5 19 8f b6 5f
[ 6802.987820] skb tailroom: 00000420: 20 1f bb 26 0f c9 a6 15 46 06 3d 26 a9 60 6f b6
[ 6802.989524] skb tailroom: 00000430: 64 ff 25 34 22 fc a5 66 84 f6 6d 03 c6 8a 92 10
[ 6802.991230] skb tailroom: 00000440: a2 2e 2d a6 62 6e 19 57 35 f7 25 3b 0e 85 5d e0
[ 6802.992931] skb tailroom: 00000450: f8 77 04 32 84 eb 42 da 6c d4 bb 3b 89 65 74 2a
[ 6802.994617] skb tailroom: 00000460: 5d 6d 49 f5 64 7a 29 fd 30 16 a3 ed 94 1e 4a f6
[ 6802.996313] skb tailroom: 00000470: fe ce 21 6c 5f 1b f7 58 5a bf 11 2f 56 85 f3 db
[ 6802.998002] skb tailroom: 00000480: 82 77 c5 19 e2 8e 28 09 aa c7 8b 8f fb 92 aa 28
[ 6802.999708] skb tailroom: 00000490: 05 1f 2e b5 eb 42 e6 1e 4c 56 ca 4f 42 32 36 2c
[ 6803.001436] skb tailroom: 000004a0: 2b b3 1b d6 df 0f a8 cc 55 29 29 ae d4 b3 1b 62
[ 6803.003127] skb tailroom: 000004b0: a6 aa b6 ff 77 f4 4b 6b cd c1 3a 88 49 0e fd 39
[ 6803.004817] skb tailroom: 000004c0: 4a d1 30 ab 22 be 7a 65 4c f1 b7 bc 49 86 ed d9
[ 6803.006511] skb tailroom: 000004d0: 52 ed a5 51 7f d0 00 51 78 e9 4a 1f a3 c1 4e 5c
[ 6803.008214] skb tailroom: 000004e0: ff 6d 25 cf d0 15 44 c2 f7 4b bb 4f c8 d3 fd 89
[ 6803.009896] skb tailroom: 000004f0: c7 1d 76 c5 6f dc 6a 40 d1 a5 ad d7 f2 95 5c 1b
[ 6803.011594] skb tailroom: 00000500: 0f cf 42 21 38 9d a3 e2 26 a2 54 d1 12 f8 92 1f
[ 6803.013288] skb tailroom: 00000510: b7 04 15 26 a9 ec 7d d4 65 72 f6 18 63 7b 4d a6
[ 6803.014981] skb tailroom: 00000520: 06 7b 7f 2e 43 b2 da 4c 55 55 3f c6 c7 b4 37 6b
[ 6803.016668] skb tailroom: 00000530: d5 09 ff b7 bc 7f a2 9e 1a 49 35 98 cb 19 41 e8
[ 6803.018364] skb tailroom: 00000540: 74 21 2d 94 38 fc 3b 78 15 ec 05 91 c5 aa 8f e0
[ 6803.020056] skb tailroom: 00000550: 0e 41 26 1d bb 6c 59 88 9d 15 30 78 17 32 73 1c
[ 6803.021745] skb tailroom: 00000560: e3 be d2 3f a7 a4 de 06 2f 88 86 0e 70 f1 ae 67
[ 6803.023449] skb tailroom: 00000570: a0 d0 cd b6 be be af 2a a7 df db 8e ae 71 b1 fb
[ 6803.025139] skb tailroom: 00000580: 5c 87 aa cd d8 8f 26 f2 63 52 ff 2b 48 6f 70 bd
[ 6803.026831] skb tailroom: 00000590: ec 99 f8 49 45 2f 94 1f 68 54 08 3d f7 c4 e2 0f
[ 6803.028522] skb tailroom: 000005a0: 1e 02 2c 21 3b d0 93 f1 3a e1 5d df df ef 84 86
[ 6803.030205] skb tailroom: 000005b0: 00 fc a9 72 4f 56 f9 f8 bf 5e 14 d8 8c 1a af 3d
[ 6803.031894] skb tailroom: 000005c0: 1b 1c 4a 4d dc 48 ef 65 2e 74 c0 63 35 25 61 87
[ 6803.033573] skb tailroom: 000005d0: 48 00 78 bc 31 b4 fa dc e6 c8 c1 aa 37 e2 8a 38
[ 6803.035215] skb tailroom: 000005e0: 19 7a 42 23 24 f1 d7 f5 72 1d 6f 08 d8 28 7c 43
[ 6803.036593] skb tailroom: 000005f0: b3 e6 aa 4b d2 af 66 8f 45 46 cd a2 fa 20 4b 08
[ 6803.037971] skb tailroom: 00000600: 7a dd 90 7f 94 11 b6 b9 60 96 58 4d bf 17 05 31
[ 6803.039347] skb tailroom: 00000610: be f2 57 1d da f0 21 a9 27 70 88 50 cc 2e cd 64
[ 6803.040713] skb tailroom: 00000620: ed a5 75 40 21 80 f2 64 e5 d4 ae ed 90 e7 1e bb
[ 6803.042070] skb tailroom: 00000630: ca 8b 8c 37 32 07 5b 2e b9 eb 79 23 aa a8 eb 2c
[ 6803.043441] skb tailroom: 00000640: f6 ef e0 8d e2 dd 6b 13 af f6 51 69 f2 fe 92 19
[ 6803.044806] skb tailroom: 00000650: 42 81 8c 21 08 75 e0 de f5 93 9a 74 30 68 b4 86
[ 6803.046167] skb tailroom: 00000660: db df b5 3d ce f0 6a 67 52 a3 34 6f e9 b4 bc cd
[ 6803.047546] skb tailroom: 00000670: 81 8c ac a5 f5 9b 79 44 b5 3e 7f 3e 86 47 94 17
[ 6803.048897] skb tailroom: 00000680: ca 18 de 36 52 49 1e d2 78 a1 4d 86 e4 bf 5b 1f
[ 6803.050256] skb tailroom: 00000690: 5b 91 e6 2b d9 fd 4a 53 55 7a 51 d6 1b 54 d2 ea
[ 6803.051640] skb tailroom: 000006a0: a5 17 e4 63 5d 54 03 d0 be 1c fb 2e 9e c8 6c 91
[ 6803.052974] skb tailroom: 000006b0: d0 81 97 b4 85 9c 0d 54 88 e6 50 8d 33 4d e1 23
[ 6803.054353] skb tailroom: 000006c0: d6 7d 24 57 20 07 5b be bc 7e f1 20 0c ce b5 7f
[ 6803.055737] skb tailroom: 000006d0: 4c 02 ce 4d 1c 74 72 dd ca bc 6b 79 03 d4 56 31
[ 6803.057122] skb tailroom: 000006e0: 1b 1f 8f 07 50 e1 d1 e3 e5 8b 3d a7 39 ad 58 c5
[ 6803.058785] skb tailroom: 000006f0: 6f 5d 98 0c ba 7d d3 ee 88 ca 06 50 ea 42 4f ef
[ 6803.060506] skb tailroom: 00000700: a7 66 4f 58 9f 13 43 a8 64 55 bb be 49 ac 0b b3
[ 6803.062188] skb tailroom: 00000710: ab d8 13 ef 09 b3 5e e8 58 ed 63 85 cc a9 a8 e3
[ 6803.063887] skb tailroom: 00000720: d6 af bd 1d 21 20 41 75 63 e4 e2 58 43 a8 a6 23
[ 6803.065594] skb tailroom: 00000730: b0 10 6b fb c5 61 de 19 91 98 c8 3c 5e 4a 9b eb
[ 6803.067286] skb tailroom: 00000740: 07 a4 3d 35 45 b1 b6 d2 93 b0 f7 ab 61 fe f1 34
[ 6803.068996] skb tailroom: 00000750: ec b6 f4 72 38 eb 4a 98 8a 8e 8f 17 b8 ca 03 b9
[ 6803.070698] skb tailroom: 00000760: e3 a4 4e 19 93 e8 35 16 88 c6 69 c3 9a 32 ff ce
[ 6803.072415] skb tailroom: 00000770: 60 41 14 d4 86 94 28 ce 7a a5 51 c1 6b 8a a6 b0
[ 6803.074125] skb tailroom: 00000780: 41 ea d2 32 ce cc 06 14 12 31 7d 5c 44 94 7e 9a
[ 6803.075821] skb tailroom: 00000790: 6d ce 50 25 51 c8 77 f3 9d a4 79 8c 9d 28 db 4c
[ 6803.077506] skb tailroom: 000007a0: 76 9a d5 f4 e5 27 c1 b5 3c 9a df 60 44 93 22 3e
[ 6803.079187] skb tailroom: 000007b0: 67 3a bd f1 91 bb 19 55 28 09 25 ab f9 26 43 62
[ 6803.080885] skb tailroom: 000007c0: ef 3c 09 6b 6c 89 25 14 b3 ac c8 af 72 11 96 6c
[ 6803.082568] skb tailroom: 000007d0: 38 45 69 ac fe 57 63 8d c9 ee ad dd cb 7c be 88
[ 6803.084258] skb tailroom: 000007e0: 68 8f c3 23 af 81 08 b7 16 0a 16 a7 40 65 d1 86
[ 6803.085939] skb tailroom: 000007f0: 9b ca c3 44 59 a7 76 90 62 f6 3b 51 a7 54 f8 0f
[ 6803.087631] skb tailroom: 00000800: 74 71 30 d6 95 b6 b2 be fe dd 0a 4a 23 07 b8 e2
[ 6803.089314] skb tailroom: 00000810: 46 ef 1a 65 60 14 6d 59 d0 61 52 6a b1 d2 7c 40
[ 6803.091004] skb tailroom: 00000820: b3 89 7d ec 9b 20 83 5c ba f1 d7 96 91 4f 90 67
[ 6803.092689] skb tailroom: 00000830: 7c a3 7e a0 2e c7 a1 a3 97 50 57 76 21 07 94 e2
[ 6803.094371] skb tailroom: 00000840: 20 b2 bf 59 14 7d f5 18 8b 80 be 24 74 6e 47 81
[ 6803.096060] skb tailroom: 00000850: 6f 6a 15 78 dc f9 bb f9 3a d0 a3 6c d1 be 25 b3
[ 6803.097739] skb tailroom: 00000860: ee 7f 98 67 8e e6 43 da 50 83 bd 4a c7 f5 42 c9
[ 6803.099428] skb tailroom: 00000870: 8f d0 30 be 8b f4 c9 89 67 46 de 1d 5b 90 dc 84
[ 6803.101118] skb tailroom: 00000880: c3 33 86 21 b7 06 c6 0d 87 b9 9a b6 d0 bb 92 93
[ 6803.102804] skb tailroom: 00000890: 69 92 b0 b6 de 9c e2 e0 f1 09 27 f0 f3 5d d5 6a
[ 6803.104506] skb tailroom: 000008a0: bb 8b 45 57 9e 1d 0e e6 75 00 11 31 c2 35 6c d9
[ 6803.106195] skb tailroom: 000008b0: e8 96 66 2d d6 1b 5a 07 f5 d7 c0 6d fd 97 76 31
[ 6803.107885] skb tailroom: 000008c0: 63 e5 f9 a3 f5 89 86 bc 40 4b a5 da 8f 4a 50 a1
[ 6803.109563] skb tailroom: 000008d0: 2b 76 e3 78 08 90 73 58 45 fa 5e a2 c2 a2 4b fb
[ 6803.111282] skb tailroom: 000008e0: f2 cb 6e d3 76 ea b8 c8 6c 36 5d 7d c6 c5 f2 7f
[ 6803.112976] skb tailroom: 000008f0: ab 58 4f 2a 8d f7 43 16 94 15 46 bf dc 8f 23 1f
[ 6803.114662] skb tailroom: 00000900: 2a 58 83 7a ba e4 25 5f 4d dd 88 4f b6 b5 88 f3
[ 6803.116344] skb tailroom: 00000910: 8d 51 2c e5 61 d8 aa e8 a9 22 32 95 68 dc 17 fb
[ 6803.117990] skb tailroom: 00000920: bf 24 04 c4 63 3b 30 1d cf c6 6e b6 05 a8 36 e1
[ 6803.119664] skb tailroom: 00000930: 23 e1 56 2c 55 72 cc 0c ac 46 4e 9d 67 18 9c 9d
[ 6803.121349] skb tailroom: 00000940: 93 e0 2a fc 17 fd 0d 79 63 fa 3f e6 ee 27 d4 4c
[ 6803.123045] skb tailroom: 00000950: 44 31 63 e3 92 f2 a3 52 43 a6 a9 10 c0 cb d7 40
[ 6803.124730] skb tailroom: 00000960: e8 34 64 2d ff a9 f8 02 b3 9a fd 73 32 a5 d0 9c
[ 6803.126420] skb tailroom: 00000970: e0 da 58 7b d6 9c 1e f7 95 d9 ba 2d 52 60 40 f9
[ 6803.128109] skb tailroom: 00000980: 1d 33 f4 e2 0d 88 17 0f 2f c3 d7 a3 31 33 5e e3
[ 6803.129796] skb tailroom: 00000990: 3f 33 15 e0 06 22 45 7c 4c 4e 2f 0e 49 e1 a4 79
[ 6803.131480] skb tailroom: 000009a0: b2 a2 13 a8 41 e5 e8 a1 f3 bc 76 b3 f7 15 6a f0
[ 6803.133167] skb tailroom: 000009b0: 46 9b b8 b8 cc 01 22 40 c5 ea 86 2a 33 bd 09 51
[ 6803.134847] skb tailroom: 000009c0: c4 2a a1 8f 90 51 86 01 a1 f0 af 2a 47 22 59 59
[ 6803.136519] skb tailroom: 000009d0: 72 d1 cb 85 9f a5 1b 37 15 de 7b f2 90 67 f8 ce
[ 6803.138199] skb tailroom: 000009e0: dd d2 0a 98 11 18 4c 16 53 70 36 11 c2 f4 42 e7
[ 6803.139875] skb tailroom: 000009f0: d7 ef bd e4 02 82 66 0d 09 d4 4c 0c 56 2e af 82
[ 6803.141550] skb tailroom: 00000a00: 47 39 ac 8f 99 9c 93 b5 a0 1b e7 d5 8a 66 b1 15
[ 6803.143231] skb tailroom: 00000a10: 6a f0 46 ed fa f8 e0 01 22 40 d8 2e 2a d5 c8 cd
[ 6803.144908] skb tailroom: 00000a20: a6 98 40 19 bb 38 fb c8 ec a6 e1 7f 24 24 b0 f3
[ 6803.146596] skb tailroom: 00000a30: fd 17 53 2b 20 52 2f aa e7 88 e1 96 7c 64 ea 6e
[ 6803.148274] skb tailroom: 00000a40: 3c 67 96 b2 0b 64 77 2f c2 14 aa ef 0b 77 8a 7c
[ 6803.149952] skb tailroom: 00000a50: ec 4d a5 c2 86 fd 06 2e 33 00 09 6a 4c 15 fe c1
[ 6803.151634] skb tailroom: 00000a60: 04 16 e3 59 cc 1d db 42 4c 69 16 6b 26 71 f2 51
[ 6803.153309] skb tailroom: 00000a70: 48 15 6a 08 a0 c8 a9 a1 61 f0 3e 3a d7 0b 21 29
[ 6803.154992] skb tailroom: 00000a80: db ec 7b 0b 26 a5 4b fc de f9 a3 97 92 ef 2c be
[ 6803.156667] skb tailroom: 00000a90: 2e 57 17 4b b4 5a 26 5d bd a8 b0 09 5f f3 ba 85
[ 6803.158350] skb tailroom: 00000aa0: bf 66 38 b5 5b 18 a3 59 6b 78 88 de 45 46 36 75
[ 6803.160029] skb tailroom: 00000ab0: 68 b6 cf 95 b5 1d 3d c5 87 42 8d 24 4c 80 0f f4
[ 6803.161703] skb tailroom: 00000ac0: 78 97 d1 9d cd f7 e2 0c f3 08 c9 93 e3 04 e6 ea
[ 6803.163385] skb tailroom: 00000ad0: 93 15 6a f0 46 a7 ba f1 be 01 22 40 53 42 80 38
[ 6803.165085] skb tailroom: 00000ae0: 72 ba bf 90 9f 3b 79 6c b1 7c a9 72 ff ba 36 b6
[ 6803.166772] skb tailroom: 00000af0: 3d 09 a4 74 02 23 a1 ff 2f b0 86 01 b6 3a b0 78
[ 6803.168450] skb tailroom: 00000b00: 27 2f b9 6f 94 2c fc 4f 53 d8 5e a7 f7 49 32 25
[ 6803.170128] skb tailroom: 00000b10: f2 26 8a 0a ab 81 14 72 fb c1 3d 02 09 d4 4c 0c
[ 6803.171805] skb tailroom: 00000b20: 55 c1 8d 9c 66 89 b8 cd 6f 77 5d d7 ec 46 33 1f
[ 6803.173480] skb tailroom: 00000b30: f6 62 e2 15 6a f0 46 84 ef c8 eb 01 22 40 59 8a
[ 6803.175159] skb tailroom: 00000b40: 55 e6 2b c5 dc ce 51 21 62 bc 7b 7f 17 20 89 b6
[ 6803.176830] skb tailroom: 00000b50: fd 28 4f 3f 36 b9 eb 17 ce 3b 8d 75 05 bc 62 40
[ 6803.178510] skb tailroom: 00000b60: 93 15 ac 4e ec 53 d2 13 8f 19 81 72 e0 24 4f 51
[ 6803.180185] skb tailroom: 00000b70: e0 3f b9 a5 2f 8c c1 9b dc 0d 94 dc 7e 09 09 6a
[ 6803.181861] skb tailroom: 00000b80: 4c f6 78 3d 8f b3 0e 28 30 81 c1 63 98 29 3f 48
[ 6803.183543] skb tailroom: 00000b90: 2d ca 0e 91 2d 15 6a 08 8f de f0 41 e6 f0 3e 7c
[ 6803.185217] skb tailroom: 00000ba0: 7f 42 3d b2 ff fa b2 99 0f 41 38 5e bd 7f 78 5f
[ 6803.186897] skb tailroom: 00000bb0: fe bc 4c ac 04 56 5e 62 8a 83 a8 a0 ff 2a 29 49
[ 6803.188579] skb tailroom: 00000bc0: dc e6 61 8c 80 a7 63 de ea aa 77 95 88 17 5f 5a
[ 6803.190255] skb tailroom: 00000bd0: 45 c2 3a c3 66 e9 b2 59 a3 3c ba d1 4f 7d ad cc
[ 6803.191934] skb tailroom: 00000be0: 48 7c 5a a8 7e 52 03 c6 6e a3 5c 64 26 2f 57 6e
[ 6803.193620] skb tailroom: 00000bf0: dd 29 ba d9 26 02 18 f0 46 e2 e1 ef bc 01 22 40
[ 6803.195307] skb tailroom: 00000c00: 22 c0 fb ba 61 e5 7d 52 e4 1a ee 05 47 c0 de 56
[ 6803.196980] skb tailroom: 00000c10: 02 f6 4c f3 c1 d2 50 6c 94 64 f5 73 64 ed 44 b8
[ 6803.198658] skb tailroom: 00000c20: ae ed 48 e2 50 ee 5b d9 51 00 1e 5a 09 17 58 86
[ 6803.200331] skb tailroom: 00000c30: 9f 03 24 63 73 6f ef e9 fa e9 13 29 29 b6 f3 01
[ 6803.202009] skb tailroom: 00000c40: 09 d4 4c ce fe 7d 65 4b 52 3d ea 2a 9e d7 18 a5
[ 6803.203688] skb tailroom: 00000c50: 91 12 6c 74 17 16 89 15 d4 f0 46 ff de f9 9f 01
[ 6803.205361] skb tailroom: 00000c60: 22 40 8a 0f b9 68 83 2d c4 7c 9a 23 59 b2 d6 ed
[ 6803.207039] skb tailroom: 00000c70: 5b 8f 38 ee e0 fb 66 98 86 b4 5a 2d a3 05 8c 37
[ 6803.208712] skb tailroom: 00000c80: c5 57 80 d6 d4 87 17 61 19 1d ed 7f 44 d1 4b 77
[ 6803.210387] skb tailroom: 00000c90: af 42 07 dd b3 ce 55 b2 f7 a0 90 82 49 ba 05 95
[ 6803.212094] skb tailroom: 00000ca0: ac 04 09 6a 48 9f 8e c2 ef 58 1c e2 56 30 c8 19
[ 6803.213792] skb tailroom: 00000cb0: f1 9b 54 84 03 9e 74 8d f9 0a f0 46 bc c3 c6 dc
[ 6803.215498] skb tailroom: 00000cc0: 02 22 40 e7 de d1 f2 73 af 60 5d d7 c5 52 28 94
[ 6803.217204] skb tailroom: 00000cd0: be 47 a1 36 9f a2 94 73 dc 68 c2 90 e5 21 4c ef
[ 6803.218909] skb tailroom: 00000ce0: 9e a1 0b 66 a4 6d 62 97 95 2c
[ 6803.220290] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.15.63-flatcar #1
[ 6803.221783] Hardware name: Amazon EC2 m5.4xlarge/, BIOS 1.0 10/16/2017
[ 6803.223242] Call Trace:
[ 6803.223810]  <IRQ>
[ 6803.224286]  dump_stack_lvl+0x46/0x5e
[ 6803.225128]  __skb_checksum_complete+0xdd/0xf0
[ 6803.226135]  ? csum_block_add_ext+0x20/0x20
[ 6803.227085]  ? reqsk_fastopen_remove+0x190/0x190
[ 6803.228121]  tcp_rcv_established+0x496/0x6c0
[ 6803.229079]  tcp_v4_do_rcv+0x148/0x240
[ 6803.229919]  tcp_v4_rcv+0xdb8/0xf00
[ 6803.230707]  ? ip_rcv_finish_core.constprop.0+0x141/0x420
[ 6803.231899]  ip_protocol_deliver_rcu+0x33/0x200
[ 6803.232906]  ip_local_deliver_finish+0x44/0x60
[ 6803.233897]  __netif_receive_skb_one_core+0x8b/0xa0
[ 6803.234996]  process_backlog+0x96/0x160
[ 6803.235865]  __napi_poll+0x2a/0x150
[ 6803.236653]  net_rx_action+0x250/0x2a0
[ 6803.237501]  __do_softirq+0xcf/0x286
[ 6803.238310]  irq_exit_rcu+0x99/0xc0
[ 6803.239107]  common_interrupt+0x80/0xa0
[ 6803.239970]  </IRQ>
[ 6803.240457]  <TASK>
[ 6803.240944]  asm_common_interrupt+0x21/0x40
[ 6803.241885] RIP: 0010:native_safe_halt+0xb/0x10
[ 6803.242918] Code: 00 f0 80 48 02 20 48 8b 00 a8 08 75 c0 e9 7a ff ff ff cc cc cc cc cc cc cc cc cc cc cc cc cc 66 90 0f 00 2d 89 24 58 00 fb f4 <c3> cc cc cc cc 66 90 0f 00 2d 79 24 58 00 f4 c3 cc cc cc cc cc 0f
[ 6803.247014] RSP: 0018:ffffffff87c03e38 EFLAGS: 00000246
[ 6803.248177] RAX: 0000000000004000 RBX: 0000000000000001 RCX: 00000000ffffffff
[ 6803.249749] RDX: ffff8b5d8da00000 RSI: ffff8b4ec1e9d000 RDI: ffff8b4ec1362000
[ 6803.251312] RBP: ffff8b4ec1e9d064 R08: ffffffff87dc0840 R09: 0000062fe041fc98
[ 6803.252878] R10: 00000000000000d6 R11: 0000000000000ed4 R12: 0000000000000001
[ 6803.254436] R13: ffffffff87dc08c0 R14: 0000000000000001 R15: 0000000000000000
[ 6803.256003]  acpi_safe_halt+0x1f/0x30
[ 6803.256825]  acpi_idle_enter+0xde/0x120
[ 6803.257682]  cpuidle_enter_state+0x89/0x350
[ 6803.258621]  cpuidle_enter+0x29/0x40
[ 6803.259418]  do_idle+0x1e9/0x280
[ 6803.260144]  cpu_startup_entry+0x19/0x20
[ 6803.261019]  start_kernel+0x691/0x6ba
[ 6803.261839]  secondary_startup_64_no_verify+0xc2/0xcb
[ 6803.262970]  </TASK>
[ 7074.485025] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
[ 7074.486521] IPv6: ADDRCONF(NETDEV_CHANGE): cali320fca7df79: link becomes ready
[ 7074.604752] cali320fca7df79: Caught tx_queue_len zero misconfig
[ 7134.429044] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
[ 7134.430313] IPv6: ADDRCONF(NETDEV_CHANGE): calib15a0690693: link becomes ready
[ 7134.546039] calib15a0690693: Caught tx_queue_len zero misconfig
[ 7194.464536] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
[ 7194.465876] IPv6: ADDRCONF(NETDEV_CHANGE): cali9a841390603: link becomes ready
[ 7194.581372] cali9a841390603: Caught tx_queue_len zero misconfig
[ 7254.443045] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
[ 7254.444238] IPv6: ADDRCONF(NETDEV_CHANGE): califfbbb29779b: link becomes ready
[ 7254.571444] califfbbb29779b: Caught tx_queue_len zero misconfig
[ 7264.736871] pci 0000:00:1e.0: [1d0f:8061] type 00 class 0x010802
[ 7264.738095] pci 0000:00:1e.0: reg 0x10: [mem 0x00000000-0x00003fff]
[ 7264.740240] pci 0000:00:1e.0: BAR 0: assigned [mem 0xc0004000-0xc0007fff]
[ 7264.741625] nvme nvme2: pci function 0000:00:1e.0
[ 7264.742590] nvme 0000:00:1e.0: enabling device (0000 -> 0002)
[ 7264.749773] nvme nvme2: 2/0/0 default/read/poll queues
[ 7266.492232] EXT4-fs (nvme2n1): mounted filesystem with ordered data mode. Opts: (null). Quota mode: none.
[ 7266.750104] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
[ 7266.751300] IPv6: ADDRCONF(NETDEV_CHANGE): cali463da3abb6d: link becomes ready
[ 7266.868805] cali463da3abb6d: Caught tx_queue_len zero misconfig
[ 7744.088838] pci 0000:00:1e.0: [1d0f:8061] type 00 class 0x010802
[ 7744.090192] pci 0000:00:1e.0: reg 0x10: [mem 0x00000000-0x00003fff]
[ 7744.092261] pci 0000:00:1e.0: BAR 0: assigned [mem 0xc0004000-0xc0007fff]
[ 7744.093560] nvme nvme2: pci function 0000:00:1e.0
[ 7744.094472] nvme 0000:00:1e.0: enabling device (0000 -> 0002)
[ 7744.101861] nvme nvme2: 2/0/0 default/read/poll queues
[ 7744.303017] pci 0000:00:1d.0: [1d0f:8061] type 00 class 0x010802
[ 7744.304229] pci 0000:00:1d.0: reg 0x10: [mem 0x00000000-0x00003fff]
[ 7744.306344] pci 0000:00:1d.0: BAR 0: assigned [mem 0xc0008000-0xc000bfff]
[ 7744.307817] nvme nvme3: pci function 0000:00:1d.0
[ 7744.308713] nvme 0000:00:1d.0: enabling device (0000 -> 0002)
[ 7744.315784] nvme nvme3: 2/0/0 default/read/poll queues
[ 7752.950471] EXT4-fs (nvme3n1): mounted filesystem with ordered data mode. Opts: (null). Quota mode: none.
[ 7752.975055] EXT4-fs (nvme2n1): mounted filesystem with ordered data mode. Opts: (null). Quota mode: none.
[ 7753.408247] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
[ 7753.409443] IPv6: ADDRCONF(NETDEV_CHANGE): cali567ced57124: link becomes ready
[ 7753.529160] cali567ced57124: Caught tx_queue_len zero misconfig
[ 8073.447239] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
[ 8073.448418] IPv6: ADDRCONF(NETDEV_CHANGE): cali9ca0ec69921: link becomes ready
[ 8073.574736] cali9ca0ec69921: Caught tx_queue_len zero misconfig
[10173.593686] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
[10173.594940] IPv6: ADDRCONF(NETDEV_CHANGE): cali9632dfdcf6d: link becomes ready
[10173.721447] cali9632dfdcf6d: Caught tx_queue_len zero misconfig
[15161.728619] pci 0000:00:1c.0: [1d0f:8061] type 00 class 0x010802
[15161.729819] pci 0000:00:1c.0: reg 0x10: [mem 0x00000000-0x00003fff]
[15161.731911] pci 0000:00:1c.0: BAR 0: assigned [mem 0xc000c000-0xc000ffff]
[15161.733260] nvme nvme4: pci function 0000:00:1c.0
[15161.734198] nvme 0000:00:1c.0: enabling device (0000 -> 0002)
[15161.740658] nvme nvme4: 2/0/0 default/read/poll queues
[15163.264333] EXT4-fs (nvme4n1): mounted filesystem with ordered data mode. Opts: (null). Quota mode: none.
[15163.485730] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
[15163.486986] IPv6: ADDRCONF(NETDEV_CHANGE): cali463da3abb6d: link becomes ready
[15163.602942] cali463da3abb6d: Caught tx_queue_len zero misconfig
[15361.757816] pci 0000:00:1b.0: [1d0f:8061] type 00 class 0x010802
[15361.759343] pci 0000:00:1b.0: reg 0x10: [mem 0x00000000-0x00003fff]
[15361.761998] pci 0000:00:1b.0: BAR 0: assigned [mem 0xc0010000-0xc0013fff]
[15361.763661] nvme nvme5: pci function 0000:00:1b.0
[15361.764815] nvme 0000:00:1b.0: enabling device (0000 -> 0002)
[15361.772616] nvme nvme5: 2/0/0 default/read/poll queues
[15363.388972] EXT4-fs (nvme5n1): mounted filesystem with ordered data mode. Opts: (null). Quota mode: none.

Here are CPU details via lscpu:

"lscpu" output
Architecture:            x86_64
  CPU op-mode(s):        32-bit, 64-bit
  Address sizes:         46 bits physical, 48 bits virtual
  Byte Order:            Little Endian
CPU(s):                  16
  On-line CPU(s) list:   0-15
Vendor ID:               GenuineIntel
  Model name:            Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
    CPU family:          6
    Model:               85
    Thread(s) per core:  2
    Core(s) per socket:  8
    Socket(s):           1
    Stepping:            7
    BogoMIPS:            4999.99
    Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc c
                         puid aperfmperf tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch in
                         vpcid_single pti fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid mpx avx512f avx512dq rdseed adx smap clflushopt clwb avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves id
                         a arat pku ospke
Virtualization features: 
  Hypervisor vendor:     KVM
  Virtualization type:   full
Caches (sum of all):     
  L1d:                   256 KiB (8 instances)
  L1i:                   256 KiB (8 instances)
  L2:                    8 MiB (8 instances)
  L3:                    35.8 MiB (1 instance)
NUMA:                    
  NUMA node(s):          1
  NUMA node0 CPU(s):     0-15
Vulnerabilities:         
  Itlb multihit:         KVM: Mitigation: VMX unsupported
  L1tf:                  Mitigation; PTE Inversion
  Mds:                   Vulnerable: Clear CPU buffers attempted, no microcode; SMT Host state unknown
  Meltdown:              Mitigation; PTI
  Mmio stale data:       Vulnerable: Clear CPU buffers attempted, no microcode; SMT Host state unknown
  Retbleed:              Vulnerable
  Spec store bypass:     Vulnerable
  Spectre v1:            Mitigation; usercopy/swapgs barriers and __user pointer sanitization
  Spectre v2:            Mitigation; Retpolines, STIBP disabled, RSB filling, PBRSB-eIBRS Not affected
  Srbds:                 Not affected
  Tsx async abort:       Not affected

On this machine, the pstore facility remains unavailable to us.

@seh
Copy link
Author

seh commented Oct 10, 2022

Running with GRO and GSO enabled with the MTU for the eth0 interface back up at 9,000, this time using Flatcar Container Linux version 3227.2.2 and kernel version 5.15.63 atop the "z1d.12xlarge" EC instance type, the kernel bug does arise, but now we're getting more diagnostic output, per the following log fragment.

dmesg output
[Mon Oct 10 17:05:08 2022] calib2e12cb0dcc: Caught tx_queue_len zero misconfig
[Mon Oct 10 18:22:24 2022] ------------[ cut here ]------------
[Mon Oct 10 18:22:24 2022] kernel BUG at net/core/skbuff.c:4219!
[Mon Oct 10 18:22:24 2022] invalid opcode: 0000 [#1] SMP PTI
[Mon Oct 10 18:22:24 2022] CPU: 6 PID: 0 Comm: swapper/6 Not tainted 5.15.63-flatcar #1
[Mon Oct 10 18:22:24 2022] Hardware name: Amazon EC2 z1d.12xlarge/, BIOS 1.0 10/16/2017
[Mon Oct 10 18:22:24 2022] RIP: 0010:skb_segment+0xc70/0xe80
[Mon Oct 10 18:22:24 2022] Code: 44 24 50 48 89 44 24 30 48 8b 44 24 10 48 89 44 24 50 e9 16 f7 ff ff 0f 0b 89 44 24 2c c7 44 24 4c 00 00 00 00 e9 44 fe ff ff <0f> 0b 0f 0b 0f 0b 41 8b 7d 74 85 ff 0f 85 91 01 00 00 49 8b 95 c0
[Mon Oct 10 18:22:24 2022] RSP: 0018:ffffa2d38c780838 EFLAGS: 00010246
[Mon Oct 10 18:22:24 2022] RAX: ffff8954dd8312c0 RBX: ffff89293fbde300 RCX: ffff8957bd3d2fa0
[Mon Oct 10 18:22:24 2022] RDX: 0000000000000000 RSI: ffff89293fbde2c0 RDI: ffffffffffffffff
[Mon Oct 10 18:22:24 2022] RBP: ffffa2d38c780908 R08: 0000000000009db6 R09: 0000000000000000
[Mon Oct 10 18:22:24 2022] R10: 000000000000a356 R11: 000000000000a31a R12: 000000000000000b
[Mon Oct 10 18:22:24 2022] R13: ffff892940566100 R14: 000000000000a31a R15: ffff891ad0e5c600
[Mon Oct 10 18:22:24 2022] FS:  0000000000000000(0000) GS:ffff8948b9b80000(0000) knlGS:0000000000000000
[Mon Oct 10 18:22:24 2022] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[Mon Oct 10 18:22:24 2022] CR2: 000000c011faf000 CR3: 0000000d66a0a001 CR4: 00000000007706e0
[Mon Oct 10 18:22:24 2022] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[Mon Oct 10 18:22:24 2022] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[Mon Oct 10 18:22:24 2022] PKRU: 55555554
[Mon Oct 10 18:22:24 2022] Call Trace:
[Mon Oct 10 18:22:24 2022]  <IRQ>
[Mon Oct 10 18:22:24 2022]  ? csum_block_add_ext+0x20/0x20
[Mon Oct 10 18:22:24 2022]  ? reqsk_fastopen_remove+0x190/0x190
[Mon Oct 10 18:22:24 2022]  tcp_gso_segment+0xec/0x4e0
[Mon Oct 10 18:22:24 2022]  inet_gso_segment+0x15e/0x3e0
[Mon Oct 10 18:22:24 2022]  skb_mac_gso_segment+0x9c/0x110
[Mon Oct 10 18:22:24 2022]  __skb_gso_segment+0xb2/0x160
[Mon Oct 10 18:22:24 2022]  ? netif_skb_features+0x9c/0x2d0
[Mon Oct 10 18:22:24 2022]  validate_xmit_skb.constprop.0+0x139/0x2b0
[Mon Oct 10 18:22:24 2022]  validate_xmit_skb_list+0x41/0x70
[Mon Oct 10 18:22:24 2022]  sch_direct_xmit+0x11c/0x250
[Mon Oct 10 18:22:24 2022]  __dev_queue_xmit+0x8bd/0xb10
[Mon Oct 10 18:22:24 2022]  ip_finish_output2+0x277/0x550
[Mon Oct 10 18:22:24 2022]  ? ip_route_input_rcu+0x164/0x2d0
[Mon Oct 10 18:22:24 2022]  ? skb_gso_validate_network_len+0x11/0x80
[Mon Oct 10 18:22:24 2022]  ? __ip_finish_output+0xe9/0x1a0
[Mon Oct 10 18:22:24 2022]  ip_sublist_rcv_finish+0x6b/0x70
[Mon Oct 10 18:22:24 2022]  ip_sublist_rcv+0x16e/0x1f0
[Mon Oct 10 18:22:24 2022]  ? ip_sublist_rcv+0x1f0/0x1f0
[Mon Oct 10 18:22:24 2022]  ip_list_rcv+0xf8/0x120
[Mon Oct 10 18:22:24 2022]  __netif_receive_skb_list_core+0x24a/0x270
[Mon Oct 10 18:22:24 2022]  netif_receive_skb_list_internal+0x19f/0x2c0
[Mon Oct 10 18:22:24 2022]  ? inet_gro_complete+0xaf/0x100
[Mon Oct 10 18:22:24 2022]  napi_gro_complete.constprop.0.isra.0+0x112/0x170
[Mon Oct 10 18:22:24 2022]  dev_gro_receive+0x2d5/0x6a0
[Mon Oct 10 18:22:24 2022]  napi_gro_receive+0x62/0x1d0
[Mon Oct 10 18:22:24 2022]  0xffffffffc069d699
[Mon Oct 10 18:22:24 2022]  ? scheduler_tick+0xb8/0x230
[Mon Oct 10 18:22:24 2022]  __napi_poll+0x2a/0x150
[Mon Oct 10 18:22:24 2022]  net_rx_action+0x250/0x2a0
[Mon Oct 10 18:22:24 2022]  __do_softirq+0xcf/0x286
[Mon Oct 10 18:22:24 2022]  irq_exit_rcu+0x99/0xc0
[Mon Oct 10 18:22:24 2022]  common_interrupt+0x80/0xa0
[Mon Oct 10 18:22:24 2022]  </IRQ>
[Mon Oct 10 18:22:24 2022]  <TASK>
[Mon Oct 10 18:22:24 2022]  asm_common_interrupt+0x21/0x40
[Mon Oct 10 18:22:24 2022] RIP: 0010:cpuidle_enter_state+0xc7/0x350
[Mon Oct 10 18:22:24 2022] Code: 8b 3d f5 e1 9b 4d e8 08 bb a7 ff 49 89 c5 0f 1f 44 00 00 31 ff e8 09 c9 a7 ff 45 84 ff 0f 85 fe 00 00 00 fb 66 0f 1f 44 00 00 <45> 85 f6 0f 88 0a 01 00 00 49 63 c6 4c 2b 2c 24 48 8d 14 40 48 8d
[Mon Oct 10 18:22:24 2022] RSP: 0018:ffffa2d38c527ea8 EFLAGS: 00000246
[Mon Oct 10 18:22:24 2022] RAX: ffff8948b9bac100 RBX: 0000000000000003 RCX: 00000000ffffffff
[Mon Oct 10 18:22:24 2022] RDX: 0000000000000006 RSI: 0000000000000006 RDI: 0000000000000000
[Mon Oct 10 18:22:24 2022] RBP: ffff8948b9bb6000 R08: 0000043f38b90644 R09: 0000043f6c0b1df3
[Mon Oct 10 18:22:24 2022] R10: 0000000000000014 R11: 0000000000000008 R12: ffffffffb3bbd7e0
[Mon Oct 10 18:22:24 2022] R13: 0000043f38b90644 R14: 0000000000000003 R15: 0000000000000000
[Mon Oct 10 18:22:24 2022]  ? cpuidle_enter_state+0xb7/0x350
[Mon Oct 10 18:22:24 2022]  cpuidle_enter+0x29/0x40
[Mon Oct 10 18:22:24 2022]  do_idle+0x1e9/0x280
[Mon Oct 10 18:22:24 2022]  cpu_startup_entry+0x19/0x20
[Mon Oct 10 18:22:24 2022]  secondary_startup_64_no_verify+0xc2/0xcb
[Mon Oct 10 18:22:24 2022]  </TASK>
[Mon Oct 10 18:22:24 2022] Modules linked in: xt_CT ip_set_hash_net ip_set vxlan cls_bpf sch_ingress veth xt_comment xt_mark xt_conntrack nft_chain_nat xt_MASQUERADE nf_nat nf_conntrack_netlink nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xfrm_user xfrm_algo nft_counter xt_addrtype nft_compat nf_tables nfnetlink nls_ascii nls_cp437 vfat fat mousedev intel_rapl_msr intel_rapl_common psmouse evdev i2c_piix4 i2c_core button sch_fq_codel fuse configfs ext4 crc16 mbcache jbd2 dm_verity dm_bufio aesni_intel nvme nvme_core libaes crypto_simd ena cryptd t10_pi crc_t10dif crct10dif_generic crct10dif_common btrfs blake2b_generic zstd_compress lzo_compress raid6_pq libcrc32c crc32c_generic crc32c_intel dm_mirror dm_region_hash dm_log dm_mod qla4xxx iscsi_boot_sysfs iscsi_tcp libiscsi_tcp libiscsi br_netfilter bridge scsi_transport_iscsi stp llc overlay scsi_mod scsi_common
[Mon Oct 10 18:22:24 2022] ---[ end trace 86a2732b8f4d0b13 ]---

@jepio
Copy link
Member

jepio commented Oct 11, 2022

Thanks @seh, these kinds of logs are enough to start a discussion on lkml. I'll start a thread. Just to be sure I have all the facts straight: this is using ENA?

@seh
Copy link
Author

seh commented Oct 11, 2022

If by ENA you mean Elastic Network Adapter, then I think the answer is yes. We didn't do anything deliberate to choose that, but running modinfo ena shows that the module is installed.

@seh
Copy link
Author

seh commented Oct 11, 2022

My colleague @nbourikas disabled panic upon softlockup and was able to capture a more detailed failure trace atop kernel version 5.15.70 and Calico version 3.21.5.

dmesg output
[Tue Oct 11 22:44:47 2022] ------------[ cut here ]------------
[Tue Oct 11 22:44:47 2022] kernel BUG at net/core/skbuff.c:4218!
[Tue Oct 11 22:44:47 2022] invalid opcode: 0000 [#1] SMP PTI
[Tue Oct 11 22:44:47 2022] CPU: 7 PID: 0 Comm: swapper/7 Not tainted 5.15.70-flatcar #1
[Tue Oct 11 22:44:47 2022] Hardware name: Amazon EC2 z1d.12xlarge/, BIOS 1.0 10/16/2017
[Tue Oct 11 22:44:47 2022] RIP: 0010:skb_segment+0xc71/0xe50
[Tue Oct 11 22:44:47 2022] Code: ab 01 00 00 49 8b 97 c0 00 00 00 49 8b 8f c8 00 00 00 45 89 6f 70 48 29 d1 89 c8 44 01 e9 41 89 8f b8 00 00 00 e9 60 fe ff ff <0f> 0b 48 8b 5c 24 60 8b 7c 24 28 4c 89 7b 08 85 ff 0f 84 ac 00 00
[Tue Oct 11 22:44:47 2022] RSP: 0018:ffffb1f34c7ac818 EFLAGS: 00010246
[Tue Oct 11 22:44:47 2022] RAX: ffff9d5f890ff6c0 RBX: ffff9d61bafa6b00 RCX: 0000000000000000
[Tue Oct 11 22:44:47 2022] RDX: ffff9d5901bc6900 RSI: ffff9d61bafa6ac0 RDI: ffffffffffffffff
[Tue Oct 11 22:44:47 2022] RBP: ffffb1f34c7ac8e8 R08: 0000000000008196 R09: 0000000000000000
[Tue Oct 11 22:44:47 2022] R10: 0000000000008286 R11: 0000000000008290 R12: 0000000000000005
[Tue Oct 11 22:44:47 2022] R13: 0000000000008286 R14: ffff9d5901bc6c00 R15: ffff9d581730ac00
[Tue Oct 11 22:44:47 2022] FS:  0000000000000000(0000) GS:ffff9d85f9bc0000(0000) knlGS:0000000000000000
[Tue Oct 11 22:44:47 2022] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[Tue Oct 11 22:44:47 2022] CR2: 000000c0149cd000 CR3: 0000004ef7c0a001 CR4: 00000000007706e0
[Tue Oct 11 22:44:47 2022] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[Tue Oct 11 22:44:47 2022] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[Tue Oct 11 22:44:47 2022] PKRU: 55555554
[Tue Oct 11 22:44:47 2022] Call Trace:
[Tue Oct 11 22:44:47 2022]  <IRQ>
[Tue Oct 11 22:44:47 2022]  ? csum_block_add_ext+0x20/0x20
[Tue Oct 11 22:44:47 2022]  ? reqsk_fastopen_remove+0x190/0x190
[Tue Oct 11 22:44:47 2022]  tcp_gso_segment+0xec/0x500
[Tue Oct 11 22:44:47 2022]  ? bpf_prog_2e6f5613f50238c5_calico_to_host_ep+0xa40/0x2cc8
[Tue Oct 11 22:44:47 2022]  inet_gso_segment+0x15e/0x3e0
[Tue Oct 11 22:44:47 2022]  skb_mac_gso_segment+0x9a/0x110
[Tue Oct 11 22:44:47 2022]  __skb_gso_segment+0xb2/0x160
[Tue Oct 11 22:44:47 2022]  ? netif_skb_features+0x9c/0x2d0
[Tue Oct 11 22:44:47 2022]  validate_xmit_skb.constprop.0+0x137/0x2b0
[Tue Oct 11 22:44:47 2022]  validate_xmit_skb_list+0x41/0x70
[Tue Oct 11 22:44:47 2022]  sch_direct_xmit+0x11c/0x250
[Tue Oct 11 22:44:47 2022]  __dev_queue_xmit+0x8f0/0xb70
[Tue Oct 11 22:44:47 2022]  ? nf_ct_deliver_cached_events+0x6c/0x90 [nf_conntrack]
[Tue Oct 11 22:44:47 2022]  ip_finish_output2+0x274/0x540
[Tue Oct 11 22:44:47 2022]  ? xt_compat_flush_offsets+0x14/0x70
[Tue Oct 11 22:44:47 2022]  ? skb_gso_validate_network_len+0x11/0x80
[Tue Oct 11 22:44:47 2022]  ? __ip_finish_output+0xe9/0x1a0
[Tue Oct 11 22:44:47 2022]  ip_sublist_rcv_finish+0x6b/0x70
[Tue Oct 11 22:44:47 2022]  ip_sublist_rcv+0x16e/0x1f0
[Tue Oct 11 22:44:47 2022]  ? ip_sublist_rcv+0x1f0/0x1f0
[Tue Oct 11 22:44:47 2022]  ip_list_rcv+0xf8/0x120
[Tue Oct 11 22:44:47 2022]  __netif_receive_skb_list_core+0x224/0x250
[Tue Oct 11 22:44:47 2022]  netif_receive_skb_list_internal+0x194/0x2b0
[Tue Oct 11 22:44:47 2022]  ? inet_gro_complete+0xae/0xf0
[Tue Oct 11 22:44:47 2022]  napi_gro_complete.constprop.0.isra.0+0x112/0x170
[Tue Oct 11 22:44:47 2022]  dev_gro_receive+0x2d2/0x690
[Tue Oct 11 22:44:47 2022]  napi_gro_receive+0x62/0x1d0
[Tue Oct 11 22:44:47 2022]  0xffffffffc0497687
[Tue Oct 11 22:44:47 2022]  ? ip_local_deliver_finish+0x49/0x60
[Tue Oct 11 22:44:47 2022]  ? __netif_receive_skb_one_core+0x8b/0xa0
[Tue Oct 11 22:44:47 2022]  __napi_poll+0x2a/0x150
[Tue Oct 11 22:44:47 2022]  net_rx_action+0x250/0x2a0
[Tue Oct 11 22:44:47 2022]  __do_softirq+0xd0/0x286
[Tue Oct 11 22:44:47 2022]  irq_exit_rcu+0x99/0xc0
[Tue Oct 11 22:44:47 2022]  common_interrupt+0x80/0xa0
[Tue Oct 11 22:44:47 2022]  </IRQ>
[Tue Oct 11 22:44:47 2022]  <TASK>
[Tue Oct 11 22:44:47 2022]  asm_common_interrupt+0x22/0x40
[Tue Oct 11 22:44:47 2022] RIP: 0010:cpuidle_enter_state+0xc7/0x350
[Tue Oct 11 22:44:47 2022] Code: 8b 3d 05 5a 9c 5e e8 38 20 a8 ff 49 89 c5 0f 1f 44 00 00 31 ff e8 39 2e a8 ff 45 84 ff 0f 85 fe 00 00 00 fb 66 0f 1f 44 00 00 <45> 85 f6 0f 88 0a 01 00 00 49 63 d6 4c 2b 2c 24 48 8d 04 52 48 8d
[Tue Oct 11 22:44:47 2022] RSP: 0018:ffffb1f34c52fea8 EFLAGS: 00000246
[Tue Oct 11 22:44:47 2022] RAX: ffff9d85f9bec100 RBX: 0000000000000003 RCX: 00000000ffffffff
[Tue Oct 11 22:44:47 2022] RDX: 0000000000000006 RSI: ffffffffa62c41c0 RDI: 0000000000000000
[Tue Oct 11 22:44:47 2022] RBP: ffff9d85f9bf6000 R08: 0000021df8b89a47 R09: 0000021e84f874f3
[Tue Oct 11 22:44:47 2022] R10: 0000000000000017 R11: 000000000000000b R12: ffffffffa2dbd720
[Tue Oct 11 22:44:47 2022] R13: 0000021df8b89a47 R14: 0000000000000003 R15: 0000000000000000
[Tue Oct 11 22:44:47 2022]  ? cpuidle_enter_state+0xb7/0x350
[Tue Oct 11 22:44:47 2022]  cpuidle_enter+0x29/0x40
[Tue Oct 11 22:44:47 2022]  do_idle+0x1e0/0x270
[Tue Oct 11 22:44:47 2022]  cpu_startup_entry+0x19/0x20
[Tue Oct 11 22:44:47 2022]  secondary_startup_64_no_verify+0xc2/0xcb
[Tue Oct 11 22:44:47 2022]  </TASK>
[Tue Oct 11 22:44:47 2022] Modules linked in: xt_CT ip_set_hash_net ip_set vxlan cls_bpf sch_ingress veth xt_comment xt_mark xt_conntrack nft_chain_nat xt_MASQUERADE nf_nat nf_conntrack_netlink nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xfrm_user xfrm_algo nft_counter xt_addrtype nft_compat nf_tables nfnetlink nls_ascii nls_cp437 vfat fat mousedev intel_rapl_msr intel_rapl_common psmouse evdev i2c_piix4 i2c_core button sch_fq_codel fuse configfs ext4 crc16 mbcache jbd2 dm_verity dm_bufio nvme aesni_intel nvme_core libaes ena crypto_simd cryptd t10_pi crc_t10dif crct10dif_generic crct10dif_common btrfs blake2b_generic xor zstd_compress lzo_compress raid6_pq libcrc32c crc32c_generic crc32c_intel dm_mirror dm_region_hash dm_log dm_mod qla4xxx iscsi_boot_sysfs iscsi_tcp libiscsi_tcp libiscsi br_netfilter bridge scsi_transport_iscsi stp llc overlay scsi_mod scsi_common
[Tue Oct 11 22:44:47 2022] ---[ end trace 4c1f3c2045158b27 ]---
[Tue Oct 11 22:44:47 2022] RIP: 0010:skb_segment+0xc71/0xe50
[Tue Oct 11 22:44:47 2022] Code: ab 01 00 00 49 8b 97 c0 00 00 00 49 8b 8f c8 00 00 00 45 89 6f 70 48 29 d1 89 c8 44 01 e9 41 89 8f b8 00 00 00 e9 60 fe ff ff <0f> 0b 48 8b 5c 24 60 8b 7c 24 28 4c 89 7b 08 85 ff 0f 84 ac 00 00

@tomastigera and @fasaxc, note this frame in that call stack, in between tcp_gso_segment and inet_gso_segment:

bpf_prog_2e6f5613f50238c5_calico_to_host_ep+0xa40/0x2cc8

@jepio
Copy link
Member

jepio commented Oct 13, 2022

@seh, would you be able to test with calico 3.23? This PR https://github.com/projectcalico/calico/pull/5753/files makes calico stop changing gso_size on vxlan decapsulation, which lkml suggests might be the cause (https://lore.kernel.org/netdev/194f6b02-8ee7-b5d7-58f3-6a83b5ff275d@gmail.com/).

@seh
Copy link
Author

seh commented Oct 13, 2022

Thank you for the suggestion. Yes, we've been testing Calico version 3.23.3 over the last couple of days together with Flatcar's beta version 3346.1.0. So far, we haven't been hitting this kernel bug. I'll have more confidence after another day or two of testing.

@seh
Copy link
Author

seh commented Oct 18, 2022

Apparently our testing did not tell the full story. It was a late one last night.

We've now seen this same kernel failure occur using Calico 3.23.2 with Flatcar Container Linux 3346.1.0 (kernel version 5.15.70) and Ubuntu 22.04.1 ("Jammy Jellyfish") (kernel version 5.15.0). The line number in file skbuff.c moves by one from 4218 to 4217 in the Ubuntu image. Disabling GRO and GSO again alleviates the rebooting problem for the moment, still at great cost for network performance.

That confirms for us that the problem is not specific to Flatcar Container Linux, but it does seem to be related to Calico's eBPF data plane.

@vojtechDB
Copy link

same issue kernel BUG at net/core/skbuff.c:4082 on Red Hat Enterprise Linux release 8.6 (Ootpa) with 4.18.0-372.26.1.el8_6.x86_64

Calico's eBPF data plane enabled. I agree with you @seh

@seh
Copy link
Author

seh commented Oct 21, 2022

Just to make sure we're following along on this side, did you all see Jiri's candidate patch that he mentioned in projectcalico/calico#6865 (comment)?

@vojtechDB
Copy link

with Jiri's candidate patch I'm not able to reproduce the issue anymore 4.18.0-372.26.1.el8_6.BZ_2136229_test_V1.x86_64

@vojtechDB
Copy link

@pothos
Copy link
Member

pothos commented Oct 31, 2022

@jepio has built patches images: https://bincache.flatcar-linux.net/images/amd64/3346.1.99+issue-378-fix/ and @seh is testing them, maybe for others following that also may be interesting

@seh
Copy link
Author

seh commented Oct 31, 2022

So far, after five hours running with both GRO and GSO enabled, the machine (EC2 instance of type "z1d.12xlarge") has not crashed yet. Another machine running Flatcar Container Linux beta version 3346.1.0 and the same configuration otherwise (same EC2 instance type, same AZ, same workload) fails at least twice every hour.

@jepio
Copy link
Member

jepio commented Nov 8, 2022

The patch is queued up in netdev/next - as soon as it lands in linus' tree it can be submitted to stable.
https://lore.kernel.org/netdev/166753501670.4086.1819802414418539212.git-patchwork-notify@kernel.org/#t

@seh
Copy link
Author

seh commented Nov 16, 2022

I see that the patch is present along Linux's "master" branch and is tagged with "v6.1-rc5" as of three days ago.

@jepio
Copy link
Member

jepio commented Nov 25, 2022

This patch is in 5.15.79, which is in beta as of yesterday (3417.1.0).

@seh, want to verify and then we'll close this issue at last?

@seh
Copy link
Author

seh commented Jan 25, 2023

We've been using this fix for about six weeks now with noticing any of these failures occurring. I consider this problem to be fixed. Thank you for all of your help with this one. It was quite a journey.

@seh seh closed this as completed Jan 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
channel/beta Issue concerns the Beta channel. kind/bug Something isn't working
Projects
None yet
Development

No branches or pull requests

10 participants