Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Arch Linux soft lockup tracker, vol. 2 #720

Closed
mrc0mmand opened this issue Mar 28, 2024 · 6 comments
Closed

Arch Linux soft lockup tracker, vol. 2 #720

mrc0mmand opened this issue Mar 28, 2024 · 6 comments

Comments

@mrc0mmand
Copy link
Member

mrc0mmand commented Mar 28, 2024

Because the first one apparently needed a sequel.

After moving the bare metal hypervisors from C8S to C9S, the Arch Linux jobs started exhibiting frequent soft lockups. The stack traces look pretty similar to the ones from #660:

[  294.819243] kernel: watchdog: BUG: soft lockup - CPU#18 stuck for 28s! [swapper/18:0]
[  294.879469] kernel: Modules linked in: veth ptp_mock psample ptp pps_core sch_tbf sch_sfq sch_sfb sch_qfq sch_netem sch_ingress sch_htb sch_hhf sch_gred sch_fq_pie sch_pie sch_fq sch_ets sch_drr sch_codel sch_cake tcp_dctcp l2tp_ip l2tp_eth ifb fou xfrm_interface xfrm6_tunnel tunnel4 tunnel6 ip_gre ip_tunnel gre wireguard curve25519_x86_64 libchacha20poly1305 chacha_x86_64 poly1305_x86_64 libcurve25519_generic libchacha vxcan vcan can_dev vrf ipvtap tap ipvlan batman_adv bridge bareudp isofs cdrom squashfs xt_nat xt_addrtype xt_tcpudp xt_MASQUERADE iptable_nat nft_fib_ipv6 nft_masq nft_nat nft_fib_ipv4 nft_fib nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables macsec l2tp_netlink l2tp_core ip6_udp_tunnel udp_tunnel ext4 crc16 mbcache jbd2 dummy rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace sunrpc netfs snd_seq_dummy snd_hrtimer snd_seq snd_seq_device snd_timer snd soundcore intel_rapl_msr intel_rapl_common intel_uncore_frequency_common 8021q garp mrp stp llc isst_if_common nfit
[  294.949456] kernel:  libnvdimm vfat fat cbc encrypted_keys trusted asn1_encoder tee kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul polyval_clmulni polyval_generic gf128mul ghash_clmulni_intel sha512_ssse3 sha1_ssse3 aesni_intel crypto_simd cryptd rapl cfg80211 rfkill joydev iTCO_wdt mousedev intel_pmc_bxt iTCO_vendor_support i2c_i801 psmouse pcspkr lpc_ich i2c_smbus mac_hid loop fuse dm_mod nfnetlink vsock_loopback vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vsock vmw_vmci qemu_fw_cfg ip_tables x_tables btrfs blake2b_generic libcrc32c crc32c_generic xor raid6_pq serio_raw virtio_net atkbd net_failover libps2 virtio_rng failover virtio_balloon vivaldi_fmap virtio_blk crc32c_intel virtio_pci sha256_ssse3 xhci_pci i8042 virtio_pci_legacy_dev intel_agp virtio_pci_modern_dev xhci_pci_renesas serio intel_gtt cirrus [last unloaded: sch_teql]
[  294.949639] kernel: CPU: 18 PID: 0 Comm: swapper/18 Not tainted 6.8.1-arch1-1 #1 52f97d9bb37be6168651745a1a9f8f7240d21ce5
[  294.949646] kernel: Hardware name: Red Hat KVM/RHEL, BIOS edk2-20240214-1.el9 02/14/2024
[  294.949647] kernel: RIP: 0010:pv_native_safe_halt+0xf/0x20
[  294.949657] kernel: Code: 22 d7 c3 cc cc cc cc 0f 1f 40 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 66 90 0f 00 2d e3 13 27 00 fb f4 <c3> cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90
[  294.949660] kernel: RSP: 0018:ffffb21fc014fed8 EFLAGS: 00000242
[  294.949664] kernel: RAX: 0000000000000012 RBX: 0000000000000012 RCX: ffff9942a6865c50
[  294.949667] kernel: RDX: 4000000000000000 RSI: 0000000000000012 RDI: 0000000000473174
[  294.949669] kernel: RBP: ffff994140a78000 R08: 0000000000000001 R09: 0000000000000000
[  294.949671] kernel: R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000
[  294.949672] kernel: R13: 0000000000000000 R14: ffff994140a78000 R15: 0000000000000000
[  294.949674] kernel: FS:  0000000000000000(0000) GS:ffff99603f080000(0000) knlGS:0000000000000000
[  294.949676] kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  294.949677] kernel: CR2: 00005affaf66e0d0 CR3: 0000000117a04003 CR4: 0000000000772ef0
[  294.949681] kernel: PKRU: 55555554
[  294.949681] kernel: Call Trace:
[  294.949683] kernel:  <IRQ>
[  294.949686] kernel:  ? watchdog_timer_fn+0x1e6/0x270
[  294.949694] kernel:  ? __pfx_watchdog_timer_fn+0x10/0x10
[  294.949696] kernel:  ? __hrtimer_run_queues+0x10f/0x2b0
[  294.949702] kernel:  ? hrtimer_interrupt+0xf8/0x230
[  294.949705] kernel:  ? __sysvec_apic_timer_interrupt+0x4d/0x140
[  294.949713] kernel:  ? sysvec_apic_timer_interrupt+0x6d/0x90
[  294.949715] kernel:  </IRQ>
[  294.949716] kernel:  <TASK>
[  294.949718] kernel:  ? asm_sysvec_apic_timer_interrupt+0x1a/0x20
[  294.949736] kernel:  ? pv_native_safe_halt+0xf/0x20
[  294.949739] kernel:  default_idle+0x9/0x20
[  294.949743] kernel:  default_idle_call+0x2c/0xe0
[  294.949746] kernel:  do_idle+0x1f1/0x230
[  294.949750] kernel:  cpu_startup_entry+0x2a/0x30
[  294.949756] kernel:  start_secondary+0x11e/0x140
[  294.949760] kernel:  secondary_startup_64_no_verify+0x184/0x18b
[  294.949767] kernel:  </TASK>
[  323.142578] kernel: watchdog: BUG: soft lockup - CPU#26 stuck for 24s! [swapper/26:0]
[  323.166262] kernel: Modules linked in: bonding tls veth ptp_mock psample ptp pps_core sch_tbf sch_sfq sch_sfb sch_qfq sch_netem sch_ingress sch_htb sch_hhf sch_gred sch_fq_pie sch_pie sch_fq sch_ets sch_drr sch_codel sch_cake tcp_dctcp l2tp_ip l2tp_eth ifb fou xfrm_interface xfrm6_tunnel tunnel4 tunnel6 ip_gre ip_tunnel gre wireguard curve25519_x86_64 libchacha20poly1305 chacha_x86_64 poly1305_x86_64 libcurve25519_generic libchacha vxcan vcan can_dev vrf ipvtap tap ipvlan batman_adv bridge bareudp isofs cdrom squashfs xt_nat xt_addrtype xt_tcpudp xt_MASQUERADE iptable_nat nft_fib_ipv6 nft_masq nft_nat nft_fib_ipv4 nft_fib nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables macsec l2tp_netlink l2tp_core ip6_udp_tunnel udp_tunnel ext4 crc16 mbcache jbd2 dummy rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace sunrpc netfs snd_seq_dummy snd_hrtimer snd_seq snd_seq_device snd_timer snd soundcore 8021q garp intel_rapl_msr mrp stp intel_rapl_common llc intel_uncore_frequency_common isst_if_common
[  323.166509] kernel:  nfit libnvdimm vfat fat cbc encrypted_keys trusted asn1_encoder tee kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul polyval_clmulni polyval_generic gf128mul ghash_clmulni_intel sha512_ssse3 sha1_ssse3 aesni_intel crypto_simd cryptd rapl cfg80211 iTCO_wdt joydev intel_pmc_bxt mousedev iTCO_vendor_support rfkill psmouse i2c_i801 pcspkr lpc_ich i2c_smbus mac_hid loop fuse dm_mod nfnetlink vsock_loopback vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vsock vmw_vmci qemu_fw_cfg ip_tables x_tables btrfs blake2b_generic libcrc32c crc32c_generic xor raid6_pq serio_raw virtio_net atkbd net_failover libps2 virtio_blk virtio_balloon failover virtio_rng vivaldi_fmap crc32c_intel sha256_ssse3 virtio_pci xhci_pci virtio_pci_legacy_dev i8042 intel_agp xhci_pci_renesas cirrus virtio_pci_modern_dev intel_gtt serio [last unloaded: sch_teql]
[  323.166581] kernel: CPU: 26 PID: 0 Comm: swapper/26 Not tainted 6.8.1-arch1-1 #1 52f97d9bb37be6168651745a1a9f8f7240d21ce5
[  323.166590] kernel: Hardware name: Red Hat KVM/RHEL, BIOS edk2-20240214-1.el9 02/14/2024
[  323.166592] kernel: RIP: 0010:_nohz_idle_balance.isra.0+0x1e0/0x3a0
[  323.166600] kernel: Code: ea 01 00 0f 84 9e 01 00 00 4c 89 ef e8 39 3b fe ff 4c 89 ef e8 81 29 fe ff 48 f7 44 24 10 00 02 00 00 74 06 fb 0f 1f 44 00 00 <f6> 44 24 0c 01 0f 85 aa 00 00 00 49 8b 85 38 0a 00 00 e9 d6 fe ff
[  323.166612] kernel: RSP: 0018:ffffb4fd0018fe80 EFLAGS: 00000206
[  323.166622] kernel: RAX: 0000000000000001 RBX: 000000000000001b RCX: 00000000003dc17a
[  323.166629] kernel: RDX: 00000000003dc17a RSI: 0000000000000000 RDI: ffff93303f6347c0
[  323.166630] kernel: RBP: 00000000fffffd1a R08: 0000000000000001 R09: 0000000000000266
[  323.166632] kernel: R10: 0000000002e392c6 R11: 000000457ff99800 R12: 0000000000000001
[  323.166633] kernel: R13: ffff93303f6347c0 R14: 0000000000000028 R15: 00000000000347c0
[  323.166635] kernel: FS:  0000000000000000(0000) GS:ffff93303f280000(0000) knlGS:0000000000000000
[  323.166637] kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  323.166638] kernel: CR2: 00005e80d2b0c0c0 CR3: 0000000140f58003 CR4: 0000000000772ef0
[  323.166641] kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  323.166642] kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  323.166644] kernel: PKRU: 55555554
[  323.166645] kernel: Call Trace:
[  323.166647] kernel:  <IRQ>
[  323.166652] kernel:  ? watchdog_timer_fn+0x1e6/0x270
[  323.166661] kernel:  ? __pfx_watchdog_timer_fn+0x10/0x10
[  323.166664] kernel:  ? __hrtimer_run_queues+0x10f/0x2b0
[  323.166669] kernel:  ? hrtimer_interrupt+0xf8/0x230
[  323.166676] kernel:  ? __sysvec_apic_timer_interrupt+0x4d/0x140
[  323.166682] kernel:  ? sysvec_apic_timer_interrupt+0x6d/0x90
[  323.166687] kernel:  </IRQ>
[  323.166687] kernel:  <TASK>
[  323.166688] kernel:  ? asm_sysvec_apic_timer_interrupt+0x1a/0x20
[  323.166694] kernel:  ? _nohz_idle_balance.isra.0+0x1e0/0x3a0
[  323.166697] kernel:  do_idle+0x38/0x230
[  323.166701] kernel:  cpu_startup_entry+0x2a/0x30
[  323.166703] kernel:  start_secondary+0x11e/0x140
[  323.166707] kernel:  secondary_startup_64_no_verify+0x184/0x18b
[  323.166714] kernel:  </TASK>

It doesn't seem to be caused by excessive I/O, since the utilization on both the host and the guest never goes above 50% (not counting spikes), and moving most of the stuff to tmpfs didn't help either.

It's also not caused by oversaturated serial console, since systemd/systemd@fa6f37c is in place, and the soft lockups happen outside of the test VMs anyway (where the serial console is pretty much quiet after boot).

@mrc0mmand
Copy link
Member Author

Kernel on the host doesn't seem to matter either, since the soft lockups happen with both the stock kernel-5.14.0-432.el9.x86_64 and kernel-core-6.7.1-0.hs1.hsx.el9.x86_64 from the Hyperscale SIG.

@mrc0mmand
Copy link
Member Author

Same stuff with linux-mainline on the guest:

[  419.813750] kernel: watchdog: BUG: soft lockup - CPU#6 stuck for 21s! [swapper/6:0]
[  419.831678] kernel: Modules linked in: bonding tls veth ptp_mock psample ptp pps_core sch_tbf sch_sfq sch_sfb sch_qfq sch_netem sch_ingress sch_htb sch_hhf sch_gred sch_fq_pie sch_pie sch_fq sch_ets sch_drr sch_codel sch_cake tcp_dctcp l2tp_ip l2tp_eth ifb fou xfrm_interface xfrm6_tunnel tunnel4 tunnel6 ip_gre ip_tunnel gre wireguard curve25519_x86_64 libchacha20poly1305 chacha_x86_64 poly1305_x86_64 libcurve25519_generic libchacha vxcan vcan can_dev vrf ipvtap tap ipvlan batman_adv bridge bareudp isofs cdrom squashfs xt_nat xt_addrtype xt_tcpudp xt_MASQUERADE iptable_nat nft_fib_ipv6 nft_masq nft_nat nft_fib_ipv4 nft_fib nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables macsec l2tp_netlink l2tp_core ip6_udp_tunnel udp_tunnel ext4 crc16 mbcache jbd2 dummy rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace sunrpc netfs snd_seq_dummy snd_hrtimer snd_seq snd_seq_device snd_timer snd soundcore intel_rapl_msr intel_rapl_common intel_uncore_frequency_common 8021q garp mrp isst_if_common stp llc
[  419.831924] kernel:  nfit libnvdimm cbc encrypted_keys trusted asn1_encoder tee vfat fat kvm_intel kvm crct10dif_pclmul crc32_pclmul polyval_clmulni polyval_generic gf128mul ghash_clmulni_intel sha512_ssse3 sha1_ssse3 aesni_intel crypto_simd cryptd rapl cfg80211 iTCO_wdt joydev rfkill intel_pmc_bxt mousedev iTCO_vendor_support i2c_i801 psmouse pcspkr i2c_smbus lpc_ich mac_hid loop fuse dm_mod nfnetlink vsock_loopback vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vsock vmw_vmci qemu_fw_cfg ip_tables x_tables btrfs blake2b_generic libcrc32c crc32c_generic xor raid6_pq serio_raw virtio_net atkbd net_failover libps2 failover virtio_balloon virtio_blk virtio_rng vivaldi_fmap virtio_pci crc32c_intel sha256_ssse3 xhci_pci virtio_pci_legacy_dev intel_agp i8042 xhci_pci_renesas virtio_pci_modern_dev intel_gtt cirrus serio [last unloaded: sch_teql]
[  419.832067] kernel: CPU: 6 PID: 0 Comm: swapper/6 Not tainted 6.9.0-rc1-1-mainline #1 f8e1e48790b6ac6744f37694fc9f9d0653827e9d
[  419.832073] kernel: Hardware name: Red Hat KVM/RHEL, BIOS edk2-20240214-1.el9 02/14/2024
[  419.832077] kernel: RIP: 0010:update_blocked_averages+0x684/0x800
[  419.832084] kernel: Code: 41 c7 46 20 00 00 00 00 40 84 ed 0f 85 92 00 00 00 4c 89 f7 e8 0d fb fe ff 48 f7 44 24 30 00 02 00 00 74 06 fb 0f 1f 44 00 00 <48> 83 c4 40 5b 5d 41 5c 41 5d 41 5e 41 5f c3 cc cc cc cc 80 3d 61
[  419.832086] kernel: RSP: 0018:ffffb317000f7e10 EFLAGS: 00000206
[  419.832092] kernel: RAX: 00000000000023bc RBX: 0000000000000000 RCX: ffff8f553e4b6b80
[  419.832097] kernel: RDX: 0000000000000015 RSI: 000000000000b848 RDI: ffff8f553e4b6180
[  419.832102] kernel: RBP: 0000005e5300b601 R08: 0000000000000094 R09: ffff8f553e4b6b80
[  419.832103] kernel: R10: 00000000000253fc R11: 00000061bed72800 R12: ffff8f553e4b6a38
[  419.832105] kernel: R13: ffff8f553e4b6b80 R14: ffff8f553e4b6180 R15: 0000000000000000
[  419.832106] kernel: FS:  0000000000000000(0000) GS:ffff8f553dd00000(0000) knlGS:0000000000000000
[  419.832108] kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  419.832110] kernel: CR2: 00007f6250f95a38 CR3: 0000001164020004 CR4: 0000000000772ef0
[  419.832113] kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  419.832114] kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  419.832121] kernel: PKRU: 55555554
[  419.832122] kernel: Call Trace:
[  419.832125] kernel:  <IRQ>
[  419.832129] kernel:  ? watchdog_timer_fn+0x1dd/0x260
[  419.832134] kernel:  ? __pfx_watchdog_timer_fn+0x10/0x10
[  419.832136] kernel:  ? __hrtimer_run_queues+0x10f/0x2a0
[  419.832140] kernel:  ? hrtimer_interrupt+0xfa/0x230
[  419.832143] kernel:  ? __sysvec_apic_timer_interrupt+0x55/0x150
[  419.832147] kernel:  ? sysvec_apic_timer_interrupt+0x6c/0x90
[  419.832151] kernel:  </IRQ>
[  419.832152] kernel:  <TASK>
[  419.832153] kernel:  ? asm_sysvec_apic_timer_interrupt+0x1a/0x20
[  419.832161] kernel:  ? update_blocked_averages+0x684/0x800
[  419.832164] kernel:  ? _raw_spin_unlock+0xe/0x30
[  419.832167] kernel:  ? finish_task_switch.isra.0+0x99/0x2e0
[  419.832170] kernel:  ? kvm_sched_clock_read+0x11/0x20
[  419.832173] kernel:  _nohz_idle_balance.isra.0+0x2f4/0x380
[  419.832179] kernel:  do_idle+0x2f/0x210
[  419.832183] kernel:  cpu_startup_entry+0x29/0x30
[  419.832185] kernel:  start_secondary+0x11c/0x140
[  419.832190] kernel:  common_startup_64+0x13e/0x141
[  419.832196] kernel:  </TASK>
[  419.832198] kernel: watchdog: BUG: soft lockup - CPU#55 stuck for 24s! [swapper/55:0]
[  419.836735] kernel: Modules linked in: bonding tls veth ptp_mock psample ptp pps_core sch_tbf sch_sfq sch_sfb sch_qfq sch_netem sch_ingress sch_htb sch_hhf sch_gred sch_fq_pie sch_pie sch_fq sch_ets sch_drr sch_codel sch_cake tcp_dctcp l2tp_ip l2tp_eth ifb fou xfrm_interface xfrm6_tunnel tunnel4 tunnel6 ip_gre ip_tunnel gre wireguard curve25519_x86_64 libchacha20poly1305 chacha_x86_64 poly1305_x86_64 libcurve25519_generic libchacha vxcan vcan can_dev vrf ipvtap tap ipvlan batman_adv bridge bareudp isofs cdrom squashfs xt_nat xt_addrtype xt_tcpudp xt_MASQUERADE iptable_nat nft_fib_ipv6 nft_masq nft_nat nft_fib_ipv4 nft_fib nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables macsec l2tp_netlink l2tp_core ip6_udp_tunnel udp_tunnel ext4 crc16 mbcache jbd2 dummy rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace sunrpc netfs snd_seq_dummy snd_hrtimer snd_seq snd_seq_device snd_timer snd soundcore intel_rapl_msr intel_rapl_common intel_uncore_frequency_common 8021q garp mrp isst_if_common stp llc
[  419.836836] kernel:  nfit libnvdimm cbc encrypted_keys trusted asn1_encoder tee vfat fat kvm_intel kvm crct10dif_pclmul crc32_pclmul polyval_clmulni polyval_generic gf128mul ghash_clmulni_intel sha512_ssse3 sha1_ssse3 aesni_intel crypto_simd cryptd rapl cfg80211 iTCO_wdt joydev rfkill intel_pmc_bxt mousedev iTCO_vendor_support i2c_i801 psmouse pcspkr i2c_smbus lpc_ich mac_hid loop fuse dm_mod nfnetlink vsock_loopback vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vsock vmw_vmci qemu_fw_cfg ip_tables x_tables btrfs blake2b_generic libcrc32c crc32c_generic xor raid6_pq serio_raw virtio_net atkbd net_failover libps2 failover virtio_balloon virtio_blk virtio_rng vivaldi_fmap virtio_pci crc32c_intel sha256_ssse3 xhci_pci virtio_pci_legacy_dev intel_agp i8042 xhci_pci_renesas virtio_pci_modern_dev intel_gtt cirrus serio [last unloaded: sch_teql]
[  419.836915] kernel: CPU: 55 PID: 0 Comm: swapper/55 Tainted: G             L     6.9.0-rc1-1-mainline #1 f8e1e48790b6ac6744f37694fc9f9d0653827e9d
[  419.836920] kernel: Hardware name: Red Hat KVM/RHEL, BIOS edk2-20240214-1.el9 02/14/2024
[  419.836922] kernel: RIP: 0010:pv_native_safe_halt+0xf/0x20
[  419.836932] kernel: Code: 22 d7 c3 cc cc cc cc 0f 1f 40 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 66 90 0f 00 2d 43 d9 24 00 fb f4 <c3> cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90
[  419.836935] kernel: RSP: 0018:ffffb3170027fed8 EFLAGS: 00000252
[  419.836938] kernel: RAX: 0000000000000037 RBX: ffff8f3640c6b000 RCX: ffff8f369a994d60
[  419.836940] kernel: RDX: 0000000000000037 RSI: 0000000000000037 RDI: 000000000017f94c
[  419.836942] kernel: RBP: 0000000000000037 R08: 0000000000000001 R09: 0000000000000000
[  419.836943] kernel: R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000
[  419.836946] kernel: R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[  419.836948] kernel: FS:  0000000000000000(0000) GS:ffff8f553f580000(0000) knlGS:0000000000000000
[  419.836950] kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  419.836952] kernel: CR2: 00007fbba1084680 CR3: 000000019151e003 CR4: 0000000000772ef0
[  419.836957] kernel: PKRU: 55555554
[  419.836967] kernel: Call Trace:
[  419.836970] kernel:  <IRQ>
[  419.836975] kernel:  ? watchdog_timer_fn+0x1dd/0x260
[  419.836986] kernel:  ? __pfx_watchdog_timer_fn+0x10/0x10
[  419.836989] kernel:  ? __hrtimer_run_queues+0x10f/0x2a0
[  419.836996] kernel:  ? hrtimer_interrupt+0xfa/0x230
[  419.837001] kernel:  ? __sysvec_apic_timer_interrupt+0x55/0x150
[  419.837006] kernel:  ? sysvec_apic_timer_interrupt+0x6c/0x90
[  419.837009] kernel:  </IRQ>
[  419.837010] kernel:  <TASK>
[  419.837011] kernel:  ? asm_sysvec_apic_timer_interrupt+0x1a/0x20
[  419.837018] kernel:  ? pv_native_safe_halt+0xf/0x20
[  419.837021] kernel:  default_idle+0x9/0x20
[  419.837026] kernel:  default_idle_call+0x30/0x100
[  419.837029] kernel:  do_idle+0x1cb/0x210
[  419.837036] kernel:  cpu_startup_entry+0x29/0x30
[  419.837039] kernel:  start_secondary+0x11c/0x140
[  419.837046] kernel:  common_startup_64+0x13e/0x141
[  419.837052] kernel:  </TASK>

@mrc0mmand
Copy link
Member Author

Some more soft lockups to spice things up:

[  349.123493] systemd-resolved[72819]: Sending query packet with id 37640 on interface 65/AF_INET of size 27.
[  349.123558] systemd-resolved[72819]: Received llmnr UDP packet of size 27, ifindex=65, ttl=255, fragsize=0, sender=192.168.23.5, destination=224.0.0.252
[  372.540188] kernel: watchdog: BUG: soft lockup - CPU#41 stuck for 21s! [swapper/41:0]
[  372.543499] kernel: Modules linked in: bonding tls veth ptp_mock psample ptp pps_core sch_tbf sch_sfq sch_sfb sch_qfq sch_netem sch_ingress sch_htb sch_hhf sch_gred sch_fq_pie sch_pie sch_fq sch_ets sch_drr sch_codel sch_cake tcp_dctcp l2tp_ip l2tp_eth ifb fou xfrm_interface xfrm6_tunnel tunnel4 tunnel6 ip_gre ip_tunnel gre wireguard curve25519_x86_64 libchacha20poly1305 chacha_x86_64 poly1305_x86_64 libcurve25519_generic libchacha vxcan vcan can_dev vrf ipvtap tap ipvlan batman_adv bridge bareudp isofs cdrom squashfs xt_nat xt_addrtype xt_tcpudp xt_MASQUERADE iptable_nat nft_fib_ipv6 nft_masq nft_nat nft_fib_ipv4 nft_fib nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables macsec l2tp_netlink l2tp_core ip6_udp_tunnel udp_tunnel ext4 crc16 mbcache jbd2 dummy rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace sunrpc netfs snd_seq_dummy snd_hrtimer snd_seq snd_seq_device snd_timer snd soundcore intel_rapl_msr intel_rapl_common 8021q intel_uncore_frequency_common garp mrp stp llc isst_if_common
[  372.625218] kernel:  nfit libnvdimm vfat fat cbc encrypted_keys trusted asn1_encoder tee kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul polyval_clmulni polyval_generic gf128mul ghash_clmulni_intel sha512_ssse3 sha1_ssse3 aesni_intel crypto_simd cryptd rapl cfg80211 iTCO_wdt joydev intel_pmc_bxt rfkill mousedev iTCO_vendor_support i2c_i801 psmouse pcspkr i2c_smbus lpc_ich mac_hid loop fuse dm_mod nfnetlink vsock_loopback vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vsock vmw_vmci qemu_fw_cfg ip_tables x_tables btrfs blake2b_generic libcrc32c crc32c_generic xor raid6_pq serio_raw virtio_net atkbd net_failover libps2 virtio_blk virtio_balloon failover virtio_rng vivaldi_fmap crc32c_intel virtio_pci sha256_ssse3 xhci_pci i8042 intel_agp virtio_pci_legacy_dev xhci_pci_renesas virtio_pci_modern_dev intel_gtt cirrus serio [last unloaded: sch_teql]
[  372.625363] kernel: CPU: 41 PID: 0 Comm: swapper/41 Not tainted 6.8.2-arch2-1 #1 a430fb92f7ba43092b62bbe6bac995458d3d442d
[  372.625369] kernel: Hardware name: Red Hat KVM/RHEL, BIOS edk2-20240214-1.el9 02/14/2024
[  372.625371] kernel: RIP: 0010:update_blocked_averages+0x684/0x800
[  372.625381] kernel: Code: 41 c7 46 20 00 00 00 00 40 84 ed 0f 85 92 00 00 00 4c 89 f7 e8 2d fb fe ff 48 f7 44 24 30 00 02 00 00 74 06 fb 0f 1f 44 00 00 <48> 83 c4 40 5b 5d 41 5c 41 5d 41 5e 41 5f c3 cc cc cc cc 80 3d 6a
[  372.625383] kernel: RSP: 0018:ffffb926c0b64f00 EFLAGS: 00000206
[  372.625387] kernel: RAX: 0000000000000001 RBX: 0000000000000000 RCX: 0000000000000000
[  372.625391] kernel: RDX: 0000000000000029 RSI: 0000000000000000 RDI: ffff9ba83f6747c0
[  372.625392] kernel: RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000000000000
[  372.625394] kernel: R10: 0000000000000000 R11: 00000000b8fbaf46 R12: ffff9ba83f6751c0
[  372.625395] kernel: R13: ffff9ba83f6751c0 R14: ffff9ba83f6747c0 R15: 0000000000000000
[  372.625399] kernel: FS:  0000000000000000(0000) GS:ffff9ba83f640000(0000) knlGS:0000000000000000
[  372.625401] kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  372.625403] kernel: CR2: 000064b832c270d8 CR3: 0000000115be4002 CR4: 0000000000772ef0
[  372.625406] kernel: PKRU: 55555554
[  372.625407] kernel: Call Trace:
[  372.625410] kernel:  <IRQ>
[  372.625415] kernel:  ? watchdog_timer_fn+0x1e6/0x270
[  372.625422] kernel:  ? __pfx_watchdog_timer_fn+0x10/0x10
[  372.625425] kernel:  ? __hrtimer_run_queues+0x10f/0x2b0
[  372.625430] kernel:  ? hrtimer_interrupt+0xf8/0x230
[  372.625434] kernel:  ? __sysvec_apic_timer_interrupt+0x4d/0x140
[  372.625440] kernel:  ? sysvec_apic_timer_interrupt+0x39/0x90
[  372.625445] kernel:  ? asm_sysvec_apic_timer_interrupt+0x1a/0x20
[  372.625459] kernel:  ? update_blocked_averages+0x684/0x800
[  372.625463] kernel:  ? enqueue_hrtimer+0x35/0x90
[  372.625466] kernel:  run_rebalance_domains+0x49/0x70
[  372.625470] kernel:  __do_softirq+0xc9/0x2c8
[  372.625476] kernel:  __irq_exit_rcu+0xa3/0xc0
[  372.625483] kernel:  sysvec_apic_timer_interrupt+0x72/0x90
[  372.625485] kernel:  </IRQ>
[  372.625486] kernel:  <TASK>
[  372.625488] kernel:  asm_sysvec_apic_timer_interrupt+0x1a/0x20
[  372.625490] kernel: RIP: 0010:pv_native_safe_halt+0xf/0x20
[  372.625493] kernel: Code: 22 d7 c3 cc cc cc cc 0f 1f 40 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 66 90 0f 00 2d e3 f3 26 00 fb f4 <c3> cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90
[  372.625495] kernel: RSP: 0018:ffffb926c0207ed8 EFLAGS: 00000242
[  372.625497] kernel: RAX: 0000000000000029 RBX: 0000000000000029 RCX: ffff9b8947965ad0
[  372.625498] kernel: RDX: 4000000000000000 RSI: 0000000000000029 RDI: 00000000002b82e4
[  372.625499] kernel: RBP: ffff9b8940ab6000 R08: 0000000000000001 R09: 0000000000000000
[  372.625500] kernel: R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000
[  372.625502] kernel: R13: 0000000000000000 R14: ffff9b8940ab6000 R15: 0000000000000000
[  372.625505] kernel:  default_idle+0x9/0x20
[  372.625510] kernel:  default_idle_call+0x2c/0xe0
[  372.625513] kernel:  do_idle+0x1f1/0x230
[  372.625516] kernel:  cpu_startup_entry+0x2a/0x30
[  372.625518] kernel:  start_secondary+0x11e/0x140
[  372.625521] kernel:  secondary_startup_64_no_verify+0x184/0x18b
[  372.625528] kernel:  </TASK>
[  372.625530] kernel: watchdog: BUG: soft lockup - CPU#27 stuck for 22s! [test-sd-device:85398]
[  372.639754] kernel: Modules linked in: bonding tls veth ptp_mock psample ptp pps_core sch_tbf sch_sfq sch_sfb sch_qfq sch_netem sch_ingress sch_htb sch_hhf sch_gred sch_fq_pie sch_pie sch_fq sch_ets sch_drr sch_codel sch_cake tcp_dctcp l2tp_ip l2tp_eth ifb fou xfrm_interface xfrm6_tunnel tunnel4 tunnel6 ip_gre ip_tunnel gre wireguard curve25519_x86_64 libchacha20poly1305 chacha_x86_64 poly1305_x86_64 libcurve25519_generic libchacha vxcan vcan can_dev vrf ipvtap tap ipvlan batman_adv bridge bareudp isofs cdrom squashfs xt_nat xt_addrtype xt_tcpudp xt_MASQUERADE iptable_nat nft_fib_ipv6 nft_masq nft_nat nft_fib_ipv4 nft_fib nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables macsec l2tp_netlink l2tp_core ip6_udp_tunnel udp_tunnel ext4 crc16 mbcache jbd2 dummy rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace sunrpc netfs snd_seq_dummy snd_hrtimer snd_seq snd_seq_device snd_timer snd soundcore intel_rapl_msr intel_rapl_common 8021q intel_uncore_frequency_common garp mrp stp llc isst_if_common
[  372.639833] kernel:  nfit libnvdimm vfat fat cbc encrypted_keys trusted asn1_encoder tee kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul polyval_clmulni polyval_generic gf128mul ghash_clmulni_intel sha512_ssse3 sha1_ssse3 aesni_intel crypto_simd cryptd rapl cfg80211 iTCO_wdt joydev intel_pmc_bxt rfkill mousedev iTCO_vendor_support i2c_i801 psmouse pcspkr i2c_smbus lpc_ich mac_hid loop fuse dm_mod nfnetlink vsock_loopback vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vsock vmw_vmci qemu_fw_cfg ip_tables x_tables btrfs blake2b_generic libcrc32c crc32c_generic xor raid6_pq serio_raw virtio_net atkbd net_failover libps2 virtio_blk virtio_balloon failover virtio_rng vivaldi_fmap crc32c_intel virtio_pci sha256_ssse3 xhci_pci i8042 intel_agp virtio_pci_legacy_dev xhci_pci_renesas virtio_pci_modern_dev intel_gtt cirrus serio [last unloaded: sch_teql]
[  372.639881] kernel: CPU: 27 PID: 85398 Comm: test-sd-device Tainted: G             L     6.8.2-arch2-1 #1 a430fb92f7ba43092b62bbe6bac995458d3d442d
[  372.639895] kernel: Hardware name: Red Hat KVM/RHEL, BIOS edk2-20240214-1.el9 02/14/2024
[  372.639896] kernel: RIP: 0010:update_blocked_averages+0x684/0x800
[  372.639901] kernel: Code: 41 c7 46 20 00 00 00 00 40 84 ed 0f 85 92 00 00 00 4c 89 f7 e8 2d fb fe ff 48 f7 44 24 30 00 02 00 00 74 06 fb 0f 1f 44 00 00 <48> 83 c4 40 5b 5d 41 5c 41 5d 41 5e 41 5f c3 cc cc cc cc 80 3d 6a
[  372.639903] kernel: RSP: 0018:ffffb926c0884f00 EFLAGS: 00000206
[  372.639905] kernel: RAX: 0000000000000001 RBX: ffff9ba83f2f5078 RCX: 0000000000000000
[  372.639906] kernel: RDX: 0000004a3fc3b400 RSI: 0000000000000000 RDI: ffff9ba83f2f47c0
[  372.639908] kernel: RBP: 0000000000000000 R08: 0000000000000000 R09: ffff9b8948be2600
[  372.639909] kernel: R10: ffffffffaf8060c0 R11: 000000517faaa448 R12: ffff9ba83f2f51c0
[  372.639911] kernel: R13: ffff9ba83f2f51c0 R14: ffff9ba83f2f47c0 R15: 0000000000000000
[  372.639913] kernel: FS:  00007ac174207cc0(0000) GS:ffff9ba83f2c0000(0000) knlGS:0000000000000000
[  372.639914] kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  372.639916] kernel: CR2: 00005c28adc73118 CR3: 000000019326c006 CR4: 0000000000772ef0
[  372.639919] kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  372.639920] kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  372.639921] kernel: PKRU: 55555554
[  372.639922] kernel: Call Trace:
[  372.639923] kernel:  <IRQ>
[  372.639925] kernel:  ? watchdog_timer_fn+0x1e6/0x270
[  372.639929] kernel:  ? __pfx_watchdog_timer_fn+0x10/0x10
[  372.639932] kernel:  ? __hrtimer_run_queues+0x10f/0x2b0
[  372.639935] kernel:  ? hrtimer_interrupt+0xf8/0x230
[  372.639939] kernel:  ? __sysvec_apic_timer_interrupt+0x4d/0x140
[  372.639942] kernel:  ? sysvec_apic_timer_interrupt+0x39/0x90
[  372.639945] kernel:  ? asm_sysvec_apic_timer_interrupt+0x1a/0x20
[  372.639949] kernel:  ? update_blocked_averages+0x684/0x800
[  372.639952] kernel:  ? enqueue_hrtimer+0x35/0x90
[  372.639955] kernel:  run_rebalance_domains+0x49/0x70
[  372.639957] kernel:  __do_softirq+0xc9/0x2c8
[  372.639961] kernel:  __irq_exit_rcu+0xa3/0xc0
[  372.639964] kernel:  sysvec_apic_timer_interrupt+0x72/0x90
[  372.640462] kernel:  </IRQ>
[  372.640463] kernel:  <TASK>
[  372.640464] kernel:  asm_sysvec_apic_timer_interrupt+0x1a/0x20
[  372.640466] kernel: RIP: 0010:__seccomp_filter+0x97/0x4f0
[  372.640469] kernel: Code: 00 c0 0f 85 1c 03 00 00 48 8d 4b 10 85 c0 78 29 48 63 d0 48 81 fa cd 01 00 00 77 1d 48 81 fa ce 01 00 00 48 19 d2 21 d0 48 98 <48> 0f a3 01 0f 92 c0 84 c0 0f 85 db 01 00 00 48 c7 04 24 00 00 00
[  372.640471] kernel: RSP: 0018:ffffb926ea23fd70 EFLAGS: 00010202
[  372.640473] kernel: RAX: 000000000000014c RBX: ffff9b8975037d00 RCX: ffff9b8975037d10
[  372.640474] kernel: RDX: ffffffffffffffff RSI: 0000000000001101 RDI: ffffb926ea23fd80
[  372.640475] kernel: RBP: ffffb926ea23ff48 R08: 0000000000001000 R09: 00007ac174141d78
[  372.640477] kernel: R10: 0000000000000005 R11: 0000000000000000 R12: ffffb926ea23fd80
[  372.640478] kernel: R13: ffffb926ea23ff58 R14: 0000000000000000 R15: 0000000000000000
[  372.640483] kernel:  syscall_trace_enter+0x9e/0x1c0
[  372.640487] kernel:  do_syscall_64+0x145/0x170
[  372.640491] kernel:  ? syscall_exit_to_user_mode+0x80/0x230
[  372.640494] kernel:  ? do_syscall_64+0x96/0x170
[  372.640496] kernel:  ? do_sys_openat2+0x97/0xe0
[  372.640500] kernel:  ? syscall_exit_to_user_mode_prepare+0x178/0x1a0
[  372.640502] kernel:  ? syscall_exit_to_user_mode+0x80/0x230
[  372.640505] kernel:  ? do_syscall_64+0x96/0x170
[  372.640507] kernel:  ? do_user_addr_fault+0x304/0x670
[  372.640513] kernel:  ? exc_page_fault+0x7f/0x180
[  372.640515] kernel:  entry_SYSCALL_64_after_hwframe+0x6e/0x76
[  372.640518] kernel: RIP: 0033:0x7ac173d19b4e
[  372.640705] kernel: Code: c1 0d 00 ba ff ff ff ff 64 c7 00 16 00 00 00 e9 a5 fd ff ff 67 e8 02 9f 01 00 66 90 f3 0f 1e fa 41 89 ca b8 4c 01 00 00 0f 05 <48> 3d 00 f0 ff ff 77 2a 89 c1 85 c0 74 0f 48 8b 05 ad c1 0d 00 64
[  372.640707] kernel: RSP: 002b:00007ffc2f6f4af8 EFLAGS: 00000246 ORIG_RAX: 000000000000014c
[  372.640709] kernel: RAX: ffffffffffffffda RBX: 00007ffc2f6f4cf0 RCX: 00007ac173d19b4e
[  372.640710] kernel: RDX: 0000000000001000 RSI: 00007ac174141d78 RDI: 0000000000000005
[  372.640711] kernel: RBP: 00007ffc2f6f4bd0 R08: 00007ffc2f6f4cf0 R09: 0000000000000001
[  372.640713] kernel: R10: 0000000000001101 R11: 0000000000000246 R12: 0000000000001000
[  372.640714] kernel: R13: 0000000000000005 R14: 00007ac174141d78 R15: 00007ffc2f6f50cc
[  372.640719] kernel:  </TASK>
[  372.640721] kernel: watchdog: BUG: soft lockup - CPU#59 stuck for 22s! [kworker/u204:4:75603]
[  372.645770] kernel: Modules linked in: bonding tls veth ptp_mock psample ptp pps_core sch_tbf sch_sfq sch_sfb sch_qfq sch_netem sch_ingress sch_htb sch_hhf sch_gred sch_fq_pie sch_pie sch_fq sch_ets sch_drr sch_codel sch_cake tcp_dctcp l2tp_ip l2tp_eth ifb fou xfrm_interface xfrm6_tunnel tunnel4 tunnel6 ip_gre ip_tunnel gre wireguard curve25519_x86_64 libchacha20poly1305 chacha_x86_64 poly1305_x86_64 libcurve25519_generic libchacha vxcan vcan can_dev vrf ipvtap tap ipvlan batman_adv bridge bareudp isofs cdrom squashfs xt_nat xt_addrtype xt_tcpudp xt_MASQUERADE iptable_nat nft_fib_ipv6 nft_masq nft_nat nft_fib_ipv4 nft_fib nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables macsec l2tp_netlink l2tp_core ip6_udp_tunnel udp_tunnel ext4 crc16 mbcache jbd2 dummy rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace sunrpc netfs snd_seq_dummy snd_hrtimer snd_seq snd_seq_device snd_timer snd soundcore intel_rapl_msr intel_rapl_common 8021q intel_uncore_frequency_common garp mrp stp llc isst_if_common
[  372.645834] kernel:  nfit libnvdimm vfat fat cbc encrypted_keys trusted asn1_encoder tee kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul polyval_clmulni polyval_generic gf128mul ghash_clmulni_intel sha512_ssse3 sha1_ssse3 aesni_intel crypto_simd cryptd rapl cfg80211 iTCO_wdt joydev intel_pmc_bxt rfkill mousedev iTCO_vendor_support i2c_i801 psmouse pcspkr i2c_smbus lpc_ich mac_hid loop fuse dm_mod nfnetlink vsock_loopback vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vsock vmw_vmci qemu_fw_cfg ip_tables x_tables btrfs blake2b_generic libcrc32c crc32c_generic xor raid6_pq serio_raw virtio_net atkbd net_failover libps2 virtio_blk virtio_balloon failover virtio_rng vivaldi_fmap crc32c_intel virtio_pci sha256_ssse3 xhci_pci i8042 intel_agp virtio_pci_legacy_dev xhci_pci_renesas virtio_pci_modern_dev intel_gtt cirrus serio [last unloaded: sch_teql]
[  372.645880] kernel: CPU: 59 PID: 75603 Comm: kworker/u204:4 Tainted: G             L     6.8.2-arch2-1 #1 a430fb92f7ba43092b62bbe6bac995458d3d442d
[  372.645883] kernel: Hardware name: Red Hat KVM/RHEL, BIOS edk2-20240214-1.el9 02/14/2024
[  372.645885] kernel: Workqueue: btrfs-endio-write btrfs_work_helper [btrfs]
[  372.646102] kernel: RIP: 0010:_raw_spin_lock+0x17/0x30
[  372.646105] kernel: Code: 00 66 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 65 ff 05 88 a1 86 51 31 c0 ba 01 00 00 00 <f0> 0f b1 17 75 05 c3 cc cc cc cc 89 c6 e8 87 04 00 00 90 c3 cc cc
[  372.646107] kernel: RSP: 0018:ffffb926cf3e3910 EFLAGS: 00010246
[  372.646109] kernel: RAX: 0000000000000000 RBX: ffff9b89a9ffd038 RCX: ffffffffc0521e5a
[  372.646110] kernel: RDX: 0000000000000001 RSI: ffffffffc05ba5ab RDI: ffff9b89a9ffd04c
[  372.646112] kernel: RBP: ffff9b89488f2a50 R08: 0000000000000000 R09: 0000000000000000
[  372.646113] kernel: R10: ffff9b8abad7b5e8 R11: 6af5be5e3eef2eaa R12: 00000000da2a4000
[  372.646115] kernel: R13: 00000000da2a4000 R14: ffff9b8abad7b5e8 R15: ffff9b8946b17000
[  372.646116] kernel: FS:  0000000000000000(0000) GS:ffff9ba83fac0000(0000) knlGS:0000000000000000
[  372.646118] kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  372.646119] kernel: CR2: 00007084b4eaf680 CR3: 000000013741a006 CR4: 0000000000772ef0
[  372.646122] kernel: PKRU: 55555554
[  372.646123] kernel: Call Trace:
[  372.646125] kernel:  <IRQ>
[  372.646127] kernel:  ? watchdog_timer_fn+0x1e6/0x270
[  372.646131] kernel:  ? __pfx_watchdog_timer_fn+0x10/0x10
[  372.646134] kernel:  ? __hrtimer_run_queues+0x10f/0x2b0
[  372.646138] kernel:  ? hrtimer_interrupt+0xf8/0x230
[  372.646142] kernel:  ? __sysvec_apic_timer_interrupt+0x4d/0x140
[  372.646146] kernel:  ? sysvec_apic_timer_interrupt+0x6d/0x90
[  372.646149] kernel:  </IRQ>
[  372.646150] kernel:  <TASK>
[  372.646152] kernel:  ? asm_sysvec_apic_timer_interrupt+0x1a/0x20
[  372.646156] kernel:  ? __set_extent_bit+0x40a/0x760 [btrfs 1310bd26707b269e4ca28dfd9adc59c6d54c90d4]
[  372.646242] kernel:  ? _raw_spin_lock+0x17/0x30
[  372.646245] kernel:  __set_extent_bit+0x91/0x760 [btrfs 1310bd26707b269e4ca28dfd9adc59c6d54c90d4]
[  372.646344] kernel:  ? _raw_spin_unlock+0xe/0x30
[  372.646348] kernel:  set_extent_bit+0x18/0x20 [btrfs 1310bd26707b269e4ca28dfd9adc59c6d54c90d4]
[  372.646416] kernel:  btrfs_alloc_tree_block+0x310/0x530 [btrfs 1310bd26707b269e4ca28dfd9adc59c6d54c90d4]
[  372.646496] kernel:  btrfs_force_cow_block+0x137/0x7d0 [btrfs 1310bd26707b269e4ca28dfd9adc59c6d54c90d4]
[  372.646548] kernel:  btrfs_cow_block+0xcd/0x280 [btrfs 1310bd26707b269e4ca28dfd9adc59c6d54c90d4]
[  372.646596] kernel:  btrfs_search_slot+0x569/0xd00 [btrfs 1310bd26707b269e4ca28dfd9adc59c6d54c90d4]
[  372.646649] kernel:  btrfs_lookup_csum+0x73/0x160 [btrfs 1310bd26707b269e4ca28dfd9adc59c6d54c90d4]
[  372.646706] kernel:  btrfs_csum_file_blocks+0x1a3/0x6f0 [btrfs 1310bd26707b269e4ca28dfd9adc59c6d54c90d4]
[  372.646760] kernel:  ? unpin_extent_cache+0xaf/0x160 [btrfs 1310bd26707b269e4ca28dfd9adc59c6d54c90d4]
[  372.646822] kernel:  btrfs_finish_one_ordered+0x6fe/0x9e0 [btrfs 1310bd26707b269e4ca28dfd9adc59c6d54c90d4]
[  372.646882] kernel:  btrfs_work_helper+0xde/0x390 [btrfs 1310bd26707b269e4ca28dfd9adc59c6d54c90d4]
[  372.646961] kernel:  process_one_work+0x183/0x370
[  372.646968] kernel:  worker_thread+0x3ab/0x4f0
[  372.646971] kernel:  ? __pfx_worker_thread+0x10/0x10
[  372.646973] kernel:  kthread+0xe5/0x120
[  372.646977] kernel:  ? __pfx_kthread+0x10/0x10
[  372.646980] kernel:  ret_from_fork+0x31/0x50
[  372.646985] kernel:  ? __pfx_kthread+0x10/0x10
[  372.646988] kernel:  ret_from_fork_asm+0x1b/0x30
[  372.646992] kernel:  </TASK>
[  372.646995] kernel: watchdog: BUG: soft lockup - CPU#63 stuck for 22s! [test-random-uti:84844]
[  372.657932] kernel: Modules linked in: bonding tls veth ptp_mock psample ptp pps_core sch_tbf sch_sfq sch_sfb sch_qfq sch_netem sch_ingress sch_htb sch_hhf sch_gred sch_fq_pie sch_pie sch_fq sch_ets sch_drr sch_codel sch_cake tcp_dctcp l2tp_ip l2tp_eth ifb fou xfrm_interface xfrm6_tunnel tunnel4 tunnel6 ip_gre ip_tunnel gre wireguard curve25519_x86_64 libchacha20poly1305 chacha_x86_64 poly1305_x86_64 libcurve25519_generic libchacha vxcan vcan can_dev vrf ipvtap tap ipvlan batman_adv bridge bareudp isofs cdrom squashfs xt_nat xt_addrtype xt_tcpudp xt_MASQUERADE iptable_nat nft_fib_ipv6 nft_masq nft_nat nft_fib_ipv4 nft_fib nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables macsec l2tp_netlink l2tp_core ip6_udp_tunnel udp_tunnel ext4 crc16 mbcache jbd2 dummy rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace sunrpc netfs snd_seq_dummy snd_hrtimer snd_seq snd_seq_device snd_timer snd soundcore intel_rapl_msr intel_rapl_common 8021q intel_uncore_frequency_common garp mrp stp llc isst_if_common
[  372.658060] kernel:  nfit libnvdimm vfat fat cbc encrypted_keys trusted asn1_encoder tee kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul polyval_clmulni polyval_generic gf128mul ghash_clmulni_intel sha512_ssse3 sha1_ssse3 aesni_intel crypto_simd cryptd rapl cfg80211 iTCO_wdt joydev intel_pmc_bxt rfkill mousedev iTCO_vendor_support i2c_i801 psmouse pcspkr i2c_smbus lpc_ich mac_hid loop fuse dm_mod nfnetlink vsock_loopback vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vsock vmw_vmci qemu_fw_cfg ip_tables x_tables btrfs blake2b_generic libcrc32c crc32c_generic xor raid6_pq serio_raw virtio_net atkbd net_failover libps2 virtio_blk virtio_balloon failover virtio_rng vivaldi_fmap crc32c_intel virtio_pci sha256_ssse3 xhci_pci i8042 intel_agp virtio_pci_legacy_dev xhci_pci_renesas virtio_pci_modern_dev intel_gtt cirrus serio [last unloaded: sch_teql]
[  372.658153] kernel: CPU: 63 PID: 84844 Comm: test-random-uti Tainted: G             L     6.8.2-arch2-1 #1 a430fb92f7ba43092b62bbe6bac995458d3d442d
[  372.658158] kernel: Hardware name: Red Hat KVM/RHEL, BIOS edk2-20240214-1.el9 02/14/2024
[  372.658161] kernel: RIP: 0033:0x72d5e48037fd
[  372.658237] kernel: Code: 48 8d 35 8b 59 14 00 48 8d 3d 88 28 13 00 e8 fa db e7 ff b8 00 00 00 00 eb d8 b8 00 00 00 00 eb d1 55 48 89 e5 53 48 83 ec 18 <64> 48 8b 04 25 28 00 00 00 48 89 45 e8 31 c0 85 ff 79 1a 48 8b 45
[  372.658242] kernel: RSP: 002b:00007fff77807f00 EFLAGS: 00010206
[  372.658245] kernel: RAX: 0000000000000008 RBX: 0000000000000008 RCX: 000072d5e445d7e4
[  372.658248] kernel: RDX: 0000000000000004 RSI: 0000000000000008 RDI: 00000000fffffff7
[  372.658251] kernel: RBP: 00007fff77807f20 R08: 00005e8f22ec42c0 R09: 00007fff77807690
[  372.658253] kernel: R10: 0000000000000000 R11: 0000000000000202 R12: 00007fff77807f60
[  372.658255] kernel: R13: 00000000fffffff7 R14: 0000000000009f51 R15: 0000000000000011
[  372.658262] kernel: FS:  000072d5e414ddc0 GS:  0000000000000000
[  372.658265] kernel: watchdog: BUG: soft lockup - CPU#67 stuck for 22s! [swapper/67:0]
[  372.660356] kernel: Modules linked in: bonding tls veth ptp_mock psample ptp pps_core sch_tbf sch_sfq sch_sfb sch_qfq sch_netem sch_ingress sch_htb sch_hhf sch_gred sch_fq_pie sch_pie sch_fq sch_ets sch_drr sch_codel sch_cake tcp_dctcp l2tp_ip l2tp_eth ifb fou xfrm_interface xfrm6_tunnel tunnel4 tunnel6 ip_gre ip_tunnel gre wireguard curve25519_x86_64 libchacha20poly1305 chacha_x86_64 poly1305_x86_64 libcurve25519_generic libchacha vxcan vcan can_dev vrf ipvtap tap ipvlan batman_adv bridge bareudp isofs cdrom squashfs xt_nat xt_addrtype xt_tcpudp xt_MASQUERADE iptable_nat nft_fib_ipv6 nft_masq nft_nat nft_fib_ipv4 nft_fib nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables macsec l2tp_netlink l2tp_core ip6_udp_tunnel udp_tunnel ext4 crc16 mbcache jbd2 dummy rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace sunrpc netfs snd_seq_dummy snd_hrtimer snd_seq snd_seq_device snd_timer snd soundcore intel_rapl_msr intel_rapl_common 8021q intel_uncore_frequency_common garp mrp stp llc isst_if_common
[  372.660427] kernel:  nfit libnvdimm vfat fat cbc encrypted_keys trusted asn1_encoder tee kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul polyval_clmulni polyval_generic gf128mul ghash_clmulni_intel sha512_ssse3 sha1_ssse3 aesni_intel crypto_simd cryptd rapl cfg80211 iTCO_wdt joydev intel_pmc_bxt rfkill mousedev iTCO_vendor_support i2c_i801 psmouse pcspkr i2c_smbus lpc_ich mac_hid loop fuse dm_mod nfnetlink vsock_loopback vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vsock vmw_vmci qemu_fw_cfg ip_tables x_tables btrfs blake2b_generic libcrc32c crc32c_generic xor raid6_pq serio_raw virtio_net atkbd net_failover libps2 virtio_blk virtio_balloon failover virtio_rng vivaldi_fmap crc32c_intel virtio_pci sha256_ssse3 xhci_pci i8042 intel_agp virtio_pci_legacy_dev xhci_pci_renesas virtio_pci_modern_dev intel_gtt cirrus serio [last unloaded: sch_teql]
[  372.660481] kernel: CPU: 67 PID: 0 Comm: swapper/67 Tainted: G             L     6.8.2-arch2-1 #1 a430fb92f7ba43092b62bbe6bac995458d3d442d
[  372.660485] kernel: Hardware name: Red Hat KVM/RHEL, BIOS edk2-20240214-1.el9 02/14/2024
[  372.660487] kernel: RIP: 0010:pv_native_safe_halt+0xf/0x20
[  372.660496] kernel: Code: 22 d7 c3 cc cc cc cc 0f 1f 40 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 66 90 0f 00 2d e3 f3 26 00 fb f4 <c3> cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90
[  372.660499] kernel: RSP: 0018:ffffb926c02d7ed8 EFLAGS: 00000246
[  372.660502] kernel: RAX: 0000000000000043 RBX: 0000000000000043 RCX: ffff9b89a9c49620
[  372.660504] kernel: RDX: 4000000000000000 RSI: 0000000000000043 RDI: 00000000002c1904
[  372.660506] kernel: RBP: ffff9b8940bf8000 R08: 0000000000000001 R09: 0000000000000000
[  372.660508] kernel: R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000
[  372.660510] kernel: R13: 0000000000000000 R14: ffff9b8940bf8000 R15: 0000000000000000
[  372.660512] kernel: FS:  0000000000000000(0000) GS:ffff9ba83fcc0000(0000) knlGS:0000000000000000
[  372.660514] kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  372.660515] kernel: CR2: 00005db583508708 CR3: 0000000191368001 CR4: 0000000000772ef0
[  372.660519] kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  372.660520] kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  372.660522] kernel: PKRU: 55555554
[  372.660523] kernel: Call Trace:
[  372.660525] kernel:  <IRQ>
[  372.660529] kernel:  ? watchdog_timer_fn+0x1e6/0x270
[  372.660537] kernel:  ? __pfx_watchdog_timer_fn+0x10/0x10
[  372.660540] kernel:  ? __hrtimer_run_queues+0x10f/0x2b0
[  372.660546] kernel:  ? hrtimer_interrupt+0xf8/0x230
[  372.660551] kernel:  ? __sysvec_apic_timer_interrupt+0x4d/0x140
[  372.660559] kernel:  ? sysvec_apic_timer_interrupt+0x6d/0x90
[  372.660562] kernel:  </IRQ>
[  372.660562] kernel:  <TASK>
[  372.660564] kernel:  ? asm_sysvec_apic_timer_interrupt+0x1a/0x20
[  372.660571] kernel:  ? pv_native_safe_halt+0xf/0x20
[  372.660574] kernel:  default_idle+0x9/0x20
[  372.660578] kernel:  default_idle_call+0x2c/0xe0
[  372.660582] kernel:  do_idle+0x1f1/0x230
[  372.660586] kernel:  cpu_startup_entry+0x2a/0x30
[  372.660589] kernel:  start_secondary+0x11e/0x140
[  372.660592] kernel:  secondary_startup_64_no_verify+0x184/0x18b
[  372.660604] kernel:  </TASK>
[  372.660606] kernel: watchdog: BUG: soft lockup - CPU#61 stuck for 22s! [test-hashmap:84987]
[  372.662711] kernel: Modules linked in: bonding tls veth ptp_mock psample ptp pps_core sch_tbf sch_sfq sch_sfb sch_qfq sch_netem sch_ingress sch_htb sch_hhf sch_gred sch_fq_pie sch_pie sch_fq sch_ets sch_drr sch_codel sch_cake tcp_dctcp l2tp_ip l2tp_eth ifb fou xfrm_interface xfrm6_tunnel tunnel4 tunnel6 ip_gre ip_tunnel gre wireguard curve25519_x86_64 libchacha20poly1305 chacha_x86_64 poly1305_x86_64 libcurve25519_generic libchacha vxcan vcan can_dev vrf ipvtap tap ipvlan batman_adv bridge bareudp isofs cdrom squashfs xt_nat xt_addrtype xt_tcpudp xt_MASQUERADE iptable_nat nft_fib_ipv6 nft_masq nft_nat nft_fib_ipv4 nft_fib nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables macsec l2tp_netlink l2tp_core ip6_udp_tunnel udp_tunnel ext4 crc16 mbcache jbd2 dummy rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace sunrpc netfs snd_seq_dummy snd_hrtimer snd_seq snd_seq_device snd_timer snd soundcore intel_rapl_msr intel_rapl_common 8021q intel_uncore_frequency_common garp mrp stp llc isst_if_common
[  372.662789] kernel:  nfit libnvdimm vfat fat cbc encrypted_keys trusted asn1_encoder tee kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul polyval_clmulni polyval_generic gf128mul ghash_clmulni_intel sha512_ssse3 sha1_ssse3 aesni_intel crypto_simd cryptd rapl cfg80211 iTCO_wdt joydev intel_pmc_bxt rfkill mousedev iTCO_vendor_support i2c_i801 psmouse pcspkr i2c_smbus lpc_ich mac_hid loop fuse dm_mod nfnetlink vsock_loopback vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vsock vmw_vmci qemu_fw_cfg ip_tables x_tables btrfs blake2b_generic libcrc32c crc32c_generic xor raid6_pq serio_raw virtio_net atkbd net_failover libps2 virtio_blk virtio_balloon failover virtio_rng vivaldi_fmap crc32c_intel virtio_pci sha256_ssse3 xhci_pci i8042 intel_agp virtio_pci_legacy_dev xhci_pci_renesas virtio_pci_modern_dev intel_gtt cirrus serio [last unloaded: sch_teql]
[  372.662839] kernel: CPU: 61 PID: 84987 Comm: test-hashmap Tainted: G             L     6.8.2-arch2-1 #1 a430fb92f7ba43092b62bbe6bac995458d3d442d
[  372.662843] kernel: Hardware name: Red Hat KVM/RHEL, BIOS edk2-20240214-1.el9 02/14/2024
[  372.662844] kernel: RIP: 0010:blk_cgroup_congested+0x27/0x70
[  372.662852] kernel: Code: 90 90 90 66 0f 1f 00 0f 1f 44 00 00 53 e8 91 60 b7 ff e8 7c f5 ae ff 48 85 c0 75 0e eb 2b 48 8b 80 c0 00 00 00 48 85 c0 74 38 <48> 8b 10 8b 92 a0 07 00 00 85 d2 74 e7 e8 87 98 b7 ff bb 01 00 00
[  372.662854] kernel: RSP: 0000:ffffb926e9897d78 EFLAGS: 00010286
[  372.662857] kernel: RAX: ffff9b89a9ffcc00 RBX: 02ffff0000000000 RCX: 00000000000020bb
[  372.662858] kernel: RDX: 0000000000000000 RSI: 0000000000000040 RDI: ffffef8289a3c6c0
[  372.662860] kernel: RBP: ffffb926e9897e08 R08: ffffef8289a3c6c0 R09: 0000000000000000
[  372.662862] kernel: R10: 0000000055555554 R11: 0000000000000001 R12: ffff9b89d2ca8240
[  372.662863] kernel: R13: fffffffffffff000 R14: 0000000000000001 R15: 00005e5453ebe000
[  372.662865] kernel: FS:  000074260473ccc0(0000) GS:ffff9ba83fb40000(0000) knlGS:0000000000000000
[  372.662867] kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  372.662868] kernel: CR2: 00005e5453ebe008 CR3: 0000000173bac004 CR4: 0000000000772ef0
[  372.662871] kernel: PKRU: 55555554
[  372.662872] kernel: Call Trace:
[  372.662874] kernel:  <IRQ>
[  372.662877] kernel:  ? watchdog_timer_fn+0x1e6/0x270
[  372.662880] kernel:  ? __pfx_watchdog_timer_fn+0x10/0x10
[  372.662883] kernel:  ? __hrtimer_run_queues+0x10f/0x2b0
[  372.662887] kernel:  ? hrtimer_interrupt+0xf8/0x230
[  372.662890] kernel:  ? __sysvec_apic_timer_interrupt+0x4d/0x140
[  372.662894] kernel:  ? sysvec_apic_timer_interrupt+0x6d/0x90
[  372.662896] kernel:  </IRQ>
[  372.662897] kernel:  <TASK>
[  372.662898] kernel:  ? asm_sysvec_apic_timer_interrupt+0x1a/0x20
[  372.662902] kernel:  ? blk_cgroup_congested+0x27/0x70
[  372.662905] kernel:  ? blk_cgroup_congested+0x14/0x70
[  372.662908] kernel:  __folio_throttle_swaprate+0x1f/0xe0
[  372.662912] kernel:  do_anonymous_page+0x28f/0x6e0
[  372.662916] kernel:  ? pmdp_collapse_flush+0x60/0x60
[  372.662921] kernel:  __handle_mm_fault+0xb30/0xe40
[  372.662925] kernel:  handle_mm_fault+0x17f/0x360
[  372.662928] kernel:  do_user_addr_fault+0x15b/0x670
[  372.662934] kernel:  exc_page_fault+0x7f/0x180
[  372.662937] kernel:  asm_exc_page_fault+0x26/0x30
[  372.662944] kernel: RIP: 0033:0x742604eaaf42
[  372.662996] kernel: Code: 8d 34 19 48 39 d5 48 89 75 60 0f 95 c2 48 29 d8 48 83 c1 10 0f b6 d2 48 83 c8 01 48 c1 e2 02 48 09 da 48 83 ca 01 48 89 51 f8 <48> 89 46 08 e9 3d ff ff ff 48 89 df e8 5d e8 ff ff 48 89 c1 48 85
[  372.662999] kernel: RSP: 002b:00007ffd54b3c840 EFLAGS: 00010206
[  372.663001] kernel: RAX: 0000000000018001 RBX: 0000000000000020 RCX: 00005e5453ebdff0
[  372.663002] kernel: RDX: 0000000000000021 RSI: 00005e5453ebe000 RDI: 0000000000000000
[  372.663004] kernel: RBP: 0000742604fe8ac0 R08: 0000000000000020 R09: 0000000000000001
[  372.663005] kernel: R10: 0000000000000004 R11: 0000000000000000 R12: 0000000000000008
[  372.663007] kernel: R13: 0000000000000020 R14: 0000000000000000 R15: 0000742604fe8b20
[  372.663010] kernel:  </TASK>

@mrc0mmand
Copy link
Member Author

mrc0mmand commented Apr 8, 2024

Updates from https://bugzilla.kernel.org/show_bug.cgi?id=218684:

  • the latest-ish mainline kernel on the host makes the issue go away
    • if the need arised we can just rebuild the latest mainline RPM from [0] to have a working C9S hypervisor
  • the issue is still present on 6.7.1-0.hs1.hsx.el9.x86_64, so the fix is somewhere between that and 6.9.0-0.rc1.316.vanilla.fc40.x86_64

[0] https://fedoraproject.org/wiki/Kernel_Vanilla_Repositories

@mrc0mmand
Copy link
Member Author

The culprit is a missing patch in C9S's kernel, see https://bugzilla.kernel.org/show_bug.cgi?id=218684#c3. Filed https://issues.redhat.com/browse/RHEL-32384 to get the patches to C9S.

@mrc0mmand
Copy link
Member Author

This should be resolved temporarily by 239a64c and permanently once https://issues.redhat.com/browse/RHEL-32384 lands in C9S.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant