kernel NULL pointer dereference in xenbus_thread, Linux 6.1.57 on openQA #8638

marmarek · 2023-10-22T14:07:16Z

Observation

openQA test in scenario qubesos-4.1-release-upgrade-x86_64-install_default@64bit fails in
release_upgrade

kernel NULL pointer dereference, stack trace

[  876.712812] BUG: kernel NULL pointer dereference, address: 0000000000000000
[  876.715099] #PF: supervisor read access in kernel mode
[  876.717222] #PF: error_code(0x0000) - not-present page
[  876.718919] PGD 101f9f067 P4D 101f9f067 PUD 103eae067 PMD 0 
[  876.721633] Oops: 0000 [#1] PREEMPT SMP NOPTI
[  876.723184] CPU: 1 PID: 28 Comm: xenbus Not tainted 6.1.57-1.qubes.fc37.x86_64 #1
[  876.725629] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552-rebuilt.opensuse.org 04/01/2014
[  876.729399] RIP: e030:__wake_up_common+0x4c/0x180
[  876.731221] Code: 24 0c 89 4c 24 08 4d 85 c9 74 0a 41 f6 01 04 0f 85 a3 00 00 00 48 8b 43 08 4c 8d 40 e8 48 83 c3 08 49 8d 40 18 48 39 c3 74 5b <49> 8b 40 18 31 ed 4c 8d 70 e8 45 8b 28 41 f6 c5 04 75 5f 49 8b 40
[  876.737539] RSP: e02b:ffffc900400f7e10 EFLAGS: 00010082
[  876.740443] RAX: 0000000000000000 RBX: ffff888066582f98 RCX: 0000000000000000
[  876.742913] RDX: 0000000000000001 RSI: 0000000000000003 RDI: ffff888066582f90
[  876.745239] RBP: ffffc900400f0280 R08: ffffffffffffffe8 R09: ffffc900400f7e68
[  876.748484] R10: 0000000000007ff0 R11: ffff888100ad9000 R12: ffffc900400f7e68
[  876.750837] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[  876.753734] FS:  0000000000000000(0000) GS:ffff88813ff00000(0000) knlGS:0000000000000000
[  876.756569] CS:  10000e030 DS: 0000 ES: 0000 CR0: 0000000080050033
[  876.758422] CR2: 0000000000000000 CR3: 0000000104ac4000 CR4: 0000000000040660
[  876.760599] Call Trace:
[  876.761359]  <TASK>
[  876.762025]  ? show_trace_log_lvl+0x1d3/0x2ef
[  876.763390]  ? show_trace_log_lvl+0x1d3/0x2ef
[  876.764731]  ? show_trace_log_lvl+0x1d3/0x2ef
[  876.766061]  ? __wake_up_common_lock+0x82/0xd0
[  876.767465]  ? __die_body.cold+0x8/0xd
[  876.769374]  ? page_fault_oops+0x163/0x1a0
[  876.770706]  ? exc_page_fault+0x70/0x170
[  876.771922]  ? asm_exc_page_fault+0x22/0x30
[  876.773235]  ? __wake_up_common+0x4c/0x180
[  876.774502]  __wake_up_common_lock+0x82/0xd0
[  876.775835]  ? process_writes+0x240/0x240
[  876.777251]  process_msg+0x18e/0x2f0
[  876.778364]  xenbus_thread+0x165/0x1c0
[  876.779520]  ? cpuusage_read+0x10/0x10
[  876.780694]  kthread+0xe9/0x110
[  876.781680]  ? kthread_complete_and_exit+0x20/0x20
[  876.783168]  ret_from_fork+0x22/0x30
[  876.784287]  </TASK>
[  876.784974] Modules linked in: joydev snd_hda_codec_generic ledtrig_audio snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec snd_hda_core snd_hwdep snd_seq snd_seq_device snd_pcm ppdev intel_rapl_msr intel_rapl_common snd_timer e1000e snd pcspkr parport_pc soundcore parport i2c_piix4 fuse loop xenfs dm_thin_pool dm_persistent_data dm_bio_prison dm_crypt crct10dif_pclmul xhci_pci crc32_pclmul crc32c_intel xhci_pci_renesas polyval_clmulni polyval_generic xhci_hcd ghash_clmulni_intel sha512_ssse3 virtio_console virtio_scsi serio_raw bochs drm_vram_helper drm_ttm_helper ttm ata_generic pata_acpi floppy qemu_fw_cfg xen_privcmd xen_pciback xen_blkback xen_gntalloc xen_gntdev xen_evtchn scsi_dh_rdac scsi_dh_emc scsi_dh_alua uinput dm_multipath
[  876.806036] CR2: 0000000000000000
[  876.807126] ---[ end trace 0000000000000000 ]---
[  876.808589] RIP: e030:__wake_up_common+0x4c/0x180
[  876.810069] Code: 24 0c 89 4c 24 08 4d 85 c9 74 0a 41 f6 01 04 0f 85 a3 00 00 00 48 8b 43 08 4c 8d 40 e8 48 83 c3 08 49 8d 40 18 48 39 c3 74 5b <49> 8b 40 18 31 ed 4c 8d 70 e8 45 8b 28 41 f6 c5 04 75 5f 49 8b 40
[  876.815813] RSP: e02b:ffffc900400f7e10 EFLAGS: 00010082
[  876.817375] RAX: 0000000000000000 RBX: ffff888066582f98 RCX: 0000000000000000
[  876.819549] RDX: 0000000000000001 RSI: 0000000000000003 RDI: ffff888066582f90
[  876.821725] RBP: ffffc900400f0280 R08: ffffffffffffffe8 R09: ffffc900400f7e68
[  876.823885] R10: 0000000000007ff0 R11: ffff888100ad9000 R12: ffffc900400f7e68
[  876.826063] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[  876.828252] FS:  0000000000000000(0000) GS:ffff88813ff00000(0000) knlGS:0000000000000000
[  876.830647] CS:  10000e030 DS: 0000 ES: 0000 CR0: 0000000080050033
[  876.832502] CR2: 0000000000000000 CR3: 0000000104ac4000 CR4: 0000000000040660
[  876.834667] Kernel panic - not syncing: Fatal exception
[  876.836257] Kernel Offset: disabled
(XEN) Hardware Dom0 crashed: rebooting machine in 5 seconds.

Test suite description

The test is about release upgrade R4.1->R4.2, but the crash looks unrelated to the specific workload.

Reproducible

Fails since (at least) Build 2023102123-4.1 (current job)

Similar crash was observed also on a real hardware in a domU on an older kernel: 6.1.43.

Expected result

Last good: 2023101101-4.1 (or more recent)

Further details

Always latest result in this scenario: latest

The text was updated successfully, but these errors were encountered:

marmarek · 2023-10-22T14:10:49Z

Reported upstream at https://lore.kernel.org/xen-devel/ZO0WrR5J0xuwDIxW@mail-itl/

marmarek · 2023-11-20T13:16:19Z

Similar crash was observed also on a real hardware in a domU on an older kernel: 6.1.43.

Specific crash message:

[173643.279852] BUG: kernel NULL pointer dereference, address: 0000000000000000
[173643.279867] #PF: supervisor read access in kernel mode
[173643.279874] #PF: error_code(0x0000) - not-present page
[173643.279881] PGD 0 P4D 0
[173643.279886] Oops: 0000 [#1] PREEMPT SMP NOPTI
[173643.279893] CPU: 1 PID: 144 Comm: xenbus Tainted: G        W          6.1.43-1.qubes.12.fc37.x86_64 #1
[173643.279905] RIP: 0010:__wake_up_common+0x5b/0x1b0
[173643.279915] Code: 85 0a 01 00 00 4d 85 e4 74 0b 41 f6 04 24 04 0f 85 a3 00 00 00 48 8b 43 40 4c 8d 40 e8 48 83 c3 40 49 8d 40 18 48 39 c3 74 5b <49> 8b 40 18 31 ed 4c 8d 70 e8 45 8b 28 41 f6 c5 04 75 5f 4>
[173643.279934] RSP: 0018:ffffc90000dc3e10 EFLAGS: 00010082
[173643.279941] RAX: 0000000000000000 RBX: ffff8883562fd6d0 RCX: 0000000000000000
[173643.279951] RDX: 0000000000000001 RSI: 0000000000000003 RDI: ffff8883562fd690
[173643.279961] RBP: 0000000000000246 R08: ffffffffffffffe8 R09: ffffc90000dc3e68
[173643.279969] R10: ffffffff81175127 R11: ffffc9000003d000 R12: ffffc90000dc3e68
[173643.279979] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[173643.279990] FS:  0000000000000000(0000) GS:ffff8883dbe40000(0000) knlGS:0000000000000000
[173643.280000] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[173643.280007] CR2: 0000000000000000 CR3: 000000002fed6006 CR4: 0000000000770ee0
[173643.280018] PKRU: 55555554
[173643.280022] Call Trace:
[173643.280027]  <TASK>
[173643.280032]  ? show_trace_log_lvl+0x1d3/0x2ef
[173643.280041]  ? show_trace_log_lvl+0x1d3/0x2ef
[173643.280049]  ? show_trace_log_lvl+0x1d3/0x2ef
[173643.280057]  ? __wake_up_common_lock+0x82/0xd0
[173643.280064]  ? __die_body.cold+0x8/0xd
[173643.280070]  ? page_fault_oops+0x163/0x1a0
[173643.280078]  ? exc_page_fault+0x7e/0x200
[173643.280085]  ? asm_exc_page_fault+0x22/0x30
[173643.280094]  ? __wake_up_common_lock+0x67/0xd0
[173643.280101]  ? __wake_up_common+0x5b/0x1b0
[173643.280107]  __wake_up_common_lock+0x82/0xd0
[173643.280114]  ? process_writes+0x260/0x260
[173643.280121]  process_msg+0x199/0x300
[173643.280153]  xenbus_thread+0x165/0x1c0
[173643.280162]  ? cpuusage_read+0x10/0x10
[173643.280170]  kthread+0xe9/0x110
[173643.280177]  ? kthread_complete_and_exit+0x20/0x20
[173643.280185]  ret_from_fork+0x22/0x30
[173643.280194]  </TASK>
[173643.280198] Modules linked in: snd_seq_dummy snd_hrtimer snd_seq snd_seq_device nvme_fabrics nvme_core nvme_common nft_reject_ipv6 nf_reject_ipv6 nft_reject_ipv4 nf_reject_ipv4 nft_reject nft_ct nft_masq >
[173643.280284] CR2: 0000000000000000
[173643.280290] ---[ end trace 0000000000000000 ]--- 
[173643.280297] RIP: 0010:__wake_up_common+0x5b/0x1b0
[173643.280304] Code: 85 0a 01 00 00 4d 85 e4 74 0b 41 f6 04 24 04 0f 85 a3 00 00 00 48 8b 43 40 4c 8d 40 e8 48 83 c3 40 49 8d 40 18 48 39 c3 74 5b <49> 8b 40 18 31 ed 4c 8d 70 e8 45 8b 28 41 f6 c5 04 75 5f 4>
[173643.280324] RSP: 0018:ffffc90000dc3e10 EFLAGS: 00010082
[173643.280331] RAX: 0000000000000000 RBX: ffff8883562fd6d0 RCX: 0000000000000000 
[173643.280340] RDX: 0000000000000001 RSI: 0000000000000003 RDI: ffff8883562fd690 
[173643.280349] RBP: 0000000000000246 R08: ffffffffffffffe8 R09: ffffc90000dc3e68 
[173643.280359] R10: ffffffff81175127 R11: ffffc9000003d000 R12: ffffc90000dc3e68 
[173643.280368] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 
[173643.280378] FS:  0000000000000000(0000) GS:ffff8883dbe40000(0000) knlGS:0000000000000000
[173643.280386] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033 
[173643.280394] CR2: 0000000000000000 CR3: 000000002fed6006 CR4: 0000000000770ee0 
[173643.280403] PKRU: 55555554
[173643.280407] Kernel panic - not syncing: Fatal exception
[173643.280475] Kernel Offset: disabled

marmarek · 2024-03-25T16:13:32Z

happens on 6.1.75 too

crash message

BUG: kernel NULL pointer dereference, address: 0000000000000000
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 0 P4D 0 
Oops: 0000 [#1] PREEMPT SMP NOPTI
CPU: 3 PID: 131 Comm: xenbus Not tainted 6.1.75-1.qubes.fc37.x86_64 #1
RIP: 0010:__wake_up_common+0x4c/0x180
Code: 24 0c 89 4c 24 08 4d 85 c9 74 0a 41 f6 01 04 0f 85 a3 00 00 00 48 8b 43 08 4c 8d 40 e8 48 83 c3 08 49 8d 40 18 48 39 c3 74 5b <49> 8b 40 18 31 ed 4c 8d 70 e8 45 

RSP: 0018:ffffc90000d4fe10 EFLAGS: 00010086
RAX: 0000000000000000 RBX: ffff88811b77a018 RCX: 0000000000000000
RDX: 0000000000000001 RSI: 0000000000000003 RDI: ffff88811b77a010
RBP: 0000000000000246 R08: ffffffffffffffe8 R09: ffffc90000d4fe68
R10: 0000000000000003 R11: ffffc9000003d000 R12: ffffc90000d4fe68
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
FS:  0000000000000000(0000) GS:ffff8880f5ac0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 0000000002c10006 CR4: 0000000000770ee0
PKRU: 55555554
Call Trace:
 <TASK>
 ? show_trace_log_lvl+0x1d3/0x2ef
 ? show_trace_log_lvl+0x1d3/0x2ef
 ? show_trace_log_lvl+0x1d3/0x2ef
 ? __wake_up_common_lock+0x82/0xd0
 ? __die_body.cold+0x8/0xd
 ? page_fault_oops+0x163/0x1a0
 ? exc_page_fault+0x70/0x170
 ? asm_exc_page_fault+0x22/0x30
 ? __wake_up_common+0x4c/0x180
 __wake_up_common_lock+0x82/0xd0
 ? process_writes+0x240/0x240
 process_msg+0x18e/0x2f0
 xenbus_thread+0x165/0x1c0
 ? cpuusage_read+0x10/0x10
 kthread+0xe9/0x110
 ? kthread_complete_and_exit+0x20/0x20
 ret_from_fork+0x22/0x30
 </TASK>
Modules linked in: snd_seq_dummy snd_hrtimer snd_seq snd_seq_device nvme_fabrics nvme_core nvme_common nft_reject_ipv6 nf_reject_ipv6 nft_reject_ipv4 nf_reject_ipv4 nft_reject nft_ct nft_masq nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables xenfs nfnetlink ipmi_devintf ipmi_msghandler binfmt_misc intel_rapl_msr intel_rapl_common crct10dif_pclmul crc32_pclmul crc32c_intel polyval_clmulni polyval_generic xen_netfront snd_pcm snd_timer ghash_clmulni_intel sha512_ssse3 snd soundcore sha256_ssse3 sha1_ssse3 pcspkr xen_privcmd xen_gntdev xen_gntalloc xen_blkback xen_evtchn parport_pc ppdev lp parport loop fuse ip_tables overlay xen_blkfront
CR2: 0000000000000000
---[ end trace 0000000000000000 ]---
RIP: 0010:__wake_up_common+0x4c/0x180
Code: 24 0c 89 4c 24 08 4d 85 c9 74 0a 41 f6 01 04 0f 85 a3 00 00 00 48 8b 43 08 4c 8d 40 e8 48 83 c3 08 49 8d 40 18 48 39 c3 74 5b <49> 8b 40 18 31 ed 4c 8d 70 e8 45 8b 28 41 f6 c5 04 75 5f 49 8b 40
RSP: 0018:ffffc90000d4fe10 EFLAGS: 00010086
RAX: 0000000000000000 RBX: ffff88811b77a018 RCX: 0000000000000000
RDX: 0000000000000001 RSI: 0000000000000003 RDI: ffff88811b77a010
RBP: 0000000000000246 R08: ffffffffffffffe8 R09: ffffc90000d4fe68
R10: 0000000000000003 R11: ffffc9000003d000 R12: ffffc90000d4fe68
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
FS:  0000000000000000(0000) GS:ffff8880f5ac0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 0000000002c10006 CR4: 0000000000770ee0
PKRU: 55555554
Kernel panic - not syncing: Fatal exception
Kernel Offset: disabled

scallyob · 2024-05-28T14:46:53Z

I've begun to see instability in my qubes with random crashing. Probably started after I implemented this fix to my Xen configuration. Only affecting Whonix based qubes that I've noticed.

Dom0 kernel is 6.6.29-1

Here is the log:

[2024-05-27 13:41:13] [20937.502042] #PF: supervisor read access in kernel mode^M
[2024-05-27 13:41:13] [20937.503416] #PF: error_code(0x0000) - not-present page^M
[2024-05-27 13:41:13] [20937.504752] PGD 0 P4D 0 ^M
[2024-05-27 13:41:13] [20937.505434] Oops: 0000 [#1] PREEMPT SMP NOPTI^M
[2024-05-27 13:41:13] [20937.506609] CPU: 0 PID: 56 Comm: xenbus Not tainted 6.6.29-1.qubes.fc37.x86_64 #1^M
[2024-05-27 13:41:13] [20937.508455] RIP: 0010:__wake_up_common+0x4c/0x180^M
[2024-05-27 13:41:13] [20937.509960] Code: 24 0c 89 4c 24 08 4d 85 c9 74 0a 41 f6 01 04 0f 85 a3 00 00 00 48 8b 43 08 4c 8d 40 e8 48 83 c3 08 49 8d 40 18 48 39 c3 74 5b <49> 8b 40 18 31 ed 4c 8d 70 e8 45 8b 28 41 f6 c5 04 75 5f 49 8b 40^M
[2024-05-27 13:41:13] [20937.514811] RSP: 0018:ffffc90000dabdf0 EFLAGS: 00010082^M
[2024-05-27 13:41:13] [20937.515678] RAX: 0000000000000000 RBX: ffff88802e9f9b98 RCX: 0000000000000000^M
[2024-05-27 13:41:13] [20937.517510] RDX: 0000000000000001 RSI: 0000000000000003 RDI: ffff88802e9f9b90^M
[2024-05-27 13:41:13] [20937.519184] RBP: 0000000000000246 R08: ffffffffffffffe8 R09: ffffc90000dabe48^M
[2024-05-27 13:41:13] [20937.520675] R10: ffff88800d3d6ea8 R11: ffffc9000002d000 R12: ffffc90000dabe48^M
[2024-05-27 13:41:13] [20937.521637] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000^M
[2024-05-27 13:41:13] [20937.522753] FS:  0000000000000000(0000) GS:ffff888018400000(0000) knlGS:0000000000000000^M
[2024-05-27 13:41:13] [20937.523543] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033^M
[2024-05-27 13:41:13] [20937.524155] CR2: 0000000000000000 CR3: 0000000006b7a000 CR4: 00000000000406f0^M
[2024-05-27 13:41:13] [20937.524767] Call Trace:^M
[2024-05-27 13:41:13] [20937.524946]  <TASK>^M
[2024-05-27 13:41:13] [20937.525127]  ? __die+0x23/0x70^M
[2024-05-27 13:41:13] [20937.525397]  ? page_fault_oops+0x98/0x190^M
[2024-05-27 13:41:13] [20937.525669]  ? exc_page_fault+0x77/0x170^M
[2024-05-27 13:41:13] [20937.525940]  ? asm_exc_page_fault+0x26/0x30^M
[2024-05-27 13:41:13] [20937.526215]  ? __wake_up_common+0x4c/0x180^M
[2024-05-27 13:41:13] [20937.526484]  __wake_up_common_lock+0x82/0xd0^M
[2024-05-27 13:41:13] [20937.526839]  ? __pfx_xenbus_thread+0x10/0x10^M
[2024-05-27 13:41:13] [20937.527196]  process_msg+0x18e/0x2f0^M
[2024-05-27 13:41:13] [20937.527464]  xenbus_thread+0x4a/0x1e0^M
[2024-05-27 13:41:13] [20937.527732]  ? __pfx_autoremove_wake_function+0x10/0x10^M
[2024-05-27 13:41:13] [20937.528090]  kthread+0xe8/0x120^M
[2024-05-27 13:41:13] [20937.528360]  ? __pfx_kthread+0x10/0x10^M
[2024-05-27 13:41:13] [20937.528629]  ret_from_fork+0x34/0x50^M
[2024-05-27 13:41:13] [20937.528904]  ? __pfx_kthread+0x10/0x10^M
[2024-05-27 13:41:13] [20937.529173]  ret_from_fork_asm+0x1b/0x30^M
[2024-05-27 13:41:13] [20937.529443]  </TASK>^M
[2024-05-27 13:41:13] [20937.529622] Modules linked in: nf_conntrack_netlink nft_flow_offload nf_flow_table_inet nf_flow_table xen_netback dummy ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xenfs xt_multiport xt_nat xt_owner xt_REDIRECT nft_chain_nat nf_nat xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nft_compat crct10dif_pclmul crc32_pclmul crc32c_intel polyval_clmulni polyval_generic binfmt_misc ghash_clmulni_intel nf_tables sha512_ssse3 sha256_ssse3 nfnetlink xen_netfront sha1_ssse3 xen_privcmd xen_gntdev xen_gntalloc xen_blkback xen_evtchn fuse loop ip_tables overlay xen_blkfront^M
[2024-05-27 13:41:13] [20937.533019] CR2: 0000000000000000^M
[2024-05-27 13:41:13] [20937.533290] ---[ end trace 0000000000000000 ]---^M
[2024-05-27 13:41:13] [20937.533642] RIP: 0010:__wake_up_common+0x4c/0x180^M
[2024-05-27 13:41:13] [20937.533998] Code: 24 0c 89 4c 24 08 4d 85 c9 74 0a 41 f6 01 04 0f 85 a3 00 00 00 48 8b 43 08 4c 8d 40 e8 48 83 c3 08 49 8d 40 18 48 39 c3 74 5b <49> 8b 40 18 31 ed 4c 8d 70 e8 45 8b 28 41 f6 c5 04 75 5f 49 8b 40^M
[2024-05-27 13:41:13] [20937.704565] RSP: 0018:ffffc90000dabdf0 EFLAGS: 00010082^M
[2024-05-27 13:41:13] [20937.705264] RAX: 0000000000000000 RBX: ffff88802e9f9b98 RCX: 0000000000000000^M
[2024-05-27 13:41:13] [20937.706294] RDX: 0000000000000001 RSI: 0000000000000003 RDI: ffff88802e9f9b90^M
[2024-05-27 13:41:13] [20937.707311] RBP: 0000000000000246 R08: ffffffffffffffe8 R09: ffffc90000dabe48^M
[2024-05-27 13:41:13] [20937.708988] R10: ffff88800d3d6ea8 R11: ffffc9000002d000 R12: ffffc90000dabe48^M
[2024-05-27 13:41:13] [20937.709805] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000^M
[2024-05-27 13:41:13] [20937.710685] FS:  0000000000000000(0000) GS:ffff888018400000(0000) knlGS:0000000000000000^M
[2024-05-27 13:41:13] [20937.712434] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033^M
[2024-05-27 13:41:13] [20937.713674] CR2: 0000000000000000 CR3: 0000000006b7a000 CR4: 00000000000406f0^M
[2024-05-27 13:41:13] [20937.714301] Kernel panic - not syncing: Fatal exception^M
[2024-05-27 13:41:13] [20937.718046] Kernel Offset: disabled^M

marmarek · 2024-06-04T22:06:19Z

Another backtrace, this time from 6.6.25 (internal ref: a3):

Details

[2024-06-04 18:49:51] [3694722.123261] BUG: kernel NULL pointer dereference, address: 0000000000000000
[2024-06-04 18:49:51] [3694722.123278] #PF: supervisor read access in kernel mode
[2024-06-04 18:49:51] [3694722.123286] #PF: error_code(0x0000) - not-present page
[2024-06-04 18:49:51] [3694722.123293] PGD 0 P4D 0 
[2024-06-04 18:49:51] [3694722.123299] Oops: 0000 [#1] PREEMPT SMP NOPTI
[2024-06-04 18:49:51] [3694722.123308] CPU: 1 PID: 151 Comm: xenbus Not tainted 6.6.25-1.qubes.fc37.x86_64 #1
[2024-06-04 18:49:51] [3694722.123319] RIP: 0010:__wake_up_common+0x4c/0x180
[2024-06-04 18:49:51] [3694722.123331] Code: 24 0c 89 4c 24 08 4d 85 c9 74 0a 41 f6 01 04 0f 85 a3 00 00 00 48 8b 43 08 4c 8d 40 e8 48 83 c3 08 49 8d 40 18 48 39 c3 74 5b <49> 8b 40 18 31 ed 4c 8d 70 e8 45 8b 28 41 f6 c5 04 75 5f 49 8b 40
[2024-06-04 18:49:51] [3694722.123353] RSP: 0018:ffffc90000df7df0 EFLAGS: 00010086
[2024-06-04 18:49:51] [3694722.123361] RAX: 0000000000000000 RBX: ffff88828ca09918 RCX: 0000000000000000
[2024-06-04 18:49:51] [3694722.123370] RDX: 0000000000000001 RSI: 0000000000000003 RDI: ffff88828ca09910
[2024-06-04 18:49:51] [3694722.123380] RBP: 0000000000000246 R08: ffffffffffffffe8 R09: ffffc90000df7e48
[2024-06-04 18:49:51] [3694722.123389] R10: ffff8880053c9cd0 R11: ffffc9000003d000 R12: ffffc90000df7e48
[2024-06-04 18:49:51] [3694722.123398] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[2024-06-04 18:49:51] [3694722.123409] FS:  0000000000000000(0000) GS:ffff8880f5a40000(0000) knlGS:0000000000000000
[2024-06-04 18:49:51] [3694722.123419] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[2024-06-04 18:49:51] [3694722.123428] CR2: 0000000000000000 CR3: 000000010fc48002 CR4: 0000000000770ee0
[2024-06-04 18:49:51] [3694722.123439] PKRU: 55555554
[2024-06-04 18:49:51] [3694722.123445] Call Trace:
[2024-06-04 18:49:51] [3694722.123451]  <TASK>
[2024-06-04 18:49:51] [3694722.123457]  ? __die+0x23/0x70
[2024-06-04 18:49:51] [3694722.123466]  ? page_fault_oops+0x98/0x190
[2024-06-04 18:49:51] [3694722.123475]  ? exc_page_fault+0x77/0x170
[2024-06-04 18:49:51] [3694722.123484]  ? asm_exc_page_fault+0x26/0x30
[2024-06-04 18:49:51] [3694722.123495]  ? __wake_up_common+0x4c/0x180
[2024-06-04 18:49:51] [3694722.123505]  __wake_up_common_lock+0x82/0xd0
[2024-06-04 18:49:51] [3694722.123515]  ? __pfx_xenbus_thread+0x10/0x10
[2024-06-04 18:49:51] [3694722.123524]  process_msg+0x18e/0x2f0
[2024-06-04 18:49:51] [3694722.123531]  xenbus_thread+0x181/0x1e0
[2024-06-04 18:49:51] [3694722.123537]  ? __pfx_autoremove_wake_function+0x10/0x10
[2024-06-04 18:49:51] [3694722.123546]  kthread+0xe8/0x120
[2024-06-04 18:49:51] [3694722.123554]  ? __pfx_kthread+0x10/0x10
[2024-06-04 18:49:51] [3694722.123562]  ret_from_fork+0x34/0x50
[2024-06-04 18:49:51] [3694722.123570]  ? __pfx_kthread+0x10/0x10
[2024-06-04 18:49:51] [3694722.123577]  ret_from_fork_asm+0x1b/0x30
[2024-06-04 18:49:51] [3694722.123586]  </TASK>
[2024-06-04 18:49:51] [3694722.123590] Modules linked in: snd_seq_dummy snd_hrtimer snd_seq snd_seq_device nvme_fabrics nvme_core nvme_common nft_reject_ipv6 nf_reject_ipv6 nft_reject_ipv4 nf_reject_ipv4 nft_reject nft_ct nft_masq nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables nfnetlink xenfs ipmi_devintf ipmi_msghandler binfmt_misc intel_rapl_msr intel_rapl_common snd_pcm crct10dif_pclmul crc32_pclmul crc32c_intel polyval_clmulni polyval_generic snd_timer ghash_clmulni_intel snd sha512_ssse3 sha256_ssse3 soundcore sha1_ssse3 xen_netfront pcspkr xen_privcmd xen_gntdev xen_gntalloc xen_blkback xen_evtchn parport_pc ppdev lp parport fuse loop ip_tables overlay xen_blkfront
[2024-06-04 18:49:51] [3694722.123701] CR2: 0000000000000000
[2024-06-04 18:49:51] [3694722.123707] ---[ end trace 0000000000000000 ]---

marmarek · 2024-06-04T22:09:47Z

@scallyob how often do you get it? can you narrow down when (what operations, applications etc) it is most likely to happen? I get it about once a month, and it's too infrequent for any kind of debugging...

scallyob · 2024-06-05T02:20:00Z

@marmarek Last two were 4 days apart. Different qubes. Those were the only 2 I captured logs for. Both were whonix gateways. Others have all been Whonix gateways and 1 time it was a Whonix workstation. So the gateways are just running in the background, I'm not actively doing anything with them when they crash, but they usually have services actively running through them.

scallyob · 2024-06-05T02:48:11Z

Just got another one. Was main sys-whonix. Was using Tor Browser in anon-whonix and changing settings with qvm-prefs on another whonix workstation when I saw it go down. Same errors in the logs. Seems like every 4 days now, which is a significant problem for me.

One thing I'm trying is I've had "Include in memory balancing checked" on qubes in the past. I turned it off for others that had crashed, but was still checked for sys-whonix. Will report back on whether there's any patterns there.

scallyob · 2024-06-18T23:04:26Z

First crash in 2 weeks

whonix workstation
was not included in memory balancing
memory only set to 400MB
vcpus only set to 1
I was not actively using my computer but it was running a server when it crashed
same error message except kernel 6.6.31

I increased to 2GB RAM and 2 VCPUS.

andrewdavidwong added waiting for upstream This issue is waiting for something from an upstream project to arrive in Qubes. Remove when closed. P: default Priority: default. Default priority for new issues, to be replaced given sufficient information. labels Oct 22, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

kernel NULL pointer dereference in xenbus_thread, Linux 6.1.57 on openQA #8638

kernel NULL pointer dereference in xenbus_thread, Linux 6.1.57 on openQA #8638

marmarek commented Oct 22, 2023

marmarek commented Oct 22, 2023

marmarek commented Nov 20, 2023 •

edited

Loading

marmarek commented Mar 25, 2024 •

edited

Loading

scallyob commented May 28, 2024

marmarek commented Jun 4, 2024

marmarek commented Jun 4, 2024

scallyob commented Jun 5, 2024

scallyob commented Jun 5, 2024

scallyob commented Jun 18, 2024 •

edited

Loading

kernel NULL pointer dereference in xenbus_thread, Linux 6.1.57 on openQA #8638

kernel NULL pointer dereference in xenbus_thread, Linux 6.1.57 on openQA #8638

Comments

marmarek commented Oct 22, 2023

Observation

Test suite description

Reproducible

Expected result

Further details

marmarek commented Oct 22, 2023

marmarek commented Nov 20, 2023 • edited Loading

marmarek commented Mar 25, 2024 • edited Loading

scallyob commented May 28, 2024

marmarek commented Jun 4, 2024

marmarek commented Jun 4, 2024

scallyob commented Jun 5, 2024

scallyob commented Jun 5, 2024

scallyob commented Jun 18, 2024 • edited Loading

marmarek commented Nov 20, 2023 •

edited

Loading

marmarek commented Mar 25, 2024 •

edited

Loading

scallyob commented Jun 18, 2024 •

edited

Loading