Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kernel Panic #43

Closed
Smithx10 opened this issue Jul 15, 2022 · 2 comments
Closed

Kernel Panic #43

Smithx10 opened this issue Jul 15, 2022 · 2 comments

Comments

@Smithx10
Copy link

While working on linstor-gateway, I was able to panic drbd on kernel-lt and 5.14.21.

The change I made to Linstor-Gateway is removing the "must be offline" check located here: https://github.com/LINBIT/linstor-gateway/blob/master/pkg/nvmeof/nvmeof.go#L271-L274

		status := linstorcontrol.StatusFromResources(path, resourceDefinition, resourceGroup, resources)
		if status.Service == common.ServiceStateStarted {
			return nil, errors.New("cannot add volume while service is running")
		}

I'm not sure if this change is related or not, but I wouldn't expect this to result in a panic.

zfs:

[root@ac-1f-6b-9e-e5-46 zfs]# zfs --version
zfs-2.1.4-1
zfs-kmod-2.1.4-1

drbd:

[root@ac-1f-6b-9e-e5-46 zfs]# drbdadm --version
DRBDADM_BUILDTAG=GIT-hash:\ 9aeb1059d37b92fec8db2b47e356c4e7fa030b64\ build\ by\ root@drbd-lsc-0\,\ 2022-06-23\ 05:01:03
DRBDADM_API_VERSION=2
DRBD_KERNEL_VERSION_CODE=0x090107
DRBD_KERNEL_VERSION=9.1.7
DRBDADM_VERSION_CODE=0x091500
DRBDADM_VERSION=9.21.0

drbd-reactor:

[root@ac-1f-6b-9e-e5-46 zfs]# drbd-reactor --version
drbd-reactor 0.7.0

Kernel: Linux ac-1f-6b-9e-e5-46 5.4.205-1.el8.elrepo.x86_64 #1 SMP Tue Jul 12 10:48:44 EDT 2022 x86_64 x86_64 x86_64 GNU/Linux

[ 1758.843242] drbd milliman/1 drbd1001 ac-1f-6b-9e-e5-46: helper command: /sbin/drbdadm before-resync-target exit code 0
[ 1758.865053] drbd milliman: Aborting cluster-wide state change 2530163159 (31ms) rv = -19
[ 1758.873892] drbd milliman: Preparing cluster-wide state change 247777449 (1->-1 3/1)
[ 1758.899606] drbd milliman ac-1f-6b-9e-e5-46: Aborting local state change 247777449 to yield to remote state change 1249144741.
[ 1758.912455] drbd milliman: Aborting cluster-wide state change 247777449 (38ms) rv = -19
[ 1758.921189] drbd milliman: Preparing cluster-wide state change 1328687619 (1->-1 3/1)
[ 1758.929707] drbd milliman: Aborting cluster-wide state change 1328687619 (9ms) rv = -19
[ 1758.938412] drbd milliman: Preparing cluster-wide state change 2976414762 (1->-1 3/1)
[ 1758.946915] drbd milliman: Aborting cluster-wide state change 2976414762 (9ms) rv = -19
[ 1758.967206] drbd milliman ac-1f-6b-9e-e5-46: Preparing remote state change 1249144741
[ 1759.000135] drbd milliman ac-1f-6b-9e-e5-46: Committing remote state change 1249144741 (primary_nodes=1)
[ 1759.010216] drbd milliman ac-1f-6b-9e-e5-46: peer( Secondary -> Primary )
[ 1759.069750] drbd milliman/1 drbd1001: disk( Outdated -> Inconsistent )
[ 1759.076966] drbd milliman/1 drbd1001 ac-1f-6b-a5-ab-ea: resync-susp( no -> connection dependency )
[ 1759.086600] drbd milliman/1 drbd1001 ac-1f-6b-9e-e5-46: repl( WFBitMapT -> SyncTarget )
[ 1759.095740] drbd milliman/0 drbd1000: disk( Outdated -> Inconsistent )
[ 1759.102917] drbd milliman/0 drbd1000 ac-1f-6b-a5-ab-ea: resync-susp( no -> connection dependency )
[ 1759.112497] drbd milliman/0 drbd1000 ac-1f-6b-9e-e5-46: repl( WFBitMapT -> SyncTarget )
[ 1759.121267] drbd milliman/1 drbd1001 ac-1f-6b-9e-e5-46: Began resync as SyncTarget (will sync 5066752 KB [1266688 bits set]).
[ 1759.133258] drbd milliman/0 drbd1000 ac-1f-6b-9e-e5-46: Began resync as SyncTarget (will sync 32768 KB [8192 bits set]).
[ 1759.133451] drbd milliman/0 drbd1000 ac-1f-6b-9e-e5-46: received new current UUID: DBD3CCFBFA3D8BAF weak_nodes=FFFFFFFFFFFFFFFC
[ 1759.263009] drbd milliman/1 drbd1001 ac-1f-6b-9e-e5-46: received new current UUID: 7AD2E749AAAFFC69 weak_nodes=FFFFFFFFFFFFFFFC
[ 1760.028877] drbd milliman/0 drbd1000 ac-1f-6b-9e-e5-46: Resync done (total 1 sec; paused 0 sec; 32768 K/sec)
[ 1760.039382] drbd milliman/0 drbd1000 ac-1f-6b-9e-e5-46: updated UUIDs DBD3CCFBFA3D8BAE:0000000000000000:C6FAEE622D6CFFFA:0000000000000000
[ 1760.053003] drbd milliman/0 drbd1000: disk( Inconsistent -> UpToDate )
[ 1760.060112] drbd milliman/0 drbd1000 ac-1f-6b-a5-ab-ea: resync-susp( connection dependency -> no )
[ 1760.069638] drbd milliman/0 drbd1000 ac-1f-6b-9e-e5-46: repl( SyncTarget -> Established )
[ 1760.079754] drbd milliman/0 drbd1000 ac-1f-6b-9e-e5-46: helper command: /sbin/drbdadm after-resync-target
[ 1760.091674] drbd milliman/0 drbd1000 ac-1f-6b-9e-e5-46: helper command: /sbin/drbdadm after-resync-target exit code 0
[ 1814.063918] drbd milliman/1 drbd1001 ac-1f-6b-9e-e5-46: Resync done (total 54 sec; paused 0 sec; 93828 K/sec)
[ 1814.074464] drbd milliman/1 drbd1001 ac-1f-6b-9e-e5-46: updated UUIDs 7AD2E749AAAFFC68:0000000000000000:9656137FABC73162:0000000000000000
[ 1814.087919] drbd milliman/1 drbd1001: disk( Inconsistent -> UpToDate )
[ 1814.094958] drbd milliman/1 drbd1001 ac-1f-6b-a5-ab-ea: resync-susp( connection dependency -> no )
[ 1814.104426] drbd milliman/1 drbd1001 ac-1f-6b-9e-e5-46: repl( SyncTarget -> Established )
[ 1814.117246] drbd milliman/1 drbd1001 ac-1f-6b-9e-e5-46: helper command: /sbin/drbdadm after-resync-target
[ 1814.132686] drbd milliman/1 drbd1001 ac-1f-6b-9e-e5-46: helper command: /sbin/drbdadm after-resync-target exit code 0
[ 1832.336993] drbd demo0/3 drbd1005: meta-data IO uses: blk-bio
[ 1832.341362] drbd demo0/3 drbd1005: disabling discards due to peer capabilities
[ 1832.344636] drbd demo0: State change failed: In transient state, retry after next state change
[ 1832.360685] drbd demo0/3 drbd1005: Failed: disk( Diskless -> Attaching )
[ 1832.368103] drbd demo0/3 drbd1005 ac-1f-6b-9e-e5-46: self 0000000000000000:0000000000000000:0000000000000000:0000000000000000 bits:0 flags:0
[ 1832.368109] drbd demo0: State change failed: In transient state, retry after next state change
[ 1832.382029] drbd demo0/3 drbd1005 ac-1f-6b-9e-e5-46: peer's exposed UUID: 0000000000000000
[ 1832.391198] drbd demo0/3 drbd1005: Failed: disk( Diskless -> Attaching )
[ 1832.407293] drbd demo0/3 drbd1005: disabling discards due to peer capabilities
[ 1832.415066] drbd demo0/3 drbd1005 ac-1f-6b-a5-ab-ea: self 0000000000000000:0000000000000000:0000000000000000:0000000000000000 bits:0 flags:0
[ 1832.428722] drbd demo0/3 drbd1005 ac-1f-6b-a5-ab-ea: peer's exposed UUID: 0000000000000000
[ 1832.437526] drbd demo0/3 drbd1005 ac-1f-6b-a5-ab-ea: pdsk( DUnknown -> Diskless ) repl( Off -> Established )
[ 1832.447954] drbd demo0: State change failed: In transient state, retry after next state change
[ 1832.457104] drbd demo0/3 drbd1005: Failed: disk( Diskless -> Attaching )
[ 1832.464352] BUG: kernel NULL pointer dereference, address: 0000000000000010
[ 1832.464439] drbd demo0: State change failed: In transient state, retry after next state change
[ 1832.471880] #PF: supervisor read access in kernel mode
[ 1832.481074] drbd demo0/3 drbd1005: Failed: disk( Diskless -> Attaching )
[ 1832.486774] #PF: error_code(0x0000) - not-present page
[ 1832.499765] PGD 0 P4D 0
[ 1832.502896] Oops: 0000 [#1] SMP NOPTI
[ 1832.507123] CPU: 0 PID: 83920 Comm: drbd_r_demo0 Tainted: P           OE     5.4.205-1.el8.elrepo.x86_64 #1
[ 1832.517442] Hardware name: Supermicro SYS-1029U-TN10RT/X11DPU, BIOS 3.1 04/29/2019
[ 1832.525626] RIP: 0010:drbd_determine_dev_size+0x5a/0x520 [drbd]
[ 1832.532118] Code: 00 48 89 44 24 78 31 c0 e8 73 e1 ff ff 48 c7 c6 b0 d4 8c c0 48 89 df e8 a4 7d fe ff 48 89 44 24 08 48 85 c0 0f 84 4a 04 00 00 <49> 8b 47 10 4d 8b 77 18 48 89 04 24 41 8b 47 48 89 44 24 18 49 8b
[ 1832.551934] RSP: 0018:ffffaaaff1587d00 EFLAGS: 00010286
[ 1832.557700] RAX: ffff9c7326224000 RBX: ffff9c72d0a1e000 RCX: 0000000000000000
[ 1832.565368] RDX: 0000000000000001 RSI: ffffffffc08cd4b0 RDI: ffff9c72d0a1e000
[ 1832.573019] RBP: 0000000000000000 R08: 0000000000000332 R09: 000000000002ea40
[ 1832.580667] R10: 0000000000008905 R11: 0000000000004482 R12: 0000000000000000
[ 1832.588284] R13: 0000000000000000 R14: ffff9c734142d000 R15: 0000000000000000
[ 1832.595889] FS:  0000000000000000(0000) GS:ffff9c1380600000(0000) knlGS:0000000000000000
[ 1832.604466] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1832.610682] CR2: 0000000000000010 CR3: 000000a9a340a003 CR4: 00000000007606f0
[ 1832.618287] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1832.625903] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 1832.633525] PKRU: 55555554
[ 1832.636734] Call Trace:
[ 1832.639648]  ? printk+0x58/0x6f
[ 1832.643241]  receive_state+0x5f7/0x1040 [drbd]
[ 1832.648125]  ? drbd_recv+0x49/0x200 [drbd]
[ 1832.652692]  ? decode_header+0x17/0x130 [drbd]
[ 1832.657606]  ? _get_ldev_if_state.part.51+0xd0/0xd0 [drbd]
[ 1832.663555]  drbd_receiver+0x5a6/0x7f0 [drbd]
[ 1832.668351]  ? __drbd_next_peer_device_ref+0x140/0x140 [drbd]
[ 1832.674534]  drbd_thread_setup+0x5e/0x160 [drbd]
[ 1832.679594]  ? __drbd_next_peer_device_ref+0x140/0x140 [drbd]
[ 1832.685790]  kthread+0x10c/0x130
[ 1832.689467]  ? kthread_park+0x80/0x80
[ 1832.693578]  ret_from_fork+0x1f/0x40
[ 1832.697591] Modules linked in: zfs(POE) zunicode(POE) zzstd(OE) zlua(OE) zavl(POE) icp(POE) zcommon(POE) znvpair(POE) spl(OE) drbd_transport_tcp(OE) drbd(OE) bcache(E) crc64(E) dm_cache(E) dm_persistent_data(E) dm_bio_prison(E) dm_bufio(E) dm_writecache(E) nvme_rdma(E) nvmet_rdma(E) rdma_cm(E) iw_cm(E) ib_cm(E) ib_core(E) 8021q(E) garp(E) mrp(E) stp(E) llc(E) intel_rapl_msr(E) intel_rapl_common(E) iTCO_wdt(E) iTCO_vendor_support(E) skx_edac(E) nfit(E) libnvdimm(E) x86_pkg_temp_thermal(E) intel_powerclamp(E) coretemp(E) kvm_intel(E) kvm(E) irqbypass(E) crct10dif_pclmul(E) crc32_pclmul(E) rfkill(E) ghash_clmulni_intel(E) rapl(E) intel_cstate(E) mei_me(E) ipmi_ssif(E) sr_mod(E) cdrom(E) intel_uncore(E) pcspkr(E) sunrpc(E) sg(E) joydev(E) i2c_i801(E) lpc_ich(E) mei(E) ioatdma(E) ipmi_si(E) acpi_power_meter(E) acpi_pad(E) vfat(E) fat(E) dm_mod(E) uas(E) usb_storage(E) xfs(E) ast(E) i2c_algo_bit(E) libcrc32c(E) drm_vram_helper(E) ttm(E) nvmet_tcp(E) drm_kms_helper(E) ixgbe(E) nvmet(E)
[ 1832.697619]  syscopyarea(E) ahci(E) sysfillrect(E) sysimgblt(E) fb_sys_fops(E) nvme_tcp(E) libahci(E) nvme_fabrics(E) crc32c_intel(E) drm(E) mdio(E) libata(E) dca(E) wmi(E) nvme(E) nvme_core(E) ipmi_devintf(E) ipmi_msghandler(E)
[ 1832.808915] CR2: 0000000000000010
[ 1832.812751] ---[ end trace 1dbb53d7f2280dec ]---
[ 1832.876741] RIP: 0010:drbd_determine_dev_size+0x5a/0x520 [drbd]
[ 1832.883115] Code: 00 48 89 44 24 78 31 c0 e8 73 e1 ff ff 48 c7 c6 b0 d4 8c c0 48 89 df e8 a4 7d fe ff 48 89 44 24 08 48 85 c0 0f 84 4a 04 00 00 <49> 8b 47 10 4d 8b 77 18 48 89 04 24 41 8b 47 48 89 44 24 18 49 8b
[ 1832.902799] RSP: 0018:ffffaaaff1587d00 EFLAGS: 00010286
[ 1832.908514] RAX: ffff9c7326224000 RBX: ffff9c72d0a1e000 RCX: 0000000000000000
[ 1832.916117] RDX: 0000000000000001 RSI: ffffffffc08cd4b0 RDI: ffff9c72d0a1e000
[ 1832.923724] RBP: 0000000000000000 R08: 0000000000000332 R09: 000000000002ea40
[ 1832.931319] R10: 0000000000008905 R11: 0000000000004482 R12: 0000000000000000
[ 1832.938913] R13: 0000000000000000 R14: ffff9c734142d000 R15: 0000000000000000
[ 1832.946497] FS:  0000000000000000(0000) GS:ffff9c1380600000(0000) knlGS:0000000000000000
[ 1832.955026] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1832.961234] CR2: 0000000000000010 CR3: 000000a9a340a003 CR4: 00000000007606f0
[ 1832.968843] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1832.976426] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 1832.983991] PKRU: 55555554
[ 1832.987134] Kernel panic - not syncing: Fatal exception
[ 1832.992928] Kernel Offset: 0x22800000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[ 1833.063038] ---[ end Kernel panic - not syncing: Fatal exception ]---

Kernel: 5.14.21:

[  350.867400] drbd milliman/0 drbd1000 ac-1f-6b-a4-df-ee: pdsk( DUnknown -> UpToDate ) repl( Off -> Established )
[  350.877981] drbd milliman/1 drbd1001: quorum( no -> yes )
[  350.883886] drbd milliman/1 drbd1001 ac-1f-6b-a4-df-ee: pdsk( DUnknown -> UpToDate ) repl( Off -> Established )
[ 9754.631374] drbd demo0: Starting worker thread (from drbdsetup [5294])
[ 9754.641352] drbd demo0 ac-1f-6b-a4-df-ee: Starting sender thread (from drbdsetup [5302])
[ 9754.663695] drbd demo0/0 drbd1002: meta-data IO uses: blk-bio
[ 9754.670334] drbd demo0/0 drbd1002: disk( Diskless -> Attaching )
[ 9754.676960] drbd demo0/0 drbd1002: Maximum number of peer devices = 7
[ 9754.684075] drbd demo0: Method to ensure write ordering: flush
[ 9754.690514] drbd demo0/0 drbd1002: drbd_bm_resize called with capacity == 131080
[ 9754.698516] drbd demo0/0 drbd1002: resync bitmap: bits=16385 words=1799 pages=4
[ 9754.706406] drbd1002: detected capacity change from 0 to 131080
[ 9754.712915] drbd demo0/0 drbd1002: size = 64 MB (65540 KB)
[ 9754.719120] drbd demo0/0 drbd1002: recounting of set bits took additional 0ms
[ 9754.726831] drbd demo0/0 drbd1002: disk( Attaching -> UpToDate )
[ 9754.733397] drbd demo0/0 drbd1002: attached to current UUID: ECDCAF858EE6D814
[ 9754.741100] drbd demo0/0 drbd1002: size = 64 MB (65540 KB)
[ 9754.774618] drbd demo0/1 drbd1003: meta-data IO uses: blk-bio
[ 9754.781254] drbd demo0/1 drbd1003: disk( Diskless -> Attaching )
[ 9754.787826] drbd demo0/1 drbd1003: Maximum number of peer devices = 7
[ 9754.794870] drbd demo0/1 drbd1003: drbd_bm_resize called with capacity == 209715208
[ 9754.820604] drbd demo0/1 drbd1003: resync bitmap: bits=26214401 words=2867207 pages=5601
[ 9754.829279] drbd1003: detected capacity change from 0 to 209715208
[ 9754.836024] drbd demo0/1 drbd1003: size = 100 GB (104857604 KB)
[ 9754.879087] drbd demo0/1 drbd1003: recounting of set bits took additional 13ms
[ 9754.895230] drbd demo0/1 drbd1003: disk( Attaching -> UpToDate )
[ 9754.901761] drbd demo0/1 drbd1003: attached to current UUID: 6048848BCEFDCF0A
[ 9754.909484] drbd demo0/1 drbd1003: size = 100 GB (104857604 KB)
[ 9754.911097] drbd demo0 ac-1f-6b-a4-df-ee: conn( StandAlone -> Unconnected )
[ 9754.923894] drbd demo0 ac-1f-6b-a4-df-ee: Starting receiver thread (from drbd_w_demo0 [5295])
[ 9754.933155] drbd demo0 ac-1f-6b-a4-df-ee: conn( Unconnected -> Connecting )
[ 9755.446373] drbd demo0 ac-1f-6b-a4-df-ee: Handshake to peer 0 successful: Agreed network protocol version 121
[ 9755.456896] drbd demo0 ac-1f-6b-a4-df-ee: Feature flags enabled on protocol level: 0xf TRIM THIN_RESYNC WRITE_SAME WRITE_ZEROES.
[ 9755.470894] drbd demo0 ac-1f-6b-a4-df-ee: Peer authenticated using 20 bytes HMAC
[ 9755.478807] drbd demo0 ac-1f-6b-a4-df-ee: Starting ack_recv thread (from drbd_r_demo0 [5444])
[ 9755.521518] drbd demo0 ac-1f-6b-a4-df-ee: Preparing remote state change 1818502727
[ 9755.543252] drbd demo0/0 drbd1002 ac-1f-6b-a4-df-ee: drbd_sync_handshake:
[ 9755.550472] drbd demo0/0 drbd1002 ac-1f-6b-a4-df-ee: self ECDCAF858EE6D814:0000000000000000:0000000000000000:0000000000000000 bits:0 flags:20
[ 9755.564013] drbd demo0/0 drbd1002 ac-1f-6b-a4-df-ee: peer E62158D3D1AFA478:ECDCAF858EE6D814:0000000000000000:0000000000000000 bits:16385 flags:20
[ 9755.577934] drbd demo0/0 drbd1002 ac-1f-6b-a4-df-ee: uuid_compare()=target-use-bitmap by rule=bitmap-peer
[ 9755.602244] drbd demo0/1 drbd1003 ac-1f-6b-a4-df-ee: drbd_sync_handshake:
[ 9755.609477] drbd demo0/1 drbd1003 ac-1f-6b-a4-df-ee: self 6048848BCEFDCF0A:0000000000000000:0000000000000000:0000000000000000 bits:0 flags:20
[ 9755.623038] drbd demo0/1 drbd1003 ac-1f-6b-a4-df-ee: peer 6048848BCEFDCF0A:0000000000000000:0000000000000000:0000000000000000 bits:0 flags:20
[ 9755.636634] drbd demo0/1 drbd1003 ac-1f-6b-a4-df-ee: uuid_compare()=no-sync by rule=both-off
[ 9755.683092] drbd demo0 ac-1f-6b-a4-df-ee: Committing remote state change 1818502727 (primary_nodes=0)
[ 9755.692790] drbd demo0 ac-1f-6b-a4-df-ee: conn( Connecting -> Connected ) peer( Unknown -> Secondary )
[ 9755.702545] drbd demo0/0 drbd1002: disk( UpToDate -> Outdated )
[ 9755.708922] drbd demo0/0 drbd1002 ac-1f-6b-a4-df-ee: pdsk( DUnknown -> UpToDate ) repl( Off -> WFBitMapT )
[ 9755.719032] drbd demo0/1 drbd1003 ac-1f-6b-a4-df-ee: pdsk( DUnknown -> UpToDate ) repl( Off -> Established )
[ 9755.738450] drbd demo0/0 drbd1002 ac-1f-6b-a4-df-ee: receive bitmap stats [Bytes(packets)]: plain 0(0), RLE 21(1), total 21; compression: 99.0%
[ 9755.760668] drbd demo0/0 drbd1002 ac-1f-6b-a4-df-ee: send bitmap stats [Bytes(packets)]: plain 0(0), RLE 21(1), total 21; compression: 99.0%
[ 9755.782646] drbd demo0/0 drbd1002 ac-1f-6b-a4-df-ee: helper command: /sbin/drbdadm before-resync-target
[ 9755.800952] drbd demo0/0 drbd1002 ac-1f-6b-a4-df-ee: helper command: /sbin/drbdadm before-resync-target exit code 0
[ 9755.820258] drbd demo0/0 drbd1002: disk( Outdated -> Inconsistent )
[ 9755.827064] drbd demo0/0 drbd1002 ac-1f-6b-a4-df-ee: repl( WFBitMapT -> SyncTarget )
[ 9755.835374] drbd demo0/0 drbd1002 ac-1f-6b-a4-df-ee: Began resync as SyncTarget (will sync 65540 KB [16385 bits set]).
[ 9756.385396] drbd demo0 ac-1f-6b-a4-df-ee: Preparing remote state change 3350157584
[ 9756.425889] drbd demo0 ac-1f-6b-a4-df-ee: Committing remote state change 3350157584 (primary_nodes=1)
[ 9756.435640] drbd demo0 ac-1f-6b-a4-df-ee: peer( Secondary -> Primary )
[ 9758.280497] drbd demo0/0 drbd1002 ac-1f-6b-a4-df-ee: Resync done (total 2 sec; paused 0 sec; 32768 K/sec)
[ 9758.290632] drbd demo0/0 drbd1002 ac-1f-6b-a4-df-ee: updated UUIDs E62158D3D1AFA478:0000000000000000:0000000000000000:0000000000000000
[ 9758.303782] drbd demo0/0 drbd1002: disk( Inconsistent -> UpToDate )
[ 9758.310611] drbd demo0/0 drbd1002 ac-1f-6b-a4-df-ee: repl( SyncTarget -> Established )
[ 9758.319908] drbd demo0/0 drbd1002 ac-1f-6b-a4-df-ee: helper command: /sbin/drbdadm after-resync-target
[ 9758.330851] drbd demo0/0 drbd1002 ac-1f-6b-a4-df-ee: helper command: /sbin/drbdadm after-resync-target exit code 0
[ 9778.139037] drbd demo0/2 drbd1004: meta-data IO uses: blk-bio
[ 9778.145730] drbd demo0: State change failed: In transient state, retry after next state change
[ 9778.154976] drbd demo0/2 drbd1004: Failed: disk( Diskless -> Attaching )
[ 9778.162307] drbd demo0: State change failed: In transient state, retry after next state change
[ 9778.171518] drbd demo0/2 drbd1004: Failed: disk( Diskless -> Attaching )
[ 9778.472135] drbd demo0/2 drbd1004: disabling discards due to peer capabilities
[ 9778.480103] drbd demo0/2 drbd1004 ac-1f-6b-a4-df-ee: self 0000000000000000:0000000000000000:0000000000000000:0000000000000000 bits:0 flags:0
[ 9778.493887] drbd demo0/2 drbd1004 ac-1f-6b-a4-df-ee: peer's exposed UUID: 0000000000000000
[ 9778.511106] BUG: kernel NULL pointer dereference, address: 0000000000000018
[ 9778.518618] #PF: supervisor read access in kernel mode
[ 9778.524305] #PF: error_code(0x0000) - not-present page
[ 9778.529987] PGD 0 P4D 0
[ 9778.533056] Oops: 0000 [#1] SMP NOPTI
[ 9778.537250] CPU: 1 PID: 5444 Comm: drbd_r_demo0 Tainted: P S         OE     5.14.21 #1
[ 9778.545682] Hardware name: Supermicro SYS-1029U-TN10RT/X11DPU, BIOS 3.1 04/29/2019
[ 9778.553758] RIP: 0010:drbd_determine_dev_size+0x5a/0x550 [drbd]
[ 9778.560208] Code: 00 48 89 44 24 78 31 c0 e8 13 e1 ff ff 48 c7 c6 f0 04 2c c1 48 89 df e8 14 72 fe ff 48 89 44 24 10 48 85 c0 0f 84 73 04 00 00 <49> 8b 47 18 48 89 04 24 49 8b 47 10 48 89 44 24 08 41 8b 47 48 89
[ 9778.580031] RSP: 0018:ffffb2d58151bd00 EFLAGS: 00010286
[ 9778.585772] RAX: ffff89be458a5000 RBX: ffff89be9365c000 RCX: 0000000000000000
[ 9778.593410] RDX: 0000000000000001 RSI: ffffffffc12c04f0 RDI: ffff89be9365c000
[ 9778.601035] RBP: 0000000000000000 R08: 0000000000000140 R09: 0000000000000180
[ 9778.608656] R10: 0000000000000140 R11: 0000000000004afc R12: 0000000000000000
[ 9778.616268] R13: ffff89bfb7160800 R14: 0000000000000000 R15: 0000000000000000
[ 9778.623881] FS:  0000000000000000(0000) GS:ffff89bcc0840000(0000) knlGS:0000000000000000
[ 9778.632444] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 9778.638660] CR2: 0000000000000018 CR3: 0000007dc6e0a006 CR4: 00000000007706e0
[ 9778.646293] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 9778.653898] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 9778.661486] PKRU: 55555554
[ 9778.664647] Call Trace:
[ 9778.667539]  <TASK>
[ 9778.670062]  ? vprintk_emit+0x128/0x270
[ 9778.674333]  ? printk+0x58/0x6f
[ 9778.677900]  receive_state+0x5f5/0x1080 [drbd]
[ 9778.682779]  ? receive_uuids110+0x570/0x570 [drbd]
[ 9778.687996]  ? drbd_recv+0x46/0x220 [drbd]
[ 9778.692510]  ? decode_header+0x17/0x140 [drbd]
[ 9778.697368]  ? receive_uuids110+0x570/0x570 [drbd]
[ 9778.702565]  drbd_receiver+0x598/0x830 [drbd]
[ 9778.707327]  drbd_thread_setup+0x76/0x1b0 [drbd]
[ 9778.712347]  ? __drbd_next_peer_device_ref+0x1a0/0x1a0 [drbd]
[ 9778.718485]  kthread+0x118/0x140
[ 9778.722092]  ? set_kthread_struct+0x40/0x40
[ 9778.726649]  ret_from_fork+0x1f/0x30
[ 9778.730601]  </TASK>
[ 9778.733164] Modules linked in: drbd_transport_tcp(OE) drbd(OE) bcache crc64 dm_cache dm_persistent_data dm_bio_prison dm_bufio dm_writecache nvme_rdma nvmet_rdma rdma_cm iw_cm ib_cm ib_core dm_mod 8021q garp mrp stp llc rfkill sunrpc intel_rapl_msr intel_rapl_common skx_edac nfit libnvdimm x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm iTCO_wdt iTCO_vendor_support irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel rapl intel_cstate ipmi_ssif vfat mei_me fat i2c_i801 intel_uncore joydev pcspkr mei acpi_ipmi ioatdma i2c_smbus lpc_ich ipmi_si acpi_power_meter acpi_pad binfmt_misc zfs(POE) zunicode(POE) zzstd(OE) zlua(OE) zavl(POE) icp(POE) zcommon(POE) znvpair(POE) spl(OE) nvmet_tcp nvmet nvme_tcp nvme_fabrics xfs libcrc32c sd_mod sg ast i2c_algo_bit drm_vram_helper drm_kms_helper syscopyarea sysfilblt fb_sys_fops drm_ttm_helper ttm drm ixgbe ahci libahci nvme uas nvme_core libata crc32c_intel usb_storage mdio t10_pi dca wmi ipmi_devintf ipmi_msghandler
[ 9778.823619] CR2: 0000000000000018
[ 9778.827377] ---[ end trace 09a2a2ea66dcaf4b ]---
[ 9778.895569] RIP: 0010:drbd_determine_dev_size+0x5a/0x550 [drbd]
[ 9778.901918] Code: 00 48 89 44 24 78 31 c0 e8 13 e1 ff ff 48 c7 c6 f0 04 2c c1 48 89 df e8 14 72 fe ff 48 89 44 24 10 48 85 c0 0f 84 73 04 00 00 <49> 8b 47 18 48 89 04 24 49 8b 47 10 48 89 44 24 08 41 8b 47 48 89
[ 9778.921526] RSP: 0018:ffffb2d58151bd00 EFLAGS: 00010286
[ 9778.927182] RAX: ffff89be458a5000 RBX: ffff89be9365c000 RCX: 0000000000000000
[ 9778.934752] RDX: 0000000000000001 RSI: ffffffffc12c04f0 RDI: ffff89be9365c000
[ 9778.942314] RBP: 0000000000000000 R08: 0000000000000140 R09: 0000000000000180
[ 9778.949877] R10: 0000000000000140 R11: 0000000000004afc R12: 0000000000000000
[ 9778.957422] R13: ffff89bfb7160800 R14: 0000000000000000 R15: 0000000000000000
[ 9778.964962] FS:  0000000000000000(0000) GS:ffff89bcc0840000(0000) knlGS:0000000000000000
[ 9778.973455] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 9778.979608] CR2: 0000000000000018 CR3: 0000007dc6e0a006 CR4: 00000000007706e0
[ 9778.987151] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 9778.994689] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 9779.002226] PKRU: 55555554
[ 9779.005349] Kernel panic - not syncing: Fatal exception
[ 9779.011059] Kernel Offset: 0x36600000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[ 9779.064919] ---[ end Kernel panic - not syncing: Fatal exception ]---
@JoelColledge
Copy link
Contributor

Thanks for the report. The call from receive_state to drbd_determine_dev_size is new in drbd-9.1.7, so it looks like that introduced a bug. I'll look into it.

@JoelColledge
Copy link
Contributor

Fixed by 83cd5b8.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants