Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Null pointer dereference in dva_get_dsize_sync #1707

Closed
lutorm opened this issue Sep 9, 2013 · 6 comments
Closed

Null pointer dereference in dva_get_dsize_sync #1707

lutorm opened this issue Sep 9, 2013 · 6 comments
Milestone

Comments

@lutorm
Copy link

lutorm commented Sep 9, 2013

For some reason, my system have begun to see a zfs lockup. It will happen within a minute with a particular workload. I've tried kernel 3.4.0 and zfs 0.6.1 and 0.6.2 and kernel 3.8.0 / zfs 0.6.2 and it seems to be the same. The dmesg is below.

Any ideas? I don't understand why it just started, I've been using it pretty stably for months.

cheers,

/Patrik

[ 1280.984093] BUG: unable to handle kernel NULL pointer dereference at 0000000000000560
[ 1280.984100] IP: [] dva_get_dsize_sync+0x4d/0x60 [zfs]
[ 1280.984136] PGD 5c01c8067 PUD 5c0237067 PMD 0
[ 1280.984139] Oops: 0000 [#1] SMP
[ 1280.984141] Modules linked in: zfs(POF) zcommon(POF) znvpair(POF) zavl(POF) zunicode(POF) spl(OF) zlib_deflate(F) openvswitch(F) autofs4(F) snd_hda_codec_hdmi(F) nvidia(POF) coretemp(F) kvm_intel(F) kvm(F) ghash_clmulni_intel(F) aesni_intel(F) ablk_helper(F) cryptd(F) lrw(F) bnep(F) aes_x86_64(F) xts(F) gf128mul(F) rfcomm(F) gpio_ich(F) snd_hda_codec_realtek(F) bluetooth(F) hp_wmi(F) sparse_keymap(F) snd_hda_intel(F) snd_hda_codec(F) snd_hwdep(F) snd_pcm(F) psmouse(F) microcode(F) serio_raw(F) snd_seq_midi(F) parport_pc(F) snd_rawmidi(F) ppdev(F) snd_seq_midi_event(F) snd_seq(F) snd_timer(F) snd_seq_device(F) lpc_ich(F) i7core_edac(F) wmi(F) snd(F) edac_core(F) soundcore(F) snd_page_alloc(F) mac_hid(F) nfsd(F) nfs_acl(F) auth_rpcgss(F) nfs(F) fscache(F) lockd(F) lp(F) parport(F) sunrpc(F) hid_generic(F) hid_logitech_dj(F) usbhid(F) hid(F) raid10(F) raid456(F) async_pq(F) async_xor(F) xor(F) async_memcpy(F) async_raid6_recov(F) firewire_ohci(F) firewire_core(F) crc_itu_t(F) mptsas(F) mptscsih(F) mptbase(F) scsi_transport_sas(F) ahci(F) libahci(F) floppy(F) tg3(F) ptp(F) pps_core(F) raid6_pq(F) async_tx(F) raid1(F) raid0(F) multipath(F) linear(F)
[ 1280.984180] CPU 6
[ 1280.984183] Pid: 14252, comm: txg_sync Tainted: PF IO 3.8.0-30-generic #44~precise1-Ubuntu Hewlett-Packard HP Z800 Workstation/0AECh
[ 1280.984185] RIP: 0010:[] [] dva_get_dsize_sync+0x4d/0x60 [zfs]
[ 1280.984202] RSP: 0000:ffff8805fd2a5788 EFLAGS: 00010293
[ 1280.984203] RAX: 0000000000000000 RBX: 0000000000000004 RCX: 0000000000000002
[ 1280.984204] RDX: ffff880306f1e000 RSI: 0000000000000004 RDI: ffff8805e8e9b000
[ 1280.984205] RBP: ffff8805fd2a5798 R08: 02d354c6c0916998 R09: ffff880571df5a00
[ 1280.984206] R10: ffffffffa04b58cb R11: 0000000000000000 R12: ffffc9001adbee00
[ 1280.984207] R13: 0000000000000bf8 R14: 0000000000000003 R15: ffff8805c8cc8000
[ 1280.984208] FS: 0000000000000000(0000) GS:ffff88061fc00000(0000) knlGS:0000000000000000
[ 1280.984209] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 1280.984210] CR2: 0000000000000560 CR3: 00000005bff07000 CR4: 00000000000007e0
[ 1280.984211] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1280.984212] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 1280.984214] Process txg_sync (pid: 14252, threadinfo ffff8805fd2a4000, task ffff8805de885d00)
[ 1280.984214] Stack:
[ 1280.984215] 0000000000000000 ffff8805e8e9b000 ffff8805fd2a57c8 ffffffffa0634255
[ 1280.984218] ffff880579bf4cc0 ffffc9001adbee00 ffff8802b38cde80 ffff880571df5a00
[ 1280.984220] ffff8805fd2a5848 ffffffffa05e5d45 00000000b434f1c0 ffff880200000000
[ 1280.984222] Call Trace:
[ 1280.984240] [] bp_get_dsize+0x55/0x90 [zfs]
[ 1280.984250] [] dbuf_dirty+0x705/0x810 [zfs]
[ 1280.984259] [] dmu_buf_will_dirty+0xaf/0x100 [zfs]
[ 1280.984276] [] zap_get_leaf_byblk+0x111/0x2d0 [zfs]
[ 1280.984293] [] ? zap_idx_to_blk+0xdf/0x140 [zfs]
[ 1280.984301] [] ? kmem_alloc_debug+0x138/0x3b0 [spl]
[ 1280.984304] [] ? kmem_alloc_debug+0x138/0x3b0 [spl]
[ 1280.984321] [] zap_deref_leaf+0x6d/0x80 [zfs]
[ 1280.984338] [] fzap_update+0xad/0x170 [zfs]
[ 1280.984354] [] zap_update_uint64+0x9b/0xe0 [zfs]
[ 1280.984364] [] ddt_zap_update+0x81/0xb0 [zfs]
[ 1280.984369] [] ? kfree+0x3b/0x150
[ 1280.984385] [] ? dva_get_dsize_sync+0x4d/0x60 [zfs]
[ 1280.984395] [] ? ddt_stat_update+0x129/0x1b0 [zfs]
[ 1280.984405] [] ddt_sync+0x237/0x6b0 [zfs]
[ 1280.984411] [] ? __wake_up+0x53/0x70
[ 1280.984427] [] spa_sync+0x46e/0xae0 [zfs]
[ 1280.984431] [] ? ktime_get_ts+0x4c/0xe0
[ 1280.984447] [] txg_sync_thread+0x2df/0x540 [zfs]
[ 1280.984463] [] ? txg_init+0x250/0x250 [zfs]
[ 1280.984467] [] thread_generic_wrapper+0x78/0x90 [spl]
[ 1280.984471] [] ? __thread_create+0x310/0x310 [spl]
[ 1280.984474] [] kthread+0xc0/0xd0
[ 1280.984476] [] ? flush_kthread_worker+0xb0/0xb0
[ 1280.984480] [] ret_from_fork+0x7c/0xb0
[ 1280.984482] [] ? flush_kthread_worker+0xb0/0xb0
[ 1280.984483] Code: 74 0d 48 83 bf 28 09 00 00 00 48 89 c3 75 0f 48 83 c4 08 48 89 d8 5b 5d c3 0f 1f 44 00 00 48 c1 ee 20 48 c1 eb 09 e8 b3 5d 00 00 <48> 0f af 98 60 05 00 00 48 83 c4 08 48 89 d8 5b 5d c3 90 66 66
[ 1280.984501] RIP [] dva_get_dsize_sync+0x4d/0x60 [zfs]
[ 1280.984517] RSP
[ 1280.984518] CR2: 0000000000000560
[ 1280.984520] ---[ end trace d34fc7027f3e38a3 ]---

@lutorm
Copy link
Author

lutorm commented Sep 10, 2013

This appears to be related to dedup. I tried changing the PREEMPT settings as was suggested in another hang, but it had no effect. Turning off dedup, however, seems to have made the problem go away.

@behlendorf
Copy link
Contributor

The would be consistent with the stack above. It appears there's something wrong with the zap used for deduplication.

@lutorm
Copy link
Author

lutorm commented Sep 27, 2013

Well, it certainly seems to have gotten a lot better without the dedup, but now it happened again. Slightly different stack trace:

[1457404.290472] CPU 4
[1457404.290475] Pid: 13235, comm: as Tainted: PF O 3.9.0-0-generic #1~precise2 Hewlett-Packard HP Z800 Workstation/0AECh
[1457404.290477] RIP: 0010:[] [] vdev_readable+0x7/0x40 [zfs]
[1457404.290494] RSP: 0018:ffff88055997d600 EFLAGS: 00010246
[1457404.290495] RAX: 0000000000000000 RBX: 0000000000000002 RCX: ffff8801d90a8de0
[1457404.290496] RDX: 0000000000000040 RSI: ffff8802fc6e6f80 RDI: 0000000000000000
[1457404.290497] RBP: ffff88055997d638 R08: 000000000ed5b291 R09: 000000006ad94887
[1457404.290498] R10: 00000000ec0780d5 R11: 00000000adfd4793 R12: ffff8802fc6e6fd0
[1457404.290499] R13: 0000000000000000 R14: ffff8802fc6e6f80 R15: 000000000070939f
[1457404.290500] FS: 00002aaaaaaf3280(0000) GS:ffff880313c80000(0000) knlGS:0000000000000000
[1457404.290501] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[1457404.290502] CR2: 0000000000000040 CR3: 00000001f4b74000 CR4: 00000000000027e0
[1457404.290503] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[1457404.290504] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[1457404.290505] Process as (pid: 13235, threadinfo ffff88055997c000, task ffff8804749a9750)
[1457404.290506] Stack:
[1457404.290507] ffffffffa02d0142 ffff88055997d638 ffff8801d90a8d60 ffff8802fc6e6f80
[1457404.290509] ffff8801d90a8dc0 ffff880607c82000 0000000000000003 ffff88055997d6c8
[1457404.290511] ffffffffa02d07dc ffff88055997d668 ffffffff8170382d ffff88055997d678
[1457404.290514] Call Trace:
[1457404.290532] [] ? vdev_mirror_child_select.isra.1+0x62/0x110 [zfs]
[1457404.290549] [] vdev_mirror_io_start+0x38c/0x4e0 [zfs]
[1457404.290552] [] ? mutex_lock+0x1d/0x50
[1457404.290554] [] ? mutex_lock+0x1d/0x50
[1457404.290573] [] zio_vdev_io_start+0x237/0x2f0 [zfs]
[1457404.290590] [] zio_nowait+0xba/0x190 [zfs]
[1457404.290599] [] arc_read+0x573/0x860 [zfs]
[1457404.290611] [] ? ddt_zap_update+0xb0/0xb0 [zfs]
[1457404.290621] [] dbuf_prefetch+0x1b0/0x260 [zfs]
[1457404.290632] [] ? ddt_zap_update+0xb0/0xb0 [zfs]
[1457404.290643] [] dmu_prefetch+0x289/0x2d0 [zfs]
[1457404.290654] [] ? ddt_zap_update+0xb0/0xb0 [zfs]
[1457404.290671] [] fzap_prefetch+0x5d/0x70 [zfs]
[1457404.290688] [] zap_prefetch_uint64+0x76/0xa0 [zfs]
[1457404.290699] [] ddt_zap_prefetch+0x13/0x20 [zfs]
[1457404.290710] [] ddt_prefetch+0xbc/0xc0 [zfs]
[1457404.290713] [] ? trace_save_cmdline+0x7a/0xf0
[1457404.290717] [] ? read_tsc+0x9/0x20
[1457404.290719] [] ? default_spin_lock_flags+0x9/0x10
[1457404.290722] [] ? __kmalloc+0x14c/0x190
[1457404.290724] [] ? mutex_lock+0x1d/0x50
[1457404.290733] [] ? add_reference.isra.7+0xe6/0x120 [zfs]
[1457404.290736] [] ? mutex_lock+0x1d/0x50
[1457404.290746] [] ? dbuf_read+0x5cb/0x860 [zfs]
[1457404.290748] [] ? mutex_lock+0x1d/0x50
[1457404.290750] [] ? mutex_lock+0x1d/0x50
[1457404.290766] [] ? spa_config_exit+0x7d/0xc0 [zfs]
[1457404.290780] [] dsl_dataset_block_freeable+0x40/0x60 [zfs]
[1457404.290793] [] dmu_tx_hold_free+0x44c/0x8e0 [zfs]
[1457404.290804] [] dmu_free_long_range_impl+0x175/0x260 [zfs]
[1457404.290815] [] dmu_free_long_range+0x4b/0x70 [zfs]
[1457404.290832] [] zfs_rmnode+0x60/0x330 [zfs]
[1457404.290849] [] zfs_zinactive+0x84/0xd0 [zfs]
[1457404.290866] [] zfs_inactive+0x7c/0x230 [zfs]
[1457404.290869] [] ? truncate_pagecache+0x5e/0x70
[1457404.290886] [] zpl_evict_inode+0x28/0x30 [zfs]
[1457404.290889] [] evict+0xc0/0x1d0
[1457404.290892] [] iput_final+0xe2/0x170
[1457404.290894] [] iput+0x3e/0x50
[1457404.290896] [] do_unlinkat+0x201/0x270
[1457404.290899] [] ? sys_newlstat+0x2a/0x40
[1457404.290901] [] ? do_page_fault+0x39/0x50
[1457404.290903] [] sys_unlink+0x16/0x20
[1457404.290906] [] system_call_fastpath+0x1a/0x1f
[1457404.290907] Code: 01 00 00 00 48 89 e5 76 17 48 83 bf 70 05 00 00 00 75 0d 31 c0 48 81 7f 50 c0 dd 31 a0 0f 94 c0 5d c3 66 90 66 66 66 66 90 31 c0 <48> 83 7f 40 05 55 48 89 e5 76 20 48 83 bf 70 05 00 00 00 75 16
[1457404.290924] RIP [] vdev_readable+0x7/0x40 [zfs]
[1457404.290940] RSP
[1457404.290941] CR2: 0000000000000040
[1457404.290949] ---[ end trace f236b950474b59de ]---

@lutorm
Copy link
Author

lutorm commented Sep 27, 2013

Actually, after that error I'm now getting this same error whenever it tries to mount that volume. Does this mean my data are gone or is there something I can try?

@lutorm
Copy link
Author

lutorm commented Sep 27, 2013

I tried to scrub the pool and it got to about 16% before it hung again with the same error, now with zpool status also saying that there were errors detected in the metadata. I ended up blowing the volume away.

@behlendorf
Copy link
Contributor

@lutorm The latest stacks suggest a problem with the existing dedup tables on disk. Unfortunately destroying and recreating the pool without deduplication is quickest way to entirely resolve the issue.

behlendorf added a commit to behlendorf/zfs that referenced this issue May 5, 2014
The dva_get_dsize_sync() function incorrectly assumes that the call
to vdev_lookup_top() cannot fail.  However, the NULL dereference at
clearly shows that under certain circumstances it is possible.  Note
that offset 0x570 (1376) maps as expected to vd->vdev_deflate_ratio.

  BUG: unable to handle kernel NULL pointer dereference at 00000570

  crash> struct -o vdev
  struct vdev {
       [0] uint64_t vdev_id;
       ... ...
    [1376] uint64_t vdev_deflate_ratio;

Given that this can happen this patch add the required error handling.
In the case where vdev_lookup_top() fails assume that no deflation
will occur for the DVA and use the asize.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue openzfs#1707
Issue openzfs#1987
Issue openzfs#1891
behlendorf added a commit to behlendorf/zfs that referenced this issue May 5, 2014
The dva_get_dsize_sync() function incorrectly assumes that the call
to vdev_lookup_top() cannot fail.  However, the NULL dereference at
clearly shows that under certain circumstances it is possible.  Note
that offset 0x570 (1376) maps as expected to vd->vdev_deflate_ratio.

  BUG: unable to handle kernel NULL pointer dereference at 00000570

  crash> struct -o vdev
  struct vdev {
       [0] uint64_t vdev_id;
       ... ...
    [1376] uint64_t vdev_deflate_ratio;

Given that this can happen this patch add the required error handling.
In the case where vdev_lookup_top() fails assume that no deflation
will occur for the DVA and use the asize.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue openzfs#1707
Issue openzfs#1987
Issue openzfs#1891
@behlendorf behlendorf modified the milestones: 0.6.3, 0.6.4 May 6, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants