dbuf_create() VERIFY(dn->dn_type != DMU_OT_NONE) failed #5396

behlendorf · 2016-11-14T17:59:44Z

Observed while testing with roughly 100 concurrent dbench threads.

VERIFY(dn->dn_type != DMU_OT_NONE) failed
PANIC at dbuf.c:2347:dbuf_create()
Showing stack for process 104709
CPU: 10 PID: 104709 Comm: dbench Tainted: P           OE  ------------   3.10.0-510.0.0.2chaos+2.g956a12a.2016111653.ch6.x86_64 #1
Hardware name: Intel Corporation S2600WTTR/S2600WTTR, BIOS SE5C610.86B.01.01.0016.033120161139 03/31/2016
 ffffffffa0ce8958 00000000339ea79b ffff8815641afa48 ffffffff8169c1d5
 ffff8815641afa58 ffffffffa0560de4 ffff8815641afbe0 ffffffffa0560eaf
 ffff881cfc164080 ffff880600000028 ffff8815641afbf0 ffff8815641afb90
Call Trace:
 [<ffffffff8169c1d5>] dump_stack+0x19/0x1b
 [<ffffffffa0560de4>] spl_dumpstack+0x44/0x50 [spl]
 [<ffffffffa0560eaf>] spl_panic+0xbf/0xf0 [spl]
 [<ffffffff816aedb1>] ? ftrace_call+0x5/0x2f
 [<ffffffffa0bf4e46>] ? refcount_remove+0x16/0x20 [zfs]
 [<ffffffffa0b973d8>] ? dbuf_rele_and_unlock+0x1f8/0x5c0 [zfs]
 [<ffffffff816aedb1>] ? ftrace_call+0x5/0x2f
 [<ffffffffa0b9683e>] dbuf_create+0x7ae/0x7e0 [zfs]
 [<ffffffff816aedb1>] ? ftrace_call+0x5/0x2f
 [<ffffffff816aedb1>] ? ftrace_call+0x5/0x2f
 [<ffffffffa0bc038f>] ? dnode_hold_impl+0x47f/0x9d0 [zfs]
 [<ffffffffa0b968ae>] dbuf_create_bonus+0x3e/0xa0 [zfs]
 [<ffffffffa0ba48f0>] dmu_bonus_hold+0x1b0/0x280 [zfs]
 [<ffffffffa0bf6e4e>] sa_buf_hold+0xe/0x10 [zfs]
 [<ffffffffa0c6f349>] zfs_zget+0x119/0x2b0 [zfs]
 [<ffffffffa0c690d3>] zfs_get_data+0x73/0x500 [zfs]
 [<ffffffffa0c758fc>] zil_commit_writer+0x62c/0xae0 [zfs]
 [<ffffffffa0c75e4d>] zil_commit.part.10+0x9d/0x130 [zfs]
 [<ffffffffa0c75ef7>] zil_commit+0x17/0x20 [zfs]
 [<ffffffffa0c65d6a>] zfs_fsync+0x7a/0xf0 [zfs]
 [<ffffffffa0c892a7>] zpl_fsync+0x87/0x120 [zfs]
 [<ffffffff81239b05>] do_fsync+0x65/0xa0
 [<ffffffff81239dd0>] SyS_fsync+0x10/0x20
 [<ffffffff816ad089>] system_call_fastpath+0x16/0x1b
INFO: task dbench:71980 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
dbench          D ffff882010b5e028     0 71980  71908 0x00000080
 ffff8803fee93e00 0000000000000086 ffff8803fee93fd8 ffff8810306c9f60
 ffff8803fee93fd8 ffff8803fee93fd8 ffff8810306c9f60 ffff882010b5e1c0
 ffff882010b5e000 ffff882010b5e1c8 0000000000000000 ffff882010b5e028
Call Trace:
 [<ffffffff816a1b39>] schedule+0x29/0x70
 [<ffffffffa056362d>] cv_wait_common+0x14d/0x2a0 [spl]
 [<ffffffff810b4640>] ? wake_up_atomic_t+0x30/0x30
 [<ffffffffa0563795>] __cv_wait+0x15/0x20 [spl]
 [<ffffffffa0c75e1b>] zil_commit.part.10+0x6b/0x130 [zfs]
 [<ffffffffa0c75ef7>] zil_commit+0x17/0x20 [zfs]
 [<ffffffffa0c65d6a>] zfs_fsync+0x7a/0xf0 [zfs]
 [<ffffffffa0c892a7>] zpl_fsync+0x87/0x120 [zfs]
 [<ffffffff81239b05>] do_fsync+0x65/0xa0
 [<ffffffff81239dd0>] SyS_fsync+0x10/0x20
 [<ffffffff816ad089>] system_call_fastpath+0x16/0x1b

The text was updated successfully, but these errors were encountered:

As part of commit 50c957f this check was pulled up before the call to dnode_create(). This is racy since the dnode_phys_t in the dbuf could be updated after the check passed but before it's created by dnode_create(). Close the race by adding the original check back to detect this unlikely case. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes openzfs#5396 Closes openzfs#6522

As part of commit 50c957f this check was pulled up before the call to dnode_create(). This is racy since the dnode_phys_t in the dbuf could be updated after the check passed but before it's created by dnode_create(). Close the race by adding the original check back to detect this unlikely case. TEST_ZFSSTRESS_RUNTIME=7200 Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes openzfs#5396 Closes openzfs#6522

As part of commit 50c957f this check was pulled up before the call to dnode_create(). This is racy since the dnode_phys_t in the dbuf could be updated after the check passed but before it's created by dnode_create(). Close the race by adding the original check back to detect this unlikely case. TEST_XFSTESTS_SKIP="yes" TEST_ZFSSTRESS_RUNTIME=3000 Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes openzfs#5396 Closes openzfs#6522

Refactor dmu_object_alloc_dnsize() and dnode_hold_impl() to simplify the code, fix errors introduced by commit dbeb879 (PR openzfs#6117) interacting badly with large dnodes, and improve performance. * When allocating a new dnode in dmu_object_alloc_dnsize(), update the percpu object ID for the core's metadnode chunk immediately. This eliminates most lock contention when taking the hold and creating the dnode. * Correct detection of the chunk boundary to work properly with large dnodes. * Separate the dmu_hold_impl() code for the FREE case from the code for the ALLOCATED case to make it easier to read. * Fully populate the dnode handle array immediately after reading a block of the metadnode from disk. Subsequently the dnode handle array provides enough information to determine which dnode slots are in use and which are free. * Add several kstats to allow the behavior of the code to be examined. * Verify dnode packing in large_dnode_008_pos.ksh. Since the test is purely creates, it should leave very few holes in the metadnode. * Add test large_dnode_009_pos.ksh, which performs concurrent creates and deletes, to complement existing test which does only creates. With the above fixes, there is very little contention in a test of about 200,000 racing dnode allocations produced by tests 'large_dnode_008_pos' and 'large_dnode_009_pos'. name type data dnode_hold_dbuf_hold 4 0 dnode_hold_dbuf_read 4 0 dnode_hold_alloc_hits 4 3804690 dnode_hold_alloc_misses 4 216 dnode_hold_alloc_interior 4 3 dnode_hold_alloc_lock_retry 4 0 dnode_hold_alloc_lock_misses 4 0 dnode_hold_alloc_type_none 4 0 dnode_hold_free_hits 4 203105 dnode_hold_free_misses 4 4 dnode_hold_free_lock_misses 4 0 dnode_hold_free_lock_retry 4 0 dnode_hold_free_overflow 4 0 dnode_hold_free_refcount 4 57 dnode_hold_free_txg 4 0 dnode_allocate 4 203154 dnode_reallocate 4 0 dnode_buf_evict 4 23918 dnode_alloc_next_chunk 4 4887 dnode_alloc_race 4 0 dnode_alloc_next_block 4 18 The performance is slightly improved for concurrent creates with 16+ threads, and unchanged for low thread counts. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Olaf Faaland <faaland1@llnl.gov> Closes openzfs#5396 Closes openzfs#6522 Closes openzfs#6414 Closes openzfs#6564

DeHackEd mentioned this issue Jun 19, 2017

VERIFY(dn->dn_type != DMU_OT_NONE) #6248

Closed

dinatale2 mentioned this issue Jul 19, 2017

Test stabilization patches for 0.7.0 rc5 and current builder configs [WIP] #6376

Closed

13 tasks

behlendorf added this to the 0.8.0 milestone Jul 25, 2017

behlendorf mentioned this issue Aug 16, 2017

VERIFY(dn->dn_type != DMU_OT_NONE) failed #6522

Closed

behlendorf mentioned this issue Aug 25, 2017

Fix dn->dn_type != DMU_OT_NONE in dbuf_create() #6559

Closed

13 tasks

behlendorf closed this as completed in 4c5b89f Sep 5, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

dbuf_create() VERIFY(dn->dn_type != DMU_OT_NONE) failed #5396

dbuf_create() VERIFY(dn->dn_type != DMU_OT_NONE) failed #5396

behlendorf commented Nov 14, 2016

dbuf_create() VERIFY(dn->dn_type != DMU_OT_NONE) failed #5396

dbuf_create() VERIFY(dn->dn_type != DMU_OT_NONE) failed #5396

Comments

behlendorf commented Nov 14, 2016