Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VERIFY(dn->dn_type != DMU_OT_NONE) failed #6522

Closed
AndCycle opened this issue Aug 16, 2017 · 3 comments
Closed

VERIFY(dn->dn_type != DMU_OT_NONE) failed #6522

AndCycle opened this issue Aug 16, 2017 · 3 comments

Comments

@AndCycle
Copy link

System information

Type Version/Name
Distribution Name Gentoo
Linux Kernel 4.12.5-gentoo
Architecture x86_64
ZFS Version 0.7.1-r0-gentoo
SPL Version 0.7.1-r0-gentoo

Describe the problem you're observing

system stop working after this panic

Describe how to reproduce the problem

running new zfs 0.7.1 on my little home server about a day

Include any warning/errors/backtraces from the system logs

[114432.726444] VERIFY(dn->dn_type != DMU_OT_NONE) failed
[114432.727750] PANIC at dbuf.c:2308:dbuf_create()
[114432.729099] Showing stack for process 28557
[114432.729102] CPU: 3 PID: 28557 Comm: mongod Tainted: P        W  O    4.12.5-gentoo #1
[114432.729103] Hardware name: Intel Corporation S1200RP/S1200RP, BIOS S1200RP.86B.03.03.0003.121820151104 12/18/2015
[114432.729103] Call Trace:
[114432.729110]  dump_stack+0x4d/0x67
[114432.729115]  spl_dumpstack+0x3d/0x40 [spl]
[114432.729117]  spl_panic+0xc3/0x110 [spl]
[114432.729120]  ? getrawmonotonic64+0x82/0xc0
[114432.729124]  ? mutex_lock+0xd/0x30
[114432.729155]  ? refcount_remove_many+0x1e5/0x2d0 [zfs]
[114432.729171]  ? refcount_remove+0x11/0x20 [zfs]
[114432.729185]  ? dbuf_rele_and_unlock+0x1bf/0x4d0 [zfs]
[114432.729198]  dbuf_create+0x68c/0x800 [zfs]
[114432.729211]  ? dbuf_rele+0x46/0x80 [zfs]
[114432.729226]  ? dnode_hold_impl+0x60a/0xb50 [zfs]
[114432.729239]  dbuf_create_bonus+0x39/0xa0 [zfs]
[114432.729253]  dmu_bonus_hold+0x16c/0x210 [zfs]
[114432.729271]  sa_buf_hold+0x9/0x10 [zfs]
[114432.729286]  zfs_zget+0x10e/0x2d0 [zfs]
[114432.729300]  ? zio_rewrite+0x2e/0x30 [zfs]
[114432.729314]  ? zil_lwb_write_init+0x220/0x220 [zfs]
[114432.729316]  ? spl_kmem_cache_free+0x153/0x270 [spl]
[114432.729332]  zfs_get_data+0x57/0x430 [zfs]
[114432.729346]  zil_commit.part.8+0x7d0/0xd50 [zfs]
[114432.729365]  ? rrw_enter_read_impl+0x125/0x220 [zfs]
[114432.729379]  zil_commit+0x12/0x20 [zfs]
[114432.729394]  zpl_writepages+0xd1/0x160 [zfs]
[114432.729397]  do_writepages+0x17/0x60
[114432.729399]  __filemap_fdatawrite_range+0xa5/0xe0
[114432.729402]  filemap_write_and_wait_range+0x3c/0x90
[114432.729426]  zpl_fsync+0x37/0x100 [zfs]
[114432.729428]  vfs_fsync_range+0x44/0xa0
[114432.729430]  ? find_vma+0x63/0x70
[114432.729432]  SyS_msync+0x178/0x1f0
[114432.729434]  entry_SYSCALL_64_fastpath+0x17/0x98
[114432.729435] RIP: 0033:0x7f958f6ee5ed
[114432.729436] RSP: 002b:00007f958d8ce200 EFLAGS: 00000293 ORIG_RAX: 000000000000001a
[114432.729438] RAX: ffffffffffffffda RBX: 000000082b317808 RCX: 00007f958f6ee5ed
[114432.729438] RDX: 0000000000000004 RSI: 0000000001000000 RDI: 00007f95768cc000
[114432.729439] RBP: 00007f958d8ce6d0 R08: 0000000000000000 R09: 0000000000000000
[114432.729440] R10: 0000000000000001 R11: 0000000000000293 R12: 000000082b317808
[114432.729441] R13: 00007f958d8ce4f0 R14: 00007f958d8ce4d0 R15: 000000082b317640
@AndCycle
Copy link
Author

as I revert back to 0.6.5.11 for stable system,
so probably not much I can help.

@behlendorf
Copy link
Contributor

Sorry you had to rollback. This is a duplicate of #5396 which has been hard to reproduce, but it looks like you found a way. Was there anything special about your workload?

@AndCycle
Copy link
Author

AndCycle commented Aug 16, 2017

I am not sure about do I have any special workload,
it's a mix bag except I don't use ZVOL,
in general

(Yes) Single Volume
(Yes) RAIDZ
(No) ZVOL
(Yes) L2ARC device
(No) ZIL device

(Yes) heavy metadata workload
(Maybe) weird database access pattern
(Yes) a little memory pressure

so I will try to layout my storage first


zroot 444G 339G
Intel 730 SSD, attached on LSI SAS HBA
single boot volume, resident for root system and /var
use for DNS/routing/email/web/database,
pretty low utilization


zassist 222G 143G
single volume
Intel 240 SSD, attached on LSI SAS HBA
minecraft server, low usage, but heavy metadata due to web-based maps, lot's ton of little chunk jpeg
CrashPlan backup client side software, due to it's inefficient about local database


ztank 32.5T 12.5T
6TBx6 raidz2, attached through S1200RP onboard
L2ARC on Intel 240 SSD partition2, attached on LSI SAS HBA
use for /home and synoid backup for zroot/zaside
large volume of personal data,
also millions of small file under 10k due to my little web crawler cron job use for scrap twitter for analysis
2 virtual machine by qemu, access through qemu raw image, no zvol
1 is running a pretty old FreeBSD, 1 is a test bench windows 10,


zaside 10.9T 4.69T
4TBx3 raidz1
temporal data storage, mostly through samba


zmess 27.2T 16.4T
6TBx6 raidz2, through external USB3
L2ARC on Intel 240 SSD partition3, attached on LSI SAS HBA
shared storage through samba


zarchive 29T 20.6T
4TBx8 raidz2, through external USB3
mostly inactive for filming video data archive


I didn't attached L2ARC previously due to I remember I left a bug report about leaking stats,
I thought I should attached on 0.7.1 for another go,
unfortunately I can't get result.


although the system has 32GB ECC memory, I do have memory pressure,
it's constantly have some swap activity due to

  1. nearly 16GB use by CrashPlan backup service,
  2. 6 GB by virtual machine
  3. BOINC, network computing, dispatch random job use all idle resource
    swap resident on Intel 240 SSD partition4, attached on LSI SAS HBA

the constant workload is mostly cause by CrashPlan backup service,
which do full filesystem walk for backup, scan for difference, and it prefer newest small files,
other than that, hardly say there are anything special,

there are daily backup synoid,
daily cron for locate update database, which walk through millions for files

as this crash happaned on Aug 17 04:19:13,

my daily cron job are schedule at AM 12:00,
CrashPlan scan are schedule at AM 03:00,

yea it do suspicious the workload cause by CrashPlan,
it will do both file scan and local database read and database update,

the database use by crashplan is close format, proprietary,
I do some simple strace figured it doesn't align at all, cause heavy write amplification,

here is the simple iops history, disk io graph
you can easily tell crashplan is doing something really inefficient on it's database
that's why I move it to SSD,

but it's hard to tell because crashplan also scan full filesystem in the mean time,

hope this info help

behlendorf added a commit to behlendorf/zfs that referenced this issue Aug 25, 2017
As part of commit 50c957f this check was pulled up before the
call to dnode_create().  This is racy since the dnode_phys_t
in the dbuf could be updated after the check passed but before
it's created by dnode_create().  Close the race by adding the
original check back to detect this unlikely case.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes openzfs#5396
Closes openzfs#6522
behlendorf added a commit to behlendorf/zfs that referenced this issue Aug 25, 2017
As part of commit 50c957f this check was pulled up before the
call to dnode_create().  This is racy since the dnode_phys_t
in the dbuf could be updated after the check passed but before
it's created by dnode_create().  Close the race by adding the
original check back to detect this unlikely case.

TEST_ZFSSTRESS_RUNTIME=7200

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes openzfs#5396
Closes openzfs#6522
behlendorf added a commit to behlendorf/zfs that referenced this issue Aug 25, 2017
As part of commit 50c957f this check was pulled up before the
call to dnode_create().  This is racy since the dnode_phys_t
in the dbuf could be updated after the check passed but before
it's created by dnode_create().  Close the race by adding the
original check back to detect this unlikely case.

TEST_XFSTESTS_SKIP="yes"
TEST_ZFSSTRESS_RUNTIME=3000

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes openzfs#5396
Closes openzfs#6522
FransUrbo pushed a commit to FransUrbo/zfs that referenced this issue Apr 28, 2019
Refactor dmu_object_alloc_dnsize() and dnode_hold_impl() to simplify the
code, fix errors introduced by commit dbeb879 (PR openzfs#6117) interacting
badly with large dnodes, and improve performance.

* When allocating a new dnode in dmu_object_alloc_dnsize(), update the
percpu object ID for the core's metadnode chunk immediately.  This
eliminates most lock contention when taking the hold and creating the
dnode.

* Correct detection of the chunk boundary to work properly with large
dnodes.

* Separate the dmu_hold_impl() code for the FREE case from the code for
the ALLOCATED case to make it easier to read.

* Fully populate the dnode handle array immediately after reading a
block of the metadnode from disk.  Subsequently the dnode handle array
provides enough information to determine which dnode slots are in use
and which are free.

* Add several kstats to allow the behavior of the code to be examined.

* Verify dnode packing in large_dnode_008_pos.ksh.  Since the test is
purely creates, it should leave very few holes in the metadnode.

* Add test large_dnode_009_pos.ksh, which performs concurrent creates
and deletes, to complement existing test which does only creates.

With the above fixes, there is very little contention in a test of about
200,000 racing dnode allocations produced by tests 'large_dnode_008_pos'
and 'large_dnode_009_pos'.

name                            type data
dnode_hold_dbuf_hold            4    0
dnode_hold_dbuf_read            4    0
dnode_hold_alloc_hits           4    3804690
dnode_hold_alloc_misses         4    216
dnode_hold_alloc_interior       4    3
dnode_hold_alloc_lock_retry     4    0
dnode_hold_alloc_lock_misses    4    0
dnode_hold_alloc_type_none      4    0
dnode_hold_free_hits            4    203105
dnode_hold_free_misses          4    4
dnode_hold_free_lock_misses     4    0
dnode_hold_free_lock_retry      4    0
dnode_hold_free_overflow        4    0
dnode_hold_free_refcount        4    57
dnode_hold_free_txg             4    0
dnode_allocate                  4    203154
dnode_reallocate                4    0
dnode_buf_evict                 4    23918
dnode_alloc_next_chunk          4    4887
dnode_alloc_race                4    0
dnode_alloc_next_block          4    18

The performance is slightly improved for concurrent creates with
16+ threads, and unchanged for low thread counts.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Closes openzfs#5396 
Closes openzfs#6522 
Closes openzfs#6414 
Closes openzfs#6564
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants