-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
VERIFY(dn->dn_type != DMU_OT_NONE) failed #6522
Comments
as I revert back to 0.6.5.11 for stable system, |
Sorry you had to rollback. This is a duplicate of #5396 which has been hard to reproduce, but it looks like you found a way. Was there anything special about your workload? |
I am not sure about do I have any special workload, (Yes) Single Volume (Yes) heavy metadata workload so I will try to layout my storage first
I didn't attached L2ARC previously due to I remember I left a bug report about leaking stats, although the system has 32GB ECC memory, I do have memory pressure,
the constant workload is mostly cause by CrashPlan backup service, there are daily backup synoid, as this crash happaned on my daily cron job are schedule at AM 12:00, yea it do suspicious the workload cause by CrashPlan, the database use by crashplan is close format, proprietary, here is the simple iops history, disk io graph but it's hard to tell because crashplan also scan full filesystem in the mean time, hope this info help |
As part of commit 50c957f this check was pulled up before the call to dnode_create(). This is racy since the dnode_phys_t in the dbuf could be updated after the check passed but before it's created by dnode_create(). Close the race by adding the original check back to detect this unlikely case. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes openzfs#5396 Closes openzfs#6522
As part of commit 50c957f this check was pulled up before the call to dnode_create(). This is racy since the dnode_phys_t in the dbuf could be updated after the check passed but before it's created by dnode_create(). Close the race by adding the original check back to detect this unlikely case. TEST_ZFSSTRESS_RUNTIME=7200 Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes openzfs#5396 Closes openzfs#6522
As part of commit 50c957f this check was pulled up before the call to dnode_create(). This is racy since the dnode_phys_t in the dbuf could be updated after the check passed but before it's created by dnode_create(). Close the race by adding the original check back to detect this unlikely case. TEST_XFSTESTS_SKIP="yes" TEST_ZFSSTRESS_RUNTIME=3000 Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes openzfs#5396 Closes openzfs#6522
Refactor dmu_object_alloc_dnsize() and dnode_hold_impl() to simplify the code, fix errors introduced by commit dbeb879 (PR openzfs#6117) interacting badly with large dnodes, and improve performance. * When allocating a new dnode in dmu_object_alloc_dnsize(), update the percpu object ID for the core's metadnode chunk immediately. This eliminates most lock contention when taking the hold and creating the dnode. * Correct detection of the chunk boundary to work properly with large dnodes. * Separate the dmu_hold_impl() code for the FREE case from the code for the ALLOCATED case to make it easier to read. * Fully populate the dnode handle array immediately after reading a block of the metadnode from disk. Subsequently the dnode handle array provides enough information to determine which dnode slots are in use and which are free. * Add several kstats to allow the behavior of the code to be examined. * Verify dnode packing in large_dnode_008_pos.ksh. Since the test is purely creates, it should leave very few holes in the metadnode. * Add test large_dnode_009_pos.ksh, which performs concurrent creates and deletes, to complement existing test which does only creates. With the above fixes, there is very little contention in a test of about 200,000 racing dnode allocations produced by tests 'large_dnode_008_pos' and 'large_dnode_009_pos'. name type data dnode_hold_dbuf_hold 4 0 dnode_hold_dbuf_read 4 0 dnode_hold_alloc_hits 4 3804690 dnode_hold_alloc_misses 4 216 dnode_hold_alloc_interior 4 3 dnode_hold_alloc_lock_retry 4 0 dnode_hold_alloc_lock_misses 4 0 dnode_hold_alloc_type_none 4 0 dnode_hold_free_hits 4 203105 dnode_hold_free_misses 4 4 dnode_hold_free_lock_misses 4 0 dnode_hold_free_lock_retry 4 0 dnode_hold_free_overflow 4 0 dnode_hold_free_refcount 4 57 dnode_hold_free_txg 4 0 dnode_allocate 4 203154 dnode_reallocate 4 0 dnode_buf_evict 4 23918 dnode_alloc_next_chunk 4 4887 dnode_alloc_race 4 0 dnode_alloc_next_block 4 18 The performance is slightly improved for concurrent creates with 16+ threads, and unchanged for low thread counts. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Olaf Faaland <faaland1@llnl.gov> Closes openzfs#5396 Closes openzfs#6522 Closes openzfs#6414 Closes openzfs#6564
System information
Describe the problem you're observing
system stop working after this panic
Describe how to reproduce the problem
running new zfs 0.7.1 on my little home server about a day
Include any warning/errors/backtraces from the system logs
The text was updated successfully, but these errors were encountered: