All writers block on same hash table mutex hurting performance #238

imtiazdc · 2020-04-04T07:58:25Z

I see a performance issue with all writers trying to acquire same mutex in dbuf_hash_table while writing to zvol (RAW disk). I am using Iometer to generate workload.
During live debugging, I found that there are 8 threads executing dbuf_find (as part of zvol_write_win) out of which 7 are waiting for the mutex held by the 8th thread. The output of !stacks 2 zfsin is attached.

dmu_buf_impl_t *
dbuf_find(objset_t *os, uint64_t obj, uint8_t level, uint64_t blkid)
{
    dbuf_hash_table_t *h = &dbuf_hash_table;
    uint64_t hv = dbuf_hash(os, obj, level, blkid);
    uint64_t idx;
    dmu_buf_impl_t *db;
    idx = hv & h->hash_table_mask;
    mutex_enter(DBUF_HASH_MUTEX(h, idx));    // 7 threads wait here for the "mutex" held by 8th thread

Here's the problem, all 8 threads have same input parameters (shown below) to the function dbuf_find -
0xffffc78f`a4734180
0
0
0
which is causing them to map to the same mutex within the hash table.

Is this expected and mandatory in zvol writes? Are there ways to improve?

Below is my setup:
VM configuration:
4 vCPU
8 GB RAM
zvol settings:
8GB (thick provisioned)
Everything is default. Dedup=off, Compression=off, ...
Iometer settings:
4 workers
Data pattern = full random
No of outstanding IOs = 64
Access spec = 4 KiB; 0% Read; 100% random

Thanks,
Imtiaz

imtiazdc · 2020-04-05T19:43:47Z

As can be seen from stack traces below, the writer threads are waiting for the mutex in open context (and not syncing context).

Stack trace of writer thread that has acquired the hash table mutex

   4.001524  ffffc78fb6226040 0000001 RUNNING    nt!DbgBreakPointWithStatus
                                        nt!KeAccumulateTicks+0x39c
                                        nt!KeClockInterruptNotify+0xb8
                                        hal!HalpTimerClockIpiRoutine+0x15
                                        nt!KiCallInterruptServiceRoutine+0x106
                                        nt!KiInterruptSubDispatchNoLockNoEtw+0xea
                                        nt!KiInterruptDispatchNoLockNoEtw+0x37
                                        ZFSin!spl_mutex_enter+0x14b
                                        ZFSin!dbuf_find+0x105
                                        ZFSin!__dbuf_hold_impl+0x23f
                                        ZFSin!dbuf_hold_impl+0x87
                                        ZFSin!dbuf_hold_level+0x50
                                        ZFSin!dbuf_hold+0x29
                                        ZFSin!dnode_hold_impl+0x40c
                                        ZFSin!dnode_hold+0x44
                                        ZFSin!dmu_tx_hold_object_impl+0x44
                                        ZFSin!dmu_tx_hold_write+0x176
                                        ZFSin!zvol_write_win+0x1f4
                                        ZFSin!wzvol_WkRtn+0x267
                                        ZFSin!wzvol_GeneralWkRtn+0x6d
                                        nt!IopProcessWorkItem+0x12a
                                        nt!ExpWorkerThread+0xe9
                                        nt!PspSystemThreadStartup+0x41
                                        nt!KiStartSystemThread+0x16

Stack traces of 7 writer threads that are waiting for the mutex acquired by the thread above

   4.000ee4  ffffc78fb60c1800 0000000 Blocked    nt!KiSwapContext+0x76
                                        nt!KiSwapThread+0x17d
                                        nt!KiCommitThreadWait+0x14f
                                        nt!KeWaitForSingleObject+0x377
                                        ZFSin!spl_mutex_enter+0x12b
                                        ZFSin!dbuf_find+0x7c
                                        ZFSin!__dbuf_hold_impl+0x23f
                                        ZFSin!dbuf_hold_impl+0x87
                                        ZFSin!dbuf_hold_level+0x50
                                        ZFSin!dbuf_hold+0x29
                                        ZFSin!dnode_hold_impl+0x40c
                                        ZFSin!dnode_hold+0x44
                                        ZFSin!dmu_tx_hold_object_impl+0x44
                                        ZFSin!dmu_tx_hold_write+0x176
                                        ZFSin!zvol_write_win+0x1f4
                                        ZFSin!wzvol_WkRtn+0x267
                                        ZFSin!wzvol_GeneralWkRtn+0x6d
                                        nt!IopProcessWorkItem+0x12a
                                        nt!ExpWorkerThread+0xe9
                                        nt!PspSystemThreadStartup+0x41
                                        nt!KiStartSystemThread+0x16

4.001628  ffffc78fb5c07800 0000000 Blocked    nt!KiSwapContext+0x76
                                        nt!KiSwapThread+0x17d
                                        nt!KiCommitThreadWait+0x14f
                                        nt!KeWaitForSingleObject+0x377
                                        ZFSin!spl_mutex_enter+0x12b
                                        ZFSin!dbuf_find+0x7c
                                        ZFSin!__dbuf_hold_impl+0x23f
                                        ZFSin!dbuf_hold_impl+0x87
                                        ZFSin!dbuf_hold_level+0x50
                                        ZFSin!dbuf_hold+0x29
                                        ZFSin!dnode_hold_impl+0x40c
                                        ZFSin!dnode_hold+0x44
                                        ZFSin!dmu_tx_hold_object_impl+0x44
                                        ZFSin!dmu_tx_hold_write+0x176
                                        ZFSin!zvol_write_win+0x1f4
                                        ZFSin!wzvol_WkRtn+0x267
                                        ZFSin!wzvol_GeneralWkRtn+0x6d
                                        nt!IopProcessWorkItem+0x12a
                                        nt!ExpWorkerThread+0xe9
                                        nt!PspSystemThreadStartup+0x41
                                        nt!KiStartSystemThread+0x16
   
4.001688  ffffc780003e4040 0000000 Blocked    nt!KiSwapContext+0x76
                                        nt!KiSwapThread+0x17d
                                        nt!KiCommitThreadWait+0x14f
                                        nt!KeWaitForSingleObject+0x377
                                        ZFSin!spl_mutex_enter+0x12b
                                        ZFSin!dbuf_find+0x7c
                                        ZFSin!__dbuf_hold_impl+0x23f
                                        ZFSin!dbuf_hold_impl+0x87
                                        ZFSin!dbuf_hold_level+0x50
                                        ZFSin!dbuf_hold+0x29
                                        ZFSin!dnode_hold_impl+0x40c
                                        ZFSin!dnode_hold+0x44
                                        ZFSin!dmu_tx_hold_object_impl+0x44
                                        ZFSin!dmu_tx_hold_write+0x176
                                        ZFSin!zvol_write_win+0x1f4
                                        ZFSin!wzvol_WkRtn+0x267
                                        ZFSin!wzvol_GeneralWkRtn+0x6d
                                        nt!IopProcessWorkItem+0x12a
                                        nt!ExpWorkerThread+0xe9
                                        nt!PspSystemThreadStartup+0x41
                                        nt!KiStartSystemThread+0x16

   4.001820  ffffc78000f55040 0000000 Blocked    nt!KiSwapContext+0x76
                                        nt!KiSwapThread+0x17d
                                        nt!KiCommitThreadWait+0x14f
                                        nt!KeWaitForSingleObject+0x377
                                        ZFSin!spl_mutex_enter+0x12b
                                        ZFSin!dbuf_find+0x7c
                                        ZFSin!__dbuf_hold_impl+0x23f
                                        ZFSin!dbuf_hold_impl+0x87
                                        ZFSin!dbuf_hold_level+0x50
                                        ZFSin!dbuf_hold+0x29
                                        ZFSin!dnode_hold_impl+0x40c
                                        ZFSin!dnode_hold+0x44
                                        ZFSin!dmu_tx_hold_object_impl+0x44
                                        ZFSin!dmu_tx_hold_write+0x176
                                        ZFSin!zvol_write_win+0x1f4
                                        ZFSin!wzvol_WkRtn+0x267
                                        ZFSin!wzvol_GeneralWkRtn+0x6d
                                        nt!IopProcessWorkItem+0x12a
                                        nt!ExpWorkerThread+0xe9
                                        nt!PspSystemThreadStartup+0x41
                                        nt!KiStartSystemThread+0x16

   4.001840  ffffc7800230e040 0000000 Blocked    nt!KiSwapContext+0x76
                                        nt!KiSwapThread+0x17d
                                        nt!KiCommitThreadWait+0x14f
                                        nt!KeWaitForSingleObject+0x377
                                        ZFSin!spl_mutex_enter+0x12b
                                        ZFSin!dbuf_find+0x7c
                                        ZFSin!__dbuf_hold_impl+0x23f
                                        ZFSin!dbuf_hold_impl+0x87
                                        ZFSin!dbuf_hold_level+0x50
                                        ZFSin!dbuf_hold+0x29
                                        ZFSin!dnode_hold_impl+0x40c
                                        ZFSin!dnode_hold+0x44
                                        ZFSin!dmu_tx_hold_object_impl+0x44
                                        ZFSin!dmu_tx_hold_write+0x176
                                        ZFSin!zvol_write_win+0x1f4
                                        ZFSin!wzvol_WkRtn+0x267
                                        ZFSin!wzvol_GeneralWkRtn+0x6d
                                        nt!IopProcessWorkItem+0x12a
                                        nt!ExpWorkerThread+0xe9
                                        nt!PspSystemThreadStartup+0x41
                                        nt!KiStartSystemThread+0x16

   4.00186c  ffffc78002302040 0000000 Blocked    nt!KiSwapContext+0x76
                                        nt!KiSwapThread+0x17d
                                        nt!KiCommitThreadWait+0x14f
                                        nt!KeWaitForSingleObject+0x377
                                        ZFSin!spl_mutex_enter+0x12b
                                        ZFSin!dbuf_find+0x7c
                                        ZFSin!__dbuf_hold_impl+0x23f
                                        ZFSin!dbuf_hold_impl+0x87
                                        ZFSin!dbuf_hold_level+0x50
                                        ZFSin!dbuf_hold+0x29
                                        ZFSin!dnode_hold_impl+0x40c
                                        ZFSin!dnode_hold+0x44
                                        ZFSin!dmu_tx_hold_object_impl+0x44
                                        ZFSin!dmu_tx_hold_write+0x176
                                        ZFSin!zvol_write_win+0x1f4
                                        ZFSin!wzvol_WkRtn+0x267
                                        ZFSin!wzvol_GeneralWkRtn+0x6d
                                        nt!IopProcessWorkItem+0x12a
                                        nt!ExpWorkerThread+0xe9
                                        nt!PspSystemThreadStartup+0x41
                                        nt!KiStartSystemThread+0x16

   4.0015c8  ffffc7801ade5040 0000000 Blocked    nt!KiSwapContext+0x76
                                        nt!KiSwapThread+0x17d
                                        nt!KiCommitThreadWait+0x14f
                                        nt!KeWaitForSingleObject+0x377
                                        ZFSin!spl_mutex_enter+0x12b
                                        ZFSin!dbuf_find+0x7c
                                        ZFSin!__dbuf_hold_impl+0x23f
                                        ZFSin!dbuf_hold_impl+0x87
                                        ZFSin!dbuf_hold_level+0x50
                                        ZFSin!dbuf_hold+0x29
                                        ZFSin!dnode_hold_impl+0x40c
                                        ZFSin!dnode_hold+0x44
                                        ZFSin!dmu_tx_hold_object_impl+0x44
                                        ZFSin!dmu_tx_hold_write+0x176
                                        ZFSin!zvol_write_win+0x1f4
                                        ZFSin!wzvol_WkRtn+0x267
                                        ZFSin!wzvol_GeneralWkRtn+0x6d
                                        nt!IopProcessWorkItem+0x12a
                                        nt!ExpWorkerThread+0xe9
                                        nt!PspSystemThreadStartup+0x41
                                        nt!KiStartSystemThread+0x16

lundman · 2020-04-06T00:19:02Z

This thread:

   4.001524  ffffc78fb6226040 0000001 RUNNING    nt!DbgBreakPointWithStatus
                                        nt!KeAccumulateTicks+0x39c
                                        nt!KeClockInterruptNotify+0xb8
                                        hal!HalpTimerClockIpiRoutine+0x15
                                        nt!KiCallInterruptServiceRoutine+0x106
                                        nt!KiInterruptSubDispatchNoLockNoEtw+0xea
                                        nt!KiInterruptDispatchNoLockNoEtw+0x37
                                        ZFSin!spl_mutex_enter+0x14b

is it still alive? It eventually finishes the task and releases the mutex? It's just that the stack looks exactly like that when you get DPC_WATCHDOG_VIOLATION and it has actually crashed.

imtiazdc · 2020-04-06T04:55:25Z

is it still alive?
Yes
It eventually finishes the task and releases the mutex?
Yes

There is no crash, just that the write performance is significantly degraded. In perfmon, we see writes happening at like 10 MBps on zvol for couple of seconds, then it goes back to 0 for a few seconds, then again a burst of writes, then back to 0, etc, ...

I am suspecting the way locks are acquired is hurting the throughput.

lundman · 2020-04-06T06:09:51Z

Not particularly familiar with this code, but all 8 threads go down from zvol_write_win():

                dmu_tx_hold_write(tx, ZVOL_OBJ, off, bytes);

Where ZVOL_OBJ == 1
enters err = dnode_hold(os, object, FTAG, &dn);
Where only the object passed along, not offset or bytes. (length)

We convert it:

        mdn = DMU_META_DNODE(os);
        blk = dbuf_whichblock(mdn, 0, object * sizeof (dnode_phys_t));

which is computed to the same value each time.

Even though there are 8 threads writing to different locations, they are all blocked by one mutex due to;

        uint64_t hv = dbuf_hash(os, obj, level, blkid);

As all arguments are not changing, the result is also computed to the same value each time.

@ahrens Did I miss anything? Are there any mitigations for multiple writing threads?

ahrens · 2020-04-06T14:55:59Z

I would recommend using dmu_tx_hold_write_by_dnode() instead of dmu_tx_hold_write(). Where the dnode_t* is saved in a zvol-specific struct when the zvol is opened. We should make the same change in openzfs as well.

imtiazdc · 2020-04-06T18:12:36Z

Unfortunately, the structs are different in Linux and Windows. Specifically, dnode_t* is missing from Windows' zvol_state_t.
Windows:
https://github.com/openzfsonwindows/ZFSin/blob/master/ZFSin/zfs/include/sys/zvol.h#L59
Linux:
https://github.com/openzfs/zfs/blob/master/include/sys/zvol_impl.h#L41

Thoughts??

lundman · 2020-04-06T22:54:32Z

Awesome, cheers @ahrens .

@imtiazdc Let me work on porting the PR over, change zvol_state to our needs and work out how to open the zvol by saving the dnode.

lundman · 2020-04-07T08:53:15Z

OK it would look something like this (2 commits): https://github.com/openzfsonwindows/ZFSin/tree/bydnode/ZFSin

15e6376
65faac1

Took the chance to clean things up again, we don't actually need dmu_(read|write)_win() style functions, that is something OsX needs to pass the C++ class around.

I've not had a chance to test it much.

imtiazdc · 2020-04-07T11:43:07Z

Thanks @ahrens @lundman for the quick turnaround!

Above is a snapshot with these changes.
To the left is write performance on zvol (8GB, thick provisioned, default settings, no dedup, etc).
To the right is write performance on zpool (used Fixed Disk Type when serving the VHD from Hyper-V to this VM).

I still see writes dropping to 0 on zvol (on zpool the writes are still continuos). However, those 0 writes appear very less frequently compared to how it was without this latest change. Overall, the change looks good.

imtiazdc · 2020-04-07T11:58:31Z

We dedup ON, we can see significantly lower writes on zpool!

imtiazdc · 2020-04-13T16:03:36Z

@lundman I just picked all your changes from bydnode branch. Here are some observations:

The writes have gone back to being choppy. The writes looked better with the port from OpenZFS (without your sync variable change).
I saw 598 zfsin threads through the debugger when I tried to investigate the write choppiness. Is that a cause of concern? Would you know what is causing it? Last time I remember there used to be around 390 threads in the kernel that belonged to zfsin.
Of the 598, 333 threads are in blocked state with below stacktrace:

4.0014d8  ffffd709728c3040 0009f1a Blocked    nt!KiSwapContext+0x76
                                        nt!KiSwapThread+0x17d
                                        nt!KiCommitThreadWait+0x14f
                                        nt!KeWaitForMultipleObjects+0x1fe
                                        ZFSin!spl_cv_wait+0xf3
                                        ZFSin!taskq_thread_wait+0xb2
                                        ZFSin!taskq_thread+0x39e
                                        nt!PspSystemThreadStartup+0x41
                                        nt!KiStartSystemThread+0x16

and 250 threads are in blocked state with below stacktrace:

   4.001648  ffffd7097258b040 000018b Blocked    nt!KiSwapContext+0x76
                                        nt!KiSwapThread+0x17d
                                        nt!KiCommitThreadWait+0x14f
                                        nt!KeWaitForMultipleObjects+0x1fe
                                        ZFSin!spl_cv_wait+0xf3
                                        ZFSin!txg_wait_synced+0x13b
                                        ZFSin!dmu_tx_wait+0x2b6
                                        ZFSin!dmu_tx_assign+0x17b
                                        ZFSin!zvol_write+0x235
                                        ZFSin!wzvol_WkRtn+0x419
                                        ZFSin!wzvol_GeneralWkRtn+0x6d
                                        nt!IopProcessWorkItem+0x12a
                                        nt!ExpWorkerThread+0xe9
                                        nt!PspSystemThreadStartup+0x41
                                        nt!KiStartSystemThread+0x16

There is 1 spa_sync thread and that is also blocked

   4.00160c  ffffd7097290f080 0000191 Blocked    nt!KiSwapContext+0x76
                                        nt!KiSwapThread+0x17d
                                        nt!KiCommitThreadWait+0x14f
                                        nt!KeWaitForMultipleObjects+0x1fe
                                        ZFSin!spl_cv_wait+0xf3
                                        ZFSin!zio_wait+0x1b5
                                        ZFSin!dsl_pool_sync+0x2e7
                                        ZFSin!spa_sync+0xb23
                                        ZFSin!txg_sync_thread+0x3f9
                                        nt!PspSystemThreadStartup+0x41
                                        nt!KiStartSystemThread+0x16

Does that give any clue as to what might have gone wrong?

Stacks of all threads running zfsin code if that helps:
threads.txt

imtiazdc · 2020-04-13T16:20:28Z

Update: going by the discussion on write amplification issue, I specified the ashift and volblocksize explicitly and the write amplification does look under control. However, the write choppiness remains.
zpool create -o ashift=12 tank PHYSICALDRIVE2
zfs create -s -V 8g -o volblocksize=4k tank/zvol

I will break in to the debugger after a few hours of Iometer writes and see what is happening with the count of threads and locking.

lundman · 2020-04-14T06:48:28Z

Ok, lets see. The sync commit changes was that "sync was always true", no matter what. With sync writes on, zvol will wait for the data it reach the disk, before continuing. Safe, but slow. With sync=standard the writes get bunched together and flushed with spa_sync(txg). In theory, this should get you better/faster writes.

But more importantly, it should be user selectable now, ie, zfs set sync=always pool/volume should bring you the "smooth writes" as before.

Now, I've often wondered if there is some issue in mutex / condvar code, could it be missing signals/broadcasts? Sometimes it appears to have "supposed to have been signaled" but sits there doing nothing, until the next one comes in. Is this something you have noticed?

ZFSin!taskq_thread_wait+0xb2

ZFS spawns a bunch of taskq threads, that idle waiting for something to do:
https://github.com/openzfsonwindows/ZFSin/blob/master/ZFSin/zfs/module/zfs/spa.c#L157

and when they are given a task, taskq_dispatch*() it runs it, then returns to idle, ready to do more work.

The taskq "number of threads" is based on a bunch of things, number of cores, writers, each pool has a number, each dataset, etc. But it is worth noting those values are all "Solaris" defaults. I have not yet looked at what tweaks Windows might want. We might be over-creating threads, but Windows does seem pretty decent at threads.

One txg will fill with data until it hits the limit for one txg, then it will wait for it to quiesce, and spa_sync to complete, then it starts again. The txg limits have made knobs you can tune and tweak. https://www.delphix.com/blog/delphix-engineering/zfs-fundamentals-transaction-groups

I'm no experts with those tunables - so experiment and report what you find.

There is also a write-throttle, which I believe was added so there is always a little room for other datasets to not be starved out.

Are we making progress at least?

imtiazdc · 2020-04-14T07:29:41Z

Compression seems to be happening although I didn't turn it on (I am using default settings). Is this expected?

# Child-SP          RetAddr           Call Site
00 fffff803`e632cd88 fffff803`e463b49c nt!DbgBreakPointWithStatus
01 fffff803`e632cd90 fffff803`e4638b29 nt!KeAccumulateTicks+0x39c
02 fffff803`e632cdf0 fffff803`e4e28366 nt!KeClockInterruptNotify+0x469
03 fffff803`e632cf40 fffff803`e46b35f6 hal!HalpTimerClockInterrupt+0x56
04 fffff803`e632cf70 fffff803`e4766d8a nt!KiCallInterruptServiceRoutine+0x106
05 fffff803`e632cfb0 fffff803`e4767277 nt!KiInterruptSubDispatchNoLockNoEtw+0xea
06 ffff9e80`68097270 fffff80c`398408b3 nt!KiInterruptDispatchNoLockNoEtw+0x37
07 ffff9e80`68097400 fffff80c`39841890 ZFSin!LZ4_compressCtx+0x1c3 [C:\Users\imohammad\Downloads\ZFSin\ZFSin\zfs\module\zfs\lz4.c @ 519] 
08 ffff9e80`680974e0 fffff80c`39841624 ZFSin!real_LZ4_compress+0xf0 [C:\Users\imohammad\Downloads\ZFSin\ZFSin\zfs\module\zfs\lz4.c @ 858] 
09 ffff9e80`68097540 fffff80c`39967f39 ZFSin!lz4_compress_zfs+0xa4 [C:\Users\imohammad\Downloads\ZFSin\ZFSin\zfs\module\zfs\lz4.c @ 61] 
0a ffff9e80`68097590 fffff80c`3994f79e ZFSin!zio_compress_data+0x1a9 [C:\Users\imohammad\Downloads\ZFSin\ZFSin\zfs\module\zfs\zio_compress.c @ 126] 
0b ffff9e80`68097620 fffff80c`39958812 ZFSin!zio_write_compress+0x99e [C:\Users\imohammad\Downloads\ZFSin\ZFSin\zfs\module\zfs\zio.c @ 1618] 
0c ffff9e80`68097a40 fffff80c`396e16cf ZFSin!__zio_execute+0x342 [C:\Users\imohammad\Downloads\ZFSin\ZFSin\zfs\module\zfs\zio.c @ 1997] 
0d ffff9e80`68097ad0 fffff803`e471226d ZFSin!taskq_thread+0x48f [C:\Users\imohammad\Downloads\ZFSin\ZFSin\spl\module\spl\spl-taskq.c @ 1610] 
0e ffff9e80`68097b90 fffff803`e476c896 nt!PspSystemThreadStartup+0x41
0f ffff9e80`68097be0 00000000`00000000 nt!KiStartSystemThread+0x16

imtiazdc · 2020-04-14T07:36:02Z

With sync=standard the writes get bunched together and flushed with spa_sync(txg). In theory, this should get you better/faster writes.

What is the downside of using "standard" setting for sync? Can there be data loss?

lundman · 2020-04-14T07:38:41Z

Yes, up to one txg. The pool will always be consistent (either pointing to txg-1 data, or to txg data, since uber block is updated last). But the last 5s of writes may not be there.

imtiazdc · 2020-04-14T07:41:52Z

OK. Thanks for clarifying. Is 1 txg always 5s worth of data or can it be flushed before 5s if it hits a size limit?

lundman · 2020-04-14T07:47:46Z

Take a look at:

ZFSin/ZFSin/zfs/module/zfs/dsl_pool.c

Line 111 in 6304af8

uint64_t zfs_dirty_data_sync = 64 * 1024 * 1024;

ZFSin/ZFSin/zfs/module/zfs/dmu_tx.c

Line 755 in 0766fda

* 0 +-------------------------------------*********----------------+

imtiazdc · 2020-04-14T07:55:13Z

@lundman Could you also comment on the "compression" seen in the stacktrace?

I will put together all my findings based on above comments and let you know. We are making good progress.

lundman · 2020-04-14T08:04:23Z

It would be interesting to know if zfs get compression says anything at all? I know if lz4 is enabled as feature, zfs will use it for metadata, but not the zvol data itself.

imtiazdc · 2020-04-14T10:02:33Z

It would be interesting to know if zfs get compression says anything at all?

Right now, I have the debugger attached and target paused for some investigation. What can I inspect through the debugger to find out if compression is on/off?

lundman · 2020-04-14T10:04:12Z

there's no rush - if its doing lz4 when it shouldn't, we'll come across it again :)

imtiazdc · 2020-04-23T23:18:19Z

@lundman Glad to inform that I am not seeing writes choking on physical (512b sector size) hard drives attached to a VM. Have been running Iometer workload for more than 10 hours.

Here's my zpool/zvol config:

zpool create -o ashift=12 tank PHYSICALDRIVE1
zfs create -s -V 7e -o volblocksize=4k -o sync=always tank/zvol

Looking forward to merge your change in our fork once it is merged in upstream.

imtiazdc changed the title ~~All writers block on same hash table mutex~~ All writers block on same hash table mutex hurting performance Apr 5, 2020

ahrens mentioned this issue Apr 6, 2020

zvol_write() can use dmu_tx_hold_write_by_dnode() openzfs/zfs#10184

Merged

12 tasks

lundman added the awaiting confirmation label Apr 13, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

All writers block on same hash table mutex hurting performance #238

All writers block on same hash table mutex hurting performance #238

imtiazdc commented Apr 4, 2020

imtiazdc commented Apr 5, 2020 •

edited by lundman

lundman commented Apr 6, 2020

imtiazdc commented Apr 6, 2020

lundman commented Apr 6, 2020

ahrens commented Apr 6, 2020

imtiazdc commented Apr 6, 2020

lundman commented Apr 6, 2020

lundman commented Apr 7, 2020 •

edited

imtiazdc commented Apr 7, 2020

imtiazdc commented Apr 7, 2020

imtiazdc commented Apr 13, 2020

imtiazdc commented Apr 13, 2020

lundman commented Apr 14, 2020

imtiazdc commented Apr 14, 2020

imtiazdc commented Apr 14, 2020

lundman commented Apr 14, 2020

imtiazdc commented Apr 14, 2020

lundman commented Apr 14, 2020

imtiazdc commented Apr 14, 2020 •

edited

lundman commented Apr 14, 2020

imtiazdc commented Apr 14, 2020

lundman commented Apr 14, 2020

imtiazdc commented Apr 23, 2020

All writers block on same hash table mutex hurting performance #238

All writers block on same hash table mutex hurting performance #238

Comments

imtiazdc commented Apr 4, 2020

imtiazdc commented Apr 5, 2020 • edited by lundman

lundman commented Apr 6, 2020

imtiazdc commented Apr 6, 2020

lundman commented Apr 6, 2020

ahrens commented Apr 6, 2020

imtiazdc commented Apr 6, 2020

lundman commented Apr 6, 2020

lundman commented Apr 7, 2020 • edited

imtiazdc commented Apr 7, 2020

imtiazdc commented Apr 7, 2020

imtiazdc commented Apr 13, 2020

imtiazdc commented Apr 13, 2020

lundman commented Apr 14, 2020

imtiazdc commented Apr 14, 2020

imtiazdc commented Apr 14, 2020

lundman commented Apr 14, 2020

imtiazdc commented Apr 14, 2020

lundman commented Apr 14, 2020

imtiazdc commented Apr 14, 2020 • edited

lundman commented Apr 14, 2020

imtiazdc commented Apr 14, 2020

lundman commented Apr 14, 2020

imtiazdc commented Apr 23, 2020

imtiazdc commented Apr 5, 2020 •

edited by lundman

lundman commented Apr 7, 2020 •

edited

imtiazdc commented Apr 14, 2020 •

edited