Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

All writers block on same hash table mutex hurting performance #238

Open
imtiazdc opened this issue Apr 4, 2020 · 23 comments
Open

All writers block on same hash table mutex hurting performance #238

imtiazdc opened this issue Apr 4, 2020 · 23 comments

Comments

@imtiazdc
Copy link
Contributor

imtiazdc commented Apr 4, 2020

threads.txt

I see a performance issue with all writers trying to acquire same mutex in dbuf_hash_table while writing to zvol (RAW disk). I am using Iometer to generate workload.
During live debugging, I found that there are 8 threads executing dbuf_find (as part of zvol_write_win) out of which 7 are waiting for the mutex held by the 8th thread. The output of !stacks 2 zfsin is attached.

dmu_buf_impl_t *
dbuf_find(objset_t *os, uint64_t obj, uint8_t level, uint64_t blkid)
{
    dbuf_hash_table_t *h = &dbuf_hash_table;
    uint64_t hv = dbuf_hash(os, obj, level, blkid);
    uint64_t idx;
    dmu_buf_impl_t *db;
    idx = hv & h->hash_table_mask;
    mutex_enter(DBUF_HASH_MUTEX(h, idx));    // 7 threads wait here for the "mutex" held by 8th thread

Here's the problem, all 8 threads have same input parameters (shown below) to the function dbuf_find -
0xffffc78f`a4734180
0
0
0
which is causing them to map to the same mutex within the hash table.

Is this expected and mandatory in zvol writes? Are there ways to improve?

Below is my setup:
VM configuration:
4 vCPU
8 GB RAM
zvol settings:
8GB (thick provisioned)
Everything is default. Dedup=off, Compression=off, ...
Iometer settings:
4 workers
Data pattern = full random
No of outstanding IOs = 64
Access spec = 4 KiB; 0% Read; 100% random

Thanks,
Imtiaz

@imtiazdc
Copy link
Contributor Author

imtiazdc commented Apr 5, 2020

As can be seen from stack traces below, the writer threads are waiting for the mutex in open context (and not syncing context).

Stack trace of writer thread that has acquired the hash table mutex

   4.001524  ffffc78fb6226040 0000001 RUNNING    nt!DbgBreakPointWithStatus
                                        nt!KeAccumulateTicks+0x39c
                                        nt!KeClockInterruptNotify+0xb8
                                        hal!HalpTimerClockIpiRoutine+0x15
                                        nt!KiCallInterruptServiceRoutine+0x106
                                        nt!KiInterruptSubDispatchNoLockNoEtw+0xea
                                        nt!KiInterruptDispatchNoLockNoEtw+0x37
                                        ZFSin!spl_mutex_enter+0x14b
                                        ZFSin!dbuf_find+0x105
                                        ZFSin!__dbuf_hold_impl+0x23f
                                        ZFSin!dbuf_hold_impl+0x87
                                        ZFSin!dbuf_hold_level+0x50
                                        ZFSin!dbuf_hold+0x29
                                        ZFSin!dnode_hold_impl+0x40c
                                        ZFSin!dnode_hold+0x44
                                        ZFSin!dmu_tx_hold_object_impl+0x44
                                        ZFSin!dmu_tx_hold_write+0x176
                                        ZFSin!zvol_write_win+0x1f4
                                        ZFSin!wzvol_WkRtn+0x267
                                        ZFSin!wzvol_GeneralWkRtn+0x6d
                                        nt!IopProcessWorkItem+0x12a
                                        nt!ExpWorkerThread+0xe9
                                        nt!PspSystemThreadStartup+0x41
                                        nt!KiStartSystemThread+0x16
Stack traces of 7 writer threads that are waiting for the mutex acquired by the thread above

   4.000ee4  ffffc78fb60c1800 0000000 Blocked    nt!KiSwapContext+0x76
                                        nt!KiSwapThread+0x17d
                                        nt!KiCommitThreadWait+0x14f
                                        nt!KeWaitForSingleObject+0x377
                                        ZFSin!spl_mutex_enter+0x12b
                                        ZFSin!dbuf_find+0x7c
                                        ZFSin!__dbuf_hold_impl+0x23f
                                        ZFSin!dbuf_hold_impl+0x87
                                        ZFSin!dbuf_hold_level+0x50
                                        ZFSin!dbuf_hold+0x29
                                        ZFSin!dnode_hold_impl+0x40c
                                        ZFSin!dnode_hold+0x44
                                        ZFSin!dmu_tx_hold_object_impl+0x44
                                        ZFSin!dmu_tx_hold_write+0x176
                                        ZFSin!zvol_write_win+0x1f4
                                        ZFSin!wzvol_WkRtn+0x267
                                        ZFSin!wzvol_GeneralWkRtn+0x6d
                                        nt!IopProcessWorkItem+0x12a
                                        nt!ExpWorkerThread+0xe9
                                        nt!PspSystemThreadStartup+0x41
                                        nt!KiStartSystemThread+0x16

4.001628  ffffc78fb5c07800 0000000 Blocked    nt!KiSwapContext+0x76
                                        nt!KiSwapThread+0x17d
                                        nt!KiCommitThreadWait+0x14f
                                        nt!KeWaitForSingleObject+0x377
                                        ZFSin!spl_mutex_enter+0x12b
                                        ZFSin!dbuf_find+0x7c
                                        ZFSin!__dbuf_hold_impl+0x23f
                                        ZFSin!dbuf_hold_impl+0x87
                                        ZFSin!dbuf_hold_level+0x50
                                        ZFSin!dbuf_hold+0x29
                                        ZFSin!dnode_hold_impl+0x40c
                                        ZFSin!dnode_hold+0x44
                                        ZFSin!dmu_tx_hold_object_impl+0x44
                                        ZFSin!dmu_tx_hold_write+0x176
                                        ZFSin!zvol_write_win+0x1f4
                                        ZFSin!wzvol_WkRtn+0x267
                                        ZFSin!wzvol_GeneralWkRtn+0x6d
                                        nt!IopProcessWorkItem+0x12a
                                        nt!ExpWorkerThread+0xe9
                                        nt!PspSystemThreadStartup+0x41
                                        nt!KiStartSystemThread+0x16
   
4.001688  ffffc780003e4040 0000000 Blocked    nt!KiSwapContext+0x76
                                        nt!KiSwapThread+0x17d
                                        nt!KiCommitThreadWait+0x14f
                                        nt!KeWaitForSingleObject+0x377
                                        ZFSin!spl_mutex_enter+0x12b
                                        ZFSin!dbuf_find+0x7c
                                        ZFSin!__dbuf_hold_impl+0x23f
                                        ZFSin!dbuf_hold_impl+0x87
                                        ZFSin!dbuf_hold_level+0x50
                                        ZFSin!dbuf_hold+0x29
                                        ZFSin!dnode_hold_impl+0x40c
                                        ZFSin!dnode_hold+0x44
                                        ZFSin!dmu_tx_hold_object_impl+0x44
                                        ZFSin!dmu_tx_hold_write+0x176
                                        ZFSin!zvol_write_win+0x1f4
                                        ZFSin!wzvol_WkRtn+0x267
                                        ZFSin!wzvol_GeneralWkRtn+0x6d
                                        nt!IopProcessWorkItem+0x12a
                                        nt!ExpWorkerThread+0xe9
                                        nt!PspSystemThreadStartup+0x41
                                        nt!KiStartSystemThread+0x16

   4.001820  ffffc78000f55040 0000000 Blocked    nt!KiSwapContext+0x76
                                        nt!KiSwapThread+0x17d
                                        nt!KiCommitThreadWait+0x14f
                                        nt!KeWaitForSingleObject+0x377
                                        ZFSin!spl_mutex_enter+0x12b
                                        ZFSin!dbuf_find+0x7c
                                        ZFSin!__dbuf_hold_impl+0x23f
                                        ZFSin!dbuf_hold_impl+0x87
                                        ZFSin!dbuf_hold_level+0x50
                                        ZFSin!dbuf_hold+0x29
                                        ZFSin!dnode_hold_impl+0x40c
                                        ZFSin!dnode_hold+0x44
                                        ZFSin!dmu_tx_hold_object_impl+0x44
                                        ZFSin!dmu_tx_hold_write+0x176
                                        ZFSin!zvol_write_win+0x1f4
                                        ZFSin!wzvol_WkRtn+0x267
                                        ZFSin!wzvol_GeneralWkRtn+0x6d
                                        nt!IopProcessWorkItem+0x12a
                                        nt!ExpWorkerThread+0xe9
                                        nt!PspSystemThreadStartup+0x41
                                        nt!KiStartSystemThread+0x16

   4.001840  ffffc7800230e040 0000000 Blocked    nt!KiSwapContext+0x76
                                        nt!KiSwapThread+0x17d
                                        nt!KiCommitThreadWait+0x14f
                                        nt!KeWaitForSingleObject+0x377
                                        ZFSin!spl_mutex_enter+0x12b
                                        ZFSin!dbuf_find+0x7c
                                        ZFSin!__dbuf_hold_impl+0x23f
                                        ZFSin!dbuf_hold_impl+0x87
                                        ZFSin!dbuf_hold_level+0x50
                                        ZFSin!dbuf_hold+0x29
                                        ZFSin!dnode_hold_impl+0x40c
                                        ZFSin!dnode_hold+0x44
                                        ZFSin!dmu_tx_hold_object_impl+0x44
                                        ZFSin!dmu_tx_hold_write+0x176
                                        ZFSin!zvol_write_win+0x1f4
                                        ZFSin!wzvol_WkRtn+0x267
                                        ZFSin!wzvol_GeneralWkRtn+0x6d
                                        nt!IopProcessWorkItem+0x12a
                                        nt!ExpWorkerThread+0xe9
                                        nt!PspSystemThreadStartup+0x41
                                        nt!KiStartSystemThread+0x16

   4.00186c  ffffc78002302040 0000000 Blocked    nt!KiSwapContext+0x76
                                        nt!KiSwapThread+0x17d
                                        nt!KiCommitThreadWait+0x14f
                                        nt!KeWaitForSingleObject+0x377
                                        ZFSin!spl_mutex_enter+0x12b
                                        ZFSin!dbuf_find+0x7c
                                        ZFSin!__dbuf_hold_impl+0x23f
                                        ZFSin!dbuf_hold_impl+0x87
                                        ZFSin!dbuf_hold_level+0x50
                                        ZFSin!dbuf_hold+0x29
                                        ZFSin!dnode_hold_impl+0x40c
                                        ZFSin!dnode_hold+0x44
                                        ZFSin!dmu_tx_hold_object_impl+0x44
                                        ZFSin!dmu_tx_hold_write+0x176
                                        ZFSin!zvol_write_win+0x1f4
                                        ZFSin!wzvol_WkRtn+0x267
                                        ZFSin!wzvol_GeneralWkRtn+0x6d
                                        nt!IopProcessWorkItem+0x12a
                                        nt!ExpWorkerThread+0xe9
                                        nt!PspSystemThreadStartup+0x41
                                        nt!KiStartSystemThread+0x16

   4.0015c8  ffffc7801ade5040 0000000 Blocked    nt!KiSwapContext+0x76
                                        nt!KiSwapThread+0x17d
                                        nt!KiCommitThreadWait+0x14f
                                        nt!KeWaitForSingleObject+0x377
                                        ZFSin!spl_mutex_enter+0x12b
                                        ZFSin!dbuf_find+0x7c
                                        ZFSin!__dbuf_hold_impl+0x23f
                                        ZFSin!dbuf_hold_impl+0x87
                                        ZFSin!dbuf_hold_level+0x50
                                        ZFSin!dbuf_hold+0x29
                                        ZFSin!dnode_hold_impl+0x40c
                                        ZFSin!dnode_hold+0x44
                                        ZFSin!dmu_tx_hold_object_impl+0x44
                                        ZFSin!dmu_tx_hold_write+0x176
                                        ZFSin!zvol_write_win+0x1f4
                                        ZFSin!wzvol_WkRtn+0x267
                                        ZFSin!wzvol_GeneralWkRtn+0x6d
                                        nt!IopProcessWorkItem+0x12a
                                        nt!ExpWorkerThread+0xe9
                                        nt!PspSystemThreadStartup+0x41
                                        nt!KiStartSystemThread+0x16

@imtiazdc imtiazdc changed the title All writers block on same hash table mutex All writers block on same hash table mutex hurting performance Apr 5, 2020
@lundman
Copy link
Collaborator

lundman commented Apr 6, 2020

This thread:

   4.001524  ffffc78fb6226040 0000001 RUNNING    nt!DbgBreakPointWithStatus
                                        nt!KeAccumulateTicks+0x39c
                                        nt!KeClockInterruptNotify+0xb8
                                        hal!HalpTimerClockIpiRoutine+0x15
                                        nt!KiCallInterruptServiceRoutine+0x106
                                        nt!KiInterruptSubDispatchNoLockNoEtw+0xea
                                        nt!KiInterruptDispatchNoLockNoEtw+0x37
                                        ZFSin!spl_mutex_enter+0x14b

is it still alive? It eventually finishes the task and releases the mutex? It's just that the stack looks exactly like that when you get DPC_WATCHDOG_VIOLATION and it has actually crashed.

@imtiazdc
Copy link
Contributor Author

imtiazdc commented Apr 6, 2020

is it still alive?
Yes
It eventually finishes the task and releases the mutex?
Yes

There is no crash, just that the write performance is significantly degraded. In perfmon, we see writes happening at like 10 MBps on zvol for couple of seconds, then it goes back to 0 for a few seconds, then again a burst of writes, then back to 0, etc, ...

I am suspecting the way locks are acquired is hurting the throughput.

@lundman
Copy link
Collaborator

lundman commented Apr 6, 2020

Not particularly familiar with this code, but all 8 threads go down from zvol_write_win():

                dmu_tx_hold_write(tx, ZVOL_OBJ, off, bytes);

Where ZVOL_OBJ == 1
enters err = dnode_hold(os, object, FTAG, &dn);
Where only the object passed along, not offset or bytes. (length)

We convert it:

        mdn = DMU_META_DNODE(os);
        blk = dbuf_whichblock(mdn, 0, object * sizeof (dnode_phys_t));

which is computed to the same value each time.

Even though there are 8 threads writing to different locations, they are all blocked by one mutex due to;

        uint64_t hv = dbuf_hash(os, obj, level, blkid);

As all arguments are not changing, the result is also computed to the same value each time.

@ahrens Did I miss anything? Are there any mitigations for multiple writing threads?

@ahrens
Copy link
Contributor

ahrens commented Apr 6, 2020

I would recommend using dmu_tx_hold_write_by_dnode() instead of dmu_tx_hold_write(). Where the dnode_t* is saved in a zvol-specific struct when the zvol is opened. We should make the same change in openzfs as well.

@imtiazdc
Copy link
Contributor Author

imtiazdc commented Apr 6, 2020

Unfortunately, the structs are different in Linux and Windows. Specifically, dnode_t* is missing from Windows' zvol_state_t.
Windows:
https://github.com/openzfsonwindows/ZFSin/blob/master/ZFSin/zfs/include/sys/zvol.h#L59
Linux:
https://github.com/openzfs/zfs/blob/master/include/sys/zvol_impl.h#L41

Thoughts??

@lundman
Copy link
Collaborator

lundman commented Apr 6, 2020

Awesome, cheers @ahrens .

@imtiazdc Let me work on porting the PR over, change zvol_state to our needs and work out how to open the zvol by saving the dnode.

@lundman
Copy link
Collaborator

lundman commented Apr 7, 2020

OK it would look something like this (2 commits): https://github.com/openzfsonwindows/ZFSin/tree/bydnode/ZFSin

15e6376
65faac1

Took the chance to clean things up again, we don't actually need dmu_(read|write)_win() style functions, that is something OsX needs to pass the C++ class around.

I've not had a chance to test it much.

@imtiazdc
Copy link
Contributor Author

imtiazdc commented Apr 7, 2020

image

Thanks @ahrens @lundman for the quick turnaround!

Above is a snapshot with these changes.
To the left is write performance on zvol (8GB, thick provisioned, default settings, no dedup, etc).
To the right is write performance on zpool (used Fixed Disk Type when serving the VHD from Hyper-V to this VM).

I still see writes dropping to 0 on zvol (on zpool the writes are still continuos). However, those 0 writes appear very less frequently compared to how it was without this latest change. Overall, the change looks good.

@imtiazdc
Copy link
Contributor Author

imtiazdc commented Apr 7, 2020

image

We dedup ON, we can see significantly lower writes on zpool!

@imtiazdc
Copy link
Contributor Author

@lundman I just picked all your changes from bydnode branch. Here are some observations:

  1. The writes have gone back to being choppy. The writes looked better with the port from OpenZFS (without your sync variable change).
  2. I saw 598 zfsin threads through the debugger when I tried to investigate the write choppiness. Is that a cause of concern? Would you know what is causing it? Last time I remember there used to be around 390 threads in the kernel that belonged to zfsin.
  3. Of the 598, 333 threads are in blocked state with below stacktrace:
4.0014d8  ffffd709728c3040 0009f1a Blocked    nt!KiSwapContext+0x76
                                        nt!KiSwapThread+0x17d
                                        nt!KiCommitThreadWait+0x14f
                                        nt!KeWaitForMultipleObjects+0x1fe
                                        ZFSin!spl_cv_wait+0xf3
                                        ZFSin!taskq_thread_wait+0xb2
                                        ZFSin!taskq_thread+0x39e
                                        nt!PspSystemThreadStartup+0x41
                                        nt!KiStartSystemThread+0x16

and 250 threads are in blocked state with below stacktrace:

   4.001648  ffffd7097258b040 000018b Blocked    nt!KiSwapContext+0x76
                                        nt!KiSwapThread+0x17d
                                        nt!KiCommitThreadWait+0x14f
                                        nt!KeWaitForMultipleObjects+0x1fe
                                        ZFSin!spl_cv_wait+0xf3
                                        ZFSin!txg_wait_synced+0x13b
                                        ZFSin!dmu_tx_wait+0x2b6
                                        ZFSin!dmu_tx_assign+0x17b
                                        ZFSin!zvol_write+0x235
                                        ZFSin!wzvol_WkRtn+0x419
                                        ZFSin!wzvol_GeneralWkRtn+0x6d
                                        nt!IopProcessWorkItem+0x12a
                                        nt!ExpWorkerThread+0xe9
                                        nt!PspSystemThreadStartup+0x41
                                        nt!KiStartSystemThread+0x16
  1. There is 1 spa_sync thread and that is also blocked
   4.00160c  ffffd7097290f080 0000191 Blocked    nt!KiSwapContext+0x76
                                        nt!KiSwapThread+0x17d
                                        nt!KiCommitThreadWait+0x14f
                                        nt!KeWaitForMultipleObjects+0x1fe
                                        ZFSin!spl_cv_wait+0xf3
                                        ZFSin!zio_wait+0x1b5
                                        ZFSin!dsl_pool_sync+0x2e7
                                        ZFSin!spa_sync+0xb23
                                        ZFSin!txg_sync_thread+0x3f9
                                        nt!PspSystemThreadStartup+0x41
                                        nt!KiStartSystemThread+0x16

Does that give any clue as to what might have gone wrong?

Stacks of all threads running zfsin code if that helps:
threads.txt

@imtiazdc
Copy link
Contributor Author

Update: going by the discussion on write amplification issue, I specified the ashift and volblocksize explicitly and the write amplification does look under control. However, the write choppiness remains.
zpool create -o ashift=12 tank PHYSICALDRIVE2
zfs create -s -V 8g -o volblocksize=4k tank/zvol

I will break in to the debugger after a few hours of Iometer writes and see what is happening with the count of threads and locking.

@lundman
Copy link
Collaborator

lundman commented Apr 14, 2020

Ok, lets see. The sync commit changes was that "sync was always true", no matter what. With sync writes on, zvol will wait for the data it reach the disk, before continuing. Safe, but slow. With sync=standard the writes get bunched together and flushed with spa_sync(txg). In theory, this should get you better/faster writes.

But more importantly, it should be user selectable now, ie, zfs set sync=always pool/volume should bring you the "smooth writes" as before.


Now, I've often wondered if there is some issue in mutex / condvar code, could it be missing signals/broadcasts? Sometimes it appears to have "supposed to have been signaled" but sits there doing nothing, until the next one comes in. Is this something you have noticed?


ZFSin!taskq_thread_wait+0xb2

ZFS spawns a bunch of taskq threads, that idle waiting for something to do:
https://github.com/openzfsonwindows/ZFSin/blob/master/ZFSin/zfs/module/zfs/spa.c#L157

and when they are given a task, taskq_dispatch*() it runs it, then returns to idle, ready to do more work.

The taskq "number of threads" is based on a bunch of things, number of cores, writers, each pool has a number, each dataset, etc. But it is worth noting those values are all "Solaris" defaults. I have not yet looked at what tweaks Windows might want. We might be over-creating threads, but Windows does seem pretty decent at threads.


One txg will fill with data until it hits the limit for one txg, then it will wait for it to quiesce, and spa_sync to complete, then it starts again. The txg limits have made knobs you can tune and tweak. https://www.delphix.com/blog/delphix-engineering/zfs-fundamentals-transaction-groups

I'm no experts with those tunables - so experiment and report what you find.

There is also a write-throttle, which I believe was added so there is always a little room for other datasets to not be starved out.


Are we making progress at least?

@imtiazdc
Copy link
Contributor Author

Compression seems to be happening although I didn't turn it on (I am using default settings). Is this expected?

# Child-SP          RetAddr           Call Site
00 fffff803`e632cd88 fffff803`e463b49c nt!DbgBreakPointWithStatus
01 fffff803`e632cd90 fffff803`e4638b29 nt!KeAccumulateTicks+0x39c
02 fffff803`e632cdf0 fffff803`e4e28366 nt!KeClockInterruptNotify+0x469
03 fffff803`e632cf40 fffff803`e46b35f6 hal!HalpTimerClockInterrupt+0x56
04 fffff803`e632cf70 fffff803`e4766d8a nt!KiCallInterruptServiceRoutine+0x106
05 fffff803`e632cfb0 fffff803`e4767277 nt!KiInterruptSubDispatchNoLockNoEtw+0xea
06 ffff9e80`68097270 fffff80c`398408b3 nt!KiInterruptDispatchNoLockNoEtw+0x37
07 ffff9e80`68097400 fffff80c`39841890 ZFSin!LZ4_compressCtx+0x1c3 [C:\Users\imohammad\Downloads\ZFSin\ZFSin\zfs\module\zfs\lz4.c @ 519] 
08 ffff9e80`680974e0 fffff80c`39841624 ZFSin!real_LZ4_compress+0xf0 [C:\Users\imohammad\Downloads\ZFSin\ZFSin\zfs\module\zfs\lz4.c @ 858] 
09 ffff9e80`68097540 fffff80c`39967f39 ZFSin!lz4_compress_zfs+0xa4 [C:\Users\imohammad\Downloads\ZFSin\ZFSin\zfs\module\zfs\lz4.c @ 61] 
0a ffff9e80`68097590 fffff80c`3994f79e ZFSin!zio_compress_data+0x1a9 [C:\Users\imohammad\Downloads\ZFSin\ZFSin\zfs\module\zfs\zio_compress.c @ 126] 
0b ffff9e80`68097620 fffff80c`39958812 ZFSin!zio_write_compress+0x99e [C:\Users\imohammad\Downloads\ZFSin\ZFSin\zfs\module\zfs\zio.c @ 1618] 
0c ffff9e80`68097a40 fffff80c`396e16cf ZFSin!__zio_execute+0x342 [C:\Users\imohammad\Downloads\ZFSin\ZFSin\zfs\module\zfs\zio.c @ 1997] 
0d ffff9e80`68097ad0 fffff803`e471226d ZFSin!taskq_thread+0x48f [C:\Users\imohammad\Downloads\ZFSin\ZFSin\spl\module\spl\spl-taskq.c @ 1610] 
0e ffff9e80`68097b90 fffff803`e476c896 nt!PspSystemThreadStartup+0x41
0f ffff9e80`68097be0 00000000`00000000 nt!KiStartSystemThread+0x16

@imtiazdc
Copy link
Contributor Author

With sync=standard the writes get bunched together and flushed with spa_sync(txg). In theory, this should get you better/faster writes.

What is the downside of using "standard" setting for sync? Can there be data loss?

@lundman
Copy link
Collaborator

lundman commented Apr 14, 2020

Yes, up to one txg. The pool will always be consistent (either pointing to txg-1 data, or to txg data, since uber block is updated last). But the last 5s of writes may not be there.

@imtiazdc
Copy link
Contributor Author

OK. Thanks for clarifying. Is 1 txg always 5s worth of data or can it be flushed before 5s if it hits a size limit?

@lundman
Copy link
Collaborator

lundman commented Apr 14, 2020

Take a look at:

uint64_t zfs_dirty_data_sync = 64 * 1024 * 1024;

* 0 +-------------------------------------*********----------------+

@imtiazdc
Copy link
Contributor Author

imtiazdc commented Apr 14, 2020

@lundman Could you also comment on the "compression" seen in the stacktrace?

I will put together all my findings based on above comments and let you know. We are making good progress.

@lundman
Copy link
Collaborator

lundman commented Apr 14, 2020

It would be interesting to know if zfs get compression says anything at all? I know if lz4 is enabled as feature, zfs will use it for metadata, but not the zvol data itself.

@imtiazdc
Copy link
Contributor Author

It would be interesting to know if zfs get compression says anything at all?

Right now, I have the debugger attached and target paused for some investigation. What can I inspect through the debugger to find out if compression is on/off?

@lundman
Copy link
Collaborator

lundman commented Apr 14, 2020

there's no rush - if its doing lz4 when it shouldn't, we'll come across it again :)

@imtiazdc
Copy link
Contributor Author

@lundman Glad to inform that I am not seeing writes choking on physical (512b sector size) hard drives attached to a VM. Have been running Iometer workload for more than 10 hours.

Here's my zpool/zvol config:

zpool create -o ashift=12 tank PHYSICALDRIVE1
zfs create -s -V 7e -o volblocksize=4k -o sync=always tank/zvol

Looking forward to merge your change in our fork once it is merged in upstream.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants