Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Latency Spikes in realtime threads while IO on ZFS #13128

Open
poelzi opened this issue Feb 20, 2022 · 6 comments
Open

Latency Spikes in realtime threads while IO on ZFS #13128

poelzi opened this issue Feb 20, 2022 · 6 comments
Labels
Type: Performance Performance improvement or performance problem

Comments

@poelzi
Copy link

poelzi commented Feb 20, 2022

System information

Type Version/Name
Distribution Name NixOS
Distribution Version 21.11
Kernel Version 5.10.94
Architecture x64
OpenZFS Version zfs-2.1.2-1

Describe the problem you're observing

I'm using mixxx with jackd. I had used this machine mit lvm+ext4 for 5 years and switched recently to zfs.
I notice a lot of underruns in the realtime thread now, especially when the system performs IO that did not happen with stock kernel and ext4.

Describe how to reproduce the problem

Doing zfs IO while realtime thread does work.

Include any warning/errors/backtraces from the system logs

@poelzi poelzi added the Type: Defect Incorrect behavior (e.g. crash, hang) label Feb 20, 2022
@szubersk
Copy link
Contributor

@poelzi, this issue report is incomplete. Would you mind providing more details?

@behlendorf behlendorf added Type: Performance Performance improvement or performance problem and removed Type: Defect Incorrect behavior (e.g. crash, hang) labels Feb 24, 2022
@IvanVolosyuk
Copy link

I was using qemu/kvm/vfio with realtime priority and ZFS caused audio glitches and video stuttering. I was trying isol_cpus which ZFS ignored, so I was maintaining my own version of CPU isolation in ZFS, but it was leading to system deadlocks in some situations. In the end I have found out that preemptive kernel solves the latency spikes I observed with ZFS. I am using it successfully for more than a year.

@poelzi
Copy link
Author

poelzi commented Apr 20, 2022

@IvanVolosyuk I switched to a custom kernel with full preemtive kernel and it is such a difference. Running ZFS on voluntary preemtive kernel on a desktop system is a really bad expericene.
Even with 5.16.20 and zfs-2.1.4-1

I will add a NixOS ticket so the combination of zfs and preemtive get solved, but I think it would be good on the long run if ZFS would behave like other FS on PREEMPT_VOLUNTARY and not cause such latency spikes.

@stale
Copy link

stale bot commented Apr 26, 2023

This issue has been automatically marked as "stale" because it has not had any activity for a while. It will be closed in 90 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the Status: Stale No recent activity for issue label Apr 26, 2023
@IvanVolosyuk
Copy link

still relevant

@stale stale bot removed the Status: Stale No recent activity for issue label Jul 23, 2023
@IvanVolosyuk
Copy link

Actually, what would be the way to debug / profile the ZFS code to find out in which stacks ZFS takes a long time in kernel threads between calls to schedule() or similar situations?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Performance Performance improvement or performance problem
Projects
None yet
Development

No branches or pull requests

4 participants