-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Current ZFS git tip constantly creates and destroys taskq kernel threads, especially z_wr_int_N threads #7274
Comments
@siebenmann is the pid churn somehow causing problems for the system? This is normal behavior for ZFS which creates and destroys the I/O pipeline kernel threads as they are needed. If you'd prefer the threads to be long lived set the |
It's possible that PID churn or the very large PID numbers it now creates on my machine are causing problems, but I'm going to have to investigate more. I've tried booting with |
@siebenmann it may still be caused by ZFS. There are several places in the code where a pool of kthread workers will by dynamically created then destroyed so some work can be performed in parallel. |
It turns out that the PID churn I was seeing is not ZFS's fault. Much to my surprise, building Go from source and running its tests goes through over 200,000 PIDs (even on ext4), although the whole process completes in only a few minutes. This has likely been going on for some time, but was previously masked because my machine used the normal Linux low (I don't know why Go is churning through so many PIDs here, but it's definitely not ZFS's problem.) |
@siebenmann ahh that explains it. Then please go ahead and close this out if you agree there isn't an actual problem here. |
System information
Describe the problem you're observing
On my 16-core machine, ZFS and/or SPL appears to be constantly creating and destroying kernel threads for taskqs, especially 'z_wr_int_N' threads. I see huge PIDs for these shortly after system boot, for example:
This is on a system with generally low disk IO (and certainly low disk IO since boot). The churn is sustained and rapid; I have had Linux's extended PID space wrap around in a day or so on this system.
I have three pools on this system. Two of the three have a single mirrored vdev; the third is a single disk (so it's non-redundant; I use ZFS to notice checksum errors). I have another four-core system with a single ZFS pool with a single mirrored vdev that also appears to be experiencing this, but at a much slower rate; presumably the rate of churn is related to how many cores the system has.
This churn appears to have been going on for some time based on my system logs, but in the past it has been less obvious because my system was restricting itself to 16-bit PIDs and there was not the glaring clue of PIDs with six or more digits.
The text was updated successfully, but these errors were encountered: