-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ARC consuming 38% CPU for no reason #6531
Comments
|
Here's what top was showing as we managed to catch one of the systems described above by @brendangregg in this state: |
|
Just curious, does this still happen on 0.7.x? |
|
I think we should not even call arc_adjust when ARC is less than c_min if (arc_size > arc_c_min) |
|
@gaurkuma right. At the very least, arc_adjust() should bail early (or not be called) if the arc_size is zero. |
|
|
|
I believe the test should be if the spa_namespace_avl is empty in the arc_reclaim_thread() loop. |
@brendangregg is |
|
@richardelling spa_namespace_avl check may not be sufficient because I can have unused pools. For e.g in our use case we create pools upfront on multiple nodes in a cluster and some of them may not even get used for quite some time. |
|
This issue has been automatically marked as "stale" because it has not had any activity for a while. It will be closed in 90 days if no further activity occurs. Thank you for your contributions. |
This is a production system that has ZFS installed, but is not using ZFS. No pools, datasets, or ARC buffers.
It has suffered a performance loss as ZFS was consuming 38% CPU system-wide. This is a 4 CPU system. Here is the bottom of a CPU flame graph (open in a new tab to zoom):
Zooming into the arc_reclaim thread:
This multilist work is new to me, but... do we really need to be selecting eviction lists using an entropy-based random function? Could this just be round robin?
There's also the shrink_zone CPU consumer on the right, which I'd guess is related to the ARC holding onto locks while in arc_adjust().
This system was almost running at memory capacity (about 99%), so I would think it is frequently entering arc_reclaim() and shrink_zone(). The workaround has been to reduce the Java heap size by a tiny bit.
System information
The text was updated successfully, but these errors were encountered: