-
Notifications
You must be signed in to change notification settings - Fork 391
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mas i1801 monitorworkerq #979
Conversation
The original commit to make the common riak_core_node_worker_pool behaviour contained rogue tab types that skew code formatting within github. Removing these tabs here.
Currently no way of knowing what the utilisation of different worker pools is, and how often queue events are occurring.
Changed the stats to ones which appear to be intuitively more useful. Now queue time is logged. As there may be a small number of data points only mean and max is tracked. Secondly we log threshold events. On vnode_worker pool, the threshold will be half the limit (as we don't normally expect to get close to the limit). On the node_worker pools, the threshold will be the limit.
At a cost of backwards compatibility. The renamed stats are of minimal use, and in the most case relatively recent additions, for rarely used features.
As well as monitor queue time, monitor work time. As we know our pool sizes threshold breaches can be estimated by looking at the sum of the work time (as we have a count as well as the average) Easier to understand than threshold_events.
Otherwise average will be misleading
src/riak_core_worker_pool.erl
Outdated
@@ -278,3 +291,56 @@ discard_queued_work(Q, Mod) -> | |||
ok | |||
end. | |||
|
|||
-spec poolboy_checkin(pid(), pid(), state()) -> {ok, state()}. | |||
poolboy_checkin(Pool, Worker, State) -> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not wrong, but changing state as a side effect of a little function called makes it harder to follow the code. I kind of like to have state updates very declarative in a small scope.
In this case that would mean that you pass the pool name in as last argument, instead of the state, and return the checkout or an error. Then you do the state update in the function calling this one...
I do see that this may give some code duplication and probably it's fine this way, but it is harder to get an idea of what happens to the state in the main "loop".
Manipulate state only in the main loop
Use Monitors[0]/Checkouts[0] everywhere
Add monitoring via riak stats to the worker queues. Monitor both queue time, and work time.