Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mas i1801 monitorworkerq #979

Merged
merged 13 commits into from
Nov 9, 2021
Merged

Conversation

martinsumner
Copy link
Contributor

Add monitoring via riak stats to the worker queues. Monitor both queue time, and work time.

The original commit to make the common riak_core_node_worker_pool behaviour contained rogue tab types that skew code formatting within github.  Removing these tabs here.
Currently no way of knowing what the utilisation of different worker pools is, and how often queue events are occurring.
Changed the stats to ones which appear to be intuitively more useful.

Now queue time is logged.  As there may be a small number of data points only mean and max is tracked.

Secondly we log threshold events. On vnode_worker pool, the threshold will be half the limit (as we don't normally expect to get close to the limit).  On the node_worker pools, the threshold will be the limit.
At a cost of backwards compatibility.  The renamed stats are of minimal use, and in the most case relatively recent additions, for rarely used features.
As well as monitor queue time, monitor work time.  As we know our pool sizes threshold breaches can be estimated by looking at the sum of the work time (as we have a count as well as the average)

Easier to understand than threshold_events.
@martinsumner martinsumner mentioned this pull request Nov 4, 2021
@martinsumner
Copy link
Contributor Author

basho/riak_test#1359

@@ -278,3 +291,56 @@ discard_queued_work(Q, Mod) ->
ok
end.

-spec poolboy_checkin(pid(), pid(), state()) -> {ok, state()}.
poolboy_checkin(Pool, Worker, State) ->
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not wrong, but changing state as a side effect of a little function called makes it harder to follow the code. I kind of like to have state updates very declarative in a small scope.
In this case that would mean that you pass the pool name in as last argument, instead of the state, and return the checkout or an error. Then you do the state update in the function calling this one...

I do see that this may give some code duplication and probably it's fine this way, but it is harder to get an idea of what happens to the state in the main "loop".

Manipulate state only in the main loop
Use Monitors[0]/Checkouts[0] everywhere
@martinsumner martinsumner merged commit 7e0aa31 into develop-3.0 Nov 9, 2021
@martinsumner martinsumner deleted the mas-i1801-monitorworkerq branch November 9, 2021 14:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants