feat: improved leopard saturation metrics#50
Merged
pmcclory-pp merged 1 commit intoMay 19, 2026
Conversation
29223b6 to
c143056
Compare
bougyman
approved these changes
May 19, 2026
Member
bougyman
left a comment
There was a problem hiding this comment.
Feat seems reasonable to me.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This improves the Leopard saturation metrics. The original version of the metrics assumed that the processing concurrency for a subject was always set to the default of 1. So if the semaphore for a given subject had no available permits, then the subject for the worker was busy - if it did it was idle.
In our prod environment where we are using leopard we monkey patch nats-pure to increase the processing concurrency which breaks this assumption. Update to return the number of actual inflight requests for a subject (in flight means that less then the processing concurrency is in flight - it still might not get immediately run by the executors thread pool if that is saturated - more on that below).
The second issue was subject saturation was telling an incomplete story of the full worker saturation. Each worker gets a thread pool with 24 threads, and all subjects requests are assigned to that thread pool. So it is possible with enough subjects registered that no individual subject will be fully saturated but the threadpool itself is. Expose metrics on this as well.
Here's a summary of the metric changes:
Metrics
Subject metrics (labels:
subject,worker)leopard_subject_busy_slots— thread slots actively processing a message for this subject on this workerleopard_subject_capacity_slots— total thread slots allocated for this subject on this worker (reflects actual configured processing concurrency, not the default of 1)leopard_subject_pending_messages— messages waiting to acquire a processing slot for this subject on this workerExecutor metrics (labels:
worker)leopard_executor_active_threads— approximate number of active threads in the worker's subscription executor thread poolleopard_executor_max_threads— maximum threads available in the subscription executor thread poolleopard_executor_queued_tasks— tasks holding a semaphore permit but waiting for a free executor thread; nonzero only when the executor pool is fully saturatedReplaced metrics
leopard_subject_busy_instances→leopard_subject_busy_slots/leopard_subject_capacity_slotsleopard_subject_total_instances→leopard_subject_capacity_slotsleopard_subject_pending_messages— retained but labels changed (now includesworker)Note - opened as a feat: - but could see the argument that this is a breaking change - let me know if it should be major release