Skip to content

Commit

Permalink
Improve metrics design
Browse files Browse the repository at this point in the history
- Remove "histogram" from histogram family name ans help. This is
because this generates buckets, sum and count. Sum and count are
not actually histograms, and the family name could be confusing.

- Add preventive counters creation (empty labels) to be visible in
grafana even nothing is scraped. This eases dashboard design just
starting the process but not needing to generate traffic to
"activate" those metrics.

- In enableMetrics() we provide the source label whoch shall be dynamic.

- In class name (Http2Server/Http2Client) we should pass a static
family name prefix.
  • Loading branch information
testillano authored and Eduardo Ramos Testillano (eramedu) committed Aug 13, 2023
1 parent 2548c2c commit 955168b
Show file tree
Hide file tree
Showing 4 changed files with 59 additions and 36 deletions.
27 changes: 18 additions & 9 deletions include/ert/http2comm/Http2Client.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -162,12 +162,14 @@ class Http2Client
/**
* Class constructor given host, port and secure connection indicator
*
* @param name client name. It may be used to prefix the name of metrics families
* (counters, gauges, histograms), so consider to provide a compatible name ([a-zA-Z0-9:_]).
* A good name convention would include the application name and the endpoint identification,
* for example:
* h2agent_traffic_client_myClient4
* udp_server_h2client[_traffic_client]
* @param name class name. It may be used to prefix the family name for every metric managed by
* the class (counters, gauges, histograms), so consider to provide a compatible metrics name
* ([a-zA-Z0-9:_]). You may also avoid dynamic names to build predictable grafana dashboards.
* Dynamic information shall be passed to enableMetrics() to be part of 'source' label. For
* example, the application generic name (project) and HTTP2 endpoint category would be
* appropriate class names:
*
* - h2agent_traffic_client
*
* @param host Endpoint host
* @param port Endpoint port
Expand All @@ -179,15 +181,22 @@ class Http2Client
/**
* Enable metrics
*
* The name of families created will be prefixed by the class name given in the constructor.
*
* @param metrics Optional metrics object to compute counters and histograms
* @param responseDelaySecondsHistogramBucketBoundaries Optional bucket boundaries for response delay seconds histogram
* @param messageSizeBytesHistogramBucketBoundaries Optional bucket boundaries for message size bytes histogram
* @param source Source label for prometheus metrics. If missing, class name will be taken (even being redundant with
* family name prefix as will ease metrics filtering anyway). A good source convention could be the process name and
* the endpoint identification:
*
* - h2agent[_traffic_client]_myClientToB: optional endpoint category, as it would be deducted from endpoint identifier itself or family name
* - h2agent_myClientToA
* - h2agentB_myClientToA
* - udp_server_h2client[_myClientToA]: optional endpoint identifier as could be inferred from process name, because
* 'udp-server-h2client' application has only one client
*/
void enableMetrics(ert::metrics::Metrics *metrics,
const ert::metrics::bucket_boundaries_t &responseDelaySecondsHistogramBucketBoundaries = {},
const ert::metrics::bucket_boundaries_t &messageSizeBytesHistogramBucketBoundaries = {});
const ert::metrics::bucket_boundaries_t &messageSizeBytesHistogramBucketBoundaries = {}, const std::string &source = "");

/**
* Send request to the server
Expand Down
38 changes: 22 additions & 16 deletions include/ert/http2comm/Http2Server.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -110,21 +110,23 @@ class Http2Server
public:

/**
* Class constructor
* Class constructor
*
* @param name Server name. It may be used to prefix the name of metrics families
* (counters, gauges, histograms), so consider to provide a compatible name ([a-zA-Z0-9:_]).
* A good name convention would include the application name and the endpoint identification,
* for example:
* h2agent_traffic_server
* h2agentB_traffic_server
* @param name class name. It may be used to prefix the family name for every metric managed by
* the class (counters, gauges, histograms), so consider to provide a compatible metrics name
* ([a-zA-Z0-9:_]). You may also avoid dynamic names to build predictable grafana dashboards.
* Dynamic information shall be passed to enableMetrics() to be part of 'source' label. For
* example, the application generic name (project) and HTTP2 endpoint category would be
* appropriate class names:
*
* @param workerThreads number of worker threads.
* @param maxWorkerThreads number of maximum worker threads which internal processing could grow to. Defaults to '0' which means that maximum equals to provided worker threads.
* @param timerIoContext Optional io context to manage response delays
* @param queueDispatcherMaxSize This library implements a simple congestion control algorithm which will indicate congestion status when queue dispatcher (when used) has no
* idle consumer threads, and queue dispatcher size is over this value. Defaults to -1 which means 'no limit' to grow the queue (this probably implies response time degradation).
* So, to enable the described congestion control algorithm, provide a non-negative value.
* - h2agent_traffic_server
*
* @param workerThreads number of worker threads.
* @param maxWorkerThreads number of maximum worker threads which internal processing could grow to. Defaults to '0' which means that maximum equals to provided worker threads.
* @param timerIoContext Optional io context to manage response delays
* @param queueDispatcherMaxSize This library implements a simple congestion control algorithm which will indicate congestion status when queue dispatcher (when used) has no
* idle consumer threads, and queue dispatcher size is over this value. Defaults to -1 which means 'no limit' to grow the queue (this probably implies response time degradation).
* So, to enable the described congestion control algorithm, provide a non-negative value.
*/
Http2Server(const std::string& name, size_t workerThreads, size_t maxWorkerThreads = 0, boost::asio::io_context *timerIoContext = nullptr, int queueDispatcherMaxSize = -1 /* no limit */);
virtual ~Http2Server();
Expand Down Expand Up @@ -160,15 +162,19 @@ class Http2Server
/**
* Enable metrics
*
* The name of families created will be prefixed by the class name given in the constructor.
*
* @param metrics Optional metrics object to compute counters and histograms
* @param responseDelaySecondsHistogramBucketBoundaries Optional bucket boundaries for response delay seconds histogram
* @param messageSizeBytesHistogramBucketBoundaries Optional bucket boundaries for message size bytes histogram
* @param source Source label for prometheus metrics. If missing, class name will be taken (even being redundant with
* family name prefix as will ease metrics filtering anyway). A good source convention could be the process name and
* the endpoint identification:
*
* - h2agent[_traffic_server]: optional endpoint category, as it would be deducted from family name
* - h2agentB
*/
void enableMetrics(ert::metrics::Metrics *metrics,
const ert::metrics::bucket_boundaries_t &responseDelaySecondsHistogramBucketBoundaries = {},
const ert::metrics::bucket_boundaries_t &messageSizeBytesHistogramBucketBoundaries = {});
const ert::metrics::bucket_boundaries_t &messageSizeBytesHistogramBucketBoundaries = {}, const std::string &source = "");

/**
* Sets the server key password to use with TLS/SSL
Expand Down
16 changes: 10 additions & 6 deletions src/Http2Client.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -67,18 +67,21 @@ Http2Client::Http2Client(const std::string& name, const std::string& host, const

void Http2Client::enableMetrics(ert::metrics::Metrics *metrics,
const ert::metrics::bucket_boundaries_t &responseDelaySecondsHistogramBucketBoundaries,
const ert::metrics::bucket_boundaries_t &messageSizeBytesHistogramBucketBoundaries) {
const ert::metrics::bucket_boundaries_t &messageSizeBytesHistogramBucketBoundaries, const std::string &source) {

metrics_ = metrics;

if (metrics_) {

ert::metrics::labels_t familyLabels = {{"source", name_}};
ert::metrics::labels_t familyLabels = {{"source", (source.empty() ? name_:source)}};

observed_requests_sents_counter_family_ptr_ = &(metrics_->addCounterFamily(name_ + std::string("_observed_resquests_sents_counter"), std::string("Requests sents observed counter in ") + name_, familyLabels));
ert::metrics::counter_t *observed_requests_sents_counter = &(observed_requests_sents_counter_family_ptr_->Add({{"method", ""}})); // preventive creation: this way it will be scraped even not processed and will ease dashboard creation
observed_requests_unsents_counter_family_ptr_ = &(metrics_->addCounterFamily(name_ + std::string("_observed_resquests_unsent_counter"), std::string("Requests unsents observed counter in ") + name_, familyLabels));
ert::metrics::counter_t *observed_requests_unsents_counter = &(observed_requests_unsents_counter_family_ptr_->Add({{"method", ""}})); // preventive creation: this way it will be scraped even not processed and will ease dashboard creation
observed_responses_received_counter_family_ptr_ = &(metrics_->addCounterFamily(name_ + std::string("_observed_responses_received_counter"), std::string("Responses received observed counter in ") + name_, familyLabels));
ert::metrics::counter_t *observed_responses_received_counter = &(observed_responses_received_counter_family_ptr_->Add({{"method", ""}, {"status_code", ""}})); // preventive creation: this way it will be scraped even not processed and will ease dashboard creation
observed_responses_timedout_counter_family_ptr_ = &(metrics_->addCounterFamily(name_ + std::string("_observed_responses_timedout_counter"), std::string("Responses timed-out observed counter in ") + name_, familyLabels));
ert::metrics::counter_t *observed_responses_timedout_counter = &(observed_responses_timedout_counter_family_ptr_->Add({{"method", ""}})); // preventive creation: this way it will be scraped even not processed and will ease dashboard creation

responses_delay_seconds_gauge_family_ptr_ = &(metrics_->addGaugeFamily(name_ + std::string("_responses_delay_seconds_gauge"), std::string("Message responses delay gauge (seconds) in ") + name_, familyLabels));
responses_delay_seconds_gauge_ = &(responses_delay_seconds_gauge_family_ptr_->Add({})); // we could create ad-hoc gauges for each http2 method, but it is probably overkilling
Expand All @@ -87,11 +90,12 @@ void Http2Client::enableMetrics(ert::metrics::Metrics *metrics,
received_messages_size_bytes_gauge_family_ptr_ = &(metrics_->addGaugeFamily(name_ + std::string("_received_messages_size_bytes_gauge"), std::string("Received messages sizes gauge (bytes) in ") + name_, familyLabels));
received_messages_size_bytes_gauge_ = &(received_messages_size_bytes_gauge_family_ptr_->Add({})); // we could create ad-hoc gauges for each http2 method, but it is probably overkilling

responses_delay_seconds_histogram_family_ptr_ = &(metrics_->addHistogramFamily(name_ + std::string("_responses_delay_seconds_histogram"), std::string("Message responses delay histogram (seconds) in ") + name_, familyLabels));
// Omit 'histogram' from family name, because it is confusing: this kind of metric generates 'bucket', 'sum' and 'count' which are good enough to understand the concept behind (only bucket is histogram):
responses_delay_seconds_histogram_family_ptr_ = &(metrics_->addHistogramFamily(name_ + std::string("_responses_delay_seconds"), std::string("Message responses delay (seconds) in ") + name_, familyLabels));
responses_delay_seconds_histogram_ = &(responses_delay_seconds_histogram_family_ptr_->Add({}, responseDelaySecondsHistogramBucketBoundaries)); // we could create ad-hoc histograms for each http2 method, but it is probably overkilling
sent_messages_size_bytes_histogram_family_ptr_ = &(metrics_->addHistogramFamily(name_ + std::string("_sent_messages_size_bytes_histogram"), std::string("Sent messages sizes histogram (bytes) in ") + name_, familyLabels));
sent_messages_size_bytes_histogram_family_ptr_ = &(metrics_->addHistogramFamily(name_ + std::string("_sent_messages_size_bytes"), std::string("Sent messages sizes (bytes) in ") + name_, familyLabels));
sent_messages_size_bytes_histogram_ = &(sent_messages_size_bytes_histogram_family_ptr_->Add({}, messageSizeBytesHistogramBucketBoundaries)); // we could create ad-hoc histograms for each http2 method, but it is probably overkilling
received_messages_size_bytes_histogram_family_ptr_ = &(metrics_->addHistogramFamily(name_ + std::string("_received_messages_size_bytes_histogram"), std::string("Received messages sizes histogram (bytes) in ") + name_, familyLabels));
received_messages_size_bytes_histogram_family_ptr_ = &(metrics_->addHistogramFamily(name_ + std::string("_received_messages_size_bytes"), std::string("Received messages sizes (bytes) in ") + name_, familyLabels));
received_messages_size_bytes_histogram_ = &(received_messages_size_bytes_histogram_family_ptr_->Add({}, messageSizeBytesHistogramBucketBoundaries)); // we could create ad-hoc histograms for each http2 method, but it is probably overkilling
}
}
Expand Down
14 changes: 9 additions & 5 deletions src/Http2Server.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -82,16 +82,19 @@ int Http2Server::getQueueDispatcherMaxSize() const {

void Http2Server::enableMetrics(ert::metrics::Metrics *metrics,
const ert::metrics::bucket_boundaries_t &responseDelaySecondsHistogramBucketBoundaries,
const ert::metrics::bucket_boundaries_t &messageSizeBytesHistogramBucketBoundaries) {
const ert::metrics::bucket_boundaries_t &messageSizeBytesHistogramBucketBoundaries, const std::string &source) {

metrics_ = metrics;

if (metrics_) {
ert::metrics::labels_t familyLabels = {{"source", name_}};
ert::metrics::labels_t familyLabels = {{"source", (source.empty() ? name_:source)}};

observed_requests_accepted_counter_family_ptr_ = &(metrics_->addCounterFamily(name_ + std::string("_observed_resquests_accepted_counter"), std::string("Requests accepted observed counter in ") + name_, familyLabels));
ert::metrics::counter_t *observed_requests_accepted_counter = &(observed_requests_accepted_counter_family_ptr_->Add({{"method", ""}})); // preventive creation: this way it will be scraped even not processed and will ease dashboard creation
observed_requests_errored_counter_family_ptr_ = &(metrics_->addCounterFamily(name_ + std::string("_observed_resquests_errored_counter"), std::string("Requests errored observed counter in ") + name_, familyLabels));
ert::metrics::counter_t *observed_requests_errored_counter = &(observed_requests_errored_counter_family_ptr_->Add({{"method", ""}})); // preventive creation: this way it will be scraped even not processed and will ease dashboard creation
observed_responses_counter_family_ptr_ = &(metrics_->addCounterFamily(name_ + std::string("_observed_responses_counter"), std::string("Responses observed counter in ") + name_, familyLabels));
ert::metrics::counter_t *observed_responses_counter = &(observed_responses_counter_family_ptr_->Add({{"method", ""}, {"status_code", ""}})); // preventive creation: this way it will be scraped even not processed and will ease dashboard creation

responses_delay_seconds_gauge_family_ptr_ = &(metrics_->addGaugeFamily(name_ + std::string("_responses_delay_seconds_gauge"), std::string("Message responses delay gauge (seconds) in ") + name_, familyLabels));
responses_delay_seconds_gauge_ = &(responses_delay_seconds_gauge_family_ptr_->Add({})); // we could create ad-hoc gauges for each http2 method, but it is probably overkilling
Expand All @@ -100,11 +103,12 @@ void Http2Server::enableMetrics(ert::metrics::Metrics *metrics,
sent_messages_size_bytes_gauge_family_ptr_ = &(metrics_->addGaugeFamily(name_ + std::string("_sent_messages_size_bytes_gauge"), std::string("Sent messages sizes gauge (bytes) in ") + name_, familyLabels));
sent_messages_size_bytes_gauge_ = &(sent_messages_size_bytes_gauge_family_ptr_->Add({})); // we could create ad-hoc gauges for each http2 method, but it is probably overkilling

responses_delay_seconds_histogram_family_ptr_ = &(metrics_->addHistogramFamily(name_ + std::string("_responses_delay_seconds_histogram"), std::string("Message responses delay histogram (seconds) in ") + name_, familyLabels));
// Omit 'histogram' from family name, because it is confusing: this kind of metric generates 'bucket', 'sum' and 'count' which are good enough to understand the concept behind (only bucket is histogram):
responses_delay_seconds_histogram_family_ptr_ = &(metrics_->addHistogramFamily(name_ + std::string("_responses_delay_seconds"), std::string("Message responses delay (seconds) in ") + name_, familyLabels));
responses_delay_seconds_histogram_ = &(responses_delay_seconds_histogram_family_ptr_->Add({}, responseDelaySecondsHistogramBucketBoundaries)); // we could create ad-hoc histograms for each http2 method, but it is probably overkilling
received_messages_size_bytes_histogram_family_ptr_ = &(metrics_->addHistogramFamily(name_ + std::string("_received_messages_size_bytes_histogram"), std::string("Received messages sizes histogram (bytes) in ") + name_, familyLabels));
received_messages_size_bytes_histogram_family_ptr_ = &(metrics_->addHistogramFamily(name_ + std::string("_received_messages_size_bytes"), std::string("Received messages sizes (bytes) in ") + name_, familyLabels));
received_messages_size_bytes_histogram_ = &(received_messages_size_bytes_histogram_family_ptr_->Add({}, messageSizeBytesHistogramBucketBoundaries)); // we could create ad-hoc histograms for each http2 method, but it is probably overkilling
sent_messages_size_bytes_histogram_family_ptr_ = &(metrics_->addHistogramFamily(name_ + std::string("_sent_messages_size_bytes_histogram"), std::string("Sent messages sizes histogram (bytes) in ") + name_, familyLabels));
sent_messages_size_bytes_histogram_family_ptr_ = &(metrics_->addHistogramFamily(name_ + std::string("_sent_messages_size_bytes"), std::string("Sent messages sizes (bytes) in ") + name_, familyLabels));
sent_messages_size_bytes_histogram_ = &(sent_messages_size_bytes_histogram_family_ptr_->Add({}, messageSizeBytesHistogramBucketBoundaries)); // we could create ad-hoc histograms for each http2 method, but it is probably overkilling
}
}
Expand Down

0 comments on commit 955168b

Please sign in to comment.