Skip to content

Conversation

@laxmanchekka
Copy link
Contributor

Description

Added grpc server/client metrics monitoring.

Testing

Tested manually

Checklist:

  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • Any dependent changes have been merged and published in downstream modules

Documentation

NA

laxmanchekka and others added 3 commits December 6, 2022 18:36
* refactor: example

* upgrade grpc-utils lib to the latest

Co-authored-by: Laxman Ch <laxman@traceable.ai>
@github-actions

This comment has been minimized.

@codecov
Copy link

codecov bot commented Dec 7, 2022

Codecov Report

Merging #62 (46f57de) into main (11e9213) will not change coverage.
The diff coverage is 94.28%.

@@            Coverage Diff            @@
##               main      #62   +/-   ##
=========================================
  Coverage     69.50%   69.50%           
+ Complexity      106      105    -1     
=========================================
  Files            15       15           
  Lines           564      564           
  Branches         32       33    +1     
=========================================
  Hits            392      392           
  Misses          153      153           
  Partials         19       19           
Flag Coverage Δ
unit 69.50% <94.28%> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...viceframework/metrics/PlatformMetricsRegistry.java 82.78% <94.28%> (ø)

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@laxmanchekka
Copy link
Contributor Author

Example metrics

Server metrics

# HELP grpc_server_responses_sent_messages_total The total number of responses sent
# TYPE grpc_server_responses_sent_messages_total counter
grpc_server_responses_sent_messages_total{app="query-service",method="execute",methodType="SERVER_STREAMING",service="org.hypertrace.core.query.service.QueryService",} 135.0
grpc_server_responses_sent_messages_total{app="query-service",method="Check",methodType="UNARY",service="grpc.health.v1.Health",} 98.0

Client metrics

# HELP grpc_client_processing_duration_seconds The total time taken for the client to complete the call, including network delay
# TYPE grpc_client_processing_duration_seconds summary
grpc_client_processing_duration_seconds_count{app="query-service",method="findAttributes",methodType="SERVER_STREAMING",service="org.hypertrace.core.attribute.service.v1.AttributeService",statusCode="OK",} 1.0
grpc_client_processing_duration_seconds_sum{app="query-service",method="findAttributes",methodType="SERVER_STREAMING",service="org.hypertrace.core.attribute.service.v1.AttributeService",statusCode="OK",} 0.393107214
grpc_client_processing_duration_seconds_count{app="query-service",method="Check",methodType="UNARY",service="grpc.health.v1.Health",statusCode="OK",} 98.0
grpc_client_processing_duration_seconds_sum{app="query-service",method="Check",methodType="UNARY",service="grpc.health.v1.Health",statusCode="OK",} 0.26274974
# HELP grpc_client_processing_duration_seconds_max The total time taken for the client to complete the call, including network delay
# TYPE grpc_client_processing_duration_seconds_max gauge
grpc_client_processing_duration_seconds_max{app="query-service",method="findAttributes",methodType="SERVER_STREAMING",service="org.hypertrace.core.attribute.service.v1.AttributeService",statusCode="OK",} 0.393107214
grpc_client_processing_duration_seconds_max{app="query-service",method="Check",methodType="UNARY",service="grpc.health.v1.Health",statusCode="OK",} 0.003172604

@Test
public void testMetricInitialization() {
PlatformService service = getService(Map.of("service.name", "test-service",
PlatformService service = getService(Map.of("service.name", "sample-app",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any reason for the changes in here? The default tag change should be transparent (And is, if I'm reading the test changes correctly)

Copy link
Contributor Author

@laxmanchekka laxmanchekka Dec 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With new changes, these tests are failing when they are tested with different default tags.
These tests are simulating a reload (lifecycle stop and start) kind of scenario. However, for commonTags there is no provision for cleanup/destroy. We need to destroy the registry itself and recreate it.

By using the same app name we are bypassing the test order dependencies.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't that a caught regression though? In the prior implementation, it reset its state on start/stop and it no longer does. Seems like it would be easy enough to reset the meter registry, right?

if (tags == null || tags.isEmpty()) {
return DEFAULT_TAGS;
private static Iterable<Tag> toIterable(Map<String, String> tags) {
Set<Tag> newTags = new HashSet<>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any concern about replicable ordering? Could just as easily use a list

@github-actions

This comment has been minimized.

Set<MeterRegistry> registries = new HashSet<>(METER_REGISTRY.getRegistries());
registries.forEach(METER_REGISTRY::remove);
meterRegistry.getRegistries().forEach(MeterRegistry::close);
meterRegistry.forEachMeter(meterRegistry::remove);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit - a lot of this code - 430-434 I believe - is redundant now that we're just replacing the whole registry. We don't need to go through and clear stuff out any more.

@laxmanchekka laxmanchekka merged commit 177d547 into main Dec 8, 2022
@laxmanchekka laxmanchekka deleted the grpc-metrics branch December 8, 2022 02:49
@github-actions
Copy link

github-actions bot commented Dec 8, 2022

Unit Test Results

  9 files  ±0    9 suites  ±0   11s ⏱️ -2s
31 tests ±0  31 ✔️ ±0  0 💤 ±0  0 ❌ ±0 

Results for commit 177d547. ± Comparison against base commit 11e9213.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants