New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unify metric name and format in client libraries #572

Open
NeoCN opened this Issue Nov 30, 2017 · 14 comments

Comments

Projects
None yet
5 participants
@NeoCN
Contributor

NeoCN commented Nov 30, 2017

running both golang and java service, and found out the metric name generated was different, for example, with prometheus:

Java:

# HELP jaeger_reporter_queue  
# TYPE jaeger_reporter_queue gauge
jaeger_reporter_queue 0.0
# HELP jaeger_spans_total  
# TYPE jaeger_spans_total counter
jaeger_spans_total{group="sampling",sampled="y",} 152.0
jaeger_spans_total{group="lifecycle",sampled="started",} 152.0
jaeger_spans_total{group="lifecycle",sampled="finished",} 152.0
# HELP jaeger_reporter_spans_total  
# TYPE jaeger_reporter_spans_total counter
jaeger_reporter_spans_total{state="success",} 63.0

Golang

# HELP jaeger:finished_spans jaeger:finished_spans
# TYPE jaeger:finished_spans counter
jaeger:finished_spans 0
# HELP jaeger:reporter_queue_length jaeger:reporter_queue_length
# TYPE jaeger:reporter_queue_length gauge
jaeger:reporter_queue_length 0
# HELP jaeger:reporter_spans jaeger:reporter_spans
# TYPE jaeger:reporter_spans counter
jaeger:reporter_spans{result="dropped"} 0
jaeger:reporter_spans{result="err"} 0
jaeger:reporter_spans{result="ok"} 0

Unify metric name and format in client libraries can help metric collector and monitor

@yurishkuro

This comment has been minimized.

Member

yurishkuro commented Nov 30, 2017

Thanks for opening this. Yes, I just recently merged a change to the Go client with renamed metrics. The objective is to change all clients to emit similar metrics.

I just booked issues for the other clients.

@jpkrohling

This comment has been minimized.

Member

jpkrohling commented Feb 1, 2018

@NeoCN , how did you get the output listed under the "java service" on your first comment?

@NeoCN

This comment has been minimized.

Contributor

NeoCN commented Feb 1, 2018

@jpkrohling impl StatsReporter with Micrometer API, then get metrics output with micrometer-registry-prometheus

@jpkrohling

This comment has been minimized.

Member

jpkrohling commented Feb 1, 2018

Is it your own implementation of the StatsReporter? And, more importantly, would you be interested in contributing this implementation? @objectiser has just mentioned Micrometer to me, so, we might solve two items in one :)

@objectiser

This comment has been minimized.

Contributor

objectiser commented Feb 1, 2018

+1 would be great to get this contributed!

@NeoCN

This comment has been minimized.

Contributor

NeoCN commented Feb 1, 2018

@jpkrohling @objectiser yes, my own implementation; I can make a pull request tomorrow, it is too late to go bed now.

@yurishkuro

This comment has been minimized.

Member

yurishkuro commented Mar 12, 2018

Some people have complained (#732 and jaegertracing/jaeger-client-java#335 (comment)) that the colon : is not a great namespace separator.

@SuperQ

This comment has been minimized.

SuperQ commented May 30, 2018

@yurishkuro Correct, the colon is "reserved" for end users, and should not be used in metric names as exposed by source. There is no technical namespace separator in Prometheus metrics right now. We've been considering it as part of OpenMetrics, but for now we don't have a solution.

@yurishkuro

This comment has been minimized.

Member

yurishkuro commented May 30, 2018

we just released 1.5 where the colon is no longer used, but that only applies to the backend metrics.

@objectiser

This comment has been minimized.

Contributor

objectiser commented Oct 4, 2018

We still have an issue with consistency across clients - only looking at Golang and Java currently:

Baggage

Go:
jaeger_client_jaeger_baggage_restrictions_updates{result="ok"} 0
jaeger_client_jaeger_baggage_restrictions_updates{result="err"} 0
jaeger_client_jaeger_baggage_truncations 0
jaeger_client_jaeger_baggage_updates{result="ok"} 0
jaeger_client_jaeger_baggage_updates{result="err"} 0

Java:
jaeger:baggage_restrictions_updates_total{result="ok",} 0.0
jaeger:baggage_restrictions_updates_total{result="err",} 0.0
jaeger:baggage_truncations_total 0.0
jaeger:baggage_updates_total{result="ok",} 263.0
jaeger:baggage_updates_total{result="err",} 0.0

Reporter

Go:
jaeger_client_jaeger_reporter_queue_length 0
jaeger_client_jaeger_reporter_spans{result="ok"} 5
jaeger_client_jaeger_reporter_spans{result="err"} 4
jaeger_client_jaeger_reporter_spans{result="dropped"} 0

Java:
jaeger:reporter_queue_length 0.0
jaeger:reporter_spans_total{result="ok",} 948.0
jaeger:reporter_spans_total{result="err",} 0.0
jaeger:reporter_spans_total{result="dropped",} 0.0

Sampler

Go:
jaeger_client_jaeger_sampler_updates{result="ok"} 0
jaeger_client_jaeger_sampler_updates{result="err"} 0
jaeger_client_jaeger_sampler_queries{result="ok"} 0
jaeger_client_jaeger_sampler_queries{result="err"} 0

Java:
jaeger:sampler_updates_total{result="ok",} 0.0
jaeger:sampler_updates_total{result="err",} 0.0
jaeger:sampler_queries_total{result="ok",} 0.0
jaeger:sampler_queries_total{result="err",} 0.0

Spans

Go:
jaeger_client_jaeger_span_context_decoding_errors 0
jaeger_client_jaeger_started_spans{sampled="n"} 0
jaeger_client_jaeger_started_spans{sampled="y"} 9
jaeger_client_jaeger_finished_spans 9

Java:
jaeger:span_context_decoding_errors_total 0.0
jaeger:started_spans_total{sampled="y",} 952.0
jaeger:started_spans_total{sampled="n",} 0.0
jaeger:finished_spans_total 951.0

Traces

Go:
jaeger_client_jaeger_traces{sampled="n",state="joined"} 0
jaeger_client_jaeger_traces{sampled="n",state="started"} 0
jaeger_client_jaeger_traces{sampled="y",state="joined"} 0
jaeger_client_jaeger_traces{sampled="y",state="started"} 9

Java:
jaeger:traces_total{sampled="y",state="started",} 0.0
jaeger:traces_total{sampled="n",state="started",} 0.0
jaeger:traces_total{sampled="y",state="joined",} 308.0
jaeger:traces_total{sampled="n",state="joined",} 0.0

Proposal

There are common metrics across the two languages with consistent tags which is good. The Go client has some additional metrics based on throttling. Haven't looked at the rpc request metrics yet.

Therefore this should just be a renaming exercise:

  • The Go client should replace jaeger_client_jaeger_ with just jaeger_client_
  • The Java client should replace prefix jaeger: with jaeger_client_ and remove the suffix _total

One question would be - should the language be a tag, to enable metrics to be segmented by language if necessary?

Thoughts on the proposal?

@SuperQ

This comment has been minimized.

SuperQ commented Oct 4, 2018

Deleted my previous comment, it was redundant. ☕️ lacking today.

The proposal looks good.

I don't see a reason to have different metric names per language. As long as metrics have the same meaning, they can remain the same.

If you had something that was java specific, I might namespace the metrics like jaeger_java_client_....

@objectiser

This comment has been minimized.

Contributor

objectiser commented Oct 4, 2018

@SuperQ Thanks for the feedback.

I was more wondering whether all the metrics produced by each language client should include a tag to identify which language client produced it? Not sure if that would be useful, but just an idea.

@SuperQ

This comment has been minimized.

SuperQ commented Oct 4, 2018

No, I don't think a label for language would make sense. It's not really necessary in the Prometheus ecosystem.

@objectiser

This comment has been minimized.

Contributor

objectiser commented Oct 5, 2018

Discussion at today's team meeting - agreed that we should standardise across languages, using jaeger_tracer_ as the default prefix, but allow the application to override the prefix if they wish.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment