Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tempo support #4278

Closed
visteras opened this issue Aug 16, 2021 · 34 comments
Closed

Tempo support #4278

visteras opened this issue Aug 16, 2021 · 34 comments
Assignees
Labels
backlog Triaged Issue added to backlog enhancement This is the preferred way to describe new end-to-end features.

Comments

@visteras
Copy link

visteras commented Aug 16, 2021

Is your feature request related to a problem? Please describe.
I'm always frustrated when i need use only one product because other not supported...

Describe the solution you'd like
Support Tempo

Describe alternatives you've considered
I check Jaeger and other products, but my result - use Tempo, because i cant pay for big cluster k8s if i used Jaeger((

Additional context
I checked code for work with jaeger, and if its all need for realisation support for Tempo, i can try write client for Tempo.. but i very bad with tests(
client interface:

kiali/jaeger/client.go

Lines 30 to 35 in f432cf1

type ClientInterface interface {
GetAppTraces(ns, app string, query models.TracingQuery) (traces *JaegerResponse, err error)
GetTraceDetail(traceId string) (*JaegerSingleTrace, error)
GetErrorTraces(ns, app string, duration time.Duration) (errorTraces int, err error)
GetServiceStatus() (available bool, err error)
}

@visteras visteras added the enhancement This is the preferred way to describe new end-to-end features. label Aug 16, 2021
@lucasponce
Copy link
Contributor

lucasponce commented Aug 16, 2021

Hi @visteras,
Thanks for the issue.
Kiali supports the default addons shipped with Istio.
Perhaps it could be interesting if you share how do you setup Istio + Tracing (with Tempo), that could give us some good information about efforts and help community to upvote a request.

In the past there was some similar request to support Zipkin, but there was no traction in community.

Also, plans are to try to evolve to use OpenTracing interfaces, so any product that would follow a standard could be plugable (but this is just a wish and a early idea people are brainstorming on).

@abonas
Copy link
Contributor

abonas commented Aug 16, 2021

@jpkrohling fyi ^^

@jmazzitelli
Copy link
Collaborator

As @lucasponce says above, we'd most likely enhance Kiali to support the OpenTracing standard in the future, rather than enhance Kiali to support another one-off tracing implementation. We had a similar request for Zipkin and closed that issue 12 days ago for that very reason.

@lucasponce we should have an enhancement request "Support OpenTracing" - I thought we already had one, but I couldn't find it.

@lucasponce
Copy link
Contributor

Problem is that there is no a single "Open Tracing" library that we can use to consume any tracing solution in that sense.
So, an issue would be set "waiting on external", probably this one could be moved to a discussion and use that to track the progress on that effort.

@visteras
Copy link
Author

I think OpenTracing is very good, but I really think OpenTracing is only about "sends to application (jaeger / tempo / etc)", but if we need to get information from tempo or jaeger, we need one client for jaeger and another for tempo... or not?

@lucasponce
Copy link
Contributor

we need one client for jaeger and another for tempo... or not?

Correct, that's the problem to consume this information.
There is WIP in the community to use also standard OpenTelemetry API for the endpoints but as far as I know it is in an early stage and clients (like Kiali) needs to use native libraries.

We're still looking at it, Kiali allows to use the new GRPC interfaces for Jaeger and at the moment that there are some standard Proto that defines the API, then it should help to query other solutions.

But right now, I think Jaeger is pushing this, Zipkin is not moving into that direction (AFAIK) and if Tempo or other solution also enable these kind of endpoints that could simplify how to consume.

Otherwise, from Kiali perspective you need to pull the depenencies for every solution, and code the different clients.

We'll move in that way, also, at some point there should be a library that may help to this.

@visteras
Copy link
Author

And so we settled on the fact that at the moment we need to use multiple libraries to support multiple services. Is there a tentative timeline for supporting Tempo?
I found the Tempo Query API here: https://grafana.com/docs/tempo/latest/api_docs/#query
Buuut ...: here I found Tempo requests are Jaeger requests with GRPC support: https://grafana.com/docs/tempo/latest/api_docs/pushing-spans-with-http/#retrieving-traces
And as I understand it, I can try to use Tempo "as is", but I need to understand how to add it to Kiali ...

@lucasponce
Copy link
Contributor

And so we settled on the fact that at the moment we need to use multiple libraries to support multiple services. Is there a tentative timeline for supporting Tempo?

I'd say that this request would need to get traction from the community: ask more Istio users to use Tempo within ServiceMesh scenarios, brainstorm potential side effects, etc.

As commented, we have a single similar request to support Zipkin, we asked for more context that there were very few response.

And as I understand it, I can try to use Tempo "as is", but I need to understand how to add it to Kiali ...

Happy if you want to tackle and investigate if existing implementation may work with Tempo backends.

You can start configuring a Tempo backend (setting urls, security, etc):

https://github.com/kiali/kiali-operator/blob/master/deploy/kiali/kiali_cr.yaml#L618

And also taking a look into the backend code.

If the API compatibility with Jaeger is high perhaps things could move with support from the community that may want to use this backend.

@jpkrohling
Copy link
Contributor

jpkrohling commented Aug 18, 2021

I believe Tempo is able to ingest data in Jaeger format, and from the client-side of things, the future is indeed around OpenTelemetry. But Kiali needs to get data out of the storage as well and makes queries directly to Jaeger, which would make it difficult to switch to something else.

@jshaughn jshaughn added the future Valuable but not ready/feasible for backlog. May require broader discussion. label Aug 20, 2021
@stevehipwell
Copy link

I naively assumed that due to Tempo using the Zipkin protocol and Jaeger query engine that it would work correctly with Kiali; it works correctly with Istio.

Has anyone looked into this further, I can see the following two integration points potentially needing a small amount of work.

  • Query traces from Tempo
    • OpenTelemetry maybe, otherwise I think it's almost identical to Jaeger
  • Use Grafana as a tracing dashboard
    • This would work for any tracing source supported by Grafana

@lucasponce
Copy link
Contributor

Hi @stevehipwell,

We haven't yet exploring this issue, if you'd like to collaborate (creating some example about how it would be to connect Tempo with Istio and how Tempo can be queried with some example) that would be a good start and Kiali could have some plugability on this side.

@stevehipwell
Copy link

@lucasponce could you point me at the current Jaeger integration points in the code for query and dashboard?

@kvrhdn
Copy link

kvrhdn commented Nov 27, 2021

Hi, Grafana Tempo maintainer here 👋 It would be awesome if you could use Tempo as tracing backend for Kiali. I can help out with getting to know Tempo and its API.

So first of all: Grafana Tempo is able to ingest a myriad of formats including OpenTelemetry, Jaeger and Zipkin. Internally we convert every format into the OpenTelemetry format and use that to store traces on s3/gcs. We have a query API specific to Tempo which returns traces in the OpenTelemetry format as well. Tempo does not use the Jaeger engine.

From a quick glance through the existing code, I think there are two options: use Tempo as a drop-in replacement for Jaeger (easiest but not ideal), create a new Kiali client specific for Tempo.
At this point there is no OpenTelemetry backend query API which could be a common API for Jaeger, Tempo, etc (AFAIK at least). If this becomes stable I'm fairly sure Tempo would want support this.

Tempo as Jaeger drop-in replacement

During the initial development of Tempo, we created the tempo-query component to expose a Jaeger API. tempo-query is a Jaeger storage plugin, so it accepts the full Jaeger query API and translates these requests into Tempo queries.
This should make it possible to use Tempo directly with Kiali right now.

Some importants concerns/notes:

  • tempo-query is not really actively maintained and we will not be investing in it anymore
  • We support the full Jaeger search API including service name, operation name and tags but there are a couple of differing implementation details:
    • Tempo will only query recently ingested traces, depending on how much data you are ingesting, traces will be searchable between 15m and 1h. We are currently working on adding full backend search.
    • Tag search does a contains query not a full match like Jaeger.
  • Searching for traces using tempo-query is inefficient: the Jaeger API returns full traces in their search results, Tempo opted to only return a summary of the trace. Because of this tempo-query has to do additional queries to fetch full traces, so if you search 100 traces tempo-query will do 1 search query + 100 trace lookup queries creating a considerable load on the ingesters.

For an example to run Tempo + tempo-query, see grafana/tempo - example/docker-compose/grafana7.4
The Jsonnet libraries and Helm chart already run tempo-query as a sidecar next to the query-frontend. It's available on port 16686.

Adapt Kiali to query Tempo directly

The better solution (imo) would be to add a client specific to Tempo. Looking at the code this would require a couple of changes:

The current client uses the Jaeger data model (for instance JaegerSingleTrace) but Tempo returns traces in the OpenTelemetry format (JSON or Protobuf). The proto definition: opentelemetry-proto - opentelemetry/proto/trace/v1/trace.proto.
Luckily there is already a lot of code out there to convert between OpenTelemetry and Jaeger, Zipkin. So you can convert OpenTelemetry into Jaeger in the client or opt to switch Kiali to use the OpenTelemetry format all together and do the conversion in the Jaeger client.

The search results of Jaeger and Tempo are different: while Jaeger returns a list of full traces, Tempo only returns metadata of the traces. See https://grafana.com/docs/tempo/latest/api_docs/#search
This might be a problem if you need the full trace data for visualisations like the workload detail: https://kiali.io/docs/features/tracing/#workload-detail

@lucasponce
Copy link
Contributor

Hi @stevehipwell and @kvrhdn,

Thanks for jumping into this, I guess if it gets traction the best could create a foundation project (perhaps in the opentelemetry umbrella?) to have a "generic" client that offers an abstraction layer for basic tracing query capabilities.

In that case, we could remove the direct dependency in Kiali and use this library, and with that layer, more backends could be added with minimal changes in the project.

That was something we thought but we haven't started.

I think OpenTelemetry group is strong in the API to ingest traces from the apps, but there is no a "formal" api to query backends, perhaps that could be a good moment to push for this.

I'm happy to help here as well.

@stevehipwell
Copy link

@lucasponce would it be worth Kiali reaching out to the OpenTelemetry project to get a steer on what they think the query API solution should look like? I think that alongside Grafana, Kiali would be one of the big winners of an OpenTelemetry query API allowing interchangeable backends.

In the meantime I don't think you'd go far wrong with crating a Tempo client for Kiali; I suspect the OpenTelemetry query implementation might look more like this than Jaeger's.

Adding support for Grafana as the tracing dashboard should be able to be completed independently of actual backends as it supports both Jaeger and Tempo (amongst others).

@kvrhdn
Copy link

kvrhdn commented Nov 29, 2021

Oh, it seems I was mistaken about a OpenTelemetry query API: Jaeger added an OpenTelemetry-compatible endpoint a while back (jaegertracing/jaeger-idl#76), this endpoint simply returns traces from Jaeger in the OpenTelemetry format. I must have misinterpreted this as something more generic than it actually is.
Even if there is no common API, I think it's still worth supporting the OpenTelemetry format (OTLP) eventually as this will make it easier to support other backends later (e.g. Zipkin), but note this isn't a necessity to add initial support for Tempo.

@lucasponce
Copy link
Contributor

Kiali reaching out to the OpenTelemetry project to get a steer on what they think the query API solution should look like? I think that alongside Grafana, Kiali would be one of the big winners of an OpenTelemetry query API allowing interchangeable backends.

It was in the agenda, thanks to this thread we can add more references and potentially push for it.

I'll create a request in the OpenTelemetry group to see how this can move, also the way to query the jaeger backend is not yet perfect and a solid abstraction layer here would really help.

I'll put it in my todo list for next week, I'll update this thread as well.

@lucasponce
Copy link
Contributor

@Kampe
Copy link

Kampe commented Apr 1, 2022

There any quickish solution to this for Kiali and Tempo users while community SIGs form around this?

@lucasponce
Copy link
Contributor

There any quickish solution to this for Kiali and Tempo users while community SIGs form around this?

Hi @Kampe, sorry for the delay, it seems that the proposal open-telemetry/oteps#193 has some traction and interest, so I think I can work in the PoC I have and make it work for Tempo/Jaeger.

I'll ping you when there is more progress.

@imranismail
Copy link

Any workaround to this issue?

@jmazzitelli
Copy link
Collaborator

Any workaround to this issue?

No. This has stalled in the community. I see no progress in this OpenTelemetry issue, and this Tempo issue has been closed, and Kiali has had no community contributions in this area.

@stevehipwell
Copy link

Does the announcement of TraceQL offer an alternative solution to getting Kiali support for Tempo? I'd love an OpenTelemetry native solution but the real driver here is to be able to use S3 backing storage for our trace data which means Tempo support in Kiali is the real priority.

@aljesusg aljesusg self-assigned this Jan 23, 2023
@jshaughn jshaughn added backlog Triaged Issue added to backlog and removed future Valuable but not ready/feasible for backlog. May require broader discussion. labels Jan 23, 2023
@stevehipwell
Copy link

@lucasponce with TraceQL now GA has this been considered as an integration point to allow Kiali to be used with Grafana Tempo?

@jshaughn
Copy link
Collaborator

@stevehipwell @aljesusg is now leading our Tempo effort. Alberto, do you have a comment?

@aljesusg
Copy link
Collaborator

Hi @stevehipwell, thanks for your interest.

Our idea is to work on the integration with tempo asap, we were waiting to the operator for that but we can start to define the steps.

Would you be interested in contributing to this?

Thanks.

@stevehipwell
Copy link

we were waiting to the operator for that

@aljesusg what exactly are you waiting for?

I'd be happy to contribute to this how and where I can, although I'm not sure how much time I will have available.

@jshaughn jshaughn added the waiting external It requires additional info to progress. For example, it can require a fix in other project. label Feb 15, 2023
@aljesusg
Copy link
Collaborator

Close in favor #5850

@jshaughn jshaughn removed the waiting external It requires additional info to progress. For example, it can require a fix in other project. label Feb 21, 2023
@Hronom
Copy link

Hronom commented Jul 4, 2023

After reading this thread and see how many energy people spend on chatting I hope in the end we be able to drop Kiali at all.
I think for maintainers(who is know the code base) it should be not hard to implement simple support for tempo.

I hope Grafana introduce dashboards and other stuff, so we can have same functionality in Grafana as we have in Kiali. Primarily I'm interested in graph of services
image

BTW I don't know why everyone pushing here for OpenTelemetry, primarily OpenTelemetry is about to standardize pushing data and pre-processing it, it's not providing API's to get the data back from different back-ends. So moving/pushing for this direction is wrong approach. Given this ideology, I'm not sure that OpenTelemetry project will do something near time, they have a lot of other stuff to do.

@Hronom Hronom mentioned this issue Jul 4, 2023
7 tasks
@aljesusg
Copy link
Collaborator

aljesusg commented Jul 4, 2023

Hi @Hronom

You can follow these instructions to use tempo in kiali

#5848 (comment)

@Hronom
Copy link

Hronom commented Jul 5, 2023

@aljesusg hello, unfortunately it's not work.

Expectation to have clear guidelines on what values to set in tempo-distributed helm chart and what values to set in Kiali CR to make it work

@Hronom
Copy link

Hronom commented Jul 6, 2023

Finally manage it to work, here is temporary workaround until support of Tempo not added directly to Kiali

@iblancasa
Copy link

@aljesusg hello, unfortunately it's not work.

It works but, as explained in this comment it is using the official documentation to deploy Tempo. Not using the Helm chart or the operator.

Maybe, what can be done, is to add a section for each one explaining the needed steps.

@jmazzitelli
Copy link
Collaborator

Maybe, what can be done, is to add a section for each one explaining the needed steps.

Folks that have access to AWS and use Tempo/managed Prometheus, please feel free to submit PRs to the kiali.io documentation with information that you know works. The Kiali team typically doesn't have the time, resources, or expertise like our community has when it comes to specialized situations like this.

We would gladly accept (and have in the past) special documentation for different cluster vendors and scenarios (here is an example where most of the docs there were community-contributed).

The kiali.io project where PRs can be submitted is here: https://github.com/kiali/kiali.io

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backlog Triaged Issue added to backlog enhancement This is the preferred way to describe new end-to-end features.
Projects
None yet
Development

No branches or pull requests