Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tempo: TraceQL query response streaming #69212

Merged
merged 48 commits into from
Jul 14, 2023

Conversation

adrapereira
Copy link
Contributor

@adrapereira adrapereira commented May 29, 2023

What is this feature?

This PR adds support in Grafana for the recently added gRPC endpoint that streams the response of a TraceQL query from Tempo. grafana/tempo#2366

To use this feature, expand the Options section under the TraceQL editor and enable the "Stream response" toggle. Grafana will start streaming the response and displaying the results in the table as they are available from Tempo.

Screen.Recording.2023-05-30.at.19.15.59.mov

This feature relies on Grafana Live to communicate between the client and Grafana's backend. A new web socket channel is created for each query with the following format ds/${tempo-ds-id}/search/${query-id}, where query-id is a UUID generated for each query execution, which guarantees a new, clean channel for each query.

In the backend, Grafana uses the query parameter to start a gRPC connection to Tempo and listen to messages. Each message received is transformed into a data frame and sent to the client through the web socket.
The response data frame contains 4 fields, with the following format:

  • traces - json enconded list of traces, as received from Tempo
  • metrics - json encoded struct of the query metrics, as received from Tempo
  • state - the state of the connection, can be Pending, Streaming, Done or Error
  • error - the error message, when applicable

TODO

  • Display the query progress using the data from the metrics field

Why do we need this feature?

Users can benefit from query streaming in cases where the query takes a long time to complete.

Who is this feature for?

Users of the Tempo data source.

Which issue(s) does this PR fix?:

Fixes #67591

Special notes for your reviewer:

Please check that:

  • It works as expected from a user's perspective.
  • If this is a pre-GA feature, it is behind a feature toggle.
  • The docs are updated, and if this is a notable improvement, it's added to our What's New doc.

Added traceId query type that is set when performing the request but doesn't map to a tab.
Return traces, metrics, state and error in a dataframe.
Shared state type between FE and BE.
Use getStream() instead of getQueryData()
…e_streaming

# Conflicts:
#	go.mod
#	go.sum
#	pkg/services/ngalert/sender/notifier.go
#	pkg/tsdb/tempo/tempo.go
#	public/app/plugins/datasource/tempo/dataquery.cue
@knylander-grafana
Copy link
Contributor

@adrapereira Great work! I see you've added the feature toggle for streaming. Should we add doc for this feature? Is this feature considered early access/preview?

@adrapereira adrapereira enabled auto-merge (squash) July 13, 2023 14:59
Copy link
Contributor

@joey-grafana joey-grafana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • From conversations in standup, sounds like we're ok with manually passing the client options for now, and will look into perhaps using connect-go at a later stage. If that's the case, I'm not sure how to test that that works correctly right now?
  • On a similar note, we've updated a good few mods. Is our base test for this that the build passes or what other way has this been tested?

Not entirely sure what happened but had to restart docker and go between my local tempo devenv and also the mlt devenv a few times to see any data with the steaming toggle on (basically no query was run even after refreshing browser a few times). With the toggle off I was receiving data. Not sure why however appears to be working fine now. Prob my devenv but worth the mention.

Screenshot 2023-07-13 at 16 39 37
Screenshot 2023-07-13 at 16 39 57

pkg/tsdb/tempo/protospan_translation.go Show resolved Hide resolved
}

var dialOps []grpc.DialOption
if settings.BasicAuthEnabled {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we ok to merge this? I know you looked into using connect-go but seems like it requires an update on the backend which may not be feasible atm.

…e_streaming

# Conflicts:
#	docs/sources/setup-grafana/configure-grafana/feature-toggles/index.md
#	go.mod
#	go.sum
#	packages/grafana-data/src/types/featureToggles.gen.ts
#	pkg/services/featuremgmt/registry.go
#	pkg/services/featuremgmt/toggles_gen.csv
#	pkg/services/featuremgmt/toggles_gen.go
@adrapereira adrapereira requested a review from a team as a code owner July 14, 2023 10:42
@github-actions
Copy link
Contributor

Backend code coverage report for PR #69212
No changes

@github-actions
Copy link
Contributor

github-actions bot commented Jul 14, 2023

Frontend code coverage report for PR #69212

Plugin Main PR Difference
explore 86.71% 86.71% 0%

Copy link
Contributor

@joey-grafana joey-grafana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really nice work, this is a fantastic addition 👍

IIUC it's enabled for everyone in this PR so we can get a good round of internal testing but the idea is that the toggle will either be removed or at least used in the code so that people can turn off the feature if needs be (atm the toggle is not used).

@adrapereira adrapereira merged commit c1709c9 into main Jul 14, 2023
20 checks passed
@adrapereira adrapereira deleted the andre/traceql_response_streaming branch July 14, 2023 14:10
@ricky-undeadcoders ricky-undeadcoders modified the milestones: 10.1.x, 10.1.0 Aug 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[TraceQL] Support Tempo streaming endpoint
7 participants