[TraceQL] Add Support for Selecting Fields in TraceQL #2175

hanksterhan · 2023-03-06T22:19:33Z

Is your feature request related to a problem? Please describe.

The current Search API endpoint returns 5 useful fields for each trace: traceID, rootServiceName, rootTraceName, startTimeUnixNano, and durationMs. When doing analysis on large amounts of traces, it can be very useful to get an additional field or two per trace, instead of parsing it out of the full trace via the Query Trace API. For example, one of my current use cases involves logging both the rootServiceName and the span.http.target fields.

Describe the solution you'd like

To modify the Search API endpoint to allow extra fields to be returned alongside the 5 fields it returns right now. This could be configured via a select statement in TraceQL or via a tag in tag-based search.

Describe alternatives you've considered

I can get the Tempo UI to show this information by including a TraceQL clause like span.http.target != "" to see the field value for span.http.target when I drill into a trace, but this does not work via API.
Additional context

The text was updated successfully, but these errors were encountered:

joe-elliott · 2023-03-07T16:53:34Z

Thank you for filing this issue! We've discussed the concept of select some internally. I'd like to add it to the language, but I'd like some community input on designing this new feature in traceql.

The current Search API endpoint returns 5 useful fields for each trace: traceID, rootServiceName, rootTraceName, startTimeUnixNano, and durationMs. When doing analysis on large amounts of traces, it can be very useful to get an additional field or two per trace, instead of parsing it out of the full trace via the Query Trace API. For example, one of my current use cases involves logging both the rootServiceName and the span.http.target fields.

I feel like you're conflating trace level fields with span level fields. If we add a select operation it will target span level fields.

To modify the Search API endpoint to allow extra fields to be returned alongside the 5 fields it returns right now. This could be configured via a select statement in TraceQL or via a tag in tag-based search.

I'm a proponent of adding it into the language itself. Perhaps something like:

{ span.http.status_code >= 200 && span.http.status_code < 300 } | select(resource.namespace, resource.cluster, ...)

I can get the Tempo UI to show this information by including a TraceQL clause like span.http.target != "" to see the field value for span.http.target when I drill into a trace, but this does not work via API.

Can you double check the values returned? The API should be returning the value of span.http.target for each span returned. For anyone else reading. There is currently a "trick" that can be used to force traceql to select that @hanksterhan is referring to. Something like:

{ span.http.status_code >= 200 && span.http.status_code < 300 && span.http.target != ""}

will "trick" the engine into returning values for span.http.target while asserting the condition.

hanksterhan · 2023-03-07T17:18:21Z

@joe-elliott Thank you for the articulation, the language you use is easier to understand the target use cases.

I feel like you're conflating trace level fields with span level fields.

You are absolutely right. span.http.target is a span level field, I could have chosen a better example.

I'm a proponent of adding it into the language itself.

I agree, adding the select statements directly into the language makes sense.

Can you double check the values returned? The API should be returning the value of span.http.target for each span returned.

After reading over the Tempo Documentation again, I am realizing that I am unable to replicate the "Example of a TraceQL search" example with q={ status=error } to get span level information. This is probably why the "trick" has not been working for me from the API. Does this have to do with how my cluster is configured?

joe-elliott · 2023-03-07T20:28:35Z

This is working for me locally. I believe this would work on a Tempo 2.0 cluster started with defaults.

$ curl --get \
     --data-urlencode 'q={ status=error && resource.service.name!="" }' \
     --data-urlencode 'start=1678130192' --data-urlencode 'end=1678133792' \
     'http://localhost:3100/api/search' | jq

{
  "traces": [
    {
      "traceID": "ad087b829763c14f109f5ff943d301c7",
      "rootServiceName": "foo-service",
      "rootTraceName": "/",
      "startTimeUnixNano": "1678133787842999577",
      "durationMs": 210,
      "spanSet": {
        "spans": [
          {
            "spanID": "98f009712e8e8af1",
            "startTimeUnixNano": "1678133787843142548",
            "durationNanos": "210071468",
            "attributes": [
              {
                "key": "status",
                "value": {
                  "stringValue": "error"
                }
              },
              {
                "key": "service.name",
                "value": {
                  "stringValue": "foo-service"
                }
              }
            ]
          }
        ],
        "matched": 1
      }
    },
...

lanapouney · 2023-03-29T15:03:51Z

I am also interested in this feature being added as I have a use case where I will have to use the work around (I am so glad I found this thread, this would have been a very difficult issue to solve otherwise!). I just wanted to add that in my case I would want the value to be shown as a top (trace) level and also span level attribute, although I understand that it may be hard to please everyone!

joe-elliott · 2023-03-29T15:57:11Z

I'm actually going to start work on this "next". Looking to have something in the next couple months.

I just wanted to add that in my case I would want the value to be shown as a top (trace)

How do you see this working? If I do a query like:

{ <stuff> } | select(span.http.status.code)

the values of http.status.code would likely vary from span to span and we would want them to be displayed at the span level.

Also, any thoughts/comments on the proposed syntax?

lanapouney · 2023-03-29T17:13:07Z

Great stuff!
In this case I would expect "http.status.code" to be added to the list of attributes returned by tempo for a trace and/or span. My only thought here is that I only want to see the top level data (parent traces returned without any spanSet information attached). In my case I am looking to filter by application name, which also varies span to span but I am only interested in traces originating in the application stated, so while not really important does create extra processing power for tempo for data I will not need. Would it be possible to have trace.http.status.code for parent span attribute returns only and span.http.status.code for child/all span attribute returns? My guess is probably not due to the varying formats that different exporters are sending data.

Otherwise the syntax looks good to me, would you use && connectors for multiple attribute selects? for example
{ <stuff> } | select(span.http.status.code && span.http.target)

joe-elliott · 2023-03-31T14:41:38Z

In my case I am looking to filter by application name, which also varies span to span but I am only interested in traces originating in the application stated

Adding a "trace scope" is on our list of things to do. Then you'd be able to write something like:

{ traceRootServiceName = "a" }

Otherwise the syntax looks good to me, would you use && connectors for multiple attribute selects? for example
{ } | select(span.http.status.code && span.http.target)

We would likely use commas since it's a list of attributes:

{ <stuff> } | select(span.http.status.code, span.http.target)

Would it be possible to have trace.http.status.code ...

I have thought about ways to select the root span but was thinking more along the lines of:

{ span.http.status.code = ?? && parent = nil }

Adding root span attributes to a "trace" scope is compelling.

adirmatzkin · 2023-05-18T12:27:00Z

Would it be possible to have trace.http.status.code ...

I have thought about ways to select the root span but was thinking more along the lines of:
{ span.http.status.code = ?? && parent = nil }
Adding root span attributes to a "trace" scope is compelling.

Related to the discussion on #1989, and continuing @mdisibio comment: I think that adding the root span attributes to the trace scope is kind of weird...
IMO, the quoted query is pretty complicated and could be much simpler following @mdisibio's idea about the "root" scope.
It could be something like { root.http.status.code=??} . Thats it.

It's a little bit out of the scope (get it? 😉) of this issue though... so maybe we shouldn't get into this conversation over here 🙃

Anyway, +1 from me about the select(x,y,z) syntax 🔥

joe-elliott added traceql enhancement New feature or request labels Mar 9, 2023

joe-elliott mentioned this issue Mar 20, 2023

[Grafana Trace View - search results] Misleading column titled Name when containing ${.service.name} ${spanName} grafana/grafana#64840

Closed

joe-elliott self-assigned this May 16, 2023

joe-elliott mentioned this issue May 22, 2023

TraceQL: Selection #2494

Merged

3 tasks

joe-elliott closed this as completed in #2494 May 23, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[TraceQL] Add Support for Selecting Fields in TraceQL #2175

[TraceQL] Add Support for Selecting Fields in TraceQL #2175

hanksterhan commented Mar 6, 2023

joe-elliott commented Mar 7, 2023

hanksterhan commented Mar 7, 2023 •

edited

joe-elliott commented Mar 7, 2023 •

edited

lanapouney commented Mar 29, 2023

joe-elliott commented Mar 29, 2023

lanapouney commented Mar 29, 2023

joe-elliott commented Mar 31, 2023

adirmatzkin commented May 18, 2023

[TraceQL] Add Support for Selecting Fields in TraceQL #2175

[TraceQL] Add Support for Selecting Fields in TraceQL #2175

Comments

hanksterhan commented Mar 6, 2023

joe-elliott commented Mar 7, 2023

hanksterhan commented Mar 7, 2023 • edited

joe-elliott commented Mar 7, 2023 • edited

lanapouney commented Mar 29, 2023

joe-elliott commented Mar 29, 2023

lanapouney commented Mar 29, 2023

joe-elliott commented Mar 31, 2023

adirmatzkin commented May 18, 2023

hanksterhan commented Mar 7, 2023 •

edited

joe-elliott commented Mar 7, 2023 •

edited