Using trace duration in traceQL #2311

amitsetty · 2023-04-07T16:41:30Z

To my understanding I can use duration in traceQL to query on span durations. This doesn’t serve my use case since I want to know about traces which take a long time since they mean a lot more than spans. Is this something that is planned or that I can contribute to?

joe-elliott · 2023-04-10T17:41:37Z

We are discussing adding a trace scope here:

#1989

One of the primary reasons is b/c it's much more efficient to search for trace duration then it is to search for span duration. For me the biggest question is how to introduce it into the language. The change is not particularly complex but it would touch a lot of different pieces (parser, engine and fetch layer).

Going to add a comment in that other issue to get some discussion going.

mdisibio · 2023-04-17T17:20:48Z

A common convention is to have a root span covering the whole execution. Its span duration is the same as the trace. Would that help you in the meantime?

amitsetty · 2023-04-17T18:40:04Z

Hey,
Tl:dr - it’s not a matter of convention rather that there is no actor involved in the whole service in the system and therefore a root span doesn’t contain the whole trace duration

This would be great to have in a case where there was a root span

The issue is that my systems are communicating using queues, file systems, and other offline asynchronous communication methods.
The first actor does an action and then puts a message in a queue (while being instrumented). The second actor picks the message up from the queue and acts (also instrumented). In reality I want to measure duration from the beginning of the first span to the end of the last but there is no actor involved in this whole process and therefore span duration and span metrics don’t help me address my data with a service centric view in the current approach. Having the capability to get trace duration will allow us to use traceql to find problematic traces and in the future when there are trace aggregations to figure out if a service is running correctly. Trace metrics would also help solve this problem however open up a lot of unknowns

amitsetty · 2023-05-13T12:03:16Z

Hey, any plans or updates around this?

joe-elliott · 2023-05-15T12:20:06Z

I think the main blocker is deciding on a syntax. We've been very focused on performance the last month or so and haven't pushed on this. Discussion here:

#1989

mdisibio · 2023-05-25T18:47:26Z

This was merged in #2503. The trace duration can be referenced like { traceDuration > 10s } and combined with any other conditions.

zalegrala · 2023-05-30T14:47:28Z

I believe this is resolved with #2503. Reopen with a comment if not.

joe-elliott added the traceql label May 15, 2023

zalegrala closed this as completed May 30, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Using trace duration in traceQL #2311

Using trace duration in traceQL #2311

amitsetty commented Apr 7, 2023

joe-elliott commented Apr 10, 2023

mdisibio commented Apr 17, 2023

amitsetty commented Apr 17, 2023 •

edited

amitsetty commented May 13, 2023

joe-elliott commented May 15, 2023

mdisibio commented May 25, 2023

zalegrala commented May 30, 2023

Using trace duration in traceQL #2311

Using trace duration in traceQL #2311

Comments

amitsetty commented Apr 7, 2023

joe-elliott commented Apr 10, 2023

mdisibio commented Apr 17, 2023

amitsetty commented Apr 17, 2023 • edited

amitsetty commented May 13, 2023

joe-elliott commented May 15, 2023

mdisibio commented May 25, 2023

zalegrala commented May 30, 2023

amitsetty commented Apr 17, 2023 •

edited