Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compactors to distinguish 'importance' of traces by policy #1427

Open
jvilhuber opened this issue May 6, 2022 · 8 comments
Open

Compactors to distinguish 'importance' of traces by policy #1427

jvilhuber opened this issue May 6, 2022 · 8 comments
Labels
keepalive Label to exempt Issues / PRs from stale workflow

Comments

@jvilhuber
Copy link

jvilhuber commented May 6, 2022

Is your feature request related to a problem? Please describe.
I'm adding tail-based sampling (via otel-collector) to my tracing solution, and while that helps reduce the amount of traces I have in the backend, I'd love a little more flexibility in how long some traces (versus others) are kept.

"Normal" (by some definition) traces are interesting for a while at first, but are rarely looked at after a short while (retention=1week?), but "odd" traces (errors, slow, http.status_code=500, etc) might be interesting for a longer period of time (retention=1month?).

Describe the solution you'd like
Looking at the tail-based policies, I wondered if it wouldn't make sense to have a similar (or greatly simplified, perhaps) set of policies whereby we might keep error and slow traces (i.e. anything matching some policy definition) for a longer period of time than "normal" traces (non-error, and fast).

Describe alternatives you've considered
Using tail-based sampling we already greatly reduce the amount of "normal" traces. I see no other way to selectively change the retention period of traces in the backend.

Additional context
https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/processor/tailsamplingprocessor

@annanay25
Copy link
Contributor

This is a cool feature 👍

Is the primary intention to store "interesting" traces for a longer period or to reduce storage costs by downsampling in the compactor?

From some internal metrics, storage is just about ~30% of our operational cost, so if we are expecting to reduce operational cost by sampling down in the compactor I'm not sure how effective that would be.

OTOH, if the intention is to have similar functionality to "archival of traces" as in Jaeger, then I am 100% on board.

Additionally, the backend format is in some flux at the moment, we are migrating to full Parquet blocks and it might be worth waiting till that settles before implementing this.

@jvilhuber
Copy link
Author

I think for the purpose of this ticket, I'd say it's to reduce storage costs and speed up searching by removing the less interesting traces sooner. The notion of 'downsampling' into metrics, is a separate (but interesting) discussion, I suspect.

But you have a point with the new storage format: Definitely something to keep on the back-burner for after that switchover (once we see how that shakes out).

@jvilhuber
Copy link
Author

jvilhuber commented May 10, 2022

Another thought: I'm not really happy with tail-sampling in general. It has little to do with any implementation (say in otel-collector or grafana-agent), and more to do with the place it's done: the middle.

Consider: any collector-type thing (otel-collector and grafana-agent, for example) will need to wait some time for 'all' spans to come in to make a decision. If I make that "decision-time" really long, I use up more memory on the collector. What's a good time? Any time I pick will inevitably be shorter than SOME amount of traces that are just really long! And those are precisely the ones I need to look at to determine "what the heck took this request so long".

I was just looking at some traces where the interesting part is simply missing! After 30s (or whatever), otel-collector decided enough is enough and declared my policy of "slow requests" was matched and sampled the request and sent it on. However, some spans that were the culprits I'm waiting on, took LONGER than the 30s and came in later, and thus weren't part of the trace in tempo.

Ingesters and compactors already deal with the heavy lifting of waiting for spans, organizing them into traces, adding late spans to trace-blocks, etc. And they even do this on disk, instead of just memory, allowing for more flexibility in durations and storage-capacity. Duplicating most of that in some collector in the middle seems bug-prone at best.

In the backend I could configure a MUCH longer time-frame to wait for a decision (minutes. days even!): Ingesters can keep adding late spans to the trace and compactors can later go through and delete traces that are "complete" but not "interesting" (and past some time-value for expiration).

Sorry for the rambling.

EDIT: maybe my tail-based sampling with trace-id-load-balancing is misconfigured. That's possible, too.

@joe-elliott
Copy link
Member

These are some good points. I agree that tail based sampling in the OTel Collector/Grafana Agent has some challenges. The points you are making are spot on. There are certainly some use cases where it works quite well, but configurations can be brittle and it does delay moving the trace to the backend (as you have seen).

I, personally, love the originally requested feature. It's something we've talked about implementing a few times, but have always had higher priorities. Keeping traces in object storage is generally cheap and so we've often been focused on other elements of Tempo.

With new Parquet blocks and TraceQL coming up we will have our hands full for a bit, but I'd like to revisit this once those features are in place and our backend has settled. Thanks!

@github-actions
Copy link
Contributor

This issue has been automatically marked as stale because it has not had any activity in the past 60 days.
The next time this stale check runs, the stale label will be removed if there is new activity. The issue will be closed after 15 days if there is no new activity.
Please apply keepalive label to exempt this Issue.

@github-actions github-actions bot added the stale Used for stale issues / PRs label Nov 15, 2022
@joe-elliott joe-elliott added keepalive Label to exempt Issues / PRs from stale workflow and removed stale Used for stale issues / PRs labels Nov 16, 2022
@ghengao
Copy link

ghengao commented Apr 11, 2023

One use case i have in mind for feature like this is (normal vs abnormal) traces (abnormal can leave up to agent to decide or user to configure), but the purpose is more focusing on back tracing a problem that may beyond the retention period. Example, some end user of my application may report a problem that happened 14 days ago, but my retention for the traces only kept for 7 days. This feature could potentially enable the maintainer to find out exactly what happened and correlate this with other metrics/trace data. Now, it maybe up to argument about why we cannot just investigate all the errors when it happens by generating alert which stores somewhere else. This maybe the case if this is not available in tempo, however, it would make things a lot easier if this is already exists in tempo.

@mdisibio
Copy link
Contributor

I think TraceQL is a great fit here so you can express any arbitrary condition. Would love to see something like:

retention:
  - query: { error = true || span.http.status_code >= 500 }
    retention: 30d
  - query: {}         <-- default case matches everything else
    retention: 7d

@rlex
Copy link

rlex commented Apr 21, 2023

Yeah, loki-styled retention streams will be great here.
In loki i use them to keep debug for a day, while everything else is stored for longer amount of time.
Default case is probably not needed, since it will just use default retention setting in tempo?

It will be great to keep errors for several months (rarely, but sometimes needed) while keeping "hot" data for shorter amount of time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
keepalive Label to exempt Issues / PRs from stale workflow
Projects
None yet
Development

No branches or pull requests

6 participants