Skip to content

Commit

Permalink
[chore][processor/tailsampling] Add help information for resolving tr…
Browse files Browse the repository at this point in the history
…ace dropped errors (#29513)

**Description:** <Describe what has changed.>
<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue.
Ex. Adding a feature - Explain what this achieves.-->
A [recent
issue](#29024)
was created when a user was getting a high value for the
`otelcol_processor_tail_sampling_sampling_trace_dropped_too_early`
metric. The solution was to increase the number of traces stored in
memory to handle the load. This is done by using the `num_traces`
configuration option. This was not clear from documentation or from the
error metric's description.

I'm not sure of what location is the best place to communicate the
solution to this error. The two options in my mind are to either
information to the README, or add more information to the metric
description itself. Any guidance here is appreciated to know what would
be most clear to users.

---------

Co-authored-by: Juraci Paixão Kröhling <juraci.github@kroehling.de>
  • Loading branch information
crobert-1 and jpkrohling committed Dec 5, 2023
1 parent d826261 commit d189c00
Show file tree
Hide file tree
Showing 2 changed files with 10 additions and 2 deletions.
10 changes: 9 additions & 1 deletion processor/tailsamplingprocessor/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ Multiple policies exist today and it is straight forward to add more. These incl

The following configuration options can also be modified:
- `decision_wait` (default = 30s): Wait time since the first span of a trace before making a sampling decision
- `num_traces` (default = 50000): Number of traces kept in memory
- `num_traces` (default = 50000): Number of traces kept in memory.
- `expected_new_traces_per_sec` (default = 0): Expected number of new traces (helps in allocating data structures)

Each policy will result in a decision, and the processor will evaluate them to make a final decision:
Expand Down Expand Up @@ -427,3 +427,11 @@ As a rule of thumb, if you want to add probabilistic sampling and...

[probabilistic_sampling_processor]: ../probabilisticsamplerprocessor
[loadbalancing_exporter]: ../../exporter/loadbalancingexporter

## FAQ

**Q. Why am I seeing high values for the error metric `sampling_trace_dropped_too_early`?**

**A.** This is likely a load issue. If the collector is processing more traces in-memory than the `num_traces` configuration
option allows, some will have to be dropped before they can be sampled. Increasing the value of `num_traces` can
help resolve this error, at the expense of increased memory usage.
2 changes: 1 addition & 1 deletion processor/tailsamplingprocessor/metrics.go
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ var (
statCountTracesSampled = stats.Int64("count_traces_sampled", "Count of traces that were sampled or not per sampling policy", stats.UnitDimensionless)
statCountGlobalTracesSampled = stats.Int64("global_count_traces_sampled", "Global count of traces that were sampled or not by at least one policy", stats.UnitDimensionless)

statDroppedTooEarlyCount = stats.Int64("sampling_trace_dropped_too_early", "Count of traces that needed to be dropped the configured wait time", stats.UnitDimensionless)
statDroppedTooEarlyCount = stats.Int64("sampling_trace_dropped_too_early", "Count of traces that needed to be dropped before the configured wait time", stats.UnitDimensionless)
statNewTraceIDReceivedCount = stats.Int64("new_trace_id_received", "Counts the arrival of new traces", stats.UnitDimensionless)
statTracesOnMemoryGauge = stats.Int64("sampling_traces_on_memory", "Tracks the number of traces current on memory", stats.UnitDimensionless)
)
Expand Down

0 comments on commit d189c00

Please sign in to comment.