Use USER_KEEP/USER_REJECT for RuleSampler decisions #1769

marcotc · 2021-11-11T00:07:09Z

This PR changes changes how the sampling decisions yielded by the Datadog::Sampling::RuleSampler affect the span's _sampling_priority_v1 tag.

Before, 0 (for rejecting) and 1 (for keeping) were used for RuleSampler decisions. The issue with this approach is that 0 and 1 are default sampling values that allow for the sampling decision to be altered downstream, either by the Agent or by the Datadog backend.
0 and 1 communicate intent, not a strong decision. These works great when the sampling rate is not strictly important: for example, it is normally desirable to keep an error span, even if it's marked with a sampling decision of 0, as that span will have valuable debugging information.

The issue with 0 and 1 is that they don't allow for strict control of sampling rates. When limiting the sampling rate is very important (e.g. network traffic constrains, strictly limiting billable costs), such sampling was not possible with the RuleSampler, custom code snippets were required (e.g. Datadog::ForcedTracing.drop(span)).

After the changes in this PR, decisions that are originated from any user configuration to the RuleSampler use priority values -1 (for forced rejection) and 2 (for forced keeping). This means that RuleSampling sampling configurations will always be strictly respected by the whole Datadog processing pipeline.

It is possible for the RuleSampler to analyze a span and not make any user-configured decision (e.g. when no rules match, or not sampling rate applies). In this case, the priority is not altered.

ivoanjo

Overall it seems good... but I think I don't have enough context to evaluate if this change makes sense.

Is there any design doc I could reference to confirm that this is the common behavior we want? E.g., it's a bit unclear to me why we're changing this now.

lib/ddtrace/ext/priority.rb

lib/ddtrace/sampler.rb

ivoanjo · 2021-11-11T09:34:47Z

lib/ddtrace/sampler.rb

      if span.sampled
-        # If priority sampling has already been applied upstream, use that, otherwise...
-        unless priority_assigned_upstream?(span)
-          # Roll the dice and determine whether how we set the priority.
-          priority = priority_sample!(span) ? Datadog::Ext::Priority::AUTO_KEEP : Datadog::Ext::Priority::AUTO_REJECT
+        # If priority sampling has already been applied upstream, use that value.
+        return span.sampled if priority_assigned?(span)

-          assign_priority!(span, priority)
-        end
+        # Check with post sampler how we set the priority.
+        sample = priority_sample!(span)
+
+        # Check if post sampler has already assigned a priority.
+        return span.sampled if priority_assigned?(span)


Minor: Can the value of span.sampled be mutated as this method is being executed? If not, could we simply the early returns with return true if priority_assigned?(...) to clarify what we expect from it?

It cannot, I simplified it was suggested. 👍

ivoanjo · 2021-11-11T10:19:05Z

Is there any design doc I could reference to confirm that this is the common behavior we want? E.g., it's a bit unclear to me why we're changing this now.

Oops just realized that you DID had already dropped a link to the design doc in slack. Thanks.

What I did also notice is that I believe our docs still need to be updated, e.g. this section.

codecov-commenter · 2021-11-16T00:43:37Z

Codecov Report

Merging #1769 (3ddb57d) into master (9e75a24) will increase coverage by 0.00%.
The diff coverage is 100.00%.

@@           Coverage Diff           @@
##           master    #1769   +/-   ##
=======================================
  Coverage   98.21%   98.21%           
=======================================
  Files         931      931           
  Lines       44870    44904   +34     
=======================================
+ Hits        44069    44103   +34     
  Misses        801      801

Impacted Files	Coverage Δ
lib/ddtrace/ext/priority.rb	`100.00% <ø> (ø)`
lib/ddtrace/sampler.rb	`91.04% <100.00%> (+0.13%)`	⬆️
lib/ddtrace/sampling/rule_sampler.rb	`96.55% <100.00%> (+0.39%)`	⬆️
spec/ddtrace/integration_spec.rb	`97.16% <100.00%> (ø)`
spec/ddtrace/sampler_spec.rb	`100.00% <100.00%> (ø)`
spec/ddtrace/sampling/rule_sampler_spec.rb	`99.23% <100.00%> (+0.06%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9e75a24...3ddb57d. Read the comment docs.

ivoanjo

Thanks for all the clarifications. Here's my 👍 but I think this is one of those where it makes sense for @delner to doublecheck if we missed something, since there may be unknown unknowns hiding.

ivoanjo · 2021-11-16T12:50:25Z

lib/ddtrace/sampler.rb

+    # NOTE: We do not advise using a pre-sampler. It can save resources,
+    # but pre-sampling at rates < 100% may result in partial traces, unless
+    # the pre-sampler knows exactly how to drop a span without dropping its ancestors.
+    #
+    # Additionally, as service metrics are calculated in the Datadog Agent,
+    # the service's throughput will be underestimated.
    attr_reader :pre_sampler, :priority_sampler


I'm actually curious, are there a lot of customers using this feature? Since it ends up being a very sharp edge...

I'm not sure it's used in Ruby, but it is used by other language clients.
Because pre-smapler is the only way for the user to reduce traffic between ddtrace and the agent, it's used by clients with super high throughput, as their infra wouldn't handle the trace traffic otherwise (mostly due to high load on the trace agent).

delner

Overall code changes seem fine, although it will inevitably have merge conflicts with 1.0 changes. Thank you!

lib/ddtrace/sampling/rule_sampler.rb

This reverts commit dad5da9.

Use USER_KEEP/USER_REJECT for RuleSampler decisions

53634f5

marcotc added the core Involves Datadog core libraries label Nov 11, 2021

marcotc self-assigned this Nov 11, 2021

marcotc requested a review from a team November 11, 2021 00:07

ivoanjo reviewed Nov 11, 2021

View reviewed changes

marcotc added 3 commits November 15, 2021 16:22

Use RFC requirement language

1569260

Simplify return on Datadog::PrioritySampler#sample!

4208d4b

Expaned on pre-sampler comments

3ddb57d

ivoanjo approved these changes Nov 16, 2021

View reviewed changes

delner approved these changes Nov 17, 2021

View reviewed changes

lib/ddtrace/sampling/rule_sampler.rb Show resolved Hide resolved

Directly set sampling priority on span

dad5da9

marcotc added this to the 0.54.0 milestone Nov 17, 2021

Revert "Directly set sampling priority on span"

52809fc

This reverts commit dad5da9.

marcotc merged commit 4c7414f into master Nov 17, 2021

marcotc deleted the priority-sampling-3 branch November 17, 2021 21:46

ygree mentioned this pull request Dec 13, 2021

Use USER_(KEEP/REJECT) for RuleBasedSampler DataDog/dd-trace-java#3263

Merged

buddhistpirate mentioned this pull request Feb 14, 2022

0.54.0 Broke App Analytics #1905

Closed

marcotc mentioned this pull request May 3, 2022

AppSec: fetch parsed body for Rack #1969

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use USER_KEEP/USER_REJECT for RuleSampler decisions #1769

Use USER_KEEP/USER_REJECT for RuleSampler decisions #1769

marcotc commented Nov 11, 2021

ivoanjo left a comment

ivoanjo Nov 11, 2021

marcotc Nov 16, 2021

ivoanjo commented Nov 11, 2021

codecov-commenter commented Nov 16, 2021

ivoanjo left a comment

ivoanjo Nov 16, 2021

marcotc Nov 16, 2021

delner left a comment

Use USER_KEEP/USER_REJECT for RuleSampler decisions #1769

Use USER_KEEP/USER_REJECT for RuleSampler decisions #1769

Conversation

marcotc commented Nov 11, 2021

ivoanjo left a comment

Choose a reason for hiding this comment

ivoanjo Nov 11, 2021

Choose a reason for hiding this comment

marcotc Nov 16, 2021

Choose a reason for hiding this comment

ivoanjo commented Nov 11, 2021

codecov-commenter commented Nov 16, 2021

Codecov Report

ivoanjo left a comment

Choose a reason for hiding this comment

ivoanjo Nov 16, 2021

Choose a reason for hiding this comment

marcotc Nov 16, 2021

Choose a reason for hiding this comment

delner left a comment

Choose a reason for hiding this comment