Composite Samplers #250

PeterF778 · 2024-02-29T02:21:32Z

This is another approach for introducing Composite Samplers. The previous one (A Sampling Configuration proposal) proved too large, with some controversial elements.
This OTEP is a split-off from that one, focusing just on one area - new Composite Samplers.

yurishkuro

The bifurcation of AnyOf/Conjunction and RuleBased doesn't feel right. They are all in the realm of Boolean logic, with or without short circuiting, can we not come up with a more streamlined scheme?

For example, why does RuleBased sampler need to take a list of predicates, instead of one predicate and achieving multi-predicate behavior through composition?

text/0250-Composite_Samplers.md

yurishkuro · 2024-02-29T02:53:21Z

text/0250-Composite_Samplers.md

+
+The `RuleBased` sampler SHOULD NOT accept lists of different length, i.e. it SHOULD report an error. Implementations MAY allow for replacing both lists with a list of pairs (`Predicate`, `Sampler`), if this is supported by the platform.
+
+For making the sampling decision, if the `Span` kind matches the specified kind, the sampler goes through the lists in the declared order. If a Predicate matches the `Span` in question, the corresponding `Sampler` will be called to make the final sampling decision. If the `SpanKind` does not match, or none of the Predicates evaluates to `true`, the final decision is `DROP`.


If a Predicate matches the Span in question, the corresponding Sampler will be called to make the final sampling decision.

This will result in poor observability as the underlying sampler can be a primitive one so the only attributes it will be able to add will be related to its algorithm, not the fact that a specific predicate was matched. I would much rather see an ability for the user to give names to individual decision points (applies to all composite samplers). In addition to attributes these samplers could be emitting metrics with corresponding names.

Interesting ideas - do you have a specific use case in mind?
Anyway, I do not quite agree with your claim of poor observability. Most Predicates are expected to test span attributes only and therefore their underlying logic can be re-evaluated at any later time.

In the example at the top you have English description of different categories of spans being sampled. As a user, I would appreciate a mechanism to not only keep those description (maybe in a shorter form of "names"), but also to have observability where I can see which triggers fire how often (either by recording the trigger name as span attribute or emitting metrics labeled with those names). Your current mechanism does not allow for that afaict.

text/0250-Composite_Samplers.md

PeterF778 · 2024-03-04T15:25:49Z

The bifurcation of AnyOf/Conjunction and RuleBased doesn't feel right. They are all in the realm of Boolean logic, with or without short circuiting, can we not come up with a more streamlined scheme?

For example, why does RuleBased sampler need to take a list of predicates, instead of one predicate and achieving multi-predicate behavior through composition?

RuleBased is based on easy to understand if-then-else paradigm. Alternative schemas may look attractive, but I'm afraid they would provide less readable configurations.

… into composite_samplers

text/0250-Composite_Samplers.md

…w clarifying sentences.

jmacd

Haven't been able to finish this review. (Still need to understand Approach 2.) Sorry for the delay!

text/0250-Composite_Samplers.md

oertl · 2024-04-01T20:20:21Z

text/0250-Composite_Samplers.md

+- The set of span Attributes to be added to the `Span` is the sum of the sets of Attributes as provided by the delegate samplers within their `SamplingResults`s.
+- The `TraceState` to be used with the new `Span` is obtained by cumulatively applying all the potential modfications of the parent `TraceState` by the delegate samplers, with special handling of the `th` sub-key (the sampling rejection `THRESHOLD`) for the `ot` entry as described below.
+
+If the final sampling Decision is `DROP` or `RECORD_ONLY`, the `th` entry MUST be removed.


If all delegated samplers are consistent, it still makes sense to set the th value even in the case of negative sampling decisions.

One could perhaps say that, in general, for both EachOf and AnyOf, the th value must be set if and only if all delegated samplers are consistent (regardless of the sampling decision).

I like this idea, but didn't we expect that th is removed after a negative sampling decision? Similarly to how the p-values were handled?

I see. If we never put th to the trace state for negative sampling decisions, which I think makes sense, then again what is the purpose of an empty th field representing 0% probability? @jmacd

I had to read-through the latest iteration of this document before coming back to this thread--

I don't think an "empty" th field should be meaningful except to represent unknown information, in which case the field should be absent, not present w/ an empty value.

I guess the reason this might matter, the explicit unsetting of th-values for negative sampling decisions, is that for approach-1 spans where a RECORD_ONLY decision is reached, there will be a copy of the span for reading, in memory. When the span is accessed, both the former th value and the unset th value are somehow meaningful.

I would like us to come back to the invariants we believe there are, including the sampled flag.

If the sampled flag is NOT set and th is non-empty, I believe we have an inconsistency. This is why I believe we should unset th.

If the sampled flag is set and the th is unset (or empty), I believe we have unknown sampling.

jmacd · 2024-04-18T18:27:45Z

text/0250-Composite_Samplers.md

+Upon invocation of its `GetThreshold` function, the composite sampler MUST get the threshold from the delegate sampler, passing the same arguments as received.
+If using the obtained threshold value as the final threshold would entail sampling more spans than the declared target rate, the sampler SHOULD increase the threshold to a value that would meet the target rate. Several algorithms can be used for threshold adjustment, no particular behavior is prescried by the specification though.
+
+When using `ConsistentRateLimiting` in our requirements example as a replacement for `EachOf` and `RateLimiting`, we are left with no use case for any direct equivalent to `EachOf` in Approach Two.


Can you clarify this sentence for me? I am not sure I follow.

For symmetry, one would expect that we introduce something like ConsistentEachOf. But the only situation I see we needed that kind of logic (each of) was with combining one sampler with a rate-limiting sampler. So now, when we have a delegate for ConsistentRateLimiting, that need disappears. Unless we find a use case where it is is useful.

I see --
would you agree that the concept of a ConsistentEachOf sampler can be defined easily, and that it reduces to the maximum-threshold of the delegates. It's well-defined, but not necessary for the example being developed, I think.

Yes. I do not see any difficulties in defining the behavior of ConsistentEachOf, it is just its usefulness that is questionable. As soon as we find a nice use case for it, it can be added to this spec.

jmacd · 2024-04-18T18:37:56Z

text/0250-Composite_Samplers.md

+
+### Limitations of composite samplers in Approach Two
+
+While making sampling decisions with samplers from Approach Two is more efficient and avoids dealing with non-mainstream cases, it puts some limits on the capabilities of the Consistent Probability Samplers. In particular, a custom CP sampler that wishes to add a span `Attribute` or modify TraceState will be out of luck if it is used as a delegate.


This leaves me thinking about future work, possibly out-of-scope in this document. It looks like both approaches are necessary, one for historical compatibility and one for enabling future span-to-metrics and logs-to-metrics pipelines.

Do you agree that the two approaches may be complimentary, and can co-exist? This leads to two suggestions:

For the approach-one case, is there a set of rules that ensure when any non-probabilistic or CP-incompatible samplers are used, that the th value becomes "unknown" by being unset? (I think yes--this is discussed in comment threads above.)

To handle the limitations here, for approach 2, is there a way to combine approach-1 and approach-2 samplers so the th field is governed but otherwise Samplers can extend attributes, and so on? (I think yes.)

For the second case, I imagine an existing sampler that is written like:

if arbitraryPredicate(context) { result.decision = RECORD_AND_SAMPLE result.attributes = someNewAttributes() } else { result.decision = DROP }

can be rewritten, probably, as:

S = Conjunction( ConsistentRuleBased( (arbitraryPredicate => ConsistentAlwaysOn), true => ConsistentAlwaysOff, ), AttributeInserter( (result) => { result.attributes = someNewAttributes return result }))

If this works, it seems we ought to be able to construct an implementation that combines approach-1 and approach-2 samplers via, essentially, this rewrite. Does this sound possible?

Yes, I expect both approaches to be useful, targeting different users. I do not think we should abandon either one in favor of the other.
With respect to your point 1 above, the intention is to handle the th value "correctly". But we cannot just say that in all cases if any non-CP sampler is used then the th value needs to be erased. Both RuleBased and AnyOf samplers can have non-CP delegates, yet still deliver valid th values.

It is possible to use Approach Two delegates within Approach One samplers, and this is more or less what your example does. So it is a workaround for some limitations of Approach Two.

Got it. I think we're in agreement on this topic. I think our next step is really to prototype what this looks like in a prototype implementation.

jmacd · 2024-04-18T18:51:23Z

text/0250-Composite_Samplers.md

+- The set of span Attributes to be added to the `Span` is the sum of the sets of Attributes as provided by the delegate samplers within their `SamplingResults`s.
+- The `TraceState` to be used with the new `Span` is obtained by cumulatively applying all the potential modfications of the parent `TraceState` by the delegate samplers, with special handling of the `th` sub-key (the sampling rejection `THRESHOLD`) for the `ot` entry as described below.
+
+If the final sampling Decision is `DROP` or `RECORD_ONLY`, the `th` entry MUST be removed.


I had to read-through the latest iteration of this document before coming back to this thread--

I don't think an "empty" th field should be meaningful except to represent unknown information, in which case the field should be absent, not present w/ an empty value.

I guess the reason this might matter, the explicit unsetting of th-values for negative sampling decisions, is that for approach-1 spans where a RECORD_ONLY decision is reached, there will be a copy of the span for reading, in memory. When the span is accessed, both the former th value and the unset th value are somehow meaningful.

I would like us to come back to the invariants we believe there are, including the sampled flag.

If the sampled flag is NOT set and th is non-empty, I believe we have an inconsistency. This is why I believe we should unset th.

If the sampled flag is set and the th is unset (or empty), I believe we have unknown sampling.

text/0250-Composite_Samplers.md

oertl · 2024-04-19T06:26:38Z

text/0250-Composite_Samplers.md

+- If all of the delegate Decisions are `RECORD_AND_SAMPLE`, the composite sampler MUST return `RECORD_AND_SAMPLE` Decision as well.
+If any of the delegate Decisions is `DROP`, the composite sampler MUST return `DROP` Decision.
+Otherwise, if any of the delegate Decisions is `RECORD_ONLY`, the composite sampler MUST return `RECORD_ONLY` Decision.
+- If the resulting sampling Decision is `DROP`, the set of span Attributes to be added to the `Span` is empty. Otherwise, it is the sum of the sets of Attributes as provided by the delegate samplers within their `SamplingResults`s. In case of conflicting attribute keys, the attribute definition from the last delegate that uses that key takes effect.


If we want to have symmetry (see above):

Suggested change

- If the resulting sampling Decision is `DROP`, the set of span Attributes to be added to the `Span` is empty. Otherwise, it is the sum of the sets of Attributes as provided by the delegate samplers within their `SamplingResults`s. In case of conflicting attribute keys, the attribute definition from the last delegate that uses that key takes effect.

- If the resulting sampling Decision is `DROP`, the union of span Attributes to be added to the `Span` is empty. Otherwise, it is the sum of the sets of Attributes as provided by the delegate samplers within their `SamplingResults`s. Conflicting attribute keys must be removed.

Removing conflicting attributes is an interesting concept. However, if we were to set attributes using Span.SetAttribute, the specification says
Setting an attribute with the same key as an existing attribute SHOULD overwrite the existing attribute's value, so there would be an inconsistency. Adding entries with conflicting keys to TraceState also overwrites the previous entries - another inconsistency.
And we would be left with an interesting corner case: what to do if two samplers attempt to set Attributes with the same keys and values?

Upon invocation of its shouldSample method, it MUST go through the whole list and invoke shouldSample method on each delegate sampler, passing the same arguments as received, and collecting the delegates' sampling Decisions.

I understand this means the delegate samplers must be called with the same attributes. After invocating the delegates we have different sets of attributes that must be merged to a final result. Given an initial set of attributes, individual copies of this initial set are then passed to the delegates, who may set/modify attributes. As the final step we have to combine all the resulting sets of attributes including the initial one. To achieve symmetry, one could define that an attribute is only changed/set if the modifications by the delegates are not conflicting meaning that they don't try to set different values for the same key. If we take the initial set of attributes and apply only the non-conflicting modifications using Span.SetAttribute we should be fine.

okay, so you suggest looking only at the conflicts between the delegates - it makes a lot of sense. I had thought about conflicts with the initial attributes as well, my bad. I like this idea. Unfortunately, similar treatment of TraceState modifications does not seem possible with the current implementation of Java SDK. I did not look at other platforms.

oertl · 2024-04-19T06:27:01Z

text/0250-Composite_Samplers.md

+- If all of the delegate Decisions are `RECORD_AND_SAMPLE`, the composite sampler MUST return `RECORD_AND_SAMPLE` Decision as well.
+If any of the delegate Decisions is `DROP`, the composite sampler MUST return `DROP` Decision.
+Otherwise, if any of the delegate Decisions is `RECORD_ONLY`, the composite sampler MUST return `RECORD_ONLY` Decision.
+- If the resulting sampling Decision is `DROP`, the set of span Attributes to be added to the `Span` is empty. Otherwise, it is the sum of the sets of Attributes as provided by the delegate samplers within their `SamplingResults`s. In case of conflicting attribute keys, the attribute definition from the last delegate that uses that key takes effect.


Suggested change

- If the resulting sampling Decision is `DROP`, the set of span Attributes to be added to the `Span` is empty. Otherwise, it is the sum of the sets of Attributes as provided by the delegate samplers within their `SamplingResults`s. In case of conflicting attribute keys, the attribute definition from the last delegate that uses that key takes effect.

- If the resulting sampling Decision is `DROP`, the set of span Attributes to be added to the `Span` is empty. Otherwise, it is the union of the sets of Attributes as provided by the delegate samplers within their `SamplingResults`s. In case of conflicting attribute keys, the attribute definition from the last delegate that uses that key takes effect.

text/0250-Composite_Samplers.md

oertl · 2024-04-19T06:34:19Z

text/0250-Composite_Samplers.md

+- If all of the delegate Decisions are `DROP`, the composite sampler MUST return `DROP` Decision as well.
+If any of the delegate Decisions is `RECORD_AND_SAMPLE`, the composite sampler MUST return `RECORD_AND_SAMPLE` Decision.
+Otherwise, if any of the delegate Decisions is `RECORD_ONLY`, the composite sampler MUST return `RECORD_ONLY` Decision.
+- The set of span Attributes to be added to the `Span` is the sum of the sets of Attributes as provided by those delegate samplers which produced a sampling Decision other than `DROP`. In case of conflicting attribute keys, the attribute definition from the last delegate that uses that key takes effect.


Suggested change

- The set of span Attributes to be added to the `Span` is the sum of the sets of Attributes as provided by those delegate samplers which produced a sampling Decision other than `DROP`. In case of conflicting attribute keys, the attribute definition from the last delegate that uses that key takes effect.

- The set of span Attributes to be added to the `Span` is the sum of the union of Attributes as provided by those delegate samplers which produced a sampling Decision other than `DROP`. In case of conflicting attribute keys, the attribute definition from the last delegate that uses that key takes effect.

oertl · 2024-04-19T06:41:26Z

text/0250-Composite_Samplers.md

+If any of the delegate Decisions is `RECORD_AND_SAMPLE`, the composite sampler MUST return `RECORD_AND_SAMPLE` Decision.
+Otherwise, if any of the delegate Decisions is `RECORD_ONLY`, the composite sampler MUST return `RECORD_ONLY` Decision.
+- The set of span Attributes to be added to the `Span` is the sum of the sets of Attributes as provided by those delegate samplers which produced a sampling Decision other than `DROP`. In case of conflicting attribute keys, the attribute definition from the last delegate that uses that key takes effect.
+- The `TraceState` to be used with the new `Span` is obtained by cumulatively applying all the potential modfications of the parent `TraceState` by the delegate samplers. In case of conflicting entry keys, the entry definition provided by the last delegate that uses that key takes effect. However, the `th` sub-key (the sampling rejection `THRESHOLD`) for the `ot` entry gets special handling as described below.


If we want to have symmetry:

Suggested change

- The `TraceState` to be used with the new `Span` is obtained by cumulatively applying all the potential modfications of the parent `TraceState` by the delegate samplers. In case of conflicting entry keys, the entry definition provided by the last delegate that uses that key takes effect. However, the `th` sub-key (the sampling rejection `THRESHOLD`) for the `ot` entry gets special handling as described below.

- The `TraceState` to be used with the new `Span` is obtained by cumulatively applying all the potential modifications of the parent `TraceState` by the delegate samplers. Conflicting entry keys must be removed. However, the `th` sub-key (the sampling rejection `THRESHOLD`) for the `ot` entry gets special handling as described below.

It would be nice to have, but I see it difficult to implement. In Java SDK the SamplingResult can modify a given TraceState, but the modification itself (delta) is not tangible. Also, in contrast with the Attributes, trace state modification can remove keys (at least I have not found anything that would prohibit doing that).

text/0250-Composite_Samplers.md

oertl · 2024-04-19T07:54:26Z

text/0250-Composite_Samplers.md

+
+### Limitations of composite samplers in Approach Two
+
+While making sampling decisions with samplers from Approach Two is more efficient and avoids dealing with non-mainstream cases, it puts some limits on the capabilities of the Consistent Probability Samplers. In particular, a custom CP sampler that wishes to add a span `Attribute` or modify TraceState will be out of luck if it is used as a delegate.


As it is explicitly mentioned as a limitation. I wonder why a consistent sampler would add any span attributes or change the trace state at all (other than th and rv)? What would be a use case for this?

Good question. With trace state, I imagine a head sampler can mark some traces for downstream (head) samplers and for further trace processing, effectively classifying traces according to arbitrary criteria. Obviously, this requires writing custom samplers, but these samplers can be still CP compliant.
Adding Attributes is similar, even though it works only on the Span level and only for sampled Spans.

Got it. I think this could also be achieved with Approach Two by extending the ConsistentSampler interface by methods other than getThresholdsuch as getAttributes and defining some logic how those attributes are finally added in case the overall sampling decision is positive. So I would not necessarily see that as limitation of Approach Two as we also have to define some logic for the attributes for Approach One.

Yes, and we could do a similar thing with TraceState by providing adjustTraceState. For that call, we could pass the final SamplingDecision (which can be different from what the delegate decided). One difficulty is that for RuleBased, the Predicates would get re-evaluated in order to route the call to the correct delegate.

I am thinking in the same direction. I have a sandbox where I've begun drafting my hopes&dreams for a V2 sampler API, right now I have something like:

type SamplingResult struct { Decision SamplingDecision Mutation func(trace.TraceState) ([]attribute.KeyValue, trace.TraceState) }

for an Approach-2 sampler, the idea is that a composition of sampler delegates will first evaluate the decision, then it would modify the special TraceState fields, then for the positive decision makers it would allow them (sequentially) to apply modified attributes and modify the trace state in other ways. I think it will be fine to say that attributes are applied in the same order as the original delegate list, and that later samplers clobber earlier samplers in case of conflicts. I will flush this out more, but I think we can get past the trouble associated with modifying attributes and tracestate independent of the decision & essential tracestate updates.

I modified the document to present this two-phase process.

I've begun to prototype jmacd/opentelemetry-go#877

Co-authored-by: Otmar Ertl <otmar.ertl@gmail.com>

kalyanaj · 2024-04-28T02:33:05Z

text/0250-Composite_Samplers.md

+
+Each delegate sampler MUST be given a chance to participate in the sampling decision as described above and MUST see the same _parent_ state. The resulting sampling Decision does not depend on the order of the delegate samplers.
+
+### EachOf


Suggestion: While EachOf is fine, I am wondering if we can consider "AllOf" as an alternative name - it feels a bit stronger and contrasts well with the AnyOf model.

Interestingly, I had considered AllOf originally, but decided against it because it looks too much like AnyOf ... especially in small font and for tired eyes ... but definitely an option.

kalyanaj · 2024-04-28T02:45:57Z

text/0250-Composite_Samplers.md

+
+Each delegate sampler MUST be given a chance to participate in the sampling decision as described above and MUST see the same _parent_ state. The resulting sampling Decision does not depend on the order of the delegate samplers.
+
+### Conjunction


Suggestion: Would the names Chain or Sequence better convey the intent here? Conjunction feels a bit too abstract (and seems to have more interpretations than the chaining/pipeline implied here).

Both Chain and Sequence suggest, IMHO, a possibility to have more steps (delegates) than just two. And I do not want to deal with more than 2 delegates - this kind of composition is rarely needed for two, and I do not have a use case for more than two. It is a bit like IfThen construct (I'm not suggesting this name though, this could be too confusing).
I think Chain is acceptable, and I'm open to any new suggestions.

text/0250-Composite_Samplers.md

kalyanaj · 2024-06-06T16:02:14Z

text/0250-Composite_Samplers.md

+- The sampling Decision is `RECORD_AND_SAMPLE`.
+- The set of span Attributes to be added to the `Span` is the union of the sets of Attributes as provided by both samplers.
+- The `Tracestate` to be used with the new `Span` is as provided by the Second sampler, with special handling of the `th` sub-key (the sampling rejection `THRESHOLD`) for the `ot` entry.
+If the First sampler did not provide `th` entry in the returned `Tracestate`, or if the value of the corresponding `THRESHOLD` is not `0`, then the `th` entry MUST be removed from the resulting `Tracestate`.


I didn't quite get the part about "or if the value of the corresponding THRESHOLD is not 0,". If the first sampler returned RECORD_AND_SAMPLE, it is possible that the threshold could be non-zero, correct? Can you please clarify?

Yes, this is a tricky part. If the First sampler decides to sample only some spans (i.e. the THRESHOLD is not 0), we cannot, in general, assure that the composite sampler's behavior is correct (even if we will use maximum of the THRESHOLD values from First and Second delegate). See @oertl's comments from March 7 (click on Show resolved).

Actually, giving it more thoughts, I'm inclined to relax this condition and allow any THRESHOLD values from the First delegate. It opens an opportunity for abuse by a malicious Second sampler, but such abuse (using the r value for determining sampling priority) can happen anywhere. I believe that as long as First and Second delegates are well-behaved, we can use the maximum THRESHOLD value for the final result. CC: @oertl

kalyanaj · 2024-06-06T16:03:50Z

text/0250-Composite_Samplers.md

+
+If the sampling Decision from the Second sampler is `DROP` or `RECORD_ONLY`, the Conjunction sampler MUST return a `SamplingResult` which is constructed as follows:
+
+- The sampling Decision is `DROP`.


If the sampling decision from the second sampler is RECORD_ONLY, shouldn't this composite sampler be returning that - didn't quite get why the final decision is DROP.

kalyanaj · 2024-06-06T16:06:06Z

text/0250-Composite_Samplers.md

+
+`Conjunction` is a composite sampler which takes two Samplers (delegates) as the arguments. These delegate samplers will be hereby referenced as First and Second. This kind of composition forms conditional chaining of both samplers. 
+
+Upon invocation of its `shouldSample` method, the Conjunction sampler MUST invoke `shouldSample` method on the First sampler, passing the same arguments as received, and examine the received sampling Decision. Upon receiving `DROP` or `RECORD_ONLY` decision it MUST return the SamplingResult from the First sampler, and it MUST NOT proceed with querying the Second sampler. If the sampling decision from the First sampler is `RECORD_AND_SAMPLE`, the Conjunction sampler MUST invoke `shouldSample` method on the Second sampler, effectively passing the `Tracestate` received from the First sampler as the parent trace state.


For the case where the first sampler returns DROP or RECORD_ONLY, can we also clarify the behavior of the span attributes and tracestate? For the RECORD_ONLY case, should they be retained? Currently, the spec doesn't seem to be specifying that aspect.

kalyanaj · 2024-06-06T16:21:47Z

text/0250-Composite_Samplers.md

+This proposal addresses head-based sampling as described by the [Open Telemetry SDK](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/sdk.md#sampling).
+It introduces additional _composite samplers_. Composite samplers use other samplers (_delegates_ or _children_) to make sampling decisions. The composite samplers invoke the delegate samplers, but eventually make the final call.
+
+The new samplers proposed here are mostly compatible with Consistent Probability Samplers. For detailed description of this concept see [probability sampling](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/tracestate-probability-sampling.md). However, the technical details in that document are outdated. For the current proposal see [OTEP 235](https://github.com/open-telemetry/oteps/blob/main/text/trace/0235-sampling-threshold-in-trace-state.md). Also see Draft PR 3910 [Probability Samplers based on W3C Trace Context Level 2](https://github.com/open-telemetry/opentelemetry-specification/pull/3910).


Since OTEP 235 supersedes the earlier experimental spec, my suggestion is to link only to OTEP 235 in this spec (both here and later below in the section Example - sampling configuration 2). Otherwise, it can be confusing for readers. If they need more info on the earlier version, I believe in OTEP 235 we link to the earlier version of the spec.

Good point, I'll change that.

text/0250-Composite_Samplers.md

kalyanaj · 2024-06-06T16:28:06Z

text/0250-Composite_Samplers.md

+
+## Prior art
+
+A number of composite samplers are already available as independent contributions


FYI, I have a few more here (in case you want to link to them) at:

https://github.com/open-telemetry/opentelemetry-dotnet/tree/main/docs/trace/links-based-sampler

https://github.com/open-telemetry/opentelemetry-dotnet/tree/main/docs/trace/stratified-sampling-example

https://github.com/open-telemetry/opentelemetry-dotnet/tree/main/docs/trace/tail-based-sampling-span-level

Thanks! Good stuff!

text/0250-Composite_Samplers.md

kalyanaj · 2024-06-06T17:27:20Z

text/0250-Composite_Samplers.md

+To make this approach possible, all Consistent Probability Samplers which participate in the samplers composition need to implement the following API, in addition to the standard Sampler API. We will use the term _Composable Sampler_ to dentote Consistent Probability Samplers which provide the new API and conform to the rules described here.
+The composite samplers in Approach Two are Composable Samplers as well.
+
+#### GetSamplingAdvice


Suggestion: The term advice feels a bit abstract to me. Suggest considering a bit more concrete name such as "GetSamplingStatePreview" or "EvaluateSamplingIntent".

Sure, how about GetSamplingIntent?

sure, that sounds good to me.

Co-authored-by: J. Kalyana Sundaram <kalyanaj@microsoft.com>

Creating "Composite Samplers" OTEP.

bfd33d9

PeterF778 requested a review from a team as a code owner February 29, 2024 02:21

PeterF778 added 2 commits February 28, 2024 18:22

File rename

611a140

Merge branch 'main' into composite_samplers

16005a5

yurishkuro reviewed Feb 29, 2024

View reviewed changes

PeterF778 added 6 commits March 4, 2024 16:31

Creating "Composite Samplers" OTEP.

772ef28

File rename

395bd0c

Updated version, with examples

fcb6a4b

Merge branch 'composite_samplers' of https://github.com/PeterF778/oteps…

9495d33

… into composite_samplers

Fixing markdown-lint errors

b37f054

Fixing more markdown-lint errors

7bcd40b

oertl reviewed Mar 7, 2024

View reviewed changes

text/0250-Composite_Samplers.md Outdated Show resolved Hide resolved

jmacd marked this pull request as draft March 11, 2024 16:25

Modfied the behavior of Conjunction sampler.

c72b06a

oertl reviewed Mar 13, 2024

View reviewed changes

text/0250-Composite_Samplers.md Outdated Show resolved Hide resolved

text/0250-Composite_Samplers.md Outdated Show resolved Hide resolved

Added description for Approach Two.

7cb67c1

This was referenced Mar 18, 2024

Add OpenTelemetry sampling conventions open-telemetry/semantic-conventions#793

Open

Refactor the built-in Samplers for probability sampling open-telemetry/opentelemetry-specification#2179

Closed

Added description of ConsistentRateLimited, another example, and a fe…

ab8e962

…w clarifying sentences.

jmacd mentioned this pull request Mar 21, 2024

To achieve different sampling rates for different applications and integrate them with the OTel collectors open-telemetry/opentelemetry-collector-contrib#31562

Open

jmacd reviewed Mar 28, 2024

View reviewed changes

text/0250-Composite_Samplers.md Outdated Show resolved Hide resolved

text/0250-Composite_Samplers.md Outdated Show resolved Hide resolved

text/0250-Composite_Samplers.md Outdated Show resolved Hide resolved

oertl reviewed Apr 1, 2024

View reviewed changes

text/0250-Composite_Samplers.md Outdated Show resolved Hide resolved

oertl reviewed Apr 1, 2024

View reviewed changes

Changing description of sampler as suggested

47084f2

jmacd reviewed Apr 18, 2024

View reviewed changes

oertl reviewed Apr 19, 2024

View reviewed changes

PeterF778 and others added 2 commits April 19, 2024 10:34

Update text/0250-Composite_Samplers.md

a6d3705

Co-authored-by: Otmar Ertl <otmar.ertl@gmail.com>

Update text/0250-Composite_Samplers.md

cae3b74

Co-authored-by: Otmar Ertl <otmar.ertl@gmail.com>

jmacd mentioned this pull request Apr 19, 2024

Give samplers the span ID open-telemetry/opentelemetry-specification#3991

Open

PeterF778 and others added 3 commits April 19, 2024 11:25

Update text/0250-Composite_Samplers.md

0d3ce18

Co-authored-by: Otmar Ertl <otmar.ertl@gmail.com>

Update text/0250-Composite_Samplers.md

9a7afb3

Co-authored-by: Otmar Ertl <otmar.ertl@gmail.com>

Update text/0250-Composite_Samplers.md

129c99c

Co-authored-by: Otmar Ertl <otmar.ertl@gmail.com>

jmacd mentioned this pull request Apr 24, 2024

Project Tracking: Sampling open-telemetry/opentelemetry-specification#4012

Open

PeterF778 added 2 commits May 1, 2024 11:48

Describing a two-phase processing for Approach Two.

10ba0c1

Fixing a typo

1c15c2b

jmacd mentioned this pull request May 2, 2024

[WIP] SpanReader sampling/processing interface jmacd/opentelemetry-go#877

Open

PeterF778 added 2 commits May 3, 2024 10:17

Minor adjustments and corrections, introducing Composable Sampler.

a0bfb1b

Adding more references

e544794

kalyanaj mentioned this pull request May 10, 2024

Sampler API V2: What, Why, and How? open-telemetry/opentelemetry-specification#4044

Open

kalyanaj reviewed Jun 6, 2024

View reviewed changes

PeterF778 and others added 3 commits June 6, 2024 15:17

Update text/0250-Composite_Samplers.md

635b216

Co-authored-by: J. Kalyana Sundaram <kalyanaj@microsoft.com>

Update text/0250-Composite_Samplers.md

06cc32d

Co-authored-by: J. Kalyana Sundaram <kalyanaj@microsoft.com>

Update text/0250-Composite_Samplers.md

5cc2174

Co-authored-by: J. Kalyana Sundaram <kalyanaj@microsoft.com>


		The `RuleBased` sampler SHOULD NOT accept lists of different length, i.e. it SHOULD report an error. Implementations MAY allow for replacing both lists with a list of pairs (`Predicate`, `Sampler`), if this is supported by the platform.

		For making the sampling decision, if the `Span` kind matches the specified kind, the sampler goes through the lists in the declared order. If a Predicate matches the `Span` in question, the corresponding `Sampler` will be called to make the final sampling decision. If the `SpanKind` does not match, or none of the Predicates evaluates to `true`, the final decision is `DROP`.


		### Limitations of composite samplers in Approach Two

		While making sampling decisions with samplers from Approach Two is more efficient and avoids dealing with non-mainstream cases, it puts some limits on the capabilities of the Consistent Probability Samplers. In particular, a custom CP sampler that wishes to add a span `Attribute` or modify TraceState will be out of luck if it is used as a delegate.

	- If the resulting sampling Decision is `DROP`, the set of span Attributes to be added to the `Span` is empty. Otherwise, it is the sum of the sets of Attributes as provided by the delegate samplers within their `SamplingResults`s. In case of conflicting attribute keys, the attribute definition from the last delegate that uses that key takes effect.
	- If the resulting sampling Decision is `DROP`, the union of span Attributes to be added to the `Span` is empty. Otherwise, it is the sum of the sets of Attributes as provided by the delegate samplers within their `SamplingResults`s. Conflicting attribute keys must be removed.

	- The set of span Attributes to be added to the `Span` is the sum of the sets of Attributes as provided by those delegate samplers which produced a sampling Decision other than `DROP`. In case of conflicting attribute keys, the attribute definition from the last delegate that uses that key takes effect.
	- The set of span Attributes to be added to the `Span` is the sum of the union of Attributes as provided by those delegate samplers which produced a sampling Decision other than `DROP`. In case of conflicting attribute keys, the attribute definition from the last delegate that uses that key takes effect.


		Each delegate sampler MUST be given a chance to participate in the sampling decision as described above and MUST see the same _parent_ state. The resulting sampling Decision does not depend on the order of the delegate samplers.

		### EachOf


		Each delegate sampler MUST be given a chance to participate in the sampling decision as described above and MUST see the same _parent_ state. The resulting sampling Decision does not depend on the order of the delegate samplers.

		### Conjunction


		If the sampling Decision from the Second sampler is `DROP` or `RECORD_ONLY`, the Conjunction sampler MUST return a `SamplingResult` which is constructed as follows:

		- The sampling Decision is `DROP`.


		`Conjunction` is a composite sampler which takes two Samplers (delegates) as the arguments. These delegate samplers will be hereby referenced as First and Second. This kind of composition forms conditional chaining of both samplers.

		Upon invocation of its `shouldSample` method, the Conjunction sampler MUST invoke `shouldSample` method on the First sampler, passing the same arguments as received, and examine the received sampling Decision. Upon receiving `DROP` or `RECORD_ONLY` decision it MUST return the SamplingResult from the First sampler, and it MUST NOT proceed with querying the Second sampler. If the sampling decision from the First sampler is `RECORD_AND_SAMPLE`, the Conjunction sampler MUST invoke `shouldSample` method on the Second sampler, effectively passing the `Tracestate` received from the First sampler as the parent trace state.


		## Prior art

		A number of composite samplers are already available as independent contributions

Composite Samplers #250

Are you sure you want to change the base?

Composite Samplers #250

Conversation

PeterF778 commented Feb 29, 2024 • edited

yurishkuro left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

PeterF778 commented Mar 4, 2024

jmacd left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jmacd Apr 18, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jmacd Apr 18, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

PeterF778 commented Feb 29, 2024 •

edited

jmacd Apr 18, 2024 •

edited

jmacd Apr 18, 2024 •

edited