Skip to content

Commit

Permalink
Introduce sampling package as reference implementation for OTEP 235 (o…
Browse files Browse the repository at this point in the history
…pen-telemetry#29720)

**Description:** This is the `pkg/sampling` portion of of
open-telemetry#24811.

**Link to tracking Issue:** 
open-telemetry#29738

open-telemetry/opentelemetry-specification#1413

**Testing:** Complete.

**Documentation:** New README added.

---------

Co-authored-by: Juraci Paixão Kröhling <juraci.github@kroehling.de>
Co-authored-by: Kent Quirk <kentquirk@gmail.com>
  • Loading branch information
3 people authored and anthoai97 committed Feb 12, 2024
1 parent 5b57deb commit 158b707
Show file tree
Hide file tree
Showing 24 changed files with 2,271 additions and 0 deletions.
27 changes: 27 additions & 0 deletions .chloggen/add_pkg_sampling.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# Use this changelog template to create an entry for release notes.

# One of 'breaking', 'deprecation', 'new_component', 'enhancement', 'bug_fix'
change_type: new_component

# The name of the component, or a single word describing the area of concern, (e.g. filelogreceiver)
component: pkg_sampling

# A brief description of the change. Surround your text with quotes ("") if it needs to start with a backtick (`).
note: Package of code for parsing OpenTelemetry tracestate probability sampling fields.

# Mandatory: One or more tracking issues related to the change. You can use the PR number here if no issue exists.
issues: [29738]

# (Optional) One or more lines of additional information to render under the primary note.
# These lines will be padded with 2 spaces and then inserted directly into the document.
# Use pipe (|) for multiline entries.
subtext:

# If your change doesn't affect end users or the exported elements of any package,
# you should instead start your pull request title with [chore] or use the "Skip Changelog" label.
# Optional: The change log or logs in which this entry should be included.
# e.g. '[user]' or '[user, api]'
# Include 'user' if the change is relevant to end users.
# Include 'api' if there is a change to a library API.
# Default: '[user]'
change_logs: [api]
1 change: 1 addition & 0 deletions .github/CODEOWNERS
Validating CODEOWNERS rules …
Original file line number Diff line number Diff line change
Expand Up @@ -136,6 +136,7 @@ pkg/ottl/ @open-telemetry/collect
pkg/pdatatest/ @open-telemetry/collector-contrib-approvers @djaglowski @fatsheep9146
pkg/pdatautil/ @open-telemetry/collector-contrib-approvers @dmitryax
pkg/resourcetotelemetry/ @open-telemetry/collector-contrib-approvers @mx-psi
pkg/sampling/ @open-telemetry/collector-contrib-approvers @jmacd @kentquirk
pkg/stanza/ @open-telemetry/collector-contrib-approvers @djaglowski
pkg/translator/azure/ @open-telemetry/collector-contrib-approvers @open-telemetry/collector-approvers @atoulme @cparkins
pkg/translator/jaeger/ @open-telemetry/collector-contrib-approvers @open-telemetry/collector-approvers @frzifus
Expand Down
1 change: 1 addition & 0 deletions .github/ISSUE_TEMPLATE/bug_report.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -133,6 +133,7 @@ body:
- pkg/pdatatest
- pkg/pdatautil
- pkg/resourcetotelemetry
- pkg/sampling
- pkg/stanza
- pkg/translator/azure
- pkg/translator/jaeger
Expand Down
1 change: 1 addition & 0 deletions .github/ISSUE_TEMPLATE/feature_request.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -127,6 +127,7 @@ body:
- pkg/pdatatest
- pkg/pdatautil
- pkg/resourcetotelemetry
- pkg/sampling
- pkg/stanza
- pkg/translator/azure
- pkg/translator/jaeger
Expand Down
1 change: 1 addition & 0 deletions .github/ISSUE_TEMPLATE/other.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -127,6 +127,7 @@ body:
- pkg/pdatatest
- pkg/pdatautil
- pkg/resourcetotelemetry
- pkg/sampling
- pkg/stanza
- pkg/translator/azure
- pkg/translator/jaeger
Expand Down
1 change: 1 addition & 0 deletions pkg/sampling/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
include ../../Makefile.Common
23 changes: 23 additions & 0 deletions pkg/sampling/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
# pkg/sampling

## Overview

This package contains utilities for parsing and interpreting the W3C
[TraceState](https://www.w3.org/TR/trace-context/#tracestate-header)
and all sampling-relevant fields specified by OpenTelemetry that may
be found in the OpenTelemetry section of the W3C TraceState.

This package implements the draft specification in [OTEP
235](https://github.com/open-telemetry/oteps/pull/235), which
specifies two fields used by the OpenTelemetry consistent probability
sampling scheme.

These are:

- `th`: the Threshold used to determine whether a TraceID is sampled
- `rv`: an explicit randomness value, which overrides randomness in the TraceID

[OTEP 235](https://github.com/open-telemetry/oteps/pull/235) contains
details on how to interpret these fields. The are not meant to be
human readable, with a few exceptions. The tracestate entry `ot=th:0`
indicates 100% sampling.
125 changes: 125 additions & 0 deletions pkg/sampling/common.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,125 @@
// Copyright The OpenTelemetry Authors
// SPDX-License-Identifier: Apache-2.0

package sampling // import "github.com/open-telemetry/opentelemetry-collector-contrib/pkg/sampling"

import (
"errors"
"io"
"strings"

"go.uber.org/multierr"
)

// KV represents a key-value parsed from a section of the TraceState.
type KV struct {
Key string
Value string
}

var (
// ErrTraceStateSize is returned when a TraceState is over its
// size limit, as specified by W3C.
ErrTraceStateSize = errors.New("invalid tracestate size")
)

// keyValueScanner defines distinct scanner behaviors for lists of
// key-values.
type keyValueScanner struct {
// maxItems is 32 or -1
maxItems int
// trim is set if OWS (optional whitespace) should be removed
trim bool
// separator is , or ;
separator byte
// equality is = or :
equality byte
}

// commonTraceState is embedded in both W3C and OTel trace states.
type commonTraceState struct {
kvs []KV
}

// ExtraValues returns additional values are carried in this
// tracestate object (W3C or OpenTelemetry).
func (cts commonTraceState) ExtraValues() []KV {
return cts.kvs
}

// trimOws removes optional whitespace on both ends of a string.
// this uses the strict definition for optional whitespace tiven
// in https://www.w3.org/TR/trace-context/#tracestate-header-field-values
func trimOws(input string) string {
return strings.Trim(input, " \t")
}

// scanKeyValues is common code to scan either W3C or OTel tracestate
// entries, as parameterized in the keyValueScanner struct.
func (s keyValueScanner) scanKeyValues(input string, f func(key, value string) error) error {
var rval error
items := 0
for input != "" {
items++
if s.maxItems > 0 && items >= s.maxItems {
// W3C specifies max 32 entries, tested here
// instead of via the regexp.
return ErrTraceStateSize
}

sep := strings.IndexByte(input, s.separator)

var member string
if sep < 0 {
member = input
input = ""
} else {
member = input[:sep]
input = input[sep+1:]
}

if s.trim {
// Trim only required for W3C; OTel does not
// specify whitespace for its value encoding.
member = trimOws(member)
}

if member == "" {
// W3C allows empty list members.
continue
}

eq := strings.IndexByte(member, s.equality)
if eq < 0 {
// We expect to find the `s.equality`
// character in this string because we have
// already validated the whole input syntax
// before calling this parser. I.e., this can
// never happen, and if it did, the result
// would be to skip malformed entries.
continue
}
if err := f(member[:eq], member[eq+1:]); err != nil {
rval = multierr.Append(rval, err)
}
}
return rval
}

// serializer assists with checking and combining errors from
// (io.StringWriter).WriteString().
type serializer struct {
writer io.StringWriter
err error
}

// write handles errors from io.StringWriter.
func (ser *serializer) write(str string) {
_, err := ser.writer.WriteString(str)
ser.check(err)
}

// check handles errors (e.g., from another serializer).
func (ser *serializer) check(err error) {
ser.err = multierr.Append(ser.err, err)
}
89 changes: 89 additions & 0 deletions pkg/sampling/doc.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
// Copyright The OpenTelemetry Authors
// SPDX-License-Identifier: Apache-2.0

// # TraceState representation
//
// A [W3CTraceState] object parses and stores the OpenTelemetry
// tracestate field and any other fields that are present in the
// W3C tracestate header, part of the [W3C tracecontext specification].
//
// An [OpenTelemetryTraceState] object parses and stores fields of
// the OpenTelemetry-specific tracestate field, including those recognized
// for probability sampling and any other fields that are present. The
// syntax of the OpenTelemetry field is specified in [Tracestate handling].
//
// The probability sampling-specific fields used here are specified in
// [OTEP 235]. The principal named fields are:
//
// - T-value: The sampling rejection threshold, expresses a 56-bit
// hexadecimal number of traces that will be rejected by sampling.
// - R-value: The sampling randomness value can be implicit in a TraceID,
// otherwise it is explicitly encoded as an R-value.
//
// # Low-level types
//
// The three key data types implemented in this package represent sampling
// decisions.
//
// - [Threshold]: Represents an exact sampling probability.
// - [Randomness]: Randomness used for sampling decisions.
// - [Threshold.Probability]: a float64 in the range [MinSamplingProbability, 1.0].
//
// # Example use-case
//
// To configure a consistent tail sampler in an OpenTelemetry
// Collector using a fixed probability for all traces in an
// "equalizing" arrangement, where the effect of sampling is
// conditioned on how much sampling has already taken place, use the
// following pseudocode.
//
// func Setup() {
// // Get a fixed probability value from the configuration, in
// // the range (0, 1].
// probability := *FLAG_probability
//
// // Calculate the sampling threshold from probability using 3
// // hex digits of precision.
// fixedThreshold, err = ProbabilityToThresholdWithPrecision(probability, 3)
// if err != nil {
// // error case: Probability is not valid.
// }
// }
//
// func MakeDecision(tracestate string, tid TraceID) bool {
// // Parse the incoming tracestate
// ts, err := NewW3CTraceState(tracestate)
// if err != nil {
// // error case: Tracestate is ill-formed.
// }
// // For an absolute probability sample, we check the incoming
// // tracestate to see whether it was already sampled enough.
// if len(ts.OTelValue().TValue()) != 0 {
// // If the incoming tracestate was already sampled at
// // least as much as our threshold implies, then its
// // (rejection) threshold is higher. If so, then no
// // further sampling is called for.
// if ThresholdGreater(ts.OTelValue().TValueThreshold(), fixedThreshold) {
// return true
// }
// }
// var rnd Randomness
// // If the R-value is present, use it. If not, rely on TraceID
// // randomness. Note that OTLP v1.1.0 introduces a new Span flag
// // to convey trace randomness correctly, and if the context has
// // neither the randomness bit set or the R-value set, we need a
// // fallback, which can be to synthesize an R-value or to assume
// // the TraceID has sufficient randomness. This detail is left
// // out of scope.
// if rval, hasRval := ts.OTelValue().RValueRandomness(); hasRv {
// rnd = rval
// } else {
// rnd = TraceIDToRandomness(tid)
// }
//
// return fixedThreshold.ShouldSample(rnd)
// }
//
// [W3C tracecontext specification]: https://www.w3.org/TR/trace-context/#tracestate-header
// [Tracestate handling]: https://opentelemetry.io/docs/specs/otel/trace/tracestate-handling/
package sampling // import "github.com/open-telemetry/opentelemetry-collector-contrib/pkg/sampling"

0 comments on commit 158b707

Please sign in to comment.