Add "sampling.priority" support to probabilistic sampler #469

pjanotti · 2019-12-17T16:31:16Z

Adds support to OpenTracing "sampling.priority" per semantic conventions at https://github.com/opentracing/specification/blob/master/semantic_conventions.md#span-tags-table

processor/samplingprocessor/probabilisticsamplerprocessor/probabilisticsampler.go

codecov-io · 2019-12-17T16:52:54Z

Codecov Report

Merging #469 into master will increase coverage by 0.03%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master     #469      +/-   ##
==========================================
+ Coverage   75.73%   75.76%   +0.03%     
==========================================
  Files         120      120              
  Lines        7347     7386      +39     
==========================================
+ Hits         5564     5596      +32     
- Misses       1518     1523       +5     
- Partials      265      267       +2

Impacted Files	Coverage Δ
...babilisticsamplerprocessor/probabilisticsampler.go	`96% <100%> (+1.88%)`	⬆️
extension/pprofextension/pprofextension.go	`64% <0%> (-24.89%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c300f13...62f961e. Read the comment docs.

processor/samplingprocessor/probabilisticsamplerprocessor/probabilisticsampler.go

SergeyKanzhelev · 2019-12-17T21:11:25Z

CC: @lmolkova @lizthegrey

lizthegrey · 2019-12-17T21:12:42Z

damn, you got my hopes up because I read "sampling.probability" instead of "sampling.priority" ;) will have a look tomorrow.

SergeyKanzhelev · 2019-12-17T22:52:11Z

@lizthegrey I think the idea is the same. At least based on description.

lizthegrey · 2019-12-17T22:56:06Z

It's similar, but different. the spec appears to specify ">0" or "0", but doesn't explicitly state the non-zero value is the previous sample rate, nor do we use the sample rate here to multiply by our sample rate when reporting out sampling rates, nor do we propagate our sample rate as sampling.priority.

pjanotti

@lizthegrey if you have some spec/text of "sampling.probability" I can take a look to see how the collector can implement/participate on that.

Follow up to PR review comments

This only failed after rebase and `make install-tools`

processor/samplingprocessor/probabilisticsamplerprocessor/probabilisticsampler.go

processor/samplingprocessor/probabilisticsamplerprocessor/testdata/config.yaml

…ashing trace id.

processor/samplingprocessor/probabilisticsamplerprocessor/testdata/config.yaml

fbogsany · 2020-01-07T18:42:06Z

processor/samplingprocessor/probabilisticsamplerprocessor/probabilisticsampler.go

+			return mustSampleSpan
+		}
+		return deferDecision
+	}


Is this function worth the effort? It only saves 2 lines and adds cognitive overhead. I'm also not sure whether the Go compiler will inline the call.

The compiler does inline this call, but, let me put the alternative and see what people prefer in general.

processor/samplingprocessor/probabilisticsamplerprocessor/probabilisticsampler.go

* remove check for "" value for string attribute * remove local function, duplication seems reasonable * improve comment about hash_seed usage

SergeyKanzhelev

LGTM from semantics perspective. Want to see more on passing sampling.priority via tracestate and left a question about setting sampling.priority by collector in case of defer decision.

SergeyKanzhelev · 2020-01-07T20:25:42Z

processor/samplingprocessor/probabilisticsamplerprocessor/testdata/config.yaml

    sampling_percentage: 15.3
+    # hash_seed allows one to configure the hashing seed. This is important in


When the sampling algo works on span inside the collector - will the sampling.priority be set on it for downstream to be able to read it?

It can be configured separately on the attributesprocessor.

How will the attirbuteProcessor know the seed value and the resulting hash value calculated from the TraceID?

The calculated hash is not exposed (it is a predictable algo though). If desired the seed can be shared in the configuration as an environment variable so both processors can use the same value. I think that perhaps I'm missing some context @SergeyKanzhelev on how you need this to work, if you have more info let me know.

sampling.priority is used to communicate calculated value of a hash. Typically from SDK to a dependent service (via tracestate-like propagation) and to collector for additional sampling and/or better visualization. If collector is running as a sidecar and calculates hash - it may be useful to communicate this information to collector which is actually collecting data. In case additional sampling needs to be implemented on the whole cluster for instance.

pjanotti · 2020-01-07T20:36:25Z

@SergeyKanzhelev created issue #476 to track support to tracestate on the probabilistic sampler

Signed-off-by: Bogdan Cristian Drutu <bogdandrutu@gmail.com>

…ry#469)

…-telemetry#469)

pjanotti requested review from bogdandrutu, flands, owais, rghetia, songy23 and tigrannajaryan as code owners December 17, 2019 16:31

pjanotti self-assigned this Dec 17, 2019

pjanotti commented Dec 17, 2019

View reviewed changes

processor/samplingprocessor/probabilisticsamplerprocessor/probabilisticsampler.go Outdated Show resolved Hide resolved

ibawt reviewed Dec 17, 2019

View reviewed changes

processor/samplingprocessor/probabilisticsamplerprocessor/probabilisticsampler.go Outdated Show resolved Hide resolved

SergeyKanzhelev reviewed Dec 17, 2019

View reviewed changes

processor/samplingprocessor/probabilisticsamplerprocessor/probabilisticsampler.go Show resolved Hide resolved

lizthegrey self-requested a review December 17, 2019 21:12

pjanotti commented Jan 3, 2020

View reviewed changes

Paulo Janotti added 4 commits January 3, 2020 08:05

Add "sampling.priority" support to probabilistic sampler

3d0d247

Updating per PR comments

e53e505

Follow up to PR review comments

Add local func to avoid small code duplication

eaa933e

Fix gofmt per latest lint

4b85537

This only failed after rebase and `make install-tools`

pjanotti force-pushed the add-sampling-priority-support branch from bcb8506 to 4b85537 Compare January 3, 2020 16:11

tigrannajaryan reviewed Jan 7, 2020

View reviewed changes

Improve comments: clarify "sampled" and sampling.priority overrides h…

9c17e1e

…ashing trace id.

fbogsany reviewed Jan 7, 2020

View reviewed changes

PR feedback

62f961e

* remove check for "" value for string attribute * remove local function, duplication seems reasonable * improve comment about hash_seed usage

SergeyKanzhelev approved these changes Jan 7, 2020

View reviewed changes

tigrannajaryan approved these changes Jan 7, 2020

View reviewed changes

owais approved these changes Jan 8, 2020

View reviewed changes

pjanotti merged commit 60b03d0 into open-telemetry:master Jan 8, 2020

pjanotti deleted the add-sampling-priority-support branch January 8, 2020 16:16

pjanotti mentioned this pull request Jan 13, 2020

Add way to communicate hash calculated by probabilistic sampler #499

Closed

MovieStoreGuy pushed a commit to atlassian-forks/opentelemetry-collector that referenced this pull request Nov 11, 2021

Call Gosched if load an unmapped record (open-telemetry#469)

2584c3e

Signed-off-by: Bogdan Cristian Drutu <bogdandrutu@gmail.com>

hughesjj pushed a commit to hughesjj/opentelemetry-collector that referenced this pull request Apr 27, 2023

Use internal pipelines for collector prometheus metrics (open-telemet…

13989c6

…ry#469)

Troels51 pushed a commit to Troels51/opentelemetry-collector that referenced this pull request Jul 5, 2024

Correct the doc about that approver is not expected to merge PR (open…

47c0599

…-telemetry#469)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add "sampling.priority" support to probabilistic sampler #469

Add "sampling.priority" support to probabilistic sampler #469

pjanotti commented Dec 17, 2019

codecov-io commented Dec 17, 2019 •

edited

Loading

SergeyKanzhelev commented Dec 17, 2019

lizthegrey commented Dec 17, 2019

SergeyKanzhelev commented Dec 17, 2019

lizthegrey commented Dec 17, 2019

pjanotti left a comment

fbogsany Jan 7, 2020

pjanotti Jan 7, 2020

SergeyKanzhelev left a comment

SergeyKanzhelev Jan 7, 2020

pjanotti Jan 7, 2020

SergeyKanzhelev Jan 7, 2020

pjanotti Jan 7, 2020 •

edited

Loading

SergeyKanzhelev Jan 7, 2020

pjanotti commented Jan 7, 2020

		sampling_percentage: 15.3
		# hash_seed allows one to configure the hashing seed. This is important in

Add "sampling.priority" support to probabilistic sampler #469

Add "sampling.priority" support to probabilistic sampler #469

Conversation

pjanotti commented Dec 17, 2019

codecov-io commented Dec 17, 2019 • edited Loading

Codecov Report

SergeyKanzhelev commented Dec 17, 2019

lizthegrey commented Dec 17, 2019

SergeyKanzhelev commented Dec 17, 2019

lizthegrey commented Dec 17, 2019

pjanotti left a comment

Choose a reason for hiding this comment

fbogsany Jan 7, 2020

Choose a reason for hiding this comment

pjanotti Jan 7, 2020

Choose a reason for hiding this comment

SergeyKanzhelev left a comment

Choose a reason for hiding this comment

SergeyKanzhelev Jan 7, 2020

Choose a reason for hiding this comment

pjanotti Jan 7, 2020

Choose a reason for hiding this comment

SergeyKanzhelev Jan 7, 2020

Choose a reason for hiding this comment

pjanotti Jan 7, 2020 • edited Loading

Choose a reason for hiding this comment

SergeyKanzhelev Jan 7, 2020

Choose a reason for hiding this comment

pjanotti commented Jan 7, 2020

codecov-io commented Dec 17, 2019 •

edited

Loading

pjanotti Jan 7, 2020 •

edited

Loading