Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add "sampling.priority" support to probabilistic sampler #469

Merged

Conversation

pjanotti
Copy link
Contributor

Adds support to OpenTracing "sampling.priority" per semantic conventions at https://github.com/opentracing/specification/blob/master/semantic_conventions.md#span-tags-table

@codecov-io
Copy link

codecov-io commented Dec 17, 2019

Codecov Report

Merging #469 into master will increase coverage by 0.03%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #469      +/-   ##
==========================================
+ Coverage   75.73%   75.76%   +0.03%     
==========================================
  Files         120      120              
  Lines        7347     7386      +39     
==========================================
+ Hits         5564     5596      +32     
- Misses       1518     1523       +5     
- Partials      265      267       +2
Impacted Files Coverage Δ
...babilisticsamplerprocessor/probabilisticsampler.go 96% <100%> (+1.88%) ⬆️
extension/pprofextension/pprofextension.go 64% <0%> (-24.89%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c300f13...62f961e. Read the comment docs.

@SergeyKanzhelev
Copy link
Member

CC: @lmolkova @lizthegrey

@lizthegrey
Copy link
Member

damn, you got my hopes up because I read "sampling.probability" instead of "sampling.priority" ;) will have a look tomorrow.

@lizthegrey lizthegrey self-requested a review December 17, 2019 21:12
@SergeyKanzhelev
Copy link
Member

@lizthegrey I think the idea is the same. At least based on description.

@lizthegrey
Copy link
Member

It's similar, but different. the spec appears to specify ">0" or "0", but doesn't explicitly state the non-zero value is the previous sample rate, nor do we use the sample rate here to multiply by our sample rate when reporting out sampling rates, nor do we propagate our sample rate as sampling.priority.

Copy link
Contributor Author

@pjanotti pjanotti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lizthegrey if you have some spec/text of "sampling.probability" I can take a look to see how the collector can implement/participate on that.

@pjanotti pjanotti force-pushed the add-sampling-priority-support branch from bcb8506 to 4b85537 Compare January 3, 2020 16:11
return mustSampleSpan
}
return deferDecision
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this function worth the effort? It only saves 2 lines and adds cognitive overhead. I'm also not sure whether the Go compiler will inline the call.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The compiler does inline this call, but, let me put the alternative and see what people prefer in general.

* remove check for "" value for string attribute
* remove local function, duplication seems reasonable
* improve comment about hash_seed usage
Copy link
Member

@SergeyKanzhelev SergeyKanzhelev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM from semantics perspective. Want to see more on passing sampling.priority via tracestate and left a question about setting sampling.priority by collector in case of defer decision.

sampling_percentage: 15.3
# hash_seed allows one to configure the hashing seed. This is important in
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the sampling algo works on span inside the collector - will the sampling.priority be set on it for downstream to be able to read it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can be configured separately on the attributesprocessor.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How will the attirbuteProcessor know the seed value and the resulting hash value calculated from the TraceID?

Copy link
Contributor Author

@pjanotti pjanotti Jan 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The calculated hash is not exposed (it is a predictable algo though). If desired the seed can be shared in the configuration as an environment variable so both processors can use the same value. I think that perhaps I'm missing some context @SergeyKanzhelev on how you need this to work, if you have more info let me know.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sampling.priority is used to communicate calculated value of a hash. Typically from SDK to a dependent service (via tracestate-like propagation) and to collector for additional sampling and/or better visualization. If collector is running as a sidecar and calculates hash - it may be useful to communicate this information to collector which is actually collecting data. In case additional sampling needs to be implemented on the whole cluster for instance.

@pjanotti
Copy link
Contributor Author

pjanotti commented Jan 7, 2020

@SergeyKanzhelev created issue #476 to track support to tracestate on the probabilistic sampler

@pjanotti pjanotti merged commit 60b03d0 into open-telemetry:master Jan 8, 2020
@pjanotti pjanotti deleted the add-sampling-priority-support branch January 8, 2020 16:16
MovieStoreGuy pushed a commit to atlassian-forks/opentelemetry-collector that referenced this pull request Nov 11, 2021
Signed-off-by: Bogdan Cristian Drutu <bogdandrutu@gmail.com>
hughesjj pushed a commit to hughesjj/opentelemetry-collector that referenced this pull request Apr 27, 2023
Troels51 pushed a commit to Troels51/opentelemetry-collector that referenced this pull request Jul 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants