Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

server: Add sampling rate to distributed tracing #13248

Merged
merged 1 commit into from
Aug 30, 2021

Conversation

lilic
Copy link
Contributor

@lilic lilic commented Jul 28, 2021

In the initial PR we mentioned we should also add sampling rate, as this will allow users to control the overhead by being able to reduce the amount of traces sent.

This PR adds ExperimentalDistributedTracingSamplingRatePerMillion, which is the number of samples to collect per million spans.

It defaults to 0, I wasn't sure if we should default to sample all instead of 0, but this is how the Kubernetes implementation did it so I tried to follow the same principles as there.


const maxSamplingRatePerMillion = 1000000

func (e *Etcd) setupTracingExporter(ctx context.Context) (exporter tracesdk.SpanExporter, options []otelgrpc.Option, err error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can change this to method that takes ServerConfig as argument. This function doesn't need anything from embed.Etcd than e.cfg. As so we should try to keep is separate from embed.Etcd Law of Demeter and make them separate.

Best even move it to separate package server/embed/tracing this way we can avoid this function reading embed.Etcd private fields and functions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I followed the pattern we have with logging here, where we have config_logging.go. But will change that takes config instead, good suggestion!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, good point. I also prefer consistent quality, instead of having each module written differently.
But this is a good candiate for a follow up PR. I would be happy to help with that.

if samplingRate == 0 {
return sampler, nil
}
if samplingRate < 0 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like flag value validation, please move it to embed.Config.Validate

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestion, for both this and the below check right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think so. Best would be to have a separate configuration struct for tracing that is included in embed.Config. This way we can have a validate function in tracing package that is invoked by embed.Config.Validate (common pattern in K8s code).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah agreed its bit different per configuration here. The cfg.setupLogging seems to be called in validate as well. But agreed we can do it down the road in similar way, especially since config is really growing and lacks testing IMO.

@lilic
Copy link
Contributor Author

lilic commented Jul 28, 2021

@serathius let me know what you think, I addressed the suggestions, thanks. Will move some of unit tests cases if this is the approach you thought of.

return exporter, options, err
}

func determineSampler(samplingRate int) (tracesdk.Sampler, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This always returns nil for error

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice catch, removed error returns.

Copy link
Member

@serathius serathius left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good aside one small nit.

@lilic
Copy link
Contributor Author

lilic commented Jul 29, 2021

Thanks, separated the unit tests for new function and fixed the nit. Please have another look, thanks!

@lilic lilic requested a review from serathius August 2, 2021 13:51
Copy link
Contributor

@hexfusion hexfusion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

few optional notes, otherwise LGTM

server/embed/config_tracing.go Outdated Show resolved Hide resolved
server/etcdmain/help.go Show resolved Hide resolved
server/embed/etcd.go Outdated Show resolved Hide resolved
@lilic
Copy link
Contributor Author

lilic commented Aug 26, 2021

@hexfusion addressed your suggestions, please have another look, thanks!

Copy link
Contributor

@hexfusion hexfusion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one very minor nit then lets merge 👍

server/etcdmain/help.go Outdated Show resolved Hide resolved
ExperimentalDistributedTracingSamplingRatePerMillion is the
number of samples to collect per million spans.
Defaults to 0.
@hexfusion
Copy link
Contributor

Proxy failure appears to be a flake and passed on the previous run (txt change diff)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants