Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multi-service sampling rules for nginx ingress controller #253

Open
marcportabellaclotet-mt opened this issue Dec 8, 2022 · 14 comments
Open

Comments

@marcportabellaclotet-mt
Copy link

I have been trying to define different SAMPLING RULES per k8s nginx ingress, but I had no success.

What I have tried:
Define sampling rules according to datadog docs.
My config:

- name: DD_TRACE_SAMPLING_RULES
  value: '[{"service": "nginx","sample_rate": 0.2},{"name": "nginxdebug","sample_rate": 1}]'

Changing this value (service:nginx) works for all ingress, but we can not leverage of fine grained service rules as described here.

I have added annotations to my ingress to override the servicename.

nginx.ingress.kubernetes.io/configuration-snippet: |
   opentracing_tag service.name "nginxdebug";

Is this setup possible?
Thanks

@marcportabellaclotet-mt
Copy link
Author

marcportabellaclotet-mt commented Dec 8, 2022

I was able to configure different rules, by setting:

nginx.ingress.kubernetes.io/server-snippet: |
   opentracing_load_tracer /usr/local/lib/libdd_opentracing.so /etc/nginx/opentracing-debug.json;

And creating a new file in the ingress controller:

  "service": "nginxdebug",
  "agent_host": "datadog.datadog.svc.cluster.local",
  "agent_port": 8126,
  "environment": "prod",
  "operation_name_override": "nginx.handle",
  "sample_rate": 1,
  "dd.priority.sampling": true
}

@marcportabellaclotet-mt marcportabellaclotet-mt changed the title Allow to define different sample rules to each kubernetes nginx ingress Allow to define different sample rules in nginx using the same opentracing_load_tracer config Dec 8, 2022
@marcportabellaclotet-mt
Copy link
Author

Could you please share a nginx setup where DD_TRACE_SAMPLING_RULES is used with different services?
I am not sure if I understand how DD_TRACE_SAMPLING_RULES work when multiple services are used.
Is DD_TRACE_SAMPLING_RULES able to read different services configurations when opentracing_tag "service.name" is set in nginx config?

@marcportabellaclotet-mt
Copy link
Author

ingress nginx is using a not up to date version of open-tracing-cpp. (v0.19.0).
Can this be the reason why SAMPLING_RULES are not working?

@dgoffredo
Copy link
Contributor

Hi, Marc.

I don't have an example of using DD_TRACE_SAMPLING_RULES (or the equivalent dd-config.json) to configure sampling rules for multiple services in the nginx ingress controller for kubernetes.

It would be a nice example to have, though.

I looked through our code based on what you are trying to do, and noticed a few things:

  • The ability to set the trace's service by setting the "service.name" tag was added in dd-opentracing-cpp version v1.3.2. If you share your ingress-nginx version, then I can check the corresponding dd-opentracing-cpp version.
  • I think that your original technique, with the environment variable, should work, but I have not tested it.
    • Your second technique might not work because it could cause the plugin to be loaded more than once, which is not supported.

The first thing to do is get details about the software versions that you're using.

If based on that it's "supposed to work," then I can create a reproduction on a test kubernetes cluster. However, I can't promise I would be able to work on that anytime soon.

@marcportabellaclotet-mt
Copy link
Author

Thanks for the fast response.
I confirm that I am using the latest nginx ingress version for helm chart 4.4.0, which is using DATADOG_CPP_VERSION=1.3.2.
I complied a new version of rootfs for ingress controller, using version 1.3.6, and after testing it, the same behavior described in the above messages, using opentracing_tag service.name xxx does not work.

I have run a nginx -s reload, to verify what version is being used, and the configuration for dd tracer:

nginx: [warn] the "http2_max_requests" directive is obsolete, use the "keepalive_requests" directive instead in /etc/nginx/nginx.conf:150
info: DATADOG TRACER CONFIGURATION - {"agent_url":"http://datadog.datadog.svc.cluster.local:8126","analytics_enabled":false,"analytics_sample_rate":null,"date":"2022-12-08T22:47:58+0000","enabled":true,"env":"prod","lang":"cpp","lang_version":"201402","operation_name_override":"nginx.handle","report_hostname":false,"sampling_rules":"[{\"service\": \"nginx\",\"sample_rate\": 0.2},{\"service\": \"nginxdebug\",\"sample_rate\": 1}]","service":"nginx","version":"v1.3.6"}

Thanks for your time, and it would be great if you could do some testing on kubernetes, whenever you have time.
In the other hand, seems that nginx ingress project is moving to use opentelemetry soon.

@marcportabellaclotet-mt
Copy link
Author

marcportabellaclotet-mt commented Dec 8, 2022

Is it possible that the DD_TRACING_SAMPLE_RULES is checking the root service name? Settting service.name as opentracing tag is overriding the child span? Looking into the flame graph, when using service.name tag:

image

I also have tested using "opentracing_trace_locations off;" which merges nginx services into the same span.

image

@dgoffredo
Copy link
Contributor

Ooo, look at that.

Yes, opentracing_tag service.name will set the service name on the "current span," which if opentracing_trace_locations on means the location span. The outer request span will be unaffected. It might depend on where in the nginx configuration opentracing_tag is being called. In a location block it will certainly refer to the location span. Perhaps in the enclosing server block it would instead refer to the request span, I'm not sure.

In your second screenshot, there is only the nginxdebug service, which is promising. If you are able to go about configuring the ingress controller that way (opentracing_trace_locations off), maybe the sample rate for nginxdebug will be as you specified in the sampling rules.

In the other hand, seems that nginx ingress project is moving to use opentelemetry soon.

We have yet to decide how we'll continue supporting the ingress controller: OpenTelemetry-only, Datadog-specific module, etc. For now we continue to maintain this OpenTracing-based plugin.

@dgoffredo
Copy link
Contributor

Might as well keep this open. It's something I'd like to support.

@dgoffredo dgoffredo reopened this Dec 9, 2022
@dgoffredo dgoffredo changed the title Allow to define different sample rules in nginx using the same opentracing_load_tracer config multi-service sampling rules for nginx ingress controller Dec 9, 2022
@marcportabellaclotet-mt
Copy link
Author

marcportabellaclotet-mt commented Dec 10, 2022

I have been testing to define opentracing_tag service.name to the server level instead of location, and it does not help.

nginx.ingress.kubernetes.io/server-snippet: |
  opentracing_tag service.name "nginxdebug";

Also using opentracing_trace_locations off; does not help.

I was trying also another approach, and use name instead of service in sampling rules.
However, I could not find a way to override the value defined in operation_name_override,
either using:

        opentracing_tag resource.name "debug";
        opentracing_operation_name "debug";
        opentracing_localtion_operation_name "debug";

do not change the root resource name (still the one defined in operation_name_override or it's default nginx.handle):

image

Thanks !

@marcportabellaclotet-mt
Copy link
Author

Is there any update on this topic?

@dgoffredo
Copy link
Contributor

dgoffredo commented Mar 13, 2023

None, I'm afraid. My time has been spent on getting the new tracing library into Envoy and on other internal designs.

Work is planned to add the new tracing library to nginx-datadog, and no later than then I'll revisit this idea of setting sampling configuration in a more fine-grained way.

Sorry for the delay.

@marcportabellaclotet-mt
Copy link
Author

Thanks for the update

@dgoffredo
Copy link
Contributor

I should also point out that getting the new code into the ingress controller is a separate project, also planned, but that will take longer.

Ideally we'd figure out a way to configure your existing system to do what you want, but I haven't spent the time yet this year.

@marcportabellaclotet-mt
Copy link
Author

That would be great. Thanks again!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants