Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WARN datadog.trace.agent.core.propagation.DatadogTagsFactory - Invalid datadog tags header value #5930

Closed
ivankatliarchuk opened this issue Sep 21, 2023 · 23 comments
Labels
comp: logging Tracer internal logging
Milestone

Comments

@ivankatliarchuk
Copy link

ivankatliarchuk commented Sep 21, 2023

Every request where request header contains something like Key - tracestate | value - dd=s:1;o:synthetics;t.dm:--128 logs a message

WARN datadog.trace.agent.core.propagation.DatadogTagsFactory - Invalid datadog tags header value: '_dd.p.dm=--128' invalid tag value at 9

I'm struggling to identify how the t.dm or _dd.p.dm value is actually computed and under which circumstances it could have --. I would assume any request with t.dm:--XXXX could be invalid.

@ivankatliarchuk ivankatliarchuk changed the title WARN datadog.trace.agent.core.propagation.DatadogTagsFactory - Invalid datadog tags header value: '_dd.p.dm=--128' invalid tag value at 9 WARN datadog.trace.agent.core.propagation.DatadogTagsFactory - Invalid datadog tags header value Sep 21, 2023
@bantonsson
Copy link
Contributor

Hi @ivankatliarchuk, the dm=--128, is the internal value external override 0x80 escaping into a header, which shouldn't happen. Could you provide a little bit more information about:

  • The version(s) of the tracer library you are using
  • The propagation style(s) enabled (looks like tracecontext at least)
  • If there are any other languages/runtimes, i.e. node, python, et.c., calling this service.

@ivankatliarchuk
Copy link
Author

Hi @bantonsson . Thanks for looking at it

  1. Java applications affected
The versions of the java tracer library of these apps are:

* 1.20.0~6308c0048e
* 1.2.0~a26b8ca0ff
* 1.20.1~70cd67ce90
* 1.19.0~b023c07015
* 1.18.2~0a1d7c2013

and configuration

{"version":"1.2.0~a26b8ca0ff","os_name":"Linux","os_version":"5.10.186-179.751.amzn2.x86_64","architecture":"amd64","lang":"jvm","lang_version":"17.0.5","jvm_vendor":"Amazon.com Inc.","jvm_version":"17.0.5+8-LTS","java_class_version":"61.0","http_nonProxyHosts":"null","http_proxyHost":"null","enabled":true,"service":"offer-orchestrator","agent_url":"http://10.1.208.123:8126","agent_error":false,"debug":false,"analytics_enabled":true,"sampling_rules":[{},{}],"priority_sampling_enabled":true,"logs_correlation_enabled":true,"profiling_enabled":false,"remote_config_enabled":false,"debugger_enabled":false,"appsec_enabled":"ENABLED_INACTIVE","telemetry_enabled":true,"dd_version":"0.53.0","health_checks_enabled":true,"configuration_file":"no config file present","runtime_id":"d527765f-94b9-4891-b7cd-892e1389e8f5","logging_settings":{"levelInBrackets":false,"dateTimeFormat":"'[dd.trace 'yyyy-MM-dd HH:mm:ss:SSS Z']'","logFile":"System.err","configurationFile":"simplelogger.properties","showShortLogName":false,"showDateTime":true,"showLogName":true,"showThreadName":true,"defaultLogLevel":"INFO","warnLevelString":"WARN","embedException":false},"cws_enabled":false,"cws_tls_refresh":5000}
  1. I may need more information e.g. how to capture tracecontext
    Screenshot 2023-09-22 at 08 08 49

  2. Yes, .NET applications making calls to Java applications

.Net version is package Datadog.Trace --version 2.20.0

We created a ticket in DD Support as well. The possible suggested fix is to update .Net tracer library. DD Support Request.

Let's say we do not have a significant controll over our clients. Is there is a way to fix it in Java library e.g. reject incorrect tracer or smth?

@bantonsson
Copy link
Contributor

@ivankatliarchuk The Java tracer will ignore this part of the trace information, but continue the rest of the trace. The logging is just very noisy and we probably need to fix that going forward.

As for the tracecontext part, since you referenced a header named tracestate I assumed that you had enabled tracecontext style propagation as well as datadog style propagation, but if the service that logs the warnings is running 1.2.0 then that can not be the case. There is probably a header in the requests named x-datadog-tags as well with _dd.p.dm=--128 in it.

@ivankatliarchuk
Copy link
Author

ivankatliarchuk commented Sep 22, 2023

So current DD_* related environment variables in affected (Java) service

DD_STATSD_PORT=8125
DD_AGENT_HOST=10.1.XXX.XX
DD_VERSION=0.53.0
DD_TRACE_ANALYTICS_ENABLED=true
DD_ENV=dev
DD_SERVICE_NAME=offer-orchestrator
DD_TAGS=team:offer
DD_SERVICE=offer-orchestrator
DD_LOGS_INJECTION=true
DD_AGENT_ARG=-javaagent:/dd-java-agent.jar
DD_REMOTE_CONFIG_ENABLED=false
DD_APM_PORT=8126

I have enabled DD_TRACE_DEBUG=true. Would you be able to guide me, which line your are after? For example, every debug log starts with DEBUG datadog.trace.agent.core.SOME_CLASS.

I might need to enable tracecontext as not able to find anything like context in current logs. Do I need to enable it with DD_TRACE_PROPAGATION_STYLE_INJECT=datadog explicitly for debuging purposes?

@bantonsson
Copy link
Contributor

Thanks @ivankatliarchuk that list is great. I don't think there is anything more that you need to get, or enable. There is nothing wrong with this Java tracer installation, and the only way to turn down the logging is to set the property -Ddatadog.slf4j.simpleLogger.defaultLogLevel=ERROR.

@ivankatliarchuk
Copy link
Author

ivankatliarchuk commented Sep 22, 2023

I do not think we are planning to change logger level and have only ERRORs. My understanding, some traces are not sent to Datadog due to unsupported headers. Is there a way to actually fix it or the only way is to supress message with ERROR log level?

@bantonsson
Copy link
Contributor

That logger level is only for the internal Java Tracer logging, and it was a suggestion if the logging is a problem for your service. Unfortunately right now, there is no other way to get rid of those log messages.

As for traces not being sent, the warning you see only means that a part of extra information about the trace is discarded since it is invalid. The rest of the trace will continue to propagate as normal.

@ivankatliarchuk
Copy link
Author

Cool. I let the team now.

@beardpapa
Copy link

beardpapa commented Oct 31, 2023

We've also observed this error starting to populate our java app logs. This is on a vanilla installation using the auto-implementation libraries. Versions are as follows:

  $agent_version = '7.46.0-1', # datadog-agent-7.46.0-1.x86_64.rpm
  $apm_inject_version = '0.10.7-1', # datadog-apm-inject-0.10.7-1.x86_64
  $apm_java_version = '1.18.1-1', # datadog-apm-library-java-1.18.1-1.x86_64
  $apm_js_version = '4.1.0-1', # datadog-apm-library-js-4.1.0-1.x86_64.rpm```

@gaston-haro
Copy link

Similar issue here but after latest release. After 30eeaf5 we see a lot of new WARN level logs from the agent:

[dd.trace yyyy-MM-dd HH:mm:ss:SSS
 0000] [*
] WARN datadog.trace.agent.core.propagation.ptags.W3CPTagsCodec - Invalid datadog tags header value: 'dd=t.dm:3;s:[-1-2]
' invalid tag value at 8

@ahmed-envoy
Copy link

Similar issue here but after latest release. After 30eeaf5 we see a lot of new WARN level logs from the agent:

[dd.trace yyyy-MM-dd HH:mm:ss:SSS
 0000] [*
] WARN datadog.trace.agent.core.propagation.ptags.W3CPTagsCodec - Invalid datadog tags header value: 'dd=t.dm:3;s:[-1-2]
' invalid tag value at 8

We started experiencing this same issue starting with v1.24.1, previously we were using v1.23.0

@AdrianGnlz
Copy link

Similar issue here but after latest release. After 30eeaf5 we see a lot of new WARN level logs from the agent:

[dd.trace yyyy-MM-dd HH:mm:ss:SSS
 0000] [*
] WARN datadog.trace.agent.core.propagation.ptags.W3CPTagsCodec - Invalid datadog tags header value: 'dd=t.dm:3;s:[-1-2]
' invalid tag value at 8

We started experiencing this same issue starting with v1.24.1, previously we were using v1.23.0

Same here, is there any workaround to fix this issue?

@ademar59
Copy link

We begin to have a big number of errror too with agent 1.25.1
Is it something plan?

@kwazii1231
Copy link

we also met same issue.
just to much error with latest agent jar

@bantonsson
Copy link
Contributor

Hi, sorry for not getting back to you all earlier. The reason that this is happening more now, is because the Java Datadog Tracer since version 1.24+ is now parsing the W3C tracecontext headers by default, and there is some Datadog tracer in your system sending out those headers with illegal sample values (that's the s:[-1-2]).

We will implement log throttling for this, but it would be great if you could tell us which language tracers and versions you have in your system.

Copy link
Contributor

🤖 This issue has been addressed in the latest release. See full details in the Release Notes.

@eliebleton-manomano
Copy link

eliebleton-manomano commented Jan 4, 2024

there is some Datadog tracer in your system sending out those headers with illegal sample values (that's the s:[-1-2])

We do not believe the -1 and 2 values to illegal priorities:

They're in use in existing dd-tracers:

dd-trace-py // propagation/http.py

      - ``x-datadog-sampling-priority`` integer representing the sampling decision.
        ``<= 0`` (Reject) or ``> 1`` (Keep)

dd-trace-js // ext/priority.js

  USER_REJECT: -1,
  AUTO_REJECT: 0,
  AUTO_KEEP: 1,
  USER_KEEP: 2

(And most likely more)

Such priorities also have dedicated logic in the trace-agent,code is here.

I hope that the above elements are sufficient to demonstrate that support of -1 and 2 would be beneficial to dd-trace-java.

The word "illegal" you used is normative wording - I'm not aware of any public specification for dd-trace headers. We would certainly benefit from having a specification - not for lack of alternatives, but for clarity and interop.

Cheers,

PS: Furthermore, I believe -2 or 3 should be legal as well, but it has various glitches from trace-agent to tracers that prevent their use. (i.e. who is "USER"? we run systems, we have developers, they're all being confused as "USER" today. It'd be clearer if I could say "-2/3" is from INTAKE_SYSTEM_A, "-3/4" is from COSTCONTROL_SYSTEM_A and keep -2/1 for USER/OPERATOR/DEVELOPER/HUMAN decisions)

@bantonsson
Copy link
Contributor

Hi @eliebleton-manomano, the problem is not the sampling priority -1 or 2 or any other actual number, but the fact that the tracestate contains the sampling priority [-1-2] which is definitely not the string representation of an integer, and an illegal sampling priority.

@eliebleton-manomano
Copy link

Hi @bantonsson! Thanks for the quick reply and sorry for the misunderstanding. [-1-2] is definitely an illegal representation of an integer :)

We are running into issues with this parser this morning and I was too quick to assume this was one and the same problem - turns out it isn't. I'll go on and open a new issue describing what we ran into to reduce confusion.

@eliebleton-manomano
Copy link

turns out it isn't. I'll go on and open a new issue describing what we ran into to reduce confusion.

That was also said too quick ...

We are seing a variant that has not yet been reported in this thread, but it does look quite similar:

WARN datadog.trace.agent.core.propagation.ptags.W3CPTagsCodec - Invalid datadog tags header value: 'dd=t.dm:4;s:-1' invalid tag value at 8

The only possibly helpful thing I can add with any confidence is that the calling system is running dd-trace-js .

I have yet to find out the original system that sent W3C propagation headers in the first place.

@bantonsson
Copy link
Contributor

@eliebleton-manomano The dm part of the tag can consist of {hash}-{mechanism} and if there is no hash then the mechanism should be proceeded by a - sign. I would assume that dd-trace-js is encoding it in a bad way for some reason. Please reach out to them.

@wavermartijn
Copy link

@eliebleton-manomano Have you gotten any success on fixing the issue? We bumped into the same thing when we udated the DataDog agent

@eliebleton-manomano
Copy link

eliebleton-manomano commented Jan 30, 2024

I have not and I have yet to report this in due form to the dd-trace-js folks 😓 (Assuming you indeed ran into specifically the same issue that we did, which appears to come from there). I'd like to point out that this thread is full of similar errors, but they all should be traced to whoever is responsible for sending the invalid tags header value.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
comp: logging Tracer internal logging
Projects
None yet
Development

No branches or pull requests

10 participants