-
Notifications
You must be signed in to change notification settings - Fork 147
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OpenTelemetry API #2332
OpenTelemetry API #2332
Conversation
…lemetry-api # Conflicts: # ext/configuration.h
…g format This will be required for the display of OpenTelemetry array attributes. Note that this also removes the parenthesis from boolean/null values in order to be consistent with other languages, i.e. system-tests. Signed-off-by: Bob Weinand <bob.weinand@datadoghq.com>
…ure/opentelemetry-api
# Conflicts: # ext/ddtrace_arginfo.h # ext/serializer.c # src/Integrations/Integrations/PHPRedis/PHPRedisIntegration.php # tests/ext/sandbox/safe_to_string_metadata.phpt
if (str.s && ZSTR_VAL(str.s)[ZSTR_LEN(str.s) - 1] == ',') { | ||
ZSTR_VAL(str.s)[ZSTR_LEN(str.s) - 1] = '\0'; | ||
ZSTR_LEN(str.s)--; | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note to self: This is really caused by bad handling of the trailing comma in distributed tracing parsing and has been fixed ... on the span links branch.
src/DDTrace/OpenTelemetry/Span.php
Outdated
|
||
public function hasEnded(): bool | ||
{ | ||
return $this->hasEnded; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this intentionally just checking for hasEnded and not getDuration() != 0, which is only triggered when Context::current is called?
start_span();
$otelSpan = Context::current();
close_span();
var_dump($otelSpan->hasEnded()); // false?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a lovely counter-example 👍 Yes, this was intentional - I didn't want to use getDuration
and rely on an instance variable, although this is an O(1)
(only because I found it to be prettier). But, as you showed, it only works if the context is updated.
PS: I guess $this->endEpochNanos
is also wrong (when calling $span->toSpanData()
) in this example because of this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Speaking about toSpanData
, that just made me realized that... the attributes may be wrong if you set/unset attributes from the DD-span...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Especially if you unset
…into alex/feature/opentelemetry-api
src/DDTrace/OpenTelemetry/Span.php
Outdated
$this->context->getTraceState() | ||
); | ||
|
||
$this->spanProcessor->onEnd($this); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure about this:
- Given that endOTelSpan is only called on Context::current(), if there's no further access to Context::current() after terminating the span, the spanProcessor->onEnd() will not be called
- this only runs for datadog-only Spans which happen to be accessed at least once by a Context::current(), right?
- I'm not sure what the best behaviour is, maybe, for consistency exclusively run it on OTel managed spans?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that endOTelSpan is only called on Context::current(), if there's no further access to Context::current() after terminating the span, the spanProcessor->onEnd() will not be called
Yes. One alternative I also tried would have been to do an OpenTelemetryIntegration, which traces a span's termination... but faced limitations. To this day, I still think this would have been the cleanest way overall (i.e., have sort of an event when a DD span is created/closed without the OTel API so that we could simply instrument them and start/end OTel spans as needed).
this only runs for datadog-only Spans, which happen to be accessed at least once by a Context::current(), right?
Yes
I'm not sure what the best behaviour is, maybe, for consistency exclusively run it on OTel managed spans?
For consistency, it may indeed make sense
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are some minor tradeoffs in the code, like Span::endOTelSpan fetching getTraceState() - which incidentally is only called when the next Context::current() call happens.
There might be theoretical divergence, but as long as we acknowledge such edge cases, I think it's fine.
I've left one more point for consideration, but overall looks good I think :-)
…sively on Span->end()
Notes: - A span wasn't finished when calling the span processors, hence the span was considered as not ended and its duration marked to 0 - We can use an instance variable $endEpochNanos to circumvent this issue, but it's a bit uglier than just running the span processors after the (dd) span was closed, considering the span cannot be written onto (ReadableSpanInterface) - With partial flushing, there may be an issue depending on whether the spans will be readable after flushed (if flushed)
The changes made on the endEpochNanos/hasEnded 'flags' to simplify the code, although not impacting at all DD's end result, impact Jaeger traces. I'll be running the span processors after the Datadog span termination. It may conflict with partial flushing in the future. Alternatively, I can just rollback the changes made and use some flags, but it fundamentally conflicts as well. We can discuss this in our meeting later on today if you care about this. |
I think that's fine. Even if we implement partial flushing in future, we probably shouldn't immediately, irreversibly flush the span, but flush it slightly delayed, so that some data may still be set on the closed span, otherwise we risk breaking currently perfectly working code from other users, which nonchalantly add some meta data to the span after closing it. |
8fc3add
to
b073989
Compare
Description
Trace API
Metrics
Log (Appenders)
Context
Context::getCurrent
). Otherwise, they aren't converted (Reason: Creating 1 OTel span for each DD Span could become expensive for massive traces, considering OTel's interface carries many instances).datadog
) and default resource informationAutoinstrumentation
Trace SDK
Profiler
Misc:
src/
directory) will now be copied to the artifacts. It is necessary for the autoloading to be done when installing withdatadog-setup.php
(the problem doesn't exist with.apk
or.deb
)Readiness checklist
Reviewer checklist