-
Notifications
You must be signed in to change notification settings - Fork 374
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[distributed tracing] add support for Rack & Net/HTTP #151
Conversation
is there any ETA for this PR? |
@dri3r the plan is to have it reviewed and fully tested tomorrow, then release should follow this week. |
922183b
to
258dd99
Compare
@@ -12,6 +12,12 @@ module HTTP | |||
APP = 'net/http'.freeze | |||
SERVICE = 'net/http'.freeze | |||
|
|||
@distributed_tracing_enabled = false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@palazzem I added this one because manipulating the Pin at each client call was, IMHO, really a pain at some point. For the good and the bad, this enables/disables it globally. It's harder to find a proper grain for that config with Net/HTTP (client lib) than with the server side (Rack, for instance) as there's no such thing as a "global client instance". While there's a global server instance and config.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's fine, we can define that distributed tracing is globally activated for that interpreter execution. From my point of view, it's enough to write that in the documentation.
HTTP_HEADER_TRACE_ID = 'HTTP_X_DATADOG_TRACE_ID'.freeze | ||
|
||
# Header used to transmit the parent ID. | ||
HTTP_HEADER_PARENT_ID = 'HTTP_X_DATADOG_PARENT_ID'.freeze |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am already sending these under the old name, it won't require too much change on my side but it would be nice if they were configurable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good feedback. Indeed configurable entries would ease up your migration process and allow fine-tuning. OTOH these should really be the same across all languages, implementations, etc. At some point it's a bit like having standard HTTP headers like WWW-Authenticate
be tuneable (randomly picked from https://tools.ietf.org/html/rfc2617 but you get the idea). Plus the fact in some areas you want them uppercased, and in some not. I fear this could become complicated. But again, will try and see if I can find a way that makes it easy without introducing any complexity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK @cabello we did discuss that point internally. At some point I had a test which was running this:
# Below a quick example on how to hack and modify the standard headers.
Datadog::Monkey.without_warnings do # get rid of overriden constants warning
module Datadog
module Ext
module DistributedTracing
# These values can be used to communicate trace relationships
# across multiple languages, so it's not recommended to modify
# them, but should you want to do it, you can override them like this.
HTTP_HEADER_TRACE_ID = 'x-custom-distributed-tracing-trace-id'.freeze
HTTP_HEADER_PARENT_ID = 'x-custom-distributed-tracing-parent-id'.freeze
end
end
module Contrib
module Rack
# Note the fact these are:
# 1 - uppercased
# 2 - '-' is replaced by '_'
HTTP_HEADER_TRACE_ID = 'HTTP_X_CUSTOM_DISTRIBUTED_TRACING_TRACE_ID'.freeze
HTTP_HEADER_PARENT_ID = 'HTTP_X_CUSTOM_DISTRIBUTED_TRACING_PARENT_ID'.freeze
end
end
end
end
And it really does work, passes on CI fine, and using it on a toy app shows it works. You may use this. But I'm going to pull it from the code base, as we really don't want those to change, they should be shared among all languages indeed.
472a27f
to
4c9233b
Compare
…o x-datadog-trace-id
This patch adds a test. It's not totally done yet but at least it does check the parent_id and trace_id are correctly passed. The error cases are not tested yet. They should be. Also, introduces a 'distributed_tracing_enabled' option, by default all this is disabled.
…or all http calls
…arams In recent context branch, trace_id and parent_id had been removed from args allowed in trace(), and kept only in start_span(). This caused problems as those parameters are passed to trace() in rack distributed implementation. Ultimately, if trace_id and parent_id are passed, but the span has a parent in the context, the context parent will be used and prevail over the overridden trace_id and parent_id. If you don't want that behavior -> use start_span.
b3c9349
to
46e366d
Compare
@@ -12,6 +12,12 @@ module HTTP | |||
APP = 'net/http'.freeze | |||
SERVICE = 'net/http'.freeze | |||
|
|||
@distributed_tracing_enabled = false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's fine, we can define that distributed tracing is globally activated for that interpreter execution. From my point of view, it's enough to write that in the documentation.
lib/ddtrace/span.rb
Outdated
@trace_id = options.fetch(:trace_id, @span_id) | ||
# use span_id for trace_id if given trace_id is nil or zero | ||
@trace_id = options.fetch(:trace_id, 0) | ||
@trace_id = @span_id if @trace_id.zero? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure of this change. Why it's not:
@trace_id = options.fetch(:trace_id, Datadog::Utils.next_id())
if the passed value is wrong, it should end with a wrong output otherwise it could hide a bug. If it's not passed, it should be another random number no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mmm, no, it could not be a new random number, the only value we can set is span_id IMHO. This is actually to catch someone who would pass a bad header. Then the header is here so trace_id is passed, but it's interpreted as zero so in the end the constructor of Span tries to set "0" as the trace_id. In that case what we want is consider this is a standard root span, and abort that distributed tracing attempt. I think it's somewhat related to that commit: #141 (diff) in the original proposal. That marshalling of IDs into strings then converting them back into ints introduces a risk that "trace_id: 0" appears somewhere. I don't see it in the current code path but it could really happen after some refactoring. So I think the above is slightly more robust, it creates root span if trace_id was not passed as a possibly valid integer. As a side note, there are many cases in which distributed tracing can be broken because we can't control what's coming upstream. We can only try and protect ourselves against bad input.
lib/ddtrace/span.rb
Outdated
@@ -144,9 +148,14 @@ def to_s | |||
"Span(name:#{@name},sid:#{@span_id},tid:#{@trace_id},pid:#{@parent_id})" | |||
end | |||
|
|||
# DEPRECATED: remove this function in the next release, replaced by ``parent=`` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need a deprecation warning here. The Python client already has one and we should introduce a mechanism also for Ruby otherwise users are not aware of these deprecations.
require 'ddtrace' | ||
require 'ddtrace/contrib/rack/middlewares' | ||
|
||
require 'rack/test' | ||
|
||
Datadog::Monkey.patch_module(:http) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure here, should we automatically patch net/http
when we require the rack helpers? can't we do this in the test case setUp
handler? Just asking because AFAIR patching the library at import time was hiding a bug. Mostly a question rather than requiring an action.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're actually right. The only thing is... the way the test framework (and our setup works) I could technically write another file and requiring it here only but anyway minitest is going to load it whatsoever and monkey patch http anyway. The only reasonable way I see to fix it is to write another test, elsewhere, in another directory, but then I'd copy/paste quite a bit of code. This is why I did it here. But not so proud of it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I see. I don't want to spend a lot of time on it, neither duplicate our testing code. Let's keep it that way, also because having the unpatch()
method for our integrations, will simplify this task a lot.
But it's another thing that isn't required in this PR.
require 'contrib/rack/helpers' | ||
require 'contrib/http/test_helper' | ||
|
||
# Below a quick example on how to hack and modify the standard headers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not really convinced that we want to provide such kind of suggestion (and your comment already suggest to not change headers). Distributed Tracing wasn't part of previous releases (at least not in the sense of stable and fully supported API), and writing this example here means suggesting to monkey patch our library to change headers name.
This should never happen because it means providing the same functionality to all other libraries (e.g. Ruby net/http with custom headers -> Django web framework). We may suggest this approach in an informal way, but I'm not sure to write it here, since it's like "you may do this, but don't!".
I'd advocate to support that functionality 100% or 0%. Providing a suggestion in a GitHub comment and go back to the problem if it's happening very often, may be enough.
@ufoot our "friend":
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good to me. It has been a long work but definitely useful! In that state it's fine to ship it as experimental feature, so that we can start collecting some feedbacks.
As soon as tests pass, it's good to go!
# | ||
# Use integer values for tests, as it will catch both | ||
# a non-existing header or a badly formed one. | ||
trace_id, parent_id = Datadog::Distributed.parse_trace_headers( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great! I like it!
# sorts out most common errors, such as syntax, nil values, etc. | ||
# Both headers must be set, else nil values are returned, for both. | ||
# Reports problem on debug log. | ||
def parse_trace_headers(trace_id_header, parent_id_header) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's perfect, so we're having the right protection in the good place! thank you a lot!
@parent_id = options.fetch(:parent_id, 0) | ||
@trace_id = options.fetch(:trace_id, @span_id) | ||
@trace_id = options.fetch(:trace_id, Datadog::Utils.next_id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
require 'ddtrace' | ||
require 'ddtrace/contrib/rack/middlewares' | ||
|
||
require 'rack/test' | ||
|
||
Datadog::Monkey.patch_module(:http) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I see. I don't want to spend a lot of time on it, neither duplicate our testing code. Let's keep it that way, also because having the unpatch()
method for our integrations, will simplify this task a lot.
But it's another thing that isn't required in this PR.
# %w[9223372036854775808 9223372036854775808] => [nil, nil], | ||
# But this one should fail: | ||
%w[18446744073709551615 18446744073709551615] => [18446744073709551615, 18446744073709551615], | ||
%w[18446744073709551616 18446744073709551616] => [nil, nil], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice catch!
…= trace_id asumption
…race_id asumption
This adds support for distributing tracing on top of #145 and #141