Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[distributed tracing] add support for Rack & Net/HTTP #151

Merged
merged 17 commits into from
Jul 21, 2017

Conversation

ufoot
Copy link
Contributor

@ufoot ufoot commented Jul 7, 2017

This adds support for distributing tracing on top of #145 and #141

@ufoot ufoot requested a review from palazzem July 7, 2017 10:23
@ufoot ufoot added this to the 0.8.0 milestone Jul 7, 2017
@ufoot ufoot changed the title Christian/rack dynamic tracing [distributed tracing] add support for Rack & Net/HTTP Jul 17, 2017
@driv3r
Copy link

driv3r commented Jul 17, 2017

is there any ETA for this PR?

@ufoot
Copy link
Contributor Author

ufoot commented Jul 17, 2017

@dri3r the plan is to have it reviewed and fully tested tomorrow, then release should follow this week.

@ufoot ufoot force-pushed the christian/rack_dynamic_tracing branch from 922183b to 258dd99 Compare July 17, 2017 13:07
@@ -12,6 +12,12 @@ module HTTP
APP = 'net/http'.freeze
SERVICE = 'net/http'.freeze

@distributed_tracing_enabled = false
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@palazzem I added this one because manipulating the Pin at each client call was, IMHO, really a pain at some point. For the good and the bad, this enables/disables it globally. It's harder to find a proper grain for that config with Net/HTTP (client lib) than with the server side (Rack, for instance) as there's no such thing as a "global client instance". While there's a global server instance and config.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's fine, we can define that distributed tracing is globally activated for that interpreter execution. From my point of view, it's enough to write that in the documentation.

HTTP_HEADER_TRACE_ID = 'HTTP_X_DATADOG_TRACE_ID'.freeze

# Header used to transmit the parent ID.
HTTP_HEADER_PARENT_ID = 'HTTP_X_DATADOG_PARENT_ID'.freeze
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am already sending these under the old name, it won't require too much change on my side but it would be nice if they were configurable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good feedback. Indeed configurable entries would ease up your migration process and allow fine-tuning. OTOH these should really be the same across all languages, implementations, etc. At some point it's a bit like having standard HTTP headers like WWW-Authenticate be tuneable (randomly picked from https://tools.ietf.org/html/rfc2617 but you get the idea). Plus the fact in some areas you want them uppercased, and in some not. I fear this could become complicated. But again, will try and see if I can find a way that makes it easy without introducing any complexity.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK @cabello we did discuss that point internally. At some point I had a test which was running this:

# Below a quick example on how to hack and modify the standard headers.
Datadog::Monkey.without_warnings do # get rid of overriden constants warning
  module Datadog
    module Ext
      module DistributedTracing
        # These values can be used to communicate trace relationships
        # across multiple languages, so it's not recommended to modify
        # them, but should you want to do it, you can override them like this.
        HTTP_HEADER_TRACE_ID = 'x-custom-distributed-tracing-trace-id'.freeze
        HTTP_HEADER_PARENT_ID = 'x-custom-distributed-tracing-parent-id'.freeze
      end
    end
    module Contrib
      module Rack
        # Note the fact these are:
        # 1 - uppercased
        # 2 - '-' is replaced by '_'
        HTTP_HEADER_TRACE_ID = 'HTTP_X_CUSTOM_DISTRIBUTED_TRACING_TRACE_ID'.freeze
        HTTP_HEADER_PARENT_ID = 'HTTP_X_CUSTOM_DISTRIBUTED_TRACING_PARENT_ID'.freeze
      end
    end
  end
end

And it really does work, passes on CI fine, and using it on a toy app shows it works. You may use this. But I'm going to pull it from the code base, as we really don't want those to change, they should be shared among all languages indeed.

@ufoot ufoot force-pushed the christian/rack_dynamic_tracing branch from 472a27f to 4c9233b Compare July 20, 2017 10:38
@ufoot ufoot changed the base branch from christian/context to master July 20, 2017 10:42
Jesse Whitham and others added 10 commits July 20, 2017 17:14
This patch adds a test. It's not totally done yet but at least it
does check the parent_id and trace_id are correctly passed.
The error cases are not tested yet. They should be.

Also, introduces a 'distributed_tracing_enabled' option, by
default all this is disabled.
…arams

In recent context branch, trace_id and parent_id had been removed from
args allowed in trace(), and kept only in start_span(). This caused
problems as those parameters are passed to trace() in rack distributed
implementation. Ultimately, if trace_id and parent_id are passed, but the
span has a parent in the context, the context parent will be used
and prevail over the overridden trace_id and parent_id. If you don't want
that behavior -> use start_span.
@ufoot ufoot force-pushed the christian/rack_dynamic_tracing branch from b3c9349 to 46e366d Compare July 20, 2017 15:14
@@ -12,6 +12,12 @@ module HTTP
APP = 'net/http'.freeze
SERVICE = 'net/http'.freeze

@distributed_tracing_enabled = false
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's fine, we can define that distributed tracing is globally activated for that interpreter execution. From my point of view, it's enough to write that in the documentation.

@trace_id = options.fetch(:trace_id, @span_id)
# use span_id for trace_id if given trace_id is nil or zero
@trace_id = options.fetch(:trace_id, 0)
@trace_id = @span_id if @trace_id.zero?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure of this change. Why it's not:

@trace_id = options.fetch(:trace_id, Datadog::Utils.next_id())

if the passed value is wrong, it should end with a wrong output otherwise it could hide a bug. If it's not passed, it should be another random number no?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mmm, no, it could not be a new random number, the only value we can set is span_id IMHO. This is actually to catch someone who would pass a bad header. Then the header is here so trace_id is passed, but it's interpreted as zero so in the end the constructor of Span tries to set "0" as the trace_id. In that case what we want is consider this is a standard root span, and abort that distributed tracing attempt. I think it's somewhat related to that commit: #141 (diff) in the original proposal. That marshalling of IDs into strings then converting them back into ints introduces a risk that "trace_id: 0" appears somewhere. I don't see it in the current code path but it could really happen after some refactoring. So I think the above is slightly more robust, it creates root span if trace_id was not passed as a possibly valid integer. As a side note, there are many cases in which distributed tracing can be broken because we can't control what's coming upstream. We can only try and protect ourselves against bad input.

@@ -144,9 +148,14 @@ def to_s
"Span(name:#{@name},sid:#{@span_id},tid:#{@trace_id},pid:#{@parent_id})"
end

# DEPRECATED: remove this function in the next release, replaced by ``parent=``
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need a deprecation warning here. The Python client already has one and we should introduce a mechanism also for Ruby otherwise users are not aware of these deprecations.

require 'ddtrace'
require 'ddtrace/contrib/rack/middlewares'

require 'rack/test'

Datadog::Monkey.patch_module(:http)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure here, should we automatically patch net/http when we require the rack helpers? can't we do this in the test case setUp handler? Just asking because AFAIR patching the library at import time was hiding a bug. Mostly a question rather than requiring an action.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're actually right. The only thing is... the way the test framework (and our setup works) I could technically write another file and requiring it here only but anyway minitest is going to load it whatsoever and monkey patch http anyway. The only reasonable way I see to fix it is to write another test, elsewhere, in another directory, but then I'd copy/paste quite a bit of code. This is why I did it here. But not so proud of it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I see. I don't want to spend a lot of time on it, neither duplicate our testing code. Let's keep it that way, also because having the unpatch() method for our integrations, will simplify this task a lot.

But it's another thing that isn't required in this PR.

require 'contrib/rack/helpers'
require 'contrib/http/test_helper'

# Below a quick example on how to hack and modify the standard headers.
Copy link
Contributor

@palazzem palazzem Jul 20, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not really convinced that we want to provide such kind of suggestion (and your comment already suggest to not change headers). Distributed Tracing wasn't part of previous releases (at least not in the sense of stable and fully supported API), and writing this example here means suggesting to monkey patch our library to change headers name.

This should never happen because it means providing the same functionality to all other libraries (e.g. Ruby net/http with custom headers -> Django web framework). We may suggest this approach in an informal way, but I'm not sure to write it here, since it's like "you may do this, but don't!".

I'd advocate to support that functionality 100% or 0%. Providing a suggestion in a GitHub comment and go back to the problem if it's happening very often, may be enough.

@palazzem
Copy link
Contributor

@ufoot our "friend":

Rakefile:8:1: W: Lint/UnneededDisable: Unnecessary disabling of Metrics/BlockLength.
# rubocop:disable Metrics/BlockLength
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Copy link
Contributor

@palazzem palazzem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to me. It has been a long work but definitely useful! In that state it's fine to ship it as experimental feature, so that we can start collecting some feedbacks.

As soon as tests pass, it's good to go!

#
# Use integer values for tests, as it will catch both
# a non-existing header or a badly formed one.
trace_id, parent_id = Datadog::Distributed.parse_trace_headers(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! I like it!

# sorts out most common errors, such as syntax, nil values, etc.
# Both headers must be set, else nil values are returned, for both.
# Reports problem on debug log.
def parse_trace_headers(trace_id_header, parent_id_header)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's perfect, so we're having the right protection in the good place! thank you a lot!

@parent_id = options.fetch(:parent_id, 0)
@trace_id = options.fetch(:trace_id, @span_id)
@trace_id = options.fetch(:trace_id, Datadog::Utils.next_id)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

require 'ddtrace'
require 'ddtrace/contrib/rack/middlewares'

require 'rack/test'

Datadog::Monkey.patch_module(:http)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I see. I don't want to spend a lot of time on it, neither duplicate our testing code. Let's keep it that way, also because having the unpatch() method for our integrations, will simplify this task a lot.

But it's another thing that isn't required in this PR.

# %w[9223372036854775808 9223372036854775808] => [nil, nil],
# But this one should fail:
%w[18446744073709551615 18446744073709551615] => [18446744073709551615, 18446744073709551615],
%w[18446744073709551616 18446744073709551616] => [nil, nil],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice catch!

@ufoot ufoot merged commit 8216e9a into master Jul 21, 2017
@palazzem palazzem deleted the christian/rack_dynamic_tracing branch October 6, 2017 18:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants