Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ethon & Typhoeus tracing support #778

Merged
merged 11 commits into from
Sep 3, 2019
10 changes: 10 additions & 0 deletions Appraisals
Original file line number Diff line number Diff line change
Expand Up @@ -66,6 +66,7 @@ elsif Gem::Version.new('2.0.0') <= Gem::Version.new(RUBY_VERSION) \
gem 'delayed_job'
gem 'delayed_job_active_record'
gem 'elasticsearch-transport'
gem 'ethon'
gem 'excon'
gem 'hiredis'
gem 'mongo', '< 2.5'
Expand All @@ -83,6 +84,7 @@ elsif Gem::Version.new('2.0.0') <= Gem::Version.new(RUBY_VERSION) \
gem 'sqlite3', '~> 1.3.6'
gem 'sucker_punch'
gem 'timers', '< 4.2'
gem 'typhoeus'
end
end
elsif Gem::Version.new('2.1.0') <= Gem::Version.new(RUBY_VERSION) \
Expand Down Expand Up @@ -171,6 +173,7 @@ elsif Gem::Version.new('2.1.0') <= Gem::Version.new(RUBY_VERSION) \
gem 'delayed_job'
gem 'delayed_job_active_record'
gem 'elasticsearch-transport'
gem 'ethon'
gem 'excon'
gem 'hiredis'
gem 'mongo', '< 2.5'
Expand All @@ -189,6 +192,7 @@ elsif Gem::Version.new('2.1.0') <= Gem::Version.new(RUBY_VERSION) \
gem 'sqlite3', '~> 1.3.6'
gem 'sucker_punch'
gem 'timers', '< 4.2'
gem 'typhoeus'
end
end
elsif Gem::Version.new('2.2.0') <= Gem::Version.new(RUBY_VERSION) \
Expand Down Expand Up @@ -308,6 +312,7 @@ elsif Gem::Version.new('2.2.0') <= Gem::Version.new(RUBY_VERSION) \
gem 'delayed_job'
gem 'delayed_job_active_record'
gem 'elasticsearch-transport'
gem 'ethon'
gem 'excon'
gem 'grape'
gem 'graphql', '< 1.9.4'
Expand All @@ -328,6 +333,7 @@ elsif Gem::Version.new('2.2.0') <= Gem::Version.new(RUBY_VERSION) \
gem 'sinatra'
gem 'sqlite3', '~> 1.3.6'
gem 'sucker_punch'
gem 'typhoeus'
end
end
elsif Gem::Version.new('2.3.0') <= Gem::Version.new(RUBY_VERSION) \
Expand Down Expand Up @@ -447,6 +453,7 @@ elsif Gem::Version.new('2.3.0') <= Gem::Version.new(RUBY_VERSION) \
gem 'delayed_job'
gem 'delayed_job_active_record'
gem 'elasticsearch-transport'
gem 'ethon'
gem 'excon'
gem 'grape'
gem 'graphql'
Expand All @@ -467,6 +474,7 @@ elsif Gem::Version.new('2.3.0') <= Gem::Version.new(RUBY_VERSION) \
gem 'sinatra'
gem 'sqlite3', '~> 1.3.6'
gem 'sucker_punch'
gem 'typhoeus'
end
end
elsif Gem::Version.new('2.4.0') <= Gem::Version.new(RUBY_VERSION)
Expand All @@ -480,6 +488,7 @@ elsif Gem::Version.new('2.4.0') <= Gem::Version.new(RUBY_VERSION)
gem 'delayed_job'
gem 'delayed_job_active_record'
gem 'elasticsearch-transport'
gem 'ethon'
gem 'excon'
gem 'grape'
gem 'graphql'
Expand All @@ -500,6 +509,7 @@ elsif Gem::Version.new('2.4.0') <= Gem::Version.new(RUBY_VERSION)
gem 'sinatra'
gem 'sqlite3', '~> 1.3.6'
gem 'sucker_punch'
gem 'typhoeus'
end
end
end
6 changes: 6 additions & 0 deletions Rakefile
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,7 @@ namespace :spec do
:dalli,
:delayed_job,
:elasticsearch,
:ethon,
:excon,
:faraday,
:grape,
Expand Down Expand Up @@ -221,6 +222,7 @@ task :ci do
sh 'bundle exec appraisal contrib-old rake spec:rest_client'
sh 'bundle exec appraisal contrib-old rake spec:sequel'
sh 'bundle exec appraisal contrib-old rake spec:sinatra'
sh 'bundle exec appraisal contrib-old rake spec:ethon'
# Rails minitests
sh 'bundle exec appraisal rails30-postgres rake test:rails'
sh 'bundle exec appraisal rails30-postgres rake test:railsdisableenv'
Expand Down Expand Up @@ -269,6 +271,7 @@ task :ci do
sh 'bundle exec appraisal contrib-old rake spec:rest_client'
sh 'bundle exec appraisal contrib-old rake spec:sequel'
sh 'bundle exec appraisal contrib-old rake spec:sinatra'
sh 'bundle exec appraisal contrib-old rake spec:ethon'
# Rails minitests
sh 'bundle exec appraisal rails30-postgres rake test:rails'
sh 'bundle exec appraisal rails30-postgres rake test:railsdisableenv'
Expand Down Expand Up @@ -327,6 +330,7 @@ task :ci do
sh 'bundle exec appraisal contrib rake spec:sequel'
sh 'bundle exec appraisal contrib rake spec:shoryuken'
sh 'bundle exec appraisal contrib rake spec:sinatra'
sh 'bundle exec appraisal contrib rake spec:ethon'
# Rails minitests
sh 'bundle exec appraisal rails30-postgres rake test:rails'
sh 'bundle exec appraisal rails30-postgres rake test:railsdisableenv'
Expand Down Expand Up @@ -393,6 +397,7 @@ task :ci do
sh 'bundle exec appraisal contrib rake spec:sequel'
sh 'bundle exec appraisal contrib rake spec:shoryuken'
sh 'bundle exec appraisal contrib rake spec:sinatra'
sh 'bundle exec appraisal contrib rake spec:ethon'
# Rails minitests
sh 'bundle exec appraisal rails30-postgres rake test:rails'
sh 'bundle exec appraisal rails30-postgres rake test:railsdisableenv'
Expand Down Expand Up @@ -458,6 +463,7 @@ task :ci do
sh 'bundle exec appraisal contrib rake spec:sequel'
sh 'bundle exec appraisal contrib rake spec:shoryuken'
sh 'bundle exec appraisal contrib rake spec:sinatra'
sh 'bundle exec appraisal contrib rake spec:ethon'
# Rails minitests
sh 'bundle exec rake benchmark'
end
Expand Down
23 changes: 23 additions & 0 deletions docs/GettingStarted.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@ To contribute, check out the [contribution guidelines][contribution docs] and [d
- [Dalli](#dalli)
- [DelayedJob](#delayedjob)
- [Elastic Search](#elastic-search)
- [Ethon & Typhoeus](#ethon)
- [Excon](#excon)
- [Faraday](#faraday)
- [Grape](#grape)
Expand Down Expand Up @@ -327,6 +328,7 @@ For a list of available integrations, and their configuration options, please re
| Dalli | `dalli` | `>= 2.7` | *[Link](#dalli)* | *[Link](https://github.com/petergoldstein/dalli)* |
| DelayedJob | `delayed_job` | `>= 4.1` | *[Link](#delayedjob)* | *[Link](https://github.com/collectiveidea/delayed_job)* |
| Elastic Search | `elasticsearch` | `>= 6.0` | *[Link](#elastic-search)* | *[Link](https://github.com/elastic/elasticsearch-ruby)* |
| Ethon | `ethon` | `>= 0.11.0` | *[Link](#ethon)* | *[Link](https://github.com/typhoeus/ethon)* |
| Excon | `excon` | `>= 0.62` | *[Link](#excon)* | *[Link](https://github.com/excon/excon)* |
| Faraday | `faraday` | `>= 0.14` | *[Link](#faraday)* | *[Link](https://github.com/lostisland/faraday)* |
| Grape | `grape` | `>= 1.0` | *[Link](#grape)* | *[Link](https://github.com/ruby-grape/grape)* |
Expand Down Expand Up @@ -563,6 +565,27 @@ Where `options` is an optional `Hash` that accepts the following parameters:
| `service_name` | Service name used for `elasticsearch` instrumentation | `'elasticsearch'` |
| `tracer` | `Datadog::Tracer` used to perform instrumentation. Usually you don't need to set this. | `Datadog.tracer` |

### Ethon

The `ethon` integration will trace any HTTP request through `Easy` or `Multi` objects. Note that this integration also supports `Typhoeus` library which is based on `Ethon`.

```ruby
require 'ddtrace'

Datadog.configure do |c|
c.use :ethon, options
end
```

Where `options` is an optional `Hash` that accepts the following parameters:

| Key | Description | Default |
| --- | ----------- | ------- |
| `analytics_enabled` | Enable analytics for spans produced by this integration. `true` for on, `nil` to defer to global setting, `false` for off. | `false` |
| `distributed_tracing` | Enables [distributed tracing](#distributed-tracing) | `true` |
| `service_name` | Service name for `ethon` instrumentation. | `'ethon'` |
| `tracer` | `Datadog::Tracer` used to perform instrumentation. Usually you don't need to set this. | `Datadog.tracer` |

### Excon

The `excon` integration is available through the `ddtrace` middleware:
Expand Down
1 change: 1 addition & 0 deletions lib/ddtrace.rb
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ module Datadog
require 'ddtrace/contrib/dalli/integration'
require 'ddtrace/contrib/delayed_job/integration'
require 'ddtrace/contrib/elasticsearch/integration'
require 'ddtrace/contrib/ethon/integration'
require 'ddtrace/contrib/excon/integration'
require 'ddtrace/contrib/faraday/integration'
require 'ddtrace/contrib/grape/integration'
Expand Down
24 changes: 24 additions & 0 deletions lib/ddtrace/contrib/ethon/configuration/settings.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
require 'ddtrace/contrib/configuration/settings'
require 'ddtrace/contrib/ethon/ext'

module Datadog
module Contrib
module Ethon
module Configuration
# Custom settings for the Ethon integration
class Settings < Contrib::Configuration::Settings
option :analytics_enabled,
default: -> { env_to_bool(Ext::ENV_ANALYTICS_ENABLED, false) },
lazy: true

option :analytics_sample_rate,
default: -> { env_to_float(Ext::ENV_ANALYTICS_SAMPLE_RATE, 1.0) },
lazy: true

option :distributed_tracing, default: true
option :service_name, default: Ext::SERVICE_NAME
end
end
end
end
end
136 changes: 136 additions & 0 deletions lib/ddtrace/contrib/ethon/easy_patch.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,136 @@
require 'ddtrace/ext/net'
require 'ddtrace/ext/distributed'
require 'ddtrace/propagation/http_propagator'
require 'ddtrace/contrib/ethon/ext'

module Datadog
module Contrib
module Ethon
# Ethon EasyPatch
module EasyPatch
def self.included(base)
base.send(:prepend, InstanceMethods)
end

# InstanceMethods - implementing instrumentation
module InstanceMethods
def http_request(url, action_name, options = {})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you walk me through the flow of a request here a bit? Order the of the functions being called, sync vs async requests, etc.

I see a number of methods being patched here, each collecting and storing some information as instance variables. I'd like to understand how the state is built. Is this information not available on perform without first collecting in between all these methods?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For sure! I'll start with sync requests.

To make a sync request, the client uses Easy object. First it is created, and as a part of initialization it calls the set_attributes method which will in turn call headers=, but only if headers are provided. The client can also set the headers directly using headers= method. The problem is that as soon as headers are set they are converted to FFI pointer, that's why I store their original version. And we don't have a span yet to inject the tracing headers.

http_request method is used as a helper to populate the easy with necessary information for HTTP request. In particular, HTTP method is passed (for example, easy.http_request("www.example.com", :post, { params: { a: 1 }, body: { b: 2 } })). The problem is that this data is not preserved on easy in a way that can be used to recover the HTTP method easily. The factories set various low-level attributes (https://github.com/typhoeus/ethon/tree/master/lib/ethon/easy/http), and libcurl itself figures out HTTP method based on all these attributes. So I figured it is easier to just store the method instead of trying to recover it from libcurl attributes.

The sync way to execute an easy is calling perform. I patch the perform to create the span and to inject the headers if needed (that's why I stored the original headers on the object, to avoid reading them back from FFI). perform uses libcurl's easy_perform method and then calls the complete method on easy that I patch to finish the span.

Async requests are executed using Multi. Individual requests are represented by the same Easy objects, however the execution flow is different. Easy instance is added to the Multi using add method. I consider this to be the beginning of the request execution, that's why the span is created there. There is no easy way to know if it is the beginning of the request or if multi is still on hold because multi execution can be in progress in the other thread. I follow here the way Typhoeus uses Multi : it adds the easy objects to Multi right before executing the perform. It also has hooks for before-request event which are executed in the add call.

The execution of multi happens when the perform method is called. This method uses libcurl's multi_perform and calls complete on easy objects which finished executing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay this is pretty interesting, and exactly the kind of explanation I was looking for. (Thank you!)

I can see why you'd want to store this info before it gets turned into a format that's hard to read from; as long these objects are 1-1 with requests, then this seems okay. Just have to be careful in any scenario in which an Easy object is re-used or re-ran? Not sure if that happens, but some food for thought.

If Multi uses multiple Easy objects, would it make sense to add an additional parent span to the multi operation? Thinking about the case if users want to see the batch as an operation, in addition to its constituent parts.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see why you'd want to store this info before it gets turned into a format that's hard to read from; as long these objects are 1-1 with requests, then this seems okay. Just have to be careful in any scenario in which an Easy object is re-used or re-ran? Not sure if that happens, but some food for thought.

That's a good point. I added extra safety net by patching the reset method which is used before re-using easy instance. This way I can be sure that HTTP method or headers are not mistakenly passed in the next request.

return super unless tracer_enabled?

# It's tricky to get HTTP method from libcurl
@datadog_method = action_name.to_s.upcase
super
end

def headers=(headers)
return super unless tracer_enabled?

# Store headers to call this method again when span is ready
@datadog_original_headers = headers
super headers
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor but can you just call super here?

end

def perform
return super unless tracer_enabled?
datadog_before_request
super
end

def complete
return super unless tracer_enabled?
begin
response_options = mirror.options
response_code = (response_options[:response_code] || response_options[:code]).to_i
if response_code.zero?
return_code = response_options[:return_code]
message = return_code ? ::Ethon::Curl.easy_strerror(return_code) : 'unknown reason'
set_span_error_message("Request has failed: #{message}")
else
@datadog_span.set_tag(Datadog::Ext::HTTP::STATUS_CODE, response_code)
if Datadog::Ext::HTTP::ERROR_RANGE.cover?(response_code)
set_span_error_message("Request has failed with HTTP error: #{response_code}")
end
end
ensure
@datadog_span.finish
@datadog_span = nil
end
super
end

def reset
super
ensure
if tracer_enabled?
@datadog_span = nil
@datadog_method = nil
@datadog_original_headers = nil
end
end

def datadog_before_request(parent_span: nil)
@datadog_span = datadog_configuration[:tracer].trace(
delner marked this conversation as resolved.
Show resolved Hide resolved
Ext::SPAN_REQUEST,
service: datadog_configuration[:service_name],
span_type: Datadog::Ext::HTTP::TYPE_OUTBOUND
)
@datadog_span.parent = parent_span unless parent_span.nil?

datadog_tag_request

if datadog_configuration[:distributed_tracing]
@datadog_original_headers ||= {}
Datadog::HTTPPropagator.inject!(@datadog_span.context, @datadog_original_headers)
delner marked this conversation as resolved.
Show resolved Hide resolved
self.headers = @datadog_original_headers
end
end

def datadog_span_started?
instance_variable_defined?(:@datadog_span) && !@datadog_span.nil?
end

private

def datadog_tag_request
span = @datadog_span
uri = URI.parse(url)
method = defined?(@datadog_method) ? @datadog_method.to_s : ''
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be instance_variable_defined??

span.resource = "#{method} #{uri.path}".lstrip
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We only want to set the HTTP method here, e.g. GET, but omit the path.

Sounds crazy, and I see the value of including the path. However, the resource is meant to be a GROUP BY like key, which in practice requires values that meaningfully group traces on some user-defined dimension. HTTP paths often contain unique input (such as numeric IDs, API tokens), which tend to flatten these traces into groups too small to be meaningful. Hence, as a default behavior, we've been only setting the method as the resource for HTTP client integrations.

If the omission of path as a default behavior is an issue for users, the concession we could offer is the option to customize their span and set their own resource using a callback. See HTTP as an example: https://github.com/DataDog/dd-trace-rb/blob/master/lib/ddtrace/contrib/http/instrumentation.rb#L80. Implementing something like this is optional though, not necessary to merge this PR.

Also, it looks like the method can be nil; if you remove the path, it looks like the resource could then be an empty string, which will cause the trace to be malformed and dropped. We'll want to make sure resource is never blank, even if we have to give it some kind of default value instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, I was wondering why the resource contains only method name in other instrumentations. I changed the default value of method to N/A & handled the case when it is nil.


# Set analytics sample rate
Contrib::Analytics.set_sample_rate(span, analytics_sample_rate) if analytics_enabled?

span.set_tag(Datadog::Ext::HTTP::URL, uri.path)
span.set_tag(Datadog::Ext::HTTP::METHOD, method)
span.set_tag(Datadog::Ext::NET::TARGET_HOST, uri.host)
span.set_tag(Datadog::Ext::NET::TARGET_PORT, uri.port)
rescue URI::InvalidURIError
return
end

def set_span_error_message(message)
# Sets span error from message, in case there is no exception available
@datadog_span.status = Datadog::Ext::Errors::STATUS
@datadog_span.set_tag(Datadog::Ext::Errors::MSG, message)
end

def datadog_configuration
Datadog.configuration[:ethon]
end

def tracer_enabled?
datadog_configuration[:tracer].enabled
end

def analytics_enabled?
Contrib::Analytics.enabled?(datadog_configuration[:analytics_enabled])
end

def analytics_sample_rate
datadog_configuration[:analytics_sample_rate]
end
end
end
end
end
end
15 changes: 15 additions & 0 deletions lib/ddtrace/contrib/ethon/ext.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
module Datadog
module Contrib
module Ethon
# Ethon integration constants
module Ext
APP = 'ethon'.freeze
ENV_ANALYTICS_ENABLED = 'DD_ETHON_ANALYTICS_ENABLED'.freeze
ENV_ANALYTICS_SAMPLE_RATE = 'DD_ETHON_ANALYTICS_SAMPLE_RATE'.freeze
SERVICE_NAME = 'ethon'.freeze
SPAN_REQUEST = 'ethon.request'.freeze
SPAN_MULTI_REQUEST = 'ethon.multi.request'.freeze
end
end
end
end
35 changes: 35 additions & 0 deletions lib/ddtrace/contrib/ethon/integration.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
require 'ddtrace/contrib/integration'
require 'ddtrace/contrib/ethon/configuration/settings'
require 'ddtrace/contrib/ethon/patcher'

module Datadog
module Contrib
module Ethon
# Description of Ethon integration
class Integration
include Contrib::Integration
register_as :ethon

def self.version
Gem.loaded_specs['ethon'] && Gem.loaded_specs['ethon'].version
end

def self.present?
super && defined?(::Ethon::Easy)
end

def self.compatible?
super && Gem::Version.new(RUBY_VERSION) >= Gem::Version.new('2.0.0')
end

def default_configuration
Configuration::Settings.new
end

def patcher
Patcher
end
end
end
end
end