Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deprecate Contrib::Configuration::Settings#tracer= #1079

Merged
merged 12 commits into from
Jun 22, 2020

Conversation

marcotc
Copy link
Member

@marcotc marcotc commented Jun 12, 2020

After we started our changes towards the immutability of internal tracer components (#996), we've had issues being reported related to some traces not being captured.

The main culprit of this is the proliferation of stale tracer instances across the tracer. Tracer instances cannot be eagerly cached anymore as we now create a new immutable tracer instance when ddtracer is reconfigured (through Datadog.configure{} calls). The correct tracer instance to use Datadog.tracer: we guarantee that this call always returns the active tracer.

This PR removes the ability to set, an thus cache, the tracer instance for our integrations (all code under lib/ddtrace/contrib).

This should address all reports captured in #1072 that are related to missing traces.

This PR also supersedes #1064.

Deprecation

The tracer: option has been removed from instrumentation configuration. If you are currently using it you will received a warning instructing you to remove this explicit configuration.

The global tracer instance, Datadog.tracer, is the only recommend tracer instance to be used. Caching the result of Datadog.tracer is not supported, as it can return different objects at different points in the application execution.

@marcotc marcotc added bug Involves a bug integrations Involves tracing integrations community Was opened by a community member labels Jun 12, 2020
@marcotc marcotc requested a review from a team June 12, 2020 21:33
@marcotc marcotc self-assigned this Jun 12, 2020
ddtrace.gemspec Outdated
spec.add_development_dependency 'appraisal', '~> 2.2'
spec.add_development_dependency 'yard', '~> 0.9'
spec.add_development_dependency 'webmock', '~> 2.0'
spec.add_development_dependency 'builder'
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not being used for testing anymore.

@@ -18,6 +18,8 @@ def self.included(base)
# Redefines some class behaviors for a Subscriber to make
# it a bit simpler for an Event.
module ClassMethods
DEFAULT_TRACER = -> { Datadog.tracer }
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move to a constant to avoid creating a new Proc on every invocation.

@@ -51,8 +51,11 @@ def services
#
def shutdown!
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shutdown is not a safe operation to be executed in parallel, so we add a critical zone to it.
This affects our some of our concurrency tests.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this related to #1063? Does it resolve it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's the same issue.
I think the solution proposed in #1063 is actually better, as the writer is supposed to be a trusted thread-safe object. The tracer shouldn't have to make the writer thread-safe explicitly.
I'll remove this change so we can implement that other solution instead.

@@ -325,7 +328,7 @@ def record(context)
def record_context(context)
trace = @context_flush.consume!(context)

write(trace) if trace && !trace.empty?
write(trace) if @enabled && trace && !trace.empty?
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved the @enabled check here to ease the testing around tracer#write. Before we'd had to double check if tracer#write was actually being called by also check @enabled.

Now we can safely assert on tracer#write, knowing that it is only called when the tracer actually wants to write a new trace.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting... it might be better to move this enabled behavior elsewhere such that we prevent spans from being recorded into the context in the first place. Not necessary to undertake right now, but perhaps when we improve disabling behavior.

@@ -41,9 +37,6 @@
it_behaves_like 'measured span for integration', false

it 'calls the instrumentation when is used standalone' do
# expect service and trace is sent
expect(spans.size).to eq(1)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lot of this logic can now be removed because we are not using the global helpers #span and #spans.

#span, for example, asserts that there's only one span available, as to not deceive the developer by returning the first among many when that might not be the intention.

unmodified_future = ::Concurrent::Future.dup
example.run
# DEV We save an unmodified copy of Concurrent::Future.
let!(:unmodified_future) { ::Concurrent::Future.dup }
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a smart trick we've been doing to unpatch concurrent-ruby, too smart for my taste I'd say :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I remember running into this and it messing up my other writer extraction PR... not sure if it helps it sidestep the problem it had. Either way, not thrilled with the complexity in this test.

@@ -136,7 +136,7 @@
end
end

context '#write' do
context '#delete' do
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test had the wrong name.

@@ -25,6 +25,18 @@ class RackTestingAPI < Grape::API

class BaseRackAPITest < MiniTest::Test
include Rack::Test::Methods
include TestTracerHelper
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some shared logic for minitests has been extracted to TestTracerHelper

Copy link
Contributor

@delner delner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks excellent. There's a lot changed here, but it looks like it makes many of the tests much less verbose and makes a lot of the testing code reused... this signals movement in the right direction.

I added a few questions and have some minor suggestions for changes. We should test this a bit more outside of CI, but once that looks good, I can see this being ready to merge.

@@ -12,6 +12,9 @@ class Settings
option :service_name
option :tracer do |o|
o.delegate_to { Datadog.tracer }
o.on_set do |_value|
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's nice about this on_set is it'll always let us know if we accidentally set this when we shouldn't have in integration/test code.

the correct tracer internally.
).freeze

include Datadog::Patcher # DEV includes #do_once here. We should move that logic to a generic component.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Totally agree... before we had started this refactor, I had been hoping to refactor how we do deprecation warnings. e.g. Extract all the necessary functions to track and raise deprecation warnings into a module.

This is fine for the short term but we will want to get rid of this.

# We set defaults here instead of in the patcher because we need to wait
# for the Rails application to be fully initialized.
Datadog.configuration[:rails].tap do |config|
datadog_config[:rails].tap do |config|
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems weird... functionally speaking, it think your change is basically identical. However, it seems to me that there's a collision where #[](key) resolves an integration where #[](key) might also be used to resolve a configuration option too. I guess we only use direct function calls on configuration options (e.g. configuration.tracer.hostname)?

Probably no change necessary, just pointing this out.

@@ -51,8 +51,11 @@ def services
#
def shutdown!
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this related to #1063? Does it resolve it?

@@ -325,7 +328,7 @@ def record(context)
def record_context(context)
trace = @context_flush.consume!(context)

write(trace) if trace && !trace.empty?
write(trace) if @enabled && trace && !trace.empty?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting... it might be better to move this enabled behavior elsewhere such that we prevent spans from being recorded into the context in the first place. Not necessary to undertake right now, but perhaps when we improve disabling behavior.

spec/ddtrace/contrib/configuration/settings_spec.rb Outdated Show resolved Hide resolved
spec/ddtrace/contrib/http/circuit_breaker_spec.rb Outdated Show resolved Hide resolved
before(:each) { tracer.enabled = false }
before do
Datadog.configure do |c|
c.tracer.enabled = false
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add an after to reset this? If we're setting a test tracer explicitly before each test, resetting this value technically will have no effect (hence why this might be working right now), but the enabled setting will likely persist on the global configuration unless it too is totally reset/replaced.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This new clean up code introduced in this PR is saving this test from failure:

Datadog.configuration.reset!

I do agree that it should reset it, I'll add explicit code in here for that.

allow_any_instance_of(Datadog::Tracer).to receive(:write) do |tracer, trace|
tracer.instance_exec do
write_lock.synchronize do
@spans ||= []
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why store this on the tracer itself? Is this because some tests have multiple tracers? (Otherwise it'd be best to store this in the example's context.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@delner for tests with multiple tracers, yes.
The idea is that if we write to a tracer, that tracer should give us what's been written back.

It also helps us, given the issues we are facing, to know that the tracer use for assertions in our tests is the correct instance.

We might not need this in the future, but it is a very relevant safe-guard for now.

@delner
Copy link
Contributor

delner commented Jun 17, 2020

@marcotc Looks like we still need builder for building the gem for prerelease?

@delner
Copy link
Contributor

delner commented Jun 17, 2020

Changes look good. Once CI is green I can approve this. I would merge only after our test environment is stable though.

@marcotc marcotc requested a review from delner June 18, 2020 17:27
Copy link
Contributor

@delner delner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay changes looks good. Let's be sure to check this is stable in our test environment, then we can merge it.

Thanks for cranking this out! Nice work! 🎉

@fledman
Copy link
Contributor

fledman commented Jun 19, 2020

Were you planning to integrate my changes from #1063 in this PR? Or open a new one?

@marcotc
Copy link
Member Author

marcotc commented Jun 22, 2020

As a separate PR, @fledman. The thread-safety is somewhat unrelated to these changes, and your proposed solution was cleaner than the one I had in this PR.

@marcotc marcotc merged commit bac69b5 into master Jun 22, 2020
@marcotc marcotc deleted the fix/tracer-delegation branch June 22, 2020 16:11
@marcotc marcotc added this to the 0.37.0 milestone Jun 24, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Involves a bug community Was opened by a community member integrations Involves tracing integrations
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants