-
Notifications
You must be signed in to change notification settings - Fork 369
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Auto instrument Resque workers by default #1400
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a few suggestions, none of which are blocking. LGTM 👍
docs/GettingStarted.md
Outdated
Datadog.configure do |c| | ||
c.use :resque, options | ||
c.use :resque, auto_instrument: true, **options |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The change to **options
seems like it would make sense for all other integrations described in this doc. Can I convince you to open a separate PR to convert all of them? ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good.
require 'resque' | ||
require 'ddtrace' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've noticed we suggest adding the require for most integrations, but I'm curious if we could do the require on our side instead -- if the customer explicitly states c.use :resque
, would it make sense for us to auto-require?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Upside: clients don't have to write require
themselves.
Downside: it might not be transparent what's being require
d in their application: "Did ddtrace require 'active_support' or 'active_support/all'?".
Also, they might not find any require
statements by "grepping" their code anymore for a library they use: "Where is Redis require
d? I can't find it anywhere...".
By the power vested in me, I invoke @delner to chime in here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't believe this comments blocks this PR, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeap, not blocking at all!
|
||
# TODO: 1.0: When moving to auto patching all workers by default | ||
# we should remove this setting, as it will be a no-op. | ||
option :workers, default: [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the plan is to remove this, I suggest adding a warning now -- that way users get a bit more time to move and hopefully have updated their config when in the future we make this a no-op.
# TODO: 1.0: Automatic patching should be the default behavior. | ||
# We should not provide this option anymore when making it the default, | ||
# as our integrations should always instrument all possible scenarios when feasible. | ||
option :auto_instrument, default: false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The mad scientist in me wonders, would it make sense for us to make these 1.0 intentions into code? E.g. something like option :auto_instrument, default: false, "default_for_v1": true
and then allowing customers to do something like Datadog.configure { |c| c.use_v1_defaults = true }
?
That way customers onboarding today could already use the recommended future configuration, rather than needing to start with the legacy configuration and then in future update it again for 1.0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's too smart for my brain.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking about this, and the main issue with use_v1_defaults
is that what use_v1_defaults
means today will likely change tomorrow, when we decide to schedule the next breaking change for 1.0.
For example, if we had introduced use_v1_defaults
in the previous release, this release would change how Redis workers are configured, effectively being a breaking change for use_v1_defaults
users.
It would effectively mean that use_v1_defaults
== "every release can be a breaking release".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this PR though, I think maybe the best thing to do is, for users that have c.use resque
, but no workers
setting, to effectively become auto_instrument
.
Maybe change the semantics: remove auto_instrument
option, and allow empty workers
to be "auto-instrument".
What do you think? There's no right or wrong here: users with an explicit worker: nil
will likely see instrumentation that they didn't expect, but a nil
value is not even supported here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would effectively mean that use_v1_defaults == "every release can be a breaking release".
Yes, but you'd be signing up for those semantics for use_v1_defaults
(e.g. that the defaults may change, as they adopt the v1 conventions), so I don't see that as a big issue. (I may be underestimating how often we run into issues, but my expectation is that if we consider something as the new default, that we're reasonably confident that it won't break)
What do you think? There's no right or wrong here: users with an explicit worker: nil will likely see instrumentation that they didn't expect, but a nil value is not even supported here.
Seems pretty reasonable, and it does the right thing for new customers, I like it!
docs/GettingStarted.md
Outdated
| `auto_instrument` | Instrument all Resque jobs (recommended). | `false` | | ||
| `workers` | An array including all worker classes you want to trace (e.g. `[MyJob]`). Use `auto_instrument` instead if you'd like to instrument all jobs. | `[]` | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we document here as well the 1.0 intentions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not the default with latest changes, but still maintains backwards compatibility for users setting it explicitly.
# Automatically configures jobs with {ResqueJob} plugin. | ||
module Job | ||
def perform | ||
if Datadog.configuration[:resque][:auto_instrument] | ||
job = payload_class | ||
job.extend(Datadog::Contrib::Resque::ResqueJob) unless job.is_a? Datadog::Contrib::Resque::ResqueJob | ||
end | ||
ensure | ||
super | ||
end | ||
end | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was wondering if we could use Resque's hooks to do this instead, but indeed there seems to be no available option for "global hooks" that apply to every job.
Would it make sense to ask upstream for it? They seem to be interested in providing this kind of functionality, so they may be open to our request.
Of course we'd still need to keep this to support older Resque versions, but at least it'd be harder to break with newer Resque versions/other gems trying to instrument it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hooks, as they have today, don't exist at a global level, only on a per-worker basis, thus why we register hooks on each worker we encounter.
I think that suggesting global hooks make sense, I'll look into that.
But, like you said, we still need to support existing versions of Resque.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah it's definitely a case of "let's lay the groundwork for a better approach, so that we can reap those rewards in a few years" :)
super | ||
end | ||
end | ||
|
||
# Uses Resque job hooks to create traces | ||
module ResqueJob | ||
def around_perform(*args) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we perhaps rename this around_perform_datadog_tracing
, to avoid collisions? The hook documentation mentions we can (and should) use prefixes for naming the hooks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
spec/ddtrace/contrib/resque/job.rb
Outdated
let(:worker) { Resque::Worker.new(queue_name) } | ||
let(:job_class) do | ||
stub_const('TestJob', Class.new).tap do |mod| | ||
mod.send(:extend, Datadog::Contrib::Resque::ResqueJob) | ||
mod.send(:define_singleton_method, :perform) do |*args| | ||
# Do nothing by default. | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we merge this file back into the instrumentation_spec
? I was scratching my head for a few minutes trying to see where the job_class
came from in that spec, and it comes from this separate file that's only used once -- by the other spec 😭
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍
o.on_set do |value| | ||
unless value.nil? | ||
Datadog.logger.warn( | ||
"DEPRECATED: Resque integration now instruments all workers. \n" \ | ||
'The `workers:` option is unnecessary and will be removed in the future.' | ||
) | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor: Should we also warn when value.empty?
E.g. if you're enabling the integration, but then not instrumenting any workers, that seems like a configuration mistake.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At the end of day, the message is the same for folks with []
or [MyWorker]
: don't configure it explicitly.
I think that the specific case for users with []
doesn't warrant it's own case, given that the deprecation warning gives them the correct instructions going forward.
On a side note, validation options is something that we should address regardless in the tracer as a whole.
# | ||
# We could also just use `around_perform` but this might override the user's | ||
# own method. | ||
def around_perform0ddtrace(*args) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor: Maybe
def around_perform0ddtrace(*args) | |
def around_perform0_ddtrace(*args) |
just to make it a little more readable?
workers = Datadog.configuration[:resque][:workers] || [] | ||
workers.each { |worker| worker.extend(ResqueJob) } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor: I was thinking we could instead tweak Datadog::Contrib::Resque::Job
to instrument workers in the list "just-in-time", similar to how we do it when workers is nil
. This way, we would have almost the same behavior between both options, instead of the current setup where the extend happens at different times based on the configuration. (And we could remove these two lines entirely)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't want to touch these lines, because this behaviour (explicit configuration) is going to be deprecate anyway. It's a low risk change, but given the low value, I thought leaving it as it, and simply removing in the future was the best thing to to day.
This PR adds dynamic auto instrumentation for Resque workers.
Currently users have to list all worker classes that need to be instrumented at configuration time. This is quite more work than required in most cases, given we can automatically instrument all jobs by patching Resque, and it can be challenging to have all worker classes loaded at application configuration time.
Users currently setting an explicit value to the
workers:
option will not be affected.Users not setting any value to
workers:
(or setting it tonil
) will now have auto instrumentation of all available workers enabled.The
workers:
option itself does not follow our tracer philosophy: if we are able to automatically instrument, we should. With this in mind, this option has been marked for deprecation. We recommend removing it a lettingddtrace
perform automatic instrumentation.