-
Notifications
You must be signed in to change notification settings - Fork 369
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for instrumenting Qless jobs #1237
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very good work! Super clean so far!
I left a few comments, let me know if you have any question about them.
docs/GettingStarted.md
Outdated
| --- | ----------- | ------- | | ||
| `analytics_enabled` | Enable analytics for spans produced by this integration. `true` for on, `nil` to defer to the global setting, `false` for off. | `false` | | ||
| `service_name` | Service name used for `qless` instrumentation | `'qless'` | | ||
| `workers` | An array including all worker classes you want to trace (e.g. `[MyJob]`) | All jobs | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By default, enabling an instrumentation in ddtrace
enables it exhaustively: we automatically instrument as much as we can.
Thankfully, it seems like Qless does allow us to completely instrument all workers, like you implemented by extending Qless::Workers::BaseWorker
.
This leaves us in a good place: all users have to do is add c.use :qless
and they are go to go.
The option you added here, to selectively enable the tracer, is not a pattern we currently use, as more instrumentation is normally more desirable than the contrary.
My question to you is: do you have any requirements today that would be fulfilled by selective instrumentation, vs exhaustive instrumentation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The option you added here, to selectively enable the tracer
@marcotc which option are you referring to? analytics_enabled
or workers
?
Looking at all the other integrations, analytics_enabled
mostly defaults to false.
By default all worker classes are instrumented, but I'm giving the option to restrict the list (like the resque integration does).
We would not use selective instrumentation. Would you prefer I remove the workers
option?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry if I wasn't clear. You are correct that I was only referring to workers
. analytics_enabled
is correct.
We would not use selective instrumentation.
In this case I do suggest the removal of the workers
option. We try our best to write instrumentation that is safe to enable in the whole application, so restricting instrumentation would only happen in case there there's a strong user case for it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
def around_perform(job) | ||
return super unless datadog_configuration && tracer | ||
|
||
tracer.shutdown! if forked? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think we'd be able to exercise this fork clean up in a test case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried adding tests for the forked path but I couldn't get the forked process working correctly in tests (we had a conversation about testing forked code back in #1053). I've manually tested it and the tracer is shutdown.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll set up some time so we help you with this. We'll try to write a forking test and see what issues we encounter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@marcotc Let me know when you have some time. Maybe this Friday? I'm in the US Central time zone
span.span_type = Datadog::Ext::AppTypes::WORKER | ||
span.set_tag(Ext::TAG_JOB_ID, job.jid) | ||
span.set_tag(Ext::TAG_JOB_QUEUE, job.queue_name) | ||
span.set_tag(Ext::TAG_JOB_TAGS, job.tags) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was reading about the purpose of job.tags
and I noticed they state:
Tagging / Tracking -- Some jobs are more interesting than others. Track those jobs to get updates on their progress. Tag jobs with meaningful identifiers to find them quickly in the UI.
This seems to me like tags can carry uniquely identifiable user information (like email addresses). In contrast with job.queue_name
(setup time configuration) or job.jid
(opaque UID), it think that job.tags
can contain information that the user might not want to have stored by default.
Would you say I'm understanding the tags concept correctly? If so, we should gate this data collection the same way you did for job.data
already (thank you for that one, btw!).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
👋 @sco11morgan, I'll work on your PR tomorrow to try to figure out the forking tests, I'll let you know of the outcomes (🤞 I'll see a few commits adding such test to your branch). |
@sco11morgan, I added test for the default use case using forks. Turns out the include order was actually inverted for Could you please take a look at my commits and see if they make sense to you? |
You got the forked test to work! That's great. Good catch on the flipping of the order - makes sense as that means the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This generally lgtm and i think it's good to ship.
i know we've been trying to add error_handler
as an option for job error handling https://github.com/DataDog/dd-trace-rb/blob/master/docs/GettingStarted.md#sidekiq. Is that something we want to add here? I think it's fine if not, it may be something we want to add to all integrations as part of future work.
Thank you for the review, @ericmustin! I think the integration in its current state already provides enough value for users that we can confidently ship it. I agree that adding the |
Thank you again for your work in this PR, @sco11morgan. We've just released v0.44.0, which includes these changes. Let us know if you have any feedback with this new version. |
qless is Redis-based job queueing system inspired by Resque.
Addresses #1053