Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for instrumenting Qless jobs #1237

Merged
merged 11 commits into from
Nov 26, 2020
Merged

Add support for instrumenting Qless jobs #1237

merged 11 commits into from
Nov 26, 2020

Conversation

sco11morgan
Copy link
Contributor

qless is Redis-based job queueing system inspired by Resque.

Addresses #1053

@sco11morgan sco11morgan requested a review from a team November 6, 2020 17:00
Copy link
Member

@marcotc marcotc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very good work! Super clean so far!

I left a few comments, let me know if you have any question about them.

lib/ddtrace/contrib/qless/qless_job.rb Show resolved Hide resolved
| --- | ----------- | ------- |
| `analytics_enabled` | Enable analytics for spans produced by this integration. `true` for on, `nil` to defer to the global setting, `false` for off. | `false` |
| `service_name` | Service name used for `qless` instrumentation | `'qless'` |
| `workers` | An array including all worker classes you want to trace (e.g. `[MyJob]`) | All jobs |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By default, enabling an instrumentation in ddtrace enables it exhaustively: we automatically instrument as much as we can.

Thankfully, it seems like Qless does allow us to completely instrument all workers, like you implemented by extending Qless::Workers::BaseWorker.

This leaves us in a good place: all users have to do is add c.use :qless and they are go to go.

The option you added here, to selectively enable the tracer, is not a pattern we currently use, as more instrumentation is normally more desirable than the contrary.

My question to you is: do you have any requirements today that would be fulfilled by selective instrumentation, vs exhaustive instrumentation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The option you added here, to selectively enable the tracer

@marcotc which option are you referring to? analytics_enabled or workers?

Looking at all the other integrations, analytics_enabled mostly defaults to false.
By default all worker classes are instrumented, but I'm giving the option to restrict the list (like the resque integration does).

We would not use selective instrumentation. Would you prefer I remove the workers option?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry if I wasn't clear. You are correct that I was only referring to workers. analytics_enabled is correct.

We would not use selective instrumentation.

In this case I do suggest the removal of the workers option. We try our best to write instrumentation that is safe to enable in the whole application, so restricting instrumentation would only happen in case there there's a strong user case for it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

spec/ddtrace/contrib/qless/job.rb Outdated Show resolved Hide resolved
def around_perform(job)
return super unless datadog_configuration && tracer

tracer.shutdown! if forked?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think we'd be able to exercise this fork clean up in a test case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried adding tests for the forked path but I couldn't get the forked process working correctly in tests (we had a conversation about testing forked code back in #1053). I've manually tested it and the tracer is shutdown.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll set up some time so we help you with this. We'll try to write a forking test and see what issues we encounter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@marcotc Let me know when you have some time. Maybe this Friday? I'm in the US Central time zone

@marcotc marcotc added community Was opened by a community member feature Involves a product feature integrations Involves tracing integrations labels Nov 6, 2020
span.span_type = Datadog::Ext::AppTypes::WORKER
span.set_tag(Ext::TAG_JOB_ID, job.jid)
span.set_tag(Ext::TAG_JOB_QUEUE, job.queue_name)
span.set_tag(Ext::TAG_JOB_TAGS, job.tags)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was reading about the purpose of job.tags and I noticed they state:

Tagging / Tracking -- Some jobs are more interesting than others. Track those jobs to get updates on their progress. Tag jobs with meaningful identifiers to find them quickly in the UI.

This seems to me like tags can carry uniquely identifiable user information (like email addresses). In contrast with job.queue_name (setup time configuration) or job.jid (opaque UID), it think that job.tags can contain information that the user might not want to have stored by default.

Would you say I'm understanding the tags concept correctly? If so, we should gate this data collection the same way you did for job.data already (thank you for that one, btw!).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@marcotc
Copy link
Member

marcotc commented Nov 19, 2020

👋 @sco11morgan, I'll work on your PR tomorrow to try to figure out the forking tests, I'll let you know of the outcomes (🤞 I'll see a few commits adding such test to your branch).

@marcotc
Copy link
Member

marcotc commented Nov 20, 2020

@sco11morgan, I added test for the default use case using forks.

Turns out the include order was actually inverted for QlessJob and TracerCleaner, and the cleaner was shutting down the tracer before we had a chance to instrument it.

Could you please take a look at my commits and see if they make sense to you?
We can work on merging it soon after.

@sco11morgan
Copy link
Contributor Author

You got the forked test to work! That's great. Good catch on the flipping of the order - makes sense as that means the TracerCleaner is the first layer of middleware so that it can run its "after" last and cleanup the tracer.

Copy link
Contributor

@ericmustin ericmustin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This generally lgtm and i think it's good to ship.

i know we've been trying to add error_handler as an option for job error handling https://github.com/DataDog/dd-trace-rb/blob/master/docs/GettingStarted.md#sidekiq. Is that something we want to add here? I think it's fine if not, it may be something we want to add to all integrations as part of future work.

@marcotc
Copy link
Member

marcotc commented Nov 26, 2020

Thank you for the review, @ericmustin!

I think the integration in its current state already provides enough value for users that we can confidently ship it.

I agree that adding the error_handler is definitely something we should have in the future.

@marcotc marcotc merged commit 82a79f1 into DataDog:master Nov 26, 2020
@marcotc marcotc added this to the 0.44.0 milestone Nov 27, 2020
@marcotc
Copy link
Member

marcotc commented Jan 6, 2021

Thank you again for your work in this PR, @sco11morgan. We've just released v0.44.0, which includes these changes. Let us know if you have any feedback with this new version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
community Was opened by a community member feature Involves a product feature integrations Involves tracing integrations
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants