Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add analyzed span configuration option for Celery #1383

Merged
merged 5 commits into from
Apr 28, 2020

Conversation

ericmustin
Copy link
Contributor

Summary

This PR addresses Open Issue 1217, by adding app analytics support for celery worker and producer spans. Previously it was not possible to enable spans created by Celery instrumentation as analyzed spans for use in app analytics. The only workaround was relying on the datadog-agent configuration setting which is brittle and decoupled from application settings, making it difficult to maintain. This PR exposes an environment variable DD_CELERY_ANALYTICS_ENABLED at the application level to enable analyzed spans for celery.

Example Usage

DD_CELERY_ANALYTICS_ENABLED=true ddtrace-run python app.py

Notes

  • long time listener first time caller. Tried to follow the documentation and test style of similar integrations where app analytics is disabled by default but can be enabled optionally via env var. Please let me know if there's anything I missed or any way y'all would prefer this documented. Happy to make any changes

  • Additionally, it's worth noting that this option enables both worker and producer spans for app analytics. Seemed like a reasonable enough behaviour, but if you feel like there ought to be optionality to choose only worker spans or only producer spans to be enabled for app analytics, I'm happy to refactor this. I noticed that celery's configuration offers the ability to name the worker_service_name and producer_service_name separately.

  • Lastly, it appears that algoliasearch, dogpile_cache, jinja2, and mako also cannot enable app analytics at the moment. If this approach looks good I would be happy to try to tackle those as well either in this pr or in a separate one. Just figured I would grab celery first since there's an open issue on it.

@ericmustin ericmustin requested a review from a team as a code owner April 25, 2020 11:46
Copy link
Member

@Kyle-Verhoog Kyle-Verhoog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one comment and I think it's good to go!

👍 for docs updates
👍 for great set of tests

@@ -30,6 +30,11 @@ def trace_prerun(*args, **kwargs):
# propagate the `Span` in the current task Context
service = config.celery['worker_service_name']
span = pin.tracer.trace(c.WORKER_ROOT_SPAN, service=service, resource=task.name, span_type=SpanTypes.WORKER)
# set analytics sample rate
span.set_tag(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we only want to set the tag here if config.celery.get_analytics_sample_rate() is not None. 🙂

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tbh I was just borrowing how the other integrations do it, like boto for instance, which just naively sets the return of get_analytics_sample_rate(), which would be None in the case that analytics_enabled is not set or set to false.

Happy to change it if setting a tag to None is an antipattern

def get_analytics_sample_rate(self, use_global_config=False):
"""
Returns analytics sample rate but only when integration-specific
analytics configuration is enabled with optional override with global
configuration
"""
if self._is_analytics_enabled(use_global_config):
analytics_sample_rate = getattr(self, 'analytics_sample_rate', None)
# return True if attribute is None or attribute not found
if analytics_sample_rate is None:
return True
# otherwise return rate
return analytics_sample_rate
# Use `None` as a way to say that it was not defined,
# `False` would mean `0` which is a different thing
return None

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm fair enough. However I think it is an antipattern since TMU None will get stringified to 'None' which is meaningless for this tag 😛.

I'm sure it gets handled and dropped somewhere in the backend but I think it's cleaner if we just don't set it. Less data to process and pass around 🙂

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense, appreciate the clarification, updated.

Copy link
Member

@Kyle-Verhoog Kyle-Verhoog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!! 😄

@Kyle-Verhoog Kyle-Verhoog merged commit 92766da into DataDog:master Apr 28, 2020
@Kyle-Verhoog Kyle-Verhoog added this to the 0.38.0 milestone May 12, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants