Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: tag runtime metrics for only internal services #1311

Merged

Conversation

majorgreys
Copy link
Collaborator

The tracer currently keeps track of the set of services of the spans that have been started. This set is then used to set the service tag on any metrics written to dogstatsd.

This PR further refines this set such that it only includes services of those spans that are "internal" to the Python runtime by filtering on span_type.

@majorgreys majorgreys requested a review from a team as a code owner March 26, 2020 22:08
@@ -446,7 +446,7 @@ def start_span(self, name, child_of=None, service=None, resource=None, span_type
self._check_new_process()

# update set of services handled by tracer
if service and service not in self._services:
if service and service not in self._services and self._is_span_internal(span):
self._services.add(service)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this just be moved up to L419?

if self._runtime_worker and self._is_span_internal(span):
    if service and service not in self._services:
        self._services.add(service)

not sure what benefits we have for having it down here, but just a thought, not a required change.

self.assertRegex(gauge, 'env:tests.dog')
self.assertRegex(gauge, 'lang_interpreter:')
self.assertRegex(gauge, 'lang_version:')
self.assertRegex(gauge, 'lang:')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be a known value, no? lang:python ?

same with lang_interpreter:

Kyle-Verhoog
Kyle-Verhoog previously approved these changes Mar 27, 2020
Kyle-Verhoog
Kyle-Verhoog previously approved these changes Mar 30, 2020

@staticmethod
def _is_span_internal(span):
return not span.span_type or span.span_type in _INTERNAL_APPLICATION_SPAN_TYPES
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you expect None to be seen as internal I'd rather write this explicitely span.span_type is None

@majorgreys majorgreys merged commit 3911e64 into master Apr 2, 2020
@majorgreys majorgreys deleted the majorgreys/runtime-metrics-only-internal-services branch April 2, 2020 18:10
Kyle-Verhoog added a commit that referenced this pull request Apr 17, 2020
* chore: update six to 1.14.0 (#1313)

* chore(ci): test only with grpcio wheels (#1310)

* chore(ci): test only with grpcio wheels

* fix

Co-authored-by: Kyle Verhoog <kyle@verhoog.ca>

* fix(profiling/pprof): provide timestamp to export() (#1312)

Currently, the Exporters have to guess what the time range is for the provided
events.
This is not optimal in cases where we don't have any events — even if it
_should_ not happen in practice.

By forcing the caller of the exporter to provide the collection time range, we
can generate data that are actually more precise.

Co-authored-by: Kyle Verhoog <kyle@verhoog.ca>

* ci: import our Dockerfile in the repository (#1227)

That should make it easier to manage our CI image.

Co-authored-by: Brett Langdon <brett.langdon@datadoghq.com>
Co-authored-by: Kyle Verhoog <kyle@verhoog.ca>

* fmt: black celery tests (#1314)

* celery: black fmt tests

* celery: use self.assert_is_measured

Co-authored-by: Julien Danjou <julien@danjou.info>

* Add uds_path to ddtrace.opentracer.Tracer (#1275)

Fixes #1274

* logging: only use global setting for env, service and version (#1317)

* logging: update documentation (#1318)

* chore(docs): consolidate environment variable configuration (#1320)

* docs: consolidate environment variable configuration

* Update docs/configuration.rst

Co-Authored-By: Kyle Verhoog <kyle@verhoog.ca>

* remove env vars from usage

Co-authored-by: Kyle Verhoog <kyle@verhoog.ca>

* logging: fix typo (#1323)

* fix(tests/profiling): use a string with setenv, not a int (#1321)

* fix(profiling/http): converts nanoseconds timestamp to seconds (#1325)

* fix(profiling): ignore failure on shutdown (#1327)

On interpreter shutdown, we can see this kind of errors:

Traceback (most recent call last):
  File "ddtrace/profiling/_periodic.py", line 104, in run
    del threading._active[self._tident]
TypeError: 'NoneType' object does not support item deletion

This changes the code to use the same try/except mechanism in the above block
to avoid logging those errors on shutdown.

* aiobotocore: pin to <1.0 (#1330)

* aiobotocore: pin to <1.0

* aiobotocore: don't test 0.12 with Python 3.5

* Update error event handler name within SQLAlchemy engine (#1324)

* Updating event handler name

Update method name for handling db errors in sqlalchemy

* Listen for dbapi events

* Accommodating difference in positional args among sqlalchemy versions

Co-authored-by: Kyle Verhoog <kyle@verhoog.ca>

* fix: tag runtime metrics for only internal services (#1311)

* fix: tag runtime metrics only for internal services

* only add internal services

* handle case where span type not set as internal

* remove deprecated functions

* check lang

* revert to deprecated regexp asserts

* docs(opentracing): add celery example (#1329)

* Document how to propagate the tracer across celery tasks

* docs(opentracing): add celery distributed tracing example

Co-authored-by: Lie Ryan <lie.1296@gmail.com>
Co-authored-by: Tahir H. Butt <tahir.butt@datadoghq.com>

* fix(profiling): allow to override options after load (#1332)

Using default=os.getenv in attr attributes will make the default unchangeable
after the library is loaded.

Most options already leverage ddtrace.profiling._attr.from_env to avoid
this, but not the HTTP exporter.

Co-authored-by: Kyle Verhoog <kyle@verhoog.ca>

* ci: allow to use latest pytest version (#1326)

* feat(profiling): add the pip install when recompilation is needed (#1334)

If an incompatible ABI is detected, users are usually lost and don't know what
to do.
Indicate the command to use to install and compile this module.

* tracer: add support for cross process tracing (#1331)

* feat(profiling): add support for DD_VERSION (#1340)

* feat(profiling): add support for DD_SERVICE (#1339)

* fix: avoid context deadlock with logs injection enabled (#1338)

* don't try to get current span on patched internal logger

* run black

* add span directly to log record

* add test

* run black

* contextually group test a bit better

* don't add issue number

Co-Authored-By: Tahir H. Butt <tahir@tahirbutt.com>

* rename logger_arg to logger_override 1/2

Co-Authored-By: Tahir H. Butt <tahir@tahirbutt.com>

* rename logger_arg to logger_override 2/2

Co-Authored-By: Tahir H. Butt <tahir@tahirbutt.com>

* fix arg name

Co-authored-by: Tahir H. Butt <tahir@tahirbutt.com>

* fix(profiling): use formats.asbool to convert bool from env (#1342)

Using the `bool` primitive will only work with 0 or 1, whereas we want to
support true/false strings.

* feat(profiling): add support for DD_ENV environment variable (#1343)

* span: handle non-string tag keys (#1345)

* span: handle non-string keys

* span: use six.string_types for type checking

* fix(profiling): pthread_t is defined as unsigned long, not int (#1347)

Co-authored-by: Kyle Verhoog <kyle@verhoog.ca>

* Allow Profiler to finish upload data in the background when stopped (#1322)

* feat(profiling/periodic): allow to call a shutdown function

* feat(profiling): do not reset event list of no exporter configured

* feat(profiling): allow to stop/join collectors in 2 steps

This should make the Profiler slightly faster to stop.

* refactor: introduce Service class

This introduces a new class Service that describes a Service that can be
started, stopped and joined.

The Service tracks its own status in order to avoid starting twice the same
service.

The collectors are now inheriting from this common class as they all provide a
startable/stoppable service.

* refactor(scheduler): leverage PeriodicService class

This changes the Scheduler class to leverage the PeriodicService class.

It also makes sure the Scheduler always flush on exit.

The Profiler is now responsible for joining or not the Scheduler if it wants to
block on the last upload before returning from its `stop` method.

* feat(profiling): add support for DD_API_KEY (#1350)

* feat(profiling): add support for DD_SITE (#1351)

* tracer: stop previous writer if a new one is created (#1356)

* Updates configuration docs (#1360)

This updates the configuration docs, calling out that the DD_SERVICE, DD_ENV, and DD_VERSION env vars were added in v0.36.0

Co-authored-by: Kyle Verhoog <kyle@verhoog.ca>

* feat(profiling): allow to pass service_name to HTTP exporter (#1358)

* Add config to omit `django.user.name` tag from request root span (#1361)

* Add `user_name_enabled` config to Django contrib

* Elaborate on user_name_enabled comment

* Change test to only check for user name tag instead of all meta tags

* Check that "django.user.name" key does not exist in meta

* Change `user_name_enabled` to `include_user_name`

* feat(profiling): replace http_client by urllib for uploading (#1359)

This is slightly higher level than http_client and supports features such as
the `HTTP_PROXY` environment variable.

* test(profiling): check for thread presence rather than number of threads (#1357)

This test would fail occasionally if a thread from another test is still
running, depending on the order the tests are run.

This replaces check with a more accurate one.

* Fix task context management for asyncio in Python <3.7 (#1352)

* fix for context propagation when first span is closed

* fix tests and merge

* fix for context propagation when first span is closed

* fix tests and merge

* test without context changes

* asyncio: use context.clone()

Co-authored-by: Alexander Mohr <thehesiod@hotmail.com>
Co-authored-by: Alexander Mohr <amohr@farmersbusinessnetwork.com>

* fix: make C extensions mandatory (#1333)

If no C compiler is available, the installation do not fail and only a warning
is being print. This is wrong as the profiler won't work — at all.

Since we need a C compiler to build the profiler extensions that are part of
the wheel anyway,  let's not make wrapt or psutil optional.

Fixes #1328

* writer: allow configuration of queue maxsize via env var (#1364)

* writer: allow configuration of queue maxsize via env var

* rename to DD_TRACE_MAX_TPS

* cleanup

Co-authored-by: Julien Danjou <julien@danjou.info>
Co-authored-by: Tahir H. Butt <tahir.butt@datadoghq.com>
Co-authored-by: Kyle Verhoog <kyle@verhoog.ca>
Co-authored-by: Brett Langdon <brett.langdon@datadoghq.com>
Co-authored-by: Sarah Harvey <worldwise001@users.noreply.github.com>
Co-authored-by: Robert Townley <Me@RobertTownley.com>
Co-authored-by: Lie Ryan <lie.1296@gmail.com>
Co-authored-by: Zach Hammer <zhammer@seatgeek.com>
Co-authored-by: Tahir H. Butt <tahir@tahirbutt.com>
Co-authored-by: Scott Burns <scott.s.burns@gmail.com>
Co-authored-by: Sebastian Coetzee <mail@sebastiancoetzee.com>
@Kyle-Verhoog Kyle-Verhoog added this to the 0.37.0 milestone May 12, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants