Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

inferred service for auto injection #9376

Open
wants to merge 36 commits into
base: main
Choose a base branch
from

Conversation

wconti27
Copy link
Contributor

Description

This PR adds another layer to service naming. Instead of defaulting to unnamed-python-service when no DD_SERVICE or programatic service name was provided, this change attempts to find a package or module name.

The algo:

  • First it will try to find a package name via a metadata file (setup.py, pyproject.toml). It will begin looking for these metadata files from the entrypoint directory used to start the program.
  • If no package name was found, we will use the entrypoint script / module name used to start the process. If we were not able to resolve this entrypoint, global service will default to unnamed-python-service.

Checklist

  • Change(s) are motivated and described in the PR description
  • Testing strategy is described if automated tests are not included in the PR
  • Risks are described (performance impact, potential for breakage, maintainability)
  • Change is maintainable (easy to change, telemetry, documentation)
  • Library release note guidelines are followed or label changelog/no-changelog is set
  • Documentation is included (in-code, generated user docs, public corp docs)
  • Backport labels are set (if applicable)
  • If this PR changes the public interface, I've notified @DataDog/apm-tees.

Reviewer Checklist

  • Title is accurate
  • All changes are related to the pull request's stated goal
  • Description motivates each change
  • Avoids breaking API changes
  • Testing strategy adequately addresses listed risks
  • Change is maintainable (easy to change, telemetry, documentation)
  • Release note makes sense to a user of the library
  • Author has acknowledged and discussed the performance implications of this PR as reported in the benchmarks PR comment
  • Backport labels are set in a manner that is consistent with the release branch maintenance policy

@wconti27 wconti27 requested review from a team as code owners May 24, 2024 13:11
hatch.toml Outdated Show resolved Hide resolved
pyproject.toml Outdated Show resolved Hide resolved
tests/internal/test_get_module.py Outdated Show resolved Hide resolved
from unittest.mock import MagicMock, patch, mock_open
from pathlib import Path
import psutil
import toml
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend vendoring the toml library instead of adding it as an installation dependency. It's important to limit the number of installation dependencies to keep the customer onboarding process as simple as it can be.

ddtrace/settings/config.py Outdated Show resolved Hide resolved
@@ -371,12 +372,15 @@ def int_service(pin, int_config, default=None):
return cast(str, int_config.service_name)

global_service = int_config.global_config._get_service()
if global_service:
if global_service and global_service != DEFAULT_SERVICE_NAME:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic seems like it might be more complex than necessary. I think using DEFAULT_SERVICE_NAME as a sentinel in this way is prone to future bugs. Imagine what happens when int_config.global_config._get_service() changes to default to something other than DEFAULT_SERVICE_NAME. This code would start to behave in a confusing way.

@datadog-dd-trace-py-rkomorn
Copy link

Datadog Report

Branch report: conti/inferred-service-autoinjection
Commit report: cbc0a71
Test service: dd-trace-py

❌ 874 Failed (6 Known Flaky), 130072 Passed, 42839 Skipped, 6h 9m 36.12s Total duration (3h 42m 30.8s time saved)
❄️ 6 New Flaky
⌛ 1 Performance Regression

❌ Failed Tests (874)

This report shows up to 5 failed tests.

  • test_200_request - test_aiohttp_client.py - Details

    Expand for error
     At request <Request GET /test/session/snapshot >:
        At snapshot (token='tests.contrib.aiohttp.test_aiohttp_client.test_200_request'):
         - Directory: /snapshots
         - CI mode: 1
         - Trace File: /snapshots/tests.contrib.aiohttp.test_aiohttp_client.test_200_request.json
         - Stats File: /snapshots/tests.contrib.aiohttp.test_aiohttp_client.test_200_request_tracestats.json
         At compare of 1 expected trace(s) to 1 received trace(s):
          At trace 'aiohttp.request' (2 spans):
           At snapshot compare of span 'aiohttp.request' at position 1 in trace:
            - Expected span:
     ...
    
  • test_200_request - test_aiohttp_client.py - Details

    Expand for error
     At request <Request GET /test/session/snapshot >:
        At snapshot (token='tests.contrib.aiohttp.test_aiohttp_client.test_200_request'):
         - Directory: /snapshots
         - CI mode: 1
         - Trace File: /snapshots/tests.contrib.aiohttp.test_aiohttp_client.test_200_request.json
         - Stats File: /snapshots/tests.contrib.aiohttp.test_aiohttp_client.test_200_request_tracestats.json
         At compare of 1 expected trace(s) to 1 received trace(s):
          At trace 'aiohttp.request' (2 spans):
           At snapshot compare of span 'aiohttp.request' at position 1 in trace:
            - Expected span:
     ...
    
  • test_200_request - test_aiohttp_client.py - Details

    Expand for error
     At request <Request GET /test/session/snapshot >:
        At snapshot (token='tests.contrib.aiohttp.test_aiohttp_client.test_200_request'):
         - Directory: /snapshots
         - CI mode: 1
         - Trace File: /snapshots/tests.contrib.aiohttp.test_aiohttp_client.test_200_request.json
         - Stats File: /snapshots/tests.contrib.aiohttp.test_aiohttp_client.test_200_request_tracestats.json
         At compare of 1 expected trace(s) to 1 received trace(s):
          At trace 'aiohttp.request' (2 spans):
           At snapshot compare of span 'aiohttp.request' at position 1 in trace:
            - Expected span:
     ...
    
  • test_200_request_post - test_aiohttp_client.py - Details

    Expand for error
     At request <Request GET /test/session/snapshot >:
        At snapshot (token='tests.contrib.aiohttp.test_aiohttp_client.test_200_request_post'):
         - Directory: /snapshots
         - CI mode: 1
         - Trace File: /snapshots/tests.contrib.aiohttp.test_aiohttp_client.test_200_request_post.json
         - Stats File: /snapshots/tests.contrib.aiohttp.test_aiohttp_client.test_200_request_post_tracestats.json
         At compare of 1 expected trace(s) to 1 received trace(s):
          At trace 'aiohttp.request' (2 spans):
           At snapshot compare of span 'aiohttp.request' at position 1 in trace:
            - Expected span:
     ...
    
  • test_200_request_post - test_aiohttp_client.py - Details

    Expand for error
     At request <Request GET /test/session/snapshot >:
        At snapshot (token='tests.contrib.aiohttp.test_aiohttp_client.test_200_request_post'):
         - Directory: /snapshots
         - CI mode: 1
         - Trace File: /snapshots/tests.contrib.aiohttp.test_aiohttp_client.test_200_request_post.json
         - Stats File: /snapshots/tests.contrib.aiohttp.test_aiohttp_client.test_200_request_post_tracestats.json
         At compare of 1 expected trace(s) to 1 received trace(s):
          At trace 'aiohttp.request' (2 spans):
           At snapshot compare of span 'aiohttp.request' at position 1 in trace:
            - Expected span:
     ...
    

New Flaky Tests (6)

  • test_schematization[schema_tuples0] - test_fastapi.py - Last Failure

    Expand for error
     At request <Request GET /test/session/snapshot >:
        At snapshot (token='tests.contrib.fastapi.test_fastapi.test_schematization[schema_tuples0]_9'):
         - Directory: /snapshots
         - CI mode: 1
         - Trace File: /snapshots/tests.contrib.fastapi.test_fastapi.test_schematization[schema_tuples0]_9.json
         - Stats File: /snapshots/tests.contrib.fastapi.test_fastapi.test_schematization[schema_tuples0]_9_tracestats.json
         At compare of 1 expected trace(s) to 1 received trace(s):
          At trace 'http.request' (4 spans):
           At snapshot compare of span 'http.request' at position 1 in trace:
            - Expected span:
     ...
    
  • test_schematization[schema_tuples1] - test_fastapi.py - Last Failure

    Expand for error
     At request <Request GET /test/session/snapshot >:
        At snapshot (token='tests.contrib.fastapi.test_fastapi.test_schematization[schema_tuples1]_9'):
         - Directory: /snapshots
         - CI mode: 1
         - Trace File: /snapshots/tests.contrib.fastapi.test_fastapi.test_schematization[schema_tuples1]_9.json
         - Stats File: /snapshots/tests.contrib.fastapi.test_fastapi.test_schematization[schema_tuples1]_9_tracestats.json
         At compare of 1 expected trace(s) to 1 received trace(s):
          At trace 'http.request' (4 spans):
           At snapshot compare of span 'http.request' at position 1 in trace:
            - Expected span:
     ...
    
  • test_schematization[schema_tuples2] - test_fastapi.py - Last Failure

    Expand for error
     At request <Request GET /test/session/snapshot >:
        At snapshot (token='tests.contrib.fastapi.test_fastapi.test_schematization[schema_tuples2]_9'):
         - Directory: /snapshots
         - CI mode: 1
         - Trace File: /snapshots/tests.contrib.fastapi.test_fastapi.test_schematization[schema_tuples2]_9.json
         - Stats File: /snapshots/tests.contrib.fastapi.test_fastapi.test_schematization[schema_tuples2]_9_tracestats.json
         At compare of 1 expected trace(s) to 1 received trace(s):
          At trace 'http.client.request' (4 spans):
           At snapshot compare of span 'http.client.request' at position 1 in trace:
            - Expected span:
     ...
    
  • test_schematization[schema_tuples3] - test_fastapi.py - Last Failure

    Expand for error
     At request <Request GET /test/session/snapshot >:
        At snapshot (token='tests.contrib.fastapi.test_fastapi.test_schematization[schema_tuples3]_9'):
         - Directory: /snapshots
         - CI mode: 1
         - Trace File: /snapshots/tests.contrib.fastapi.test_fastapi.test_schematization[schema_tuples3]_9.json
         - Stats File: /snapshots/tests.contrib.fastapi.test_fastapi.test_schematization[schema_tuples3]_9_tracestats.json
         At compare of 1 expected trace(s) to 1 received trace(s):
          At trace 'http.request' (4 spans):
           At snapshot compare of span 'http.request' at position 1 in trace:
            - Expected span:
     ...
    
  • test_schematization[schema_tuples4] - test_fastapi.py - Last Failure

    Expand for error
     At request <Request GET /test/session/snapshot >:
        At snapshot (token='tests.contrib.fastapi.test_fastapi.test_schematization[schema_tuples4]_9'):
         - Directory: /snapshots
         - CI mode: 1
         - Trace File: /snapshots/tests.contrib.fastapi.test_fastapi.test_schematization[schema_tuples4]_9.json
         - Stats File: /snapshots/tests.contrib.fastapi.test_fastapi.test_schematization[schema_tuples4]_9_tracestats.json
         At compare of 1 expected trace(s) to 1 received trace(s):
          At trace 'http.request' (4 spans):
           At snapshot compare of span 'http.request' at position 1 in trace:
            - Expected span:
     ...
    

⌛ Performance Regressions vs Default Branch (1)

  • test_commit_with_consume_single_message - test_kafka.py 1m 24.15s (+1m 21.06s, +2626%) - Details

@tabgok tabgok self-assigned this Jun 20, 2024
@tabgok tabgok force-pushed the conti/inferred-service-autoinjection branch from 3ad0c21 to db6eb5a Compare June 20, 2024 13:27
@datadog-dd-trace-py-rkomorn
Copy link

datadog-dd-trace-py-rkomorn bot commented Jun 20, 2024

Datadog Report

Branch report: conti/inferred-service-autoinjection
Commit report: 2872776
Test service: dd-trace-py

❌ 1800 Failed (14 Known Flaky), 161852 Passed, 1452 Skipped, 7h 19m 55.23s Total duration (3m 29.26s time saved)
❄️ 65 New Flaky

❌ Failed Tests (1800)

This report shows up to 5 failed tests.

  • test_flask_ipblock_match_403[flask_appsec_good_rules_env] - test_appsec_flask_snapshot.py - Details

    Expand for error
     At request <Request GET /test/session/snapshot >:
        At snapshot (token='tests.contrib.flask.test_appsec_flask_snapshot.test_flask_ipblock_match_403[flask_appsec_good_rules_env]_220'):
         - Directory: /snapshots
         - CI mode: 0
         - Trace File: /snapshots/tests.contrib.flask.test_appsec_flask_snapshot.test_flask_ipblock_match_403[flask_appsec_good_rules_env]_220.json
         - Stats File: /snapshots/tests.contrib.flask.test_appsec_flask_snapshot.test_flask_ipblock_match_403[flask_appsec_good_rules_env]_220_tracestats.json
         At compare of 1 expected trace(s) to 1 received trace(s):
          At trace 'flask.request' (2 spans):
           At snapshot compare of span 'flask.request' at position 1 in trace:
            - Expected span:
     ...
    
  • test_flask_ipblock_match_403[flask_appsec_good_rules_env] - test_appsec_flask_snapshot.py

  • test_flask_ipblock_match_403[flask_appsec_good_rules_env] - test_appsec_flask_snapshot.py - Details

    Expand for error
     At request <Request GET /test/session/snapshot >:
        At snapshot (token='tests.contrib.flask.test_appsec_flask_snapshot.test_flask_ipblock_match_403[flask_appsec_good_rules_env]_220'):
         - Directory: /snapshots
         - CI mode: 0
         - Trace File: /snapshots/tests.contrib.flask.test_appsec_flask_snapshot.test_flask_ipblock_match_403[flask_appsec_good_rules_env]_220.json
         - Stats File: /snapshots/tests.contrib.flask.test_appsec_flask_snapshot.test_flask_ipblock_match_403[flask_appsec_good_rules_env]_220_tracestats.json
         At compare of 1 expected trace(s) to 1 received trace(s):
          At trace 'flask.request' (2 spans):
           At snapshot compare of span 'flask.request' at position 1 in trace:
            - Expected span:
     ...
    
  • test_flask_ipblock_match_403_json[flask_appsec_good_rules_env] - test_appsec_flask_snapshot.py

  • test_flask_ipblock_match_403_json[flask_appsec_good_rules_env] - test_appsec_flask_snapshot.py - Details

    Expand for error
     At request <Request GET /test/session/snapshot >:
        At snapshot (token='tests.contrib.flask.test_appsec_flask_snapshot.test_flask_ipblock_match_403_json[flask_appsec_good_rules_env]_220'):
         - Directory: /snapshots
         - CI mode: 0
         - Trace File: /snapshots/tests.contrib.flask.test_appsec_flask_snapshot.test_flask_ipblock_match_403_json[flask_appsec_good_rules_env]_220.json
         - Stats File: /snapshots/tests.contrib.flask.test_appsec_flask_snapshot.test_flask_ipblock_match_403_json[flask_appsec_good_rules_env]_220_tracestats.json
         At compare of 1 expected trace(s) to 1 received trace(s):
          At trace 'flask.request' (2 spans):
           At snapshot compare of span 'flask.request' at position 1 in trace:
            - Expected span:
     ...
    

New Flaky Tests (65)

  • test_custom_logging_injection - test_correlation_log_context.py - Last Failure

    Expand for error
     assert {'env': '', '...7975066', ...} == {'env': '', '...7975066', ...}
       Omitting 4 identical items, use -vv to show
       Differing items:
       {'service': 'ddtrace'} != {'service': ''}
       Full diff:
         {
          'env': '',
       -  'service': '',
       +  'service': 'ddtrace',
       ?              +++++++
     ...
    
  • test_custom_logging_injection - test_correlation_log_context.py - Last Failure

    Expand for error
     assert {'env': '', '...8496628', ...} == {'env': '', '...8496628', ...}
       Omitting 4 identical items, use -vv to show
       Differing items:
       {'service': 'ddtrace'} != {'service': ''}
       Full diff:
         {
          'env': '',
       -  'service': '',
       +  'service': 'ddtrace',
       ?              +++++++
     ...
    
  • test_get_log_correlation_context_no_active_span - test_correlation_log_context.py - Last Failure

    Expand for error
     assert {'env': '', '...id': '0', ...} == {'env': '', '...id': '0', ...}
       Omitting 4 identical items, use -vv to show
       Differing items:
       {'service': 'ddtrace'} != {'service': ''}
       Full diff:
         {
          'env': '',
       -  'service': '',
       +  'service': 'ddtrace',
       ?              +++++++
     ...
    
  • test_get_log_correlation_context_no_active_span - test_correlation_log_context.py - Last Failure

    Expand for error
     assert {'env': '', '...id': '0', ...} == {'env': '', '...id': '0', ...}
       Omitting 4 identical items, use -vv to show
       Differing items:
       {'service': 'ddtrace'} != {'service': ''}
       Full diff:
         {
          'env': '',
       -  'service': '',
       +  'service': 'ddtrace',
       ?              +++++++
     ...
    
  • test_child - test_middleware.py

@pr-commenter
Copy link

pr-commenter bot commented Jun 20, 2024

Benchmarks

Benchmark execution time: 2024-06-25 18:22:13

Comparing candidate commit 2872776 in PR branch conti/inferred-service-autoinjection with baseline commit 053d891 in branch main.

Found 2 performance improvements and 1 performance regressions! Performance is the same for 218 metrics, 9 unstable metrics.

scenario:otelspan-start

  • 🟩 max_rss_usage [-7.488MB; -7.413MB] or [-16.367%; -16.203%]

scenario:span-add-metrics

  • 🟩 max_rss_usage [-6.163MB; -6.091MB] or [-9.435%; -9.324%]

scenario:span-add-tags

  • 🟥 max_rss_usage [+3.654MB; +4.277MB] or [+11.621%; +13.602%]

@tabgok tabgok force-pushed the conti/inferred-service-autoinjection branch from 7ec5828 to cca3c5b Compare June 24, 2024 15:47
@tabgok tabgok requested a review from a team as a code owner June 24, 2024 17:58
@tabgok
Copy link
Contributor

tabgok commented Jun 24, 2024

Questions:

  1. Python internal services. Many python integrations use the component name as the service name, i.e. "aiohttp-web" unless the global is set. I have set these to use "aiohttp-web" if the inferred service is used (this causes base_service to be the running application)
  2. Breaking changes. These changes will break anyone who has set alarms/etc on "unnamed-python-service". Is that ok?

@tabgok tabgok requested a review from a team as a code owner June 25, 2024 16:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants