fix: pytest deadlock when using gevent [backport 2.8] #9166

brettlangdon · 2024-05-06T16:50:45Z

Backport 9e862f9 from #9141 to 2.8.

This fix resolves an issue when using pytest + gevent where the telemetry writer was eager initialized by pytest entrypoints loading of our plugin causing a potential dead lock.

The specific case that we targeted solving with this PR:

import ddtrace

import gevent.monkey

gevent.monkey.patch_all()

def test_thing():
    pass

$ pytest test.py
$ pytest -p ddtrace -p ddtrace.pytest_bdd -p ddtrace.pytest_benchmark test.py
$ pytest -p no:ddtrace -p no:ddtrace.pytest_bdd -p no:ddtrace.pytest_benchmark test.py

Previously would dead-lock, but now will execute without problem.

We no longer get telemetry enabled and sending data just from import ddtrace.

Moving telemetry writer starting to ddtrace.bootstrap.preload will mean that any unhandled exceptions that occur before the telemetry writer can be started will not get reported.

Fixes #8281 This fix resolves an issue when using ``pytest`` + ``gevent`` where the telemetry writer was eager initialized by ``pytest`` entrypoints loading of our plugin causing a potential dead lock. The specific case that we targeted solving with this PR: ```python import ddtrace import gevent.monkey gevent.monkey.patch_all() def test_thing(): pass ``` ```shell $ pytest test.py $ pytest -p ddtrace -p ddtrace.pytest_bdd -p ddtrace.pytest_benchmark test.py $ pytest -p no:ddtrace -p no:ddtrace.pytest_bdd -p no:ddtrace.pytest_benchmark test.py ``` Previously would dead-lock, but now will execute without problem. We no longer get telemetry enabled and sending data just from `import ddtrace`. Moving telemetry writer starting to `ddtrace.bootstrap.preload` will mean that any unhandled exceptions that occur before the telemetry writer can be started will not get reported. - [x] Change(s) are motivated and described in the PR description - [x] Testing strategy is described if automated tests are not included in the PR - [x] Risks are described (performance impact, potential for breakage, maintainability) - [x] Change is maintainable (easy to change, telemetry, documentation) - [x] [Library release note guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html) are followed or label `changelog/no-changelog` is set - [x] Documentation is included (in-code, generated user docs, [public corp docs](https://github.com/DataDog/documentation/)) - [x] Backport labels are set (if [applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)) - [x] If this PR changes the public interface, I've notified `@DataDog/apm-tees`. - [x] Title is accurate - [x] All changes are related to the pull request's stated goal - [x] Description motivates each change - [x] Avoids breaking [API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces) changes - [x] Testing strategy adequately addresses listed risks - [x] Change is maintainable (easy to change, telemetry, documentation) - [x] Release note makes sense to a user of the library - [x] Author has acknowledged and discussed the performance implications of this PR as reported in the benchmarks PR comment - [x] Backport labels are set in a manner that is consistent with the [release branch maintenance policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting) (cherry picked from commit 9e862f9)

datadog-dd-trace-py-rkomorn · 2024-05-06T17:51:51Z

Datadog Report

Branch report: backport-9141-to-2.8
Commit report: f83a6a1
Test service: dd-trace-py

✅ 0 Failed, 171574 Passed, 1095 Skipped, 11h 42m 15.71s Total duration (25m 59.15s time saved)
❄️ 1 New Flaky

New Flaky Tests (1)

test_add_event_disabled_writer - test_writer.py - Last Failure

Expand for error

 assert 1 == 0
  +  where 1 = len([{'body': {'api_version': 'v2', 'application': {'env': '', 'language_name': 'python', 'language_version': '3.11.3', 'p...brary-Language': 'python', ...}, 'method': 'POST', 'url': 'http://localhost:9126/telemetry/proxy/api/v2/apmtelemetry'}])
  +    where [{'body': {'api_version': 'v2', 'application': {'env': '', 'language_name': 'python', 'language_version': '3.11.3', 'p...brary-Language': 'python', ...}, 'method': 'POST', 'url': 'http://localhost:9126/telemetry/proxy/api/v2/apmtelemetry'}] = <bound method TelemetryTestSession.get_requests of <tests.conftest.TelemetryTestSession object at 0x7fdc65e34590>>()
  +      where <bound method TelemetryTestSession.get_requests of <tests.conftest.TelemetryTestSession object at 0x7fdc65e34590>> = <tests.conftest.TelemetryTestSession object at 0x7fdc65e34590>.get_requests

pr-commenter · 2024-05-06T17:54:04Z

Benchmarks

Benchmark execution time: 2024-05-06 18:32:09

Comparing candidate commit f83a6a1 in PR branch backport-9141-to-2.8 with baseline commit d8a5718 in branch 2.8.

Found 13 performance improvements and 23 performance regressions! Performance is the same for 165 metrics, 9 unstable metrics.

scenario:coreapiscenario-context_with_data_no_listeners

🟥 max_rss_usage [+836.434KB; +909.691KB] or [+3.956%; +4.303%]

scenario:coreapiscenario-context_with_data_only_all_listeners

🟩 max_rss_usage [-564.805KB; -495.239KB] or [-2.582%; -2.264%]

scenario:coreapiscenario-core_dispatch_listeners

🟥 max_rss_usage [+851.569KB; +919.132KB] or [+4.029%; +4.349%]

scenario:coreapiscenario-core_dispatch_no_listeners

🟥 max_rss_usage [+885.094KB; +950.324KB] or [+4.189%; +4.498%]

scenario:coreapiscenario-core_dispatch_only_all_listeners

🟥 max_rss_usage [+854.248KB; +916.453KB] or [+4.044%; +4.338%]

scenario:coreapiscenario-core_dispatch_with_results_listeners

🟩 max_rss_usage [-545.390KB; -478.610KB] or [-2.495%; -2.190%]

scenario:coreapiscenario-core_dispatch_with_results_no_listeners

🟩 max_rss_usage [-570.260KB; -511.903KB] or [-2.606%; -2.339%]

scenario:coreapiscenario-get_item_exists

🟥 max_rss_usage [+671.212KB; +739.041KB] or [+3.151%; +3.470%]

scenario:coreapiscenario-get_item_missing

🟥 max_rss_usage [+683.915KB; +751.733KB] or [+3.213%; +3.531%]

scenario:coreapiscenario-set_item

🟥 max_rss_usage [+694.810KB; +761.728KB] or [+3.261%; +3.575%]

scenario:flasksimple-appsec-telemetry

🟥 execution_time [+204.582µs; +258.818µs] or [+3.268%; +4.134%]

scenario:httppropagationextract-b3_headers

🟥 max_rss_usage [+688.733KB; +743.638KB] or [+3.238%; +3.496%]

scenario:httppropagationextract-full_t_id_datadog_headers

🟥 max_rss_usage [+678.831KB; +907.550KB] or [+3.208%; +4.289%]

scenario:httppropagationextract-invalid_trace_id_header

🟥 max_rss_usage [+651.816KB; +723.211KB] or [+3.059%; +3.394%]

scenario:httppropagationextract-large_header_no_matches

🟩 max_rss_usage [-535.758KB; -469.401KB] or [-2.448%; -2.145%]

scenario:httppropagationextract-medium_header_no_matches

🟩 max_rss_usage [-504.587KB; -439.131KB] or [-2.310%; -2.010%]

scenario:httppropagationextract-none_propagation_style

🟩 max_rss_usage [-729.968KB; -443.126KB] or [-3.361%; -2.040%]

scenario:httppropagationextract-tracecontext_headers

🟥 max_rss_usage [+671.758KB; +874.892KB] or [+3.165%; +4.122%]

scenario:httppropagationextract-wsgi_invalid_span_id_header

🟥 max_rss_usage [+451.958KB; +773.565KB] or [+2.142%; +3.667%]

scenario:httppropagationextract-wsgi_invalid_trace_id_header

🟥 max_rss_usage [+632.693KB; +695.640KB] or [+2.970%; +3.266%]

scenario:httppropagationextract-wsgi_large_header_no_matches

🟥 max_rss_usage [+733.089KB; +801.683KB] or [+3.459%; +3.783%]

scenario:httppropagationextract-wsgi_medium_header_no_matches

🟥 max_rss_usage [+766.115KB; +844.841KB] or [+3.619%; +3.991%]

scenario:httppropagationinject-ids_only

🟥 max_rss_usage [+677.963KB; +722.869KB] or [+3.190%; +3.401%]

scenario:httppropagationinject-with_tags_invalid

🟩 max_rss_usage [-596.717KB; -552.211KB] or [-2.734%; -2.530%]

scenario:otelspan-start-finish

🟩 max_rss_usage [-622.051KB; -553.911KB] or [-2.686%; -2.392%]

scenario:sethttpmeta-all-disabled

🟥 max_rss_usage [+468.000KB; +767.763KB] or [+2.162%; +3.547%]

scenario:sethttpmeta-all-enabled

🟥 max_rss_usage [+653.607KB; +818.496KB] or [+3.039%; +3.806%]

scenario:sethttpmeta-obfuscation-regular-case-implicit-query

🟩 max_rss_usage [-689.259KB; -618.594KB] or [-3.078%; -2.762%]

scenario:sethttpmeta-obfuscation-send-querystring-disabled

🟩 max_rss_usage [-625.174KB; -558.160KB] or [-2.793%; -2.494%]

scenario:sethttpmeta-obfuscation-worst-case-explicit-query

🟩 max_rss_usage [-595.225KB; -538.957KB] or [-2.660%; -2.409%]

scenario:sethttpmeta-obfuscation-worst-case-implicit-query

🟩 max_rss_usage [-622.537KB; -553.015KB] or [-2.781%; -2.471%]

scenario:sethttpmeta-useragentvariant_not_exists_1

🟥 max_rss_usage [+816.702KB; +1051.893KB] or [+3.848%; +4.956%]

scenario:span-add-tags

🟥 max_rss_usage [+1.894MB; +2.061MB] or [+6.266%; +6.817%]

scenario:span-start-finish

🟩 max_rss_usage [-631.513KB; -560.833KB] or [-2.882%; -2.559%]

scenario:span-start-finish-telemetry

🟥 max_rss_usage [+851.707KB; +1024.261KB] or [+4.008%; +4.820%]

scenario:span-start-finish-traceid128

🟥 max_rss_usage [+761.022KB; +817.577KB] or [+3.585%; +3.851%]

brettlangdon requested review from a team as code owners May 6, 2024 16:50

brettlangdon requested review from erikayasuda, juan-fernandez and juanjux May 6, 2024 16:50

fix missing get_default_telemetry_env helper

f83a6a1

brettlangdon enabled auto-merge (squash) May 6, 2024 18:43

tonyredondo approved these changes May 7, 2024

View reviewed changes

erikayasuda approved these changes May 7, 2024

View reviewed changes

brettlangdon merged commit a5ee561 into 2.8 May 7, 2024
174 of 190 checks passed

brettlangdon deleted the backport-9141-to-2.8 branch May 7, 2024 11:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: pytest deadlock when using gevent [backport 2.8] #9166

fix: pytest deadlock when using gevent [backport 2.8] #9166

brettlangdon commented May 6, 2024

datadog-dd-trace-py-rkomorn bot commented May 6, 2024 •

edited

Loading

pr-commenter bot commented May 6, 2024 •

edited

Loading

fix: pytest deadlock when using gevent [backport 2.8] #9166

fix: pytest deadlock when using gevent [backport 2.8] #9166

Conversation

brettlangdon commented May 6, 2024

datadog-dd-trace-py-rkomorn bot commented May 6, 2024 • edited Loading

Datadog Report

New Flaky Tests (1)

pr-commenter bot commented May 6, 2024 • edited Loading

Benchmarks

scenario:coreapiscenario-context_with_data_no_listeners

scenario:coreapiscenario-context_with_data_only_all_listeners

scenario:coreapiscenario-core_dispatch_listeners

scenario:coreapiscenario-core_dispatch_no_listeners

scenario:coreapiscenario-core_dispatch_only_all_listeners

scenario:coreapiscenario-core_dispatch_with_results_listeners

scenario:coreapiscenario-core_dispatch_with_results_no_listeners

scenario:coreapiscenario-get_item_exists

scenario:coreapiscenario-get_item_missing

scenario:coreapiscenario-set_item

scenario:flasksimple-appsec-telemetry

scenario:httppropagationextract-b3_headers

scenario:httppropagationextract-full_t_id_datadog_headers

scenario:httppropagationextract-invalid_trace_id_header

scenario:httppropagationextract-large_header_no_matches

scenario:httppropagationextract-medium_header_no_matches

scenario:httppropagationextract-none_propagation_style

scenario:httppropagationextract-tracecontext_headers

scenario:httppropagationextract-wsgi_invalid_span_id_header

scenario:httppropagationextract-wsgi_invalid_trace_id_header

scenario:httppropagationextract-wsgi_large_header_no_matches

scenario:httppropagationextract-wsgi_medium_header_no_matches

scenario:httppropagationinject-ids_only

scenario:httppropagationinject-with_tags_invalid

scenario:otelspan-start-finish

scenario:sethttpmeta-all-disabled

scenario:sethttpmeta-all-enabled

scenario:sethttpmeta-obfuscation-regular-case-implicit-query

scenario:sethttpmeta-obfuscation-send-querystring-disabled

scenario:sethttpmeta-obfuscation-worst-case-explicit-query

scenario:sethttpmeta-obfuscation-worst-case-implicit-query

scenario:sethttpmeta-useragentvariant_not_exists_1

scenario:span-add-tags

scenario:span-start-finish

scenario:span-start-finish-telemetry

scenario:span-start-finish-traceid128

datadog-dd-trace-py-rkomorn bot commented May 6, 2024 •

edited

Loading

pr-commenter bot commented May 6, 2024 •

edited

Loading