Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: pytest deadlock when using gevent [backport 2.8] #9166

Merged
merged 2 commits into from
May 7, 2024

Conversation

brettlangdon
Copy link
Member

Backport 9e862f9 from #9141 to 2.8.

Fixes #8281

This fix resolves an issue when using pytest + gevent where the telemetry writer was eager initialized by pytest entrypoints loading of our plugin causing a potential dead lock.

The specific case that we targeted solving with this PR:

import ddtrace

import gevent.monkey

gevent.monkey.patch_all()

def test_thing():
    pass
$ pytest test.py
$ pytest -p ddtrace -p ddtrace.pytest_bdd -p ddtrace.pytest_benchmark test.py
$ pytest -p no:ddtrace -p no:ddtrace.pytest_bdd -p no:ddtrace.pytest_benchmark test.py

Previously would dead-lock, but now will execute without problem.

We no longer get telemetry enabled and sending data just from import ddtrace.

Moving telemetry writer starting to ddtrace.bootstrap.preload will mean that any unhandled exceptions that occur before the telemetry writer can be started will not get reported.

  • Change(s) are motivated and described in the PR description

  • Testing strategy is described if automated tests are not included in the PR

  • Risks are described (performance impact, potential for breakage, maintainability)

  • Change is maintainable (easy to change, telemetry, documentation)

  • Library release note guidelines are followed or label changelog/no-changelog is set

  • Documentation is included (in-code, generated user docs, public corp docs)

  • Backport labels are set (if applicable)

  • If this PR changes the public interface, I've notified @DataDog/apm-tees.

  • Title is accurate

  • All changes are related to the pull request's stated goal

  • Description motivates each change

  • Avoids breaking API changes

  • Testing strategy adequately addresses listed risks

  • Change is maintainable (easy to change, telemetry, documentation)

  • Release note makes sense to a user of the library

  • Author has acknowledged and discussed the performance implications of this PR as reported in the benchmarks PR comment

  • Backport labels are set in a manner that is consistent with the release branch maintenance
    policy

Fixes #8281

This fix resolves an issue when using ``pytest`` + ``gevent`` where the
telemetry writer was eager initialized by ``pytest`` entrypoints loading
of our plugin causing a potential dead lock.

The specific case that we targeted solving with this PR:

```python
import ddtrace

import gevent.monkey

gevent.monkey.patch_all()

def test_thing():
    pass
```

```shell
$ pytest test.py
$ pytest -p ddtrace -p ddtrace.pytest_bdd -p ddtrace.pytest_benchmark test.py
$ pytest -p no:ddtrace -p no:ddtrace.pytest_bdd -p no:ddtrace.pytest_benchmark test.py
```

Previously would dead-lock, but now will execute without problem.

We no longer get telemetry enabled and sending data just from `import
ddtrace`.

Moving telemetry writer starting to `ddtrace.bootstrap.preload` will
mean that any unhandled exceptions that occur before the telemetry
writer can be started will not get reported.

- [x] Change(s) are motivated and described in the PR description
- [x] Testing strategy is described if automated tests are not included
in the PR
- [x] Risks are described (performance impact, potential for breakage,
maintainability)
- [x] Change is maintainable (easy to change, telemetry, documentation)
- [x] [Library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html)
are followed or label `changelog/no-changelog` is set
- [x] Documentation is included (in-code, generated user docs, [public
corp docs](https://github.com/DataDog/documentation/))
- [x] Backport labels are set (if
[applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting))
- [x] If this PR changes the public interface, I've notified
`@DataDog/apm-tees`.

- [x] Title is accurate
- [x] All changes are related to the pull request's stated goal
- [x] Description motivates each change
- [x] Avoids breaking
[API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces)
changes
- [x] Testing strategy adequately addresses listed risks
- [x] Change is maintainable (easy to change, telemetry, documentation)
- [x] Release note makes sense to a user of the library
- [x] Author has acknowledged and discussed the performance implications
of this PR as reported in the benchmarks PR comment
- [x] Backport labels are set in a manner that is consistent with the
[release branch maintenance
policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)

(cherry picked from commit 9e862f9)
@datadog-dd-trace-py-rkomorn
Copy link

datadog-dd-trace-py-rkomorn bot commented May 6, 2024

Datadog Report

Branch report: backport-9141-to-2.8
Commit report: f83a6a1
Test service: dd-trace-py

✅ 0 Failed, 171574 Passed, 1095 Skipped, 11h 42m 15.71s Total duration (25m 59.15s time saved)
❄️ 1 New Flaky

New Flaky Tests (1)

  • test_add_event_disabled_writer - test_writer.py - Last Failure

    Expand for error
     assert 1 == 0
      +  where 1 = len([{'body': {'api_version': 'v2', 'application': {'env': '', 'language_name': 'python', 'language_version': '3.11.3', 'p...brary-Language': 'python', ...}, 'method': 'POST', 'url': 'http://localhost:9126/telemetry/proxy/api/v2/apmtelemetry'}])
      +    where [{'body': {'api_version': 'v2', 'application': {'env': '', 'language_name': 'python', 'language_version': '3.11.3', 'p...brary-Language': 'python', ...}, 'method': 'POST', 'url': 'http://localhost:9126/telemetry/proxy/api/v2/apmtelemetry'}] = <bound method TelemetryTestSession.get_requests of <tests.conftest.TelemetryTestSession object at 0x7fdc65e34590>>()
      +      where <bound method TelemetryTestSession.get_requests of <tests.conftest.TelemetryTestSession object at 0x7fdc65e34590>> = <tests.conftest.TelemetryTestSession object at 0x7fdc65e34590>.get_requests
    

@pr-commenter
Copy link

pr-commenter bot commented May 6, 2024

Benchmarks

Benchmark execution time: 2024-05-06 18:32:09

Comparing candidate commit f83a6a1 in PR branch backport-9141-to-2.8 with baseline commit d8a5718 in branch 2.8.

Found 13 performance improvements and 23 performance regressions! Performance is the same for 165 metrics, 9 unstable metrics.

scenario:coreapiscenario-context_with_data_no_listeners

  • 🟥 max_rss_usage [+836.434KB; +909.691KB] or [+3.956%; +4.303%]

scenario:coreapiscenario-context_with_data_only_all_listeners

  • 🟩 max_rss_usage [-564.805KB; -495.239KB] or [-2.582%; -2.264%]

scenario:coreapiscenario-core_dispatch_listeners

  • 🟥 max_rss_usage [+851.569KB; +919.132KB] or [+4.029%; +4.349%]

scenario:coreapiscenario-core_dispatch_no_listeners

  • 🟥 max_rss_usage [+885.094KB; +950.324KB] or [+4.189%; +4.498%]

scenario:coreapiscenario-core_dispatch_only_all_listeners

  • 🟥 max_rss_usage [+854.248KB; +916.453KB] or [+4.044%; +4.338%]

scenario:coreapiscenario-core_dispatch_with_results_listeners

  • 🟩 max_rss_usage [-545.390KB; -478.610KB] or [-2.495%; -2.190%]

scenario:coreapiscenario-core_dispatch_with_results_no_listeners

  • 🟩 max_rss_usage [-570.260KB; -511.903KB] or [-2.606%; -2.339%]

scenario:coreapiscenario-get_item_exists

  • 🟥 max_rss_usage [+671.212KB; +739.041KB] or [+3.151%; +3.470%]

scenario:coreapiscenario-get_item_missing

  • 🟥 max_rss_usage [+683.915KB; +751.733KB] or [+3.213%; +3.531%]

scenario:coreapiscenario-set_item

  • 🟥 max_rss_usage [+694.810KB; +761.728KB] or [+3.261%; +3.575%]

scenario:flasksimple-appsec-telemetry

  • 🟥 execution_time [+204.582µs; +258.818µs] or [+3.268%; +4.134%]

scenario:httppropagationextract-b3_headers

  • 🟥 max_rss_usage [+688.733KB; +743.638KB] or [+3.238%; +3.496%]

scenario:httppropagationextract-full_t_id_datadog_headers

  • 🟥 max_rss_usage [+678.831KB; +907.550KB] or [+3.208%; +4.289%]

scenario:httppropagationextract-invalid_trace_id_header

  • 🟥 max_rss_usage [+651.816KB; +723.211KB] or [+3.059%; +3.394%]

scenario:httppropagationextract-large_header_no_matches

  • 🟩 max_rss_usage [-535.758KB; -469.401KB] or [-2.448%; -2.145%]

scenario:httppropagationextract-medium_header_no_matches

  • 🟩 max_rss_usage [-504.587KB; -439.131KB] or [-2.310%; -2.010%]

scenario:httppropagationextract-none_propagation_style

  • 🟩 max_rss_usage [-729.968KB; -443.126KB] or [-3.361%; -2.040%]

scenario:httppropagationextract-tracecontext_headers

  • 🟥 max_rss_usage [+671.758KB; +874.892KB] or [+3.165%; +4.122%]

scenario:httppropagationextract-wsgi_invalid_span_id_header

  • 🟥 max_rss_usage [+451.958KB; +773.565KB] or [+2.142%; +3.667%]

scenario:httppropagationextract-wsgi_invalid_trace_id_header

  • 🟥 max_rss_usage [+632.693KB; +695.640KB] or [+2.970%; +3.266%]

scenario:httppropagationextract-wsgi_large_header_no_matches

  • 🟥 max_rss_usage [+733.089KB; +801.683KB] or [+3.459%; +3.783%]

scenario:httppropagationextract-wsgi_medium_header_no_matches

  • 🟥 max_rss_usage [+766.115KB; +844.841KB] or [+3.619%; +3.991%]

scenario:httppropagationinject-ids_only

  • 🟥 max_rss_usage [+677.963KB; +722.869KB] or [+3.190%; +3.401%]

scenario:httppropagationinject-with_tags_invalid

  • 🟩 max_rss_usage [-596.717KB; -552.211KB] or [-2.734%; -2.530%]

scenario:otelspan-start-finish

  • 🟩 max_rss_usage [-622.051KB; -553.911KB] or [-2.686%; -2.392%]

scenario:sethttpmeta-all-disabled

  • 🟥 max_rss_usage [+468.000KB; +767.763KB] or [+2.162%; +3.547%]

scenario:sethttpmeta-all-enabled

  • 🟥 max_rss_usage [+653.607KB; +818.496KB] or [+3.039%; +3.806%]

scenario:sethttpmeta-obfuscation-regular-case-implicit-query

  • 🟩 max_rss_usage [-689.259KB; -618.594KB] or [-3.078%; -2.762%]

scenario:sethttpmeta-obfuscation-send-querystring-disabled

  • 🟩 max_rss_usage [-625.174KB; -558.160KB] or [-2.793%; -2.494%]

scenario:sethttpmeta-obfuscation-worst-case-explicit-query

  • 🟩 max_rss_usage [-595.225KB; -538.957KB] or [-2.660%; -2.409%]

scenario:sethttpmeta-obfuscation-worst-case-implicit-query

  • 🟩 max_rss_usage [-622.537KB; -553.015KB] or [-2.781%; -2.471%]

scenario:sethttpmeta-useragentvariant_not_exists_1

  • 🟥 max_rss_usage [+816.702KB; +1051.893KB] or [+3.848%; +4.956%]

scenario:span-add-tags

  • 🟥 max_rss_usage [+1.894MB; +2.061MB] or [+6.266%; +6.817%]

scenario:span-start-finish

  • 🟩 max_rss_usage [-631.513KB; -560.833KB] or [-2.882%; -2.559%]

scenario:span-start-finish-telemetry

  • 🟥 max_rss_usage [+851.707KB; +1024.261KB] or [+4.008%; +4.820%]

scenario:span-start-finish-traceid128

  • 🟥 max_rss_usage [+761.022KB; +817.577KB] or [+3.585%; +3.851%]

@brettlangdon brettlangdon enabled auto-merge (squash) May 6, 2024 18:43
@brettlangdon brettlangdon merged commit a5ee561 into 2.8 May 7, 2024
174 of 190 checks passed
@brettlangdon brettlangdon deleted the backport-9141-to-2.8 branch May 7, 2024 11:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants