ci(integrations): makes test_single_trace_too_large less flaky #6119

mabdinur · 2023-06-13T20:29:57Z

Description

Looking at the failing test below we can see test_single_trace_too_large generates over 20MB of trace data but these traces can be sent in separate payloads (yay partial flushing). This PR mocks AgentWriter.flush_queue() to ensure traces are not submitted. With this change the BufferFull exception will be consistently raised and "trace buffer (%s traces %db/%db) cannot fit trace of size %db, dropping (writer status: %s)" will always be logged.

This PR:

Removes time interval env var this not required since the mechanism that submits traces is being mocked.
Removes tracer.shutdown() this test should not submit traces to the agent.
Decreases the total size of the trace from 200,000 to 30,000.
- Generating 200,000 spans is overkill and very slow. We only need to queue 30,000 spans to fill the agent writer's buffer and log the buffer full warning.

Background

This test consistently passes locally but often fails in ci (and mainly on older versions of python). My hypothesis is that the time interval to submit traces to the agent is too short. The test does not have enough time to add the trace chunks to overflow the trace encoder's buffer.

Sample failure: https://app.circleci.com/pipelines/github/DataDog/dd-trace-py/37860/workflows/4da477db-5289-40b1-8375-5d6152f14b8f/jobs/2546383

Testing Strategy

YOLO. If test_single_trace_too_large is still flaky after this PR is merged then I have failed my reviewers and myself.

Checklist

Change(s) are motivated and described in the PR description.
Testing strategy is described if automated tests are not included in the PR.
Risk is outlined (performance impact, potential for breakage, maintainability, etc).
Change is maintainable (easy to change, telemetry, documentation).
Library release note guidelines are followed.
Documentation is included (in-code, generated user docs, public corp docs).
Backport labels are set (if applicable)

Reviewer Checklist

Title is accurate.
No unnecessary changes are introduced.
Description motivates each change.
Avoids breaking API changes unless absolutely necessary.
Testing strategy adequately addresses listed risk(s).
Change is maintainable (easy to change, telemetry, documentation).
Release note makes sense to a user of the library.
Reviewer has explicitly acknowledged and discussed the performance implications of this PR as reported in the benchmarks PR comment.
Backport labels are set in a manner that is consistent with the release branch maintenance policy

pr-commenter · 2023-06-13T21:12:11Z

Benchmarks

Comparing candidate commit 1281c2f in PR branch munir/disable-partial-flushing-in-integrations-test with baseline commit 54110dd in branch 1.x.

Found 0 performance improvements and 0 performance regressions! Performance is the same for 86 cases.

tests/integration/test_integration.py

emmettbutler

I'm not sure I understand why the test failing on this pull request's CI runs is a useful signal. Shouldn't we make sure it passes before merge?

mabdinur · 2023-06-14T14:07:01Z

I'm not sure I understand why the test failing on this pull request's CI runs is a useful signal. Shouldn't we make sure it passes before merge?

Sorry, I opened this PR too early. My first implementation was incorrect. I am currently working on a better solution.

This ci fix is hard to test because its timing based (I think a slower machine or slower connection to the agent causes this test to fail in ci). The best we can do is rerun ci multiple times and hope this test no longer flaky.

tests/integration/test_integration.py

…s-test

mabdinur requested review from a team as code owners June 13, 2023 20:29

mabdinur requested review from emmettbutler and majorgreys June 13, 2023 20:29

mabdinur added the changelog/no-changelog A changelog entry is not required for this PR. label Jun 13, 2023

mabdinur marked this pull request as draft June 13, 2023 20:50

mabdinur force-pushed the munir/disable-partial-flushing-in-integrations-test branch from 0aa4cd8 to 2ba1036 Compare June 14, 2023 04:11

mabdinur commented Jun 14, 2023

View reviewed changes

tests/integration/test_integration.py Outdated Show resolved Hide resolved

mabdinur force-pushed the munir/disable-partial-flushing-in-integrations-test branch from 89344a7 to 3b8b6a1 Compare June 14, 2023 04:31

mabdinur marked this pull request as ready for review June 14, 2023 04:31

mabdinur enabled auto-merge (squash) June 14, 2023 04:31

emmettbutler reviewed Jun 14, 2023

View reviewed changes

mabdinur force-pushed the munir/disable-partial-flushing-in-integrations-test branch from de32d5f to 0ca0c61 Compare June 14, 2023 14:03

mabdinur commented Jun 14, 2023

View reviewed changes

tests/integration/test_integration.py Outdated Show resolved Hide resolved

mabdinur commented Jun 14, 2023

View reviewed changes

tests/integration/test_integration.py Outdated Show resolved Hide resolved

mabdinur changed the title ~~ci(integrations): fix flaky test_single_trace_too_large [attempt #2]~~ ci(integrations): removes flaky test_single_trace_too_large Jun 14, 2023

avoid flushing large traces

1281c2f

mabdinur force-pushed the munir/disable-partial-flushing-in-integrations-test branch from c85f956 to 1281c2f Compare June 14, 2023 15:46

Merge branch '1.x' into munir/disable-partial-flushing-in-integration…

81b14cd

…s-test

mabdinur changed the title ~~ci(integrations): removes flaky test_single_trace_too_large~~ ci(integrations): makes test_single_trace_too_large less flaky Jun 14, 2023

mabdinur requested a review from emmettbutler June 14, 2023 17:49

mabdinur added 2 commits June 14, 2023 13:49

Merge branch '1.x' into munir/disable-partial-flushing-in-integration…

ef2295e

…s-test

Merge branch '1.x' into munir/disable-partial-flushing-in-integration…

c54c74b

…s-test

Yun-Kim approved these changes Jun 14, 2023

View reviewed changes

mabdinur merged commit cd6f987 into 1.x Jun 14, 2023

mabdinur deleted the munir/disable-partial-flushing-in-integrations-test branch June 14, 2023 19:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ci(integrations): makes test_single_trace_too_large less flaky #6119

ci(integrations): makes test_single_trace_too_large less flaky #6119

Uh oh!

mabdinur commented Jun 13, 2023 •

edited by Yun-Kim

Loading

Uh oh!

pr-commenter bot commented Jun 13, 2023 •

edited

Loading

Uh oh!

Uh oh!

emmettbutler left a comment

Uh oh!

mabdinur commented Jun 14, 2023 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

ci(integrations): makes test_single_trace_too_large less flaky #6119

ci(integrations): makes test_single_trace_too_large less flaky #6119

Uh oh!

Conversation

mabdinur commented Jun 13, 2023 • edited by Yun-Kim Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Background

Testing Strategy

Checklist

Reviewer Checklist

Uh oh!

pr-commenter bot commented Jun 13, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmarks

Uh oh!

Uh oh!

emmettbutler left a comment

Choose a reason for hiding this comment

Uh oh!

mabdinur commented Jun 14, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

mabdinur commented Jun 13, 2023 •

edited by Yun-Kim

Loading

pr-commenter bot commented Jun 13, 2023 •

edited

Loading

mabdinur commented Jun 14, 2023 •

edited

Loading