-
Notifications
You must be signed in to change notification settings - Fork 469
ci(integrations): makes test_single_trace_too_large less flaky #6119
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ci(integrations): makes test_single_trace_too_large less flaky #6119
Conversation
0aa4cd8 to
2ba1036
Compare
89344a7 to
3b8b6a1
Compare
emmettbutler
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I understand why the test failing on this pull request's CI runs is a useful signal. Shouldn't we make sure it passes before merge?
de32d5f to
0ca0c61
Compare
Sorry, I opened this PR too early. My first implementation was incorrect. I am currently working on a better solution. This ci fix is hard to test because its timing based (I think a slower machine or slower connection to the agent causes this test to fail in ci). The best we can do is rerun ci multiple times and hope this test no longer flaky. |
c85f956 to
1281c2f
Compare
Description
Looking at the failing test below we can see
test_single_trace_too_largegenerates over 20MB of trace data but these traces can be sent in separate payloads (yay partial flushing). This PR mocksAgentWriter.flush_queue()to ensure traces are not submitted. With this change theBufferFullexception will be consistently raised and"trace buffer (%s traces %db/%db) cannot fit trace of size %db, dropping (writer status: %s)"will always be logged.This PR:
tracer.shutdown()this test should not submit traces to the agent.Background
This test consistently passes locally but often fails in ci (and mainly on older versions of python). My hypothesis is that the time interval to submit traces to the agent is too short. The test does not have enough time to add the trace chunks to overflow the trace encoder's buffer.
Sample failure: https://app.circleci.com/pipelines/github/DataDog/dd-trace-py/37860/workflows/4da477db-5289-40b1-8375-5d6152f14b8f/jobs/2546383
Testing Strategy
YOLO. If
test_single_trace_too_largeis still flaky after this PR is merged then I have failed my reviewers and myself.Checklist
Reviewer Checklist