-
Notifications
You must be signed in to change notification settings - Fork 619
test: Reduce flakiness in tests #1716
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #1716 +/- ##
==========================================
- Coverage 92.46% 92.45% -0.02%
==========================================
Files 156 156
Lines 10464 10464
==========================================
- Hits 9676 9674 -2
- Misses 788 790 +2
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
6df4ada to
af6889a
Compare
20ae690 to
48f27cf
Compare
48f27cf to
6734fa0
Compare
6734fa0 to
710e437
Compare
Add extra wait buffer to flaky test sensitive to timing issues Properly consume first `SystemInfoEvent` in fixture to ensure no interference with tests Take into account that `asyncio.sleep` time can sleep shorter than expected
710e437 to
ad10f55
Compare
Mantisus
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just one suggestion. Otherwise, LGTM.
…ive-tests-alone-on-mac
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR reduces unit test flakiness by isolating resource-sensitive tests on macOS and adding timing tolerances/buffers where the tests depend on scheduler timing or periodic background events.
Changes:
- Added a
run_alone_on_machelper to mark specific tests asrun_aloneonly on macOS. - Increased the
asyncio.timeout(...)buffer in a flaky BasicCrawler timeout test. - Made timing-based tests more tolerant (asyncio sleep tolerance) and updated the Snapshotter fixture to consume the initial system info event before yielding.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/unit/utils.py | Adds macOS-only decorator helper for run_alone tests. |
| tests/unit/crawlers/_basic/test_basic_crawler.py | Increases timeout buffer to reduce flakiness in test_timeout_in_handler. |
| tests/unit/browsers/test_browser_pool.py | Marks one BrowserPool test to run alone on macOS. |
| tests/unit/_utils/test_recurring_task.py | Marks RecurringTask execution test to run alone on macOS. |
| tests/unit/_statistics/test_request_max_duration.py | Adds tolerance for asyncio.sleep undersleep in duration assertion. |
| tests/unit/_autoscaling/test_snapshotter.py | Consumes initial SYSTEM_INFO event in fixture to avoid cross-test interference. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
vdusek
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have 3 things to ask.
-
Aren't these tests flaky on Windows as well? I think they are, and I'd suggest running them in isolation on Windows too.
-
AFAIK the following tests are also flaky:
test_system.py::test_memory_estimation_does_not_overestimate_due_to_shared_memorytest_sitemap_request_loader.py::test_sitemap_traversal[httpx / impit]test_playwright_crawler.py::test_isolation_cookies[without use_incognito_pages]
Do you plan to investigate these later or take any action on them? I'd be in favor of adding them to the "run alone" group as well.
- Max recently added the
pytest-rerunfailurespackage with@pytest.mark.flakydecorators to six tests. This means we now have two different approaches to handling flakiness. Are you aware of this? Personally, I'd prefer removing the automatic retries and relying only on isolated execution (or I would give it a try for now).
I looked only at some recent failures. I tried to solve those that seem to be flaky everywhere and use the marker where I saw only Mac failures.
Maybe. This is not an attempt to fix all, just the ones I was aware of from my recent runs. It can be incremental. If some test is flaky too often, I probably look into it.
I see it like this:
Maybe we could do something like this in case of flaky tests (and go to next step if the first did not help) |
I will study this test separately, as it should not be linked to resources. There may be a problem with the test design or some other issue.
Some of the tests I marked with
I like this strategy. |
vdusek
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we could do something like this in case of flaky tests (and go to next step if the first did not help) run_alone_on_mac-> run_alone -> @pytest.mark.flaky --> skip
(not a rule, but a good enough recommendation?)
Sounds good, could you write it down somewhere, probably tests/unit/README.md?
vdusek
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Amazing, thanks.
4c409a6 to
bb3e371
Compare
Description
SystemInfoEventin the fixture to ensure no interference with testsasyncio.sleeptime can be slightly shorter than expectedIssues
test_timeout_in_handleris flaky #1652