test: Reduce flakiness in tests #1716

Pijukatel · 2026-02-05T14:22:41Z

Description

Properly consume the first SystemInfoEvent in the fixture to ensure no interference with tests
Add extra wait buffer to flaky test sensitive to timing issues
Take into account that asyncio.sleep time can be slightly shorter than expected
Run resource-sensitive tests alone on Mac. Mac executor is the most sensitive to resource-dependent tests. Reduce flakiness by running some of them alone on Mac only. This should reduce the flakiness, without hiding issues too much. (Mac executor on GitHub has its own issues, which we do not need to deal with. If the test is flaky on other platforms, it should be investigated. If flaky on Mac only, try to run it alone first. )

Issues

Closes: Test test_timeout_in_handler is flaky #1652

codecov · 2026-02-05T14:27:28Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 92.45%. Comparing base (44bc85a) to head (bb3e371).
⚠️ Report is 7 commits behind head on master.

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #1716      +/-   ##
==========================================
- Coverage   92.46%   92.45%   -0.02%     
==========================================
  Files         156      156              
  Lines       10464    10464              
==========================================
- Hits         9676     9674       -2     
- Misses        788      790       +2

Flag	Coverage Δ
unit	`92.45% <ø> (-0.02%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Add extra wait buffer to flaky test sensitive to timing issues Properly consume first `SystemInfoEvent` in fixture to ensure no interference with tests Take into account that `asyncio.sleep` time can sleep shorter than expected

Mantisus

Just one suggestion. Otherwise, LGTM.

tests/unit/request_loaders/test_sitemap_request_loader.py

…ive-tests-alone-on-mac

Copilot

Pull request overview

This PR reduces unit test flakiness by isolating resource-sensitive tests on macOS and adding timing tolerances/buffers where the tests depend on scheduler timing or periodic background events.

Changes:

Added a run_alone_on_mac helper to mark specific tests as run_alone only on macOS.
Increased the asyncio.timeout(...) buffer in a flaky BasicCrawler timeout test.
Made timing-based tests more tolerant (asyncio sleep tolerance) and updated the Snapshotter fixture to consume the initial system info event before yielding.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
tests/unit/utils.py	Adds macOS-only decorator helper for `run_alone` tests.
tests/unit/crawlers/_basic/test_basic_crawler.py	Increases timeout buffer to reduce flakiness in `test_timeout_in_handler`.
tests/unit/browsers/test_browser_pool.py	Marks one BrowserPool test to run alone on macOS.
tests/unit/_utils/test_recurring_task.py	Marks RecurringTask execution test to run alone on macOS.
tests/unit/_statistics/test_request_max_duration.py	Adds tolerance for `asyncio.sleep` undersleep in duration assertion.
tests/unit/_autoscaling/test_snapshotter.py	Consumes initial SYSTEM_INFO event in fixture to avoid cross-test interference.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

tests/unit/utils.py

tests/unit/_autoscaling/test_snapshotter.py

vdusek

I have 3 things to ask.

Aren't these tests flaky on Windows as well? I think they are, and I'd suggest running them in isolation on Windows too.
AFAIK the following tests are also flaky:

test_system.py::test_memory_estimation_does_not_overestimate_due_to_shared_memory
test_sitemap_request_loader.py::test_sitemap_traversal[httpx / impit]
test_playwright_crawler.py::test_isolation_cookies[without use_incognito_pages]

Do you plan to investigate these later or take any action on them? I'd be in favor of adding them to the "run alone" group as well.

Max recently added the pytest-rerunfailures package with @pytest.mark.flaky decorators to six tests. This means we now have two different approaches to handling flakiness. Are you aware of this? Personally, I'd prefer removing the automatic retries and relying only on isolated execution (or I would give it a try for now).

Pijukatel · 2026-02-06T15:09:41Z

I have 3 things to ask.

1. Aren't these tests flaky on Windows as well? I think they are, and I'd suggest running them in isolation on Windows too.

I looked only at some recent failures. I tried to solve those that seem to be flaky everywhere and use the marker where I saw only Mac failures.

AFAIK the following tests are also flaky:

test_system.py::test_memory_estimation_does_not_overestimate_due_to_shared_memory
test_sitemap_request_loader.py::test_sitemap_traversal[httpx / impit]
test_playwright_crawler.py::test_isolation_cookies[without use_incognito_pages]

Maybe. This is not an attempt to fix all, just the ones I was aware of from my recent runs. It can be incremental. If some test is flaky too often, I probably look into it.

test_sitemap_request_loader was fixed by Max yesterday

Max recently added the pytest-rerunfailures package with @pytest.mark.flaky decorators to six tests. This means we now have two different approaches to handling flakiness. Are you aware of this? Personally, I'd prefer removing the automatic retries and relying only on isolated execution (or I would give it a try for now).

I see it like this:
@pytest.mark.flaky is for unknown generic flakiness, and I think it should be used as a last resort before skipping the test. run_alone should be used for tests:

That, by design, can't run in parallel
For resource-sensitive tests. Since I saw how the Mac executor is the worst affected by those resource issues, I think it will be more common to use it than Windows or Linux.

Maybe we could do something like this in case of flaky tests (and go to next step if the first did not help) run_alone_on_mac-> run_alone -> @pytest.mark.flaky --> skip
(not a rule, but a good enough recommendation?)

Mantisus · 2026-02-06T15:32:46Z

test_playwright_crawler.py::test_isolation_cookies[without use_incognito_pages]

I will study this test separately, as it should not be linked to resources. There may be a problem with the test design or some other issue.

Max recently added the pytest-rerunfailures package with @pytest.mark.flaky decorators to six tests. This means we now have two different approaches to handling flakiness.

Some of the tests I marked with @pytest.mark.flake are resource-sensitive. And I assume that on a Mac, they may even exhibit instability with run_alone_on_mac.

Maybe we could do something like this in case of flaky tests (and go to next step if the first did not help) run_alone_on_mac-> run_alone -> @pytest.mark.flaky --> skip

I like this strategy.

vdusek

Maybe we could do something like this in case of flaky tests (and go to next step if the first did not help) run_alone_on_mac-> run_alone -> @pytest.mark.flaky --> skip
(not a rule, but a good enough recommendation?)

Sounds good, could you write it down somewhere, probably tests/unit/README.md?

vdusek

Amazing, thanks.

github-actions bot assigned Pijukatel Feb 5, 2026

github-actions bot added this to the 133rd sprint - Tooling team milestone Feb 5, 2026

github-actions bot added t-tooling Issues with this label are in the ownership of the tooling team. tested Temporary label used only programatically for some analytics. labels Feb 5, 2026

Pijukatel force-pushed the run-resource-sensitive-tests-alone-on-mac branch from 6df4ada to af6889a Compare February 5, 2026 14:32

Pijukatel added the adhoc Ad-hoc unplanned task added during the sprint. label Feb 5, 2026

Pijukatel force-pushed the run-resource-sensitive-tests-alone-on-mac branch 3 times, most recently from 20ae690 to 48f27cf Compare February 6, 2026 09:06

Pijukatel changed the title ~~chore: Run resource sensititve tests alone on Mac~~ chore: Reduce flakiness in some tests Feb 6, 2026

Pijukatel changed the title ~~chore: Reduce flakiness in some tests~~ chore: Reduce flakiness in tests Feb 6, 2026

Pijukatel force-pushed the run-resource-sensitive-tests-alone-on-mac branch from 48f27cf to 6734fa0 Compare February 6, 2026 09:20

Pijukatel marked this pull request as ready for review February 6, 2026 09:24

Pijukatel force-pushed the run-resource-sensitive-tests-alone-on-mac branch from 6734fa0 to 710e437 Compare February 6, 2026 10:55

Run resource sensititve tests alone on Mac

ad10f55

Add extra wait buffer to flaky test sensitive to timing issues Properly consume first `SystemInfoEvent` in fixture to ensure no interference with tests Take into account that `asyncio.sleep` time can sleep shorter than expected

Pijukatel force-pushed the run-resource-sensitive-tests-alone-on-mac branch from 710e437 to ad10f55 Compare February 6, 2026 10:59

Pijukatel requested a review from Mantisus February 6, 2026 10:59

Mantisus approved these changes Feb 6, 2026

View reviewed changes

Mantisus reviewed Feb 6, 2026

View reviewed changes

tests/unit/request_loaders/test_sitemap_request_loader.py Outdated Show resolved Hide resolved

Merge remote-tracking branch 'origin/master' into run-resource-sensit…

7a72bad

…ive-tests-alone-on-mac

vdusek requested a review from Copilot February 6, 2026 14:32

Copilot started reviewing on behalf of vdusek February 6, 2026 14:33 View session

Copilot AI reviewed Feb 6, 2026

View reviewed changes

tests/unit/utils.py Show resolved Hide resolved

tests/unit/_autoscaling/test_snapshotter.py Show resolved Hide resolved

vdusek self-requested a review February 6, 2026 14:41

vdusek requested changes Feb 6, 2026

View reviewed changes

vdusek reviewed Feb 9, 2026

View reviewed changes

vdusek approved these changes Feb 10, 2026

View reviewed changes

Add readme entry about flaky tests

bb3e371

Pijukatel force-pushed the run-resource-sensitive-tests-alone-on-mac branch from 4c409a6 to bb3e371 Compare February 10, 2026 10:02

vdusek changed the title ~~chore: Reduce flakiness in tests~~ test: Reduce flakiness in tests Feb 10, 2026

Pijukatel merged commit 8c0dae6 into master Feb 10, 2026
32 checks passed

Pijukatel deleted the run-resource-sensitive-tests-alone-on-mac branch February 10, 2026 11:29

Pijukatel mentioned this pull request Feb 11, 2026

test: Mark flaky tests with run_alone to reduce CI failures #1733

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test: Reduce flakiness in tests #1716

test: Reduce flakiness in tests #1716

Uh oh!

Pijukatel commented Feb 5, 2026 •

edited

Loading

Uh oh!

codecov bot commented Feb 5, 2026 •

edited

Loading

Uh oh!

Mantisus left a comment

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

vdusek left a comment

Uh oh!

Pijukatel commented Feb 6, 2026 •

edited

Loading

Uh oh!

Mantisus commented Feb 6, 2026

Uh oh!

vdusek left a comment

Uh oh!

vdusek left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

test: Reduce flakiness in tests #1716

test: Reduce flakiness in tests #1716

Uh oh!

Conversation

Pijukatel commented Feb 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Issues

Uh oh!

codecov bot commented Feb 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Mantisus left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

vdusek left a comment

Choose a reason for hiding this comment

Uh oh!

Pijukatel commented Feb 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Mantisus commented Feb 6, 2026

Uh oh!

vdusek left a comment

Choose a reason for hiding this comment

Uh oh!

vdusek left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Pijukatel commented Feb 5, 2026 •

edited

Loading

codecov bot commented Feb 5, 2026 •

edited

Loading

Pijukatel commented Feb 6, 2026 •

edited

Loading