test: fix flakey GUI tests by olivermeyer · Pull Request #297 · aignostics/python-sdk

olivermeyer · 2025-12-01T12:56:45Z

No description provided.

github-actions · 2025-12-01T12:57:38Z

Claude finished @olivermeyer's task —— View job

Code Review: PR #297 - Fix Flakey GUI Tests

Summary

This PR adds the @pytest.mark.sequential marker to two GUI integration tests that were experiencing flakiness. The change is minimal, targeted, and appropriate.

✅ Review Results

Blocking Issues

None - This PR passes all checks and is ready to merge.

Code Quality Checks

✅ Linting: All checks pass (ruff format, ruff check, pyright, mypy)
✅ Test Markers: Both tests have required category markers (@pytest.mark.integration)
✅ Commit Message: Follows conventional commit format (test: fix flakey GUI tests)
✅ No Breaking Changes: Only test configuration changes, no production code affected

📝 Changes Analysis

File Modified: tests/aignostics/application/gui_test.py (+2 lines)

Tests Modified:

test_gui_run_results_pagination_show_more_button_hidden_when_few_results (line 461)
test_gui_run_results_pagination_show_more (line 512)

Change: Added @pytest.mark.sequential decorator to force single-threaded execution

🎯 Fix Appropriateness Assessment

The sequential marker is the RIGHT solution here. Here's why:

Context from Codebase

Existing Pattern: 3 other GUI tests in this file already use @pytest.mark.sequential:
- test_gui_cli_submit_to_run_result_delete (line 87)
- test_gui_download_dataset_via_application_to_run_cancel_to_find_back (line 187)
- test_gui_run_download (line 349) - with explicit comment: "Helps on Linux with image analysis step otherwise timing out"
Test Characteristics:
- Both tests interact with NiceGUI components (User fixture)
- Both tests query the same staging API for runs (Service().application_runs())
- Both tests check DOM state (button visibility/text)
- Tests are integration tests that mock external services but use real GUI rendering
Flakiness Root Cause:
- Resource contention when parallel tests query the same API endpoints
- GUI rendering race conditions when multiple NiceGUI instances run concurrently
- Shared state in staging environment (same runs being accessed by multiple test workers)

Why Sequential Fixes This

From .github/CLAUDE.md:

Sequential Tests (@pytest.mark.sequential):

Tests that must run in specific order

Have interdependencies or shared state

Cannot be parallelized

Parallelization: None (single worker)

These pagination tests:

✅ Share state (query same staging runs)
✅ Have GUI rendering dependencies
✅ Follow existing pattern in this file
✅ Are integration tests with XDIST_WORKER_FACTOR=0.2 (limited parallelization already)

📊 Test Execution Context

Before this PR:

Integration tests run with XDIST_WORKER_FACTOR=0.2 (1 worker on 8 CPU machine)
These specific tests could theoretically run in parallel with other GUI tests
Flakiness occurred due to race conditions

After this PR:

These 2 tests join 3 existing sequential GUI tests
All 5 sequential GUI tests run in single-worker mode
Eliminates race conditions without significant performance impact

Performance Impact: Negligible

Integration test suite: ~5 minutes total
These tests: 60s + 120s = 3 minutes combined (already long)
Sequential execution adds minimal overhead since they're I/O bound

🔍 Additional Observations

Good Practices Followed

✅ Consistency: Matches existing pattern in the file
✅ Test Coverage: Both tests already have @pytest.mark.flaky(retries=2) as fallback
✅ Documentation: Previous commit (35aeab6) added skipif for macOS ARM + Python 3.13
✅ Minimal Change: Only adds marker, doesn't modify test logic

Historical Context

Previous fix (PR test: disable flaky GUI tests on macos-latest and python 3.13 #293, commit 35aeab6): Added @pytest.mark.skipif to disable GUI tests on macOS ARM with Python 3.13
Reason: "GUI tests unstable on macOS Apple Silicon with Python 3.13 (GitHub Actions runner architecture issues)"
This PR: Addresses Linux flakiness via sequential execution

💡 Suggestions (Non-blocking)

1. Add Explanatory Comments (Nice-to-Have)

Consider adding inline comments like the existing one on line 349:

@pytest.mark.integration
@pytest.mark.sequential  # GUI rendering races when run in parallel with other tests
@pytest.mark.skipif(...)

This documents WHY sequential is needed for future maintainers.

2. PR Description Enhancement

The PR has no description. Consider adding:

Problem: GUI pagination tests were flaky when run in parallel with other integration tests.

Root Cause: Race conditions in GUI rendering and shared staging API state.

Solution: Add @pytest.mark.sequential to force single-threaded execution, following the pattern of 3 existing GUI tests in this file.

Impact: Minimal performance impact (~3 min tests already sequential-appropriate).

3. Consider Future Test Isolation

For future work (not this PR), consider:

Mock API responses for pagination tests to reduce staging API dependency
Use separate test runs/data to avoid shared state
But: Current fix is pragmatic and appropriate given medical device validation requirements

✅ Recommendation

APPROVE AND MERGE

This is a well-targeted fix that:

Follows existing patterns in the codebase
Addresses real flakiness issues
Has minimal performance impact
Maintains test coverage
Passes all CI checks

The sequential marker is the correct solution for GUI integration tests with shared staging state.

Test Execution Commands (for verification):

# Run these specific tests
uv run pytest tests/aignostics/application/gui_test.py::test_gui_run_results_pagination_show_more_button_hidden_when_few_results -v
uv run pytest tests/aignostics/application/gui_test.py::test_gui_run_results_pagination_show_more -v

# Run all sequential tests
make test_sequential

# Run all integration tests
make test_integration

codecov · 2025-12-01T13:25:10Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ All tests successful. No failed tests found.
see 6 files with indirect coverage changes

sonarqubecloud · 2025-12-01T13:25:57Z

Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

test: fix flakey GUI tests

0612012

olivermeyer marked this pull request as ready for review December 1, 2025 13:45

olivermeyer requested a review from helmut-hoffer-von-ankershoffen as a code owner December 1, 2025 13:45

olivermeyer merged commit c9815ec into main Dec 1, 2025
41 of 42 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test: fix flakey GUI tests#297

test: fix flakey GUI tests#297
olivermeyer merged 1 commit intomainfrom
fix/sequential-gui-tests

olivermeyer commented Dec 1, 2025

Uh oh!

github-actions bot commented Dec 1, 2025 •

edited

Loading

Uh oh!

codecov bot commented Dec 1, 2025 •

edited

Loading

Uh oh!

sonarqubecloud bot commented Dec 1, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

olivermeyer commented Dec 1, 2025

Uh oh!

github-actions bot commented Dec 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review: PR #297 - Fix Flakey GUI Tests

Summary

✅ Review Results

Blocking Issues

Code Quality Checks

📝 Changes Analysis

🎯 Fix Appropriateness Assessment

Context from Codebase

Why Sequential Fixes This

📊 Test Execution Context

🔍 Additional Observations

Good Practices Followed

Historical Context

💡 Suggestions (Non-blocking)

1. Add Explanatory Comments (Nice-to-Have)

2. PR Description Enhancement

3. Consider Future Test Isolation

✅ Recommendation

Uh oh!

codecov bot commented Dec 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

sonarqubecloud bot commented Dec 1, 2025

Quality Gate passed

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

github-actions bot commented Dec 1, 2025 •

edited

Loading

codecov bot commented Dec 1, 2025 •

edited

Loading