Skip to content

mark all rtx-based rendering tests flaky for now#5508

Merged
huidongc merged 3 commits intoisaac-sim:developfrom
huidongc:rendering-test-flaky-marker
May 6, 2026
Merged

mark all rtx-based rendering tests flaky for now#5508
huidongc merged 3 commits intoisaac-sim:developfrom
huidongc:rendering-test-flaky-marker

Conversation

@huidongc
Copy link
Copy Markdown
Collaborator

@huidongc huidongc commented May 6, 2026

Description

Mark all RTX-based rendering test cases flaky until they can produce deterministic low-res camera outputs that pass golden image testing on every CI run.

Fixes # (issue)

Type of change

  • Test change

Screenshots

Please attach before and after screenshots of the change if applicable.

Checklist

  • I have read and understood the contribution guidelines
  • I have run the pre-commit checks with ./isaaclab.sh --format
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • I have updated the changelog and the corresponding version in the extension's config/extension.toml file
  • I have added my name to the CONTRIBUTORS.md or my name already exists there

@github-actions github-actions Bot added the isaac-lab Related to Isaac Lab team label May 6, 2026
Copy link
Copy Markdown

@isaaclab-review-bot isaaclab-review-bot Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 Isaac Lab Review Bot

Summary

This PR marks all RTX-based rendering tests as flaky by moving the @pytest.mark.flaky(max_runs=3, min_passes=1) decorator from individual test functions to the parametrized test cases in rendering_test_utils.py. It also expands CI detection from GitHub Actions only to include GitLab CI. The change is mechanically correct and consolidates flaky markers at the parameter level.

Architecture Impact

Self-contained. This change affects only test infrastructure in isaaclab_tasks/test/. The flaky markers are applied to PHYSICS_RENDERER_AOV_COMBINATIONS which is consumed by multiple test modules (test_rendering_cartpole.py, test_rendering_shadow_hand.py, test_rendering_dexsuite_kuka.py, and their kitless variants). The newton_renderer (warp) combinations intentionally remain non-flaky, preserving their deterministic expectations.

Implementation Verdict

Minor fixes needed

Test Coverage

This is a test infrastructure change, not a feature or bug fix. No new tests are required. The change appropriately targets known non-deterministic behavior in RTX rendering pipelines.

CI Status

No CI checks available yet. This PR should be verified against the actual CI to confirm the flaky markers are working as intended.

Findings

🔵 Improvement: rendering_test_utils.py:202-230 — KITLESS combinations missing flaky marker for ovrtx_renderer

The KITLESS_PHYSICS_RENDERER_AOV_COMBINATIONS list contains ovrtx_renderer entries which are RTX-based, but they only have _SKIP_ON_CI_MARK and not _FLAKY_MARK. If OVRTX is also non-deterministic like isaacsim_rtx_renderer, these should have both marks. If OVRTX tests are always skipped on CI anyway, this is moot, but the inconsistency could cause issues if someone runs these locally or if CI skip conditions change.

# Current: only _SKIP_ON_CI_MARK
marks=_SKIP_ON_CI_MARK,

# Consider if OVRTX is also non-deterministic:
marks=[_SKIP_ON_CI_MARK, _FLAKY_MARK],

🔵 Improvement: rendering_test_utils.py:232-241 — newton_renderer (warp) kitless combinations inconsistent with kit-based

In PHYSICS_RENDERER_AOV_COMBINATIONS, the newton_renderer (warp) entries at lines 175-184 have no flaky marker, which is correct since warp is deterministic. The same is true in KITLESS_PHYSICS_RENDERER_AOV_COMBINATIONS at lines 232-241. This is consistent and correct — just confirming the design is intentional.

🟡 Warning: test_rendering_dexsuite_kuka.py:32 — Removed flaky marker relies on parameter-level marks working correctly

The @pytest.mark.flaky decorator was removed from the test function, relying entirely on the marks=_FLAKY_MARK in pytest.param(). This is valid pytest behavior, but note that if any test case in PHYSICS_RENDERER_AOV_COMBINATIONS is missing the _FLAKY_MARK (like the newton_renderer cases), those specific parametrized runs will NOT be flaky. Verify this is the intended behavior — it appears to be, since warp is deterministic.

🔵 Improvement: rendering_test_utils.py:68-72 — CI detection could use a helper function

The CI detection logic is duplicated conceptually (env var checks). Consider extracting to a named constant or function for clarity:

def _is_ci_environment() -> bool:
    return any(os.environ.get(var) for var in ("CI", "GITHUB_ACTIONS", "GITLAB_CI"))

This would make the intent clearer and ease future additions (e.g., Jenkins, CircleCI).

🔵 Improvement: changelog.d/huidongc-flaky-mark.skip — Empty changelog skip file

The .skip extension indicates this is intentionally not adding a changelog entry. This is appropriate for test-only changes, but consider whether this behavioral change (marking tests flaky) should be documented somewhere for future maintainers investigating test stability.

@huidongc huidongc requested review from kellyguo11 and pbarejko May 6, 2026 02:36
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 6, 2026

Greptile Summary

This PR refines how RTX-based rendering tests are marked as flaky by moving the pytest.mark.flaky decorator from the test function level down to individual pytest.param entries in the shared utility, so only non-deterministic RTX combinations retry rather than every parametrized case. It also broadens the CI-skip guard for ovrtx kitless tests to cover GitLab CI and any runner that sets the generic CI variable, not just GitHub Actions.

  • Flaky marks moved to param level: isaacsim_rtx_renderer entries in PHYSICS_RENDERER_AOV_COMBINATIONS each receive _FLAKY_MARK; warp/newton_renderer entries remain unmarked, reflecting that only the RTX path is non-deterministic.
  • CI skip guard widened: _SKIP_ON_GITHUB_ACTIONS renamed to _SKIP_ON_CI and now also checks CI=true and GITLAB_CI, keeping ovrtx kitless tests off all supported CI runners.

Confidence Score: 4/5

Safe to merge; changes are confined to test infrastructure and do not touch production code.

The refactoring is straightforward and low-risk. The only notable issue is the inconsistent truthiness check for GITLAB_CI — the other two env-var guards use == 'true' but GITLAB_CI is evaluated as a bare truthy value, which would match any non-empty string. In practice GitLab only sets it to 'true', so this is unlikely to cause a real skip on an unintended runner, but it is a latent inconsistency worth fixing.

rendering_test_utils.py — specifically the _SKIP_ON_CI definition on lines 70-72.

Important Files Changed

Filename Overview
source/isaaclab_tasks/test/rendering_test_utils.py Moves flaky retries from function-level to per-param marks (RTX only), widens CI skip guard to cover GitLab CI and generic CI; minor inconsistency in GITLAB_CI truthiness check.
source/isaaclab_tasks/test/test_rendering_dexsuite_kuka.py Removes function-level flaky decorator; flaky behavior is now handled per-param in the shared utils.
source/isaaclab_tasks/test/test_rendering_dexsuite_kuka_kitless.py Removes function-level flaky decorator; ovrtx tests remain CI-skipped, newton_warp tests no longer retry (intentional since warp renderer is deterministic).
source/isaaclab_tasks/changelog.d/huidongc-flaky-mark.skip Empty .skip changelog placeholder, suppressing changelog generation for this test-only change.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[pytest collects parametrized test] --> B{renderer type?}
    B -->|isaacsim_rtx_renderer| C[param has _FLAKY_MARK\nmax_runs=3, min_passes=1]
    B -->|newton_renderer warp| D[No special mark\nrun once, must pass]
    B -->|ovrtx_renderer kitless| E{CI environment?}
    E -->|CI=true OR\nGITHUB_ACTIONS=true OR\nGITLAB_CI set| F[SKIP test]
    E -->|local run| G[Run once, must pass]
    C --> H{Test passes?}
    H -->|yes| I[PASS]
    H -->|no, runs < 3| C
    H -->|no, runs = 3| J[FAIL]
Loading

Reviews (1): Last reviewed commit: "mark all rtx-based rendering tests flaky..." | Re-trigger Greptile

Comment thread source/isaaclab_tasks/test/rendering_test_utils.py Outdated
huidongc and others added 2 commits May 6, 2026 10:42
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Signed-off-by: HuiDong Chen <huidongc@nvidia.com>
Copy link
Copy Markdown

@isaaclab-review-bot isaaclab-review-bot Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 Isaac Lab Review Bot

Summary

The new commit (db81598) appears to be a rebase or sync with no functional changes to the PR content — the diff and implementation remain identical to the previous review.

Follow-up Assessment

All previous findings still apply as no code changes were made:

  • The KITLESS ovrtx_renderer combinations still only have _SKIP_ON_CI_MARK without _FLAKY_MARK (previous improvement suggestion stands)
  • The newton_renderer (warp) combinations correctly remain non-flaky
  • The CI detection consolidation is complete

CI Status

The pre-commit check has failed (❌). This needs to be addressed before merging — likely the line at rendering_test_utils.py:69-70 exceeds the line length limit:

_SKIP_ON_CI = (
    os.environ.get("CI") == "true" or os.environ.get("GITHUB_ACTIONS") == "true" or os.environ.get("GITLAB_CI") == "true"
)

Implementation Verdict

Minor fixes needed — address the pre-commit failure.

New Findings

🔴 Critical: rendering_test_utils.py:69-70 — Pre-commit failure likely due to line length

The multi-condition line exceeds the line length limit. Split it:

_SKIP_ON_CI = (
    os.environ.get("CI") == "true"
    or os.environ.get("GITHUB_ACTIONS") == "true"
    or os.environ.get("GITLAB_CI") == "true"
)

Copy link
Copy Markdown

@isaaclab-review-bot isaaclab-review-bot Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 Isaac Lab Review Bot

Summary

The new commit (d408913) addresses the previous pre-commit failure by refactoring the CI detection logic to use a cleaner any() comprehension that fits within line length limits.

Follow-up Assessment

Previous concern addressed: The line length issue has been fixed with _SKIP_ON_CI = any(os.environ.get(name) == "true" for name in ("CI", "GITHUB_ACTIONS", "GITLAB_CI")) — this is a clean, idiomatic solution.

Implementation Verdict

Ship it — the change is correct and pre-commit now passes.

CI Status

Pre-commit passes (✅). Most CI jobs are still pending but the relevant checks that have completed look good.

@huidongc huidongc merged commit bf646bb into isaac-sim:develop May 6, 2026
34 checks passed
@huidongc huidongc deleted the rendering-test-flaky-marker branch May 6, 2026 04:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

isaac-lab Related to Isaac Lab team

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants