mark all rtx-based rendering tests flaky for now#5508
mark all rtx-based rendering tests flaky for now#5508huidongc merged 3 commits intoisaac-sim:developfrom
Conversation
There was a problem hiding this comment.
🤖 Isaac Lab Review Bot
Summary
This PR marks all RTX-based rendering tests as flaky by moving the @pytest.mark.flaky(max_runs=3, min_passes=1) decorator from individual test functions to the parametrized test cases in rendering_test_utils.py. It also expands CI detection from GitHub Actions only to include GitLab CI. The change is mechanically correct and consolidates flaky markers at the parameter level.
Architecture Impact
Self-contained. This change affects only test infrastructure in isaaclab_tasks/test/. The flaky markers are applied to PHYSICS_RENDERER_AOV_COMBINATIONS which is consumed by multiple test modules (test_rendering_cartpole.py, test_rendering_shadow_hand.py, test_rendering_dexsuite_kuka.py, and their kitless variants). The newton_renderer (warp) combinations intentionally remain non-flaky, preserving their deterministic expectations.
Implementation Verdict
Minor fixes needed
Test Coverage
This is a test infrastructure change, not a feature or bug fix. No new tests are required. The change appropriately targets known non-deterministic behavior in RTX rendering pipelines.
CI Status
No CI checks available yet. This PR should be verified against the actual CI to confirm the flaky markers are working as intended.
Findings
🔵 Improvement: rendering_test_utils.py:202-230 — KITLESS combinations missing flaky marker for ovrtx_renderer
The KITLESS_PHYSICS_RENDERER_AOV_COMBINATIONS list contains ovrtx_renderer entries which are RTX-based, but they only have _SKIP_ON_CI_MARK and not _FLAKY_MARK. If OVRTX is also non-deterministic like isaacsim_rtx_renderer, these should have both marks. If OVRTX tests are always skipped on CI anyway, this is moot, but the inconsistency could cause issues if someone runs these locally or if CI skip conditions change.
# Current: only _SKIP_ON_CI_MARK
marks=_SKIP_ON_CI_MARK,
# Consider if OVRTX is also non-deterministic:
marks=[_SKIP_ON_CI_MARK, _FLAKY_MARK],🔵 Improvement: rendering_test_utils.py:232-241 — newton_renderer (warp) kitless combinations inconsistent with kit-based
In PHYSICS_RENDERER_AOV_COMBINATIONS, the newton_renderer (warp) entries at lines 175-184 have no flaky marker, which is correct since warp is deterministic. The same is true in KITLESS_PHYSICS_RENDERER_AOV_COMBINATIONS at lines 232-241. This is consistent and correct — just confirming the design is intentional.
🟡 Warning: test_rendering_dexsuite_kuka.py:32 — Removed flaky marker relies on parameter-level marks working correctly
The @pytest.mark.flaky decorator was removed from the test function, relying entirely on the marks=_FLAKY_MARK in pytest.param(). This is valid pytest behavior, but note that if any test case in PHYSICS_RENDERER_AOV_COMBINATIONS is missing the _FLAKY_MARK (like the newton_renderer cases), those specific parametrized runs will NOT be flaky. Verify this is the intended behavior — it appears to be, since warp is deterministic.
🔵 Improvement: rendering_test_utils.py:68-72 — CI detection could use a helper function
The CI detection logic is duplicated conceptually (env var checks). Consider extracting to a named constant or function for clarity:
def _is_ci_environment() -> bool:
return any(os.environ.get(var) for var in ("CI", "GITHUB_ACTIONS", "GITLAB_CI"))This would make the intent clearer and ease future additions (e.g., Jenkins, CircleCI).
🔵 Improvement: changelog.d/huidongc-flaky-mark.skip — Empty changelog skip file
The .skip extension indicates this is intentionally not adding a changelog entry. This is appropriate for test-only changes, but consider whether this behavioral change (marking tests flaky) should be documented somewhere for future maintainers investigating test stability.
Greptile SummaryThis PR refines how RTX-based rendering tests are marked as flaky by moving the
Confidence Score: 4/5Safe to merge; changes are confined to test infrastructure and do not touch production code. The refactoring is straightforward and low-risk. The only notable issue is the inconsistent truthiness check for GITLAB_CI — the other two env-var guards use == 'true' but GITLAB_CI is evaluated as a bare truthy value, which would match any non-empty string. In practice GitLab only sets it to 'true', so this is unlikely to cause a real skip on an unintended runner, but it is a latent inconsistency worth fixing. rendering_test_utils.py — specifically the _SKIP_ON_CI definition on lines 70-72. Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[pytest collects parametrized test] --> B{renderer type?}
B -->|isaacsim_rtx_renderer| C[param has _FLAKY_MARK\nmax_runs=3, min_passes=1]
B -->|newton_renderer warp| D[No special mark\nrun once, must pass]
B -->|ovrtx_renderer kitless| E{CI environment?}
E -->|CI=true OR\nGITHUB_ACTIONS=true OR\nGITLAB_CI set| F[SKIP test]
E -->|local run| G[Run once, must pass]
C --> H{Test passes?}
H -->|yes| I[PASS]
H -->|no, runs < 3| C
H -->|no, runs = 3| J[FAIL]
Reviews (1): Last reviewed commit: "mark all rtx-based rendering tests flaky..." | Re-trigger Greptile |
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> Signed-off-by: HuiDong Chen <huidongc@nvidia.com>
There was a problem hiding this comment.
🤖 Isaac Lab Review Bot
Summary
The new commit (db81598) appears to be a rebase or sync with no functional changes to the PR content — the diff and implementation remain identical to the previous review.
Follow-up Assessment
All previous findings still apply as no code changes were made:
- The KITLESS
ovrtx_renderercombinations still only have_SKIP_ON_CI_MARKwithout_FLAKY_MARK(previous improvement suggestion stands) - The
newton_renderer(warp) combinations correctly remain non-flaky - The CI detection consolidation is complete
CI Status
The pre-commit check has failed (❌). This needs to be addressed before merging — likely the line at rendering_test_utils.py:69-70 exceeds the line length limit:
_SKIP_ON_CI = (
os.environ.get("CI") == "true" or os.environ.get("GITHUB_ACTIONS") == "true" or os.environ.get("GITLAB_CI") == "true"
)Implementation Verdict
Minor fixes needed — address the pre-commit failure.
New Findings
🔴 Critical: rendering_test_utils.py:69-70 — Pre-commit failure likely due to line length
The multi-condition line exceeds the line length limit. Split it:
_SKIP_ON_CI = (
os.environ.get("CI") == "true"
or os.environ.get("GITHUB_ACTIONS") == "true"
or os.environ.get("GITLAB_CI") == "true"
)There was a problem hiding this comment.
🤖 Isaac Lab Review Bot
Summary
The new commit (d408913) addresses the previous pre-commit failure by refactoring the CI detection logic to use a cleaner any() comprehension that fits within line length limits.
Follow-up Assessment
✅ Previous concern addressed: The line length issue has been fixed with _SKIP_ON_CI = any(os.environ.get(name) == "true" for name in ("CI", "GITHUB_ACTIONS", "GITLAB_CI")) — this is a clean, idiomatic solution.
Implementation Verdict
Ship it — the change is correct and pre-commit now passes.
CI Status
Pre-commit passes (✅). Most CI jobs are still pending but the relevant checks that have completed look good.
Description
Mark all RTX-based rendering test cases flaky until they can produce deterministic low-res camera outputs that pass golden image testing on every CI run.
Fixes # (issue)
Type of change
Screenshots
Please attach before and after screenshots of the change if applicable.
Checklist
pre-commitchecks with./isaaclab.sh --formatconfig/extension.tomlfileCONTRIBUTORS.mdor my name already exists there