Prevents early numpy imports to avoid Kit crash by pbarejko · Pull Request #5620 · isaac-sim/IsaacLab

pbarejko · 2026-05-14T22:24:51Z

Description

Please include a summary of the change and which issue is fixed. Please also include relevant motivation and context.
List any dependencies that are required for this change.

Fixes # (issue)

Type of change

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (existing functionality will not work without user modification)
Documentation update

Screenshots

Please attach before and after screenshots of the change if applicable.

Checklist

I have read and understood the contribution guidelines
I have run the pre-commit checks with ./isaaclab.sh --format
I have made corresponding changes to the documentation
My changes generate no new warnings
I have added tests that prove my fix is effective or that my feature works
I have updated the changelog and the corresponding version in the extension's config/extension.toml file
I have added my name to the CONTRIBUTORS.md or my name already exists there

greptile-apps · 2026-05-14T22:26:48Z

Greptile Summary

This PR contains two debugging-only changes that appear unintentional for a merge to develop: a broken pytest invocation and a CI runner switch to a staging pool.

tools/conftest.py: -s and -vv are inserted between -m and pytest in the subprocess command list, so they are parsed as Python interpreter flags rather than pytest arguments. This silently prevents output-capture disabling and verbose pytest output while also stripping user site-packages from sys.path.
.github/workflows/build.yaml: The build job runner label is changed from gpu to gpu-stg, routing production CI Docker builds to the staging GPU runner pool.

Confidence Score: 2/5

Not safe to merge — both changes are debugging artefacts that actively break the CI pipeline and test runner.

The pytest command in conftest.py is now malformed: -s and -vv land in the Python interpreter argument list, not pytest's, so the subprocess will either error on import-resolution or run with wrong sys.path settings. Separately, the CI build job is redirected to gpu-stg, meaning every Docker-image build triggered on develop would run on staging hardware until reverted.

tools/conftest.py and .github/workflows/build.yaml both need attention before this is merged.

Important Files Changed

Filename	Overview
tools/conftest.py	Debugging flags `-s` and `-vv` inserted at the wrong position in the pytest subprocess command — placed between `-m` and `pytest` so they are consumed by the Python interpreter, not by pytest.
.github/workflows/build.yaml	Build job runner changed from `gpu` to `gpu-stg`; likely a temporary debugging change that routes CI builds to a staging runner pool instead of the production one.

Comments Outside Diff (1)

tools/conftest.py, line 340-345 (link)

The -s and -vv flags are inserted between -m and pytest, so Python interprets them as interpreter-level flags rather than pytest arguments. Specifically, -s becomes Python's "don't add user site-packages to sys.path" flag (which can break imports in the test environment), and -vv would be treated as two -v (verbose interpreter) flags. Neither flag is forwarded to pytest, so the intended behaviour (no output capture, extra verbose pytest output) silently does not take effect.

_{Reviews (1): Last reviewed commit: "py test args" | Re-trigger Greptile}

greptile-apps · 2026-05-14T22:26:55Z

  build:
    name: Build Base Docker Image
-    runs-on: [self-hosted, gpu]
+    runs-on: [self-hosted, gpu-stg]


The build job runner label was changed from gpu to gpu-stg, routing all Docker-image builds to the staging GPU runner pool. If this was intentional for debugging purposes only, it should be reverted before merging to develop — merging as-is means production CI will continue to run on staging hardware.

isaaclab-review-bot

Review Summary

This PR makes two infrastructure/debugging changes:

Runner change (.github/workflows/build.yaml): gpu → gpu-stg
Pytest verbosity (tools/conftest.py): Added -s -vv flags

⚠️ Concerns

PR Metadata Issues

Incomplete description: The PR body uses template placeholders without actual content explaining the purpose
No linked issue: "Fixes #" is empty
Unchecked checklist: All items remain unchecked
Vague title: "Debugging" doesn't describe the actual changes

Technical Concerns

Runner Change (gpu → gpu-stg)

Is gpu-stg a staging/test runner? Please clarify the intent
Will this affect CI reliability if staging runners have different availability or configuration?
Is this intended to be temporary for debugging, or permanent?

Pytest Flags (-s -vv)

Adding -s disables output capture, which will significantly increase CI log verbosity
-vv (very verbose) will further increase output volume
Impact: All CI test runs will produce much larger logs, which may:
- Slow down CI execution
- Make it harder to identify actual failures in verbose output
- Increase storage/bandwidth for CI artifacts

🔍 Questions for Author

Is this PR intended for debugging purposes only? If so, consider marking it as draft or adding [WIP] to the title
Should these changes be reverted after debugging? If yes, please note that in the PR description
What issue or problem are you debugging? This context would help reviewers understand the necessity of these changes

📋 Recommendations

Update PR description with actual context explaining the debugging purpose
Consider if verbose pytest output should be permanent or temporary
Clarify the runner change rationale
If temporary, consider using a feature flag or environment variable instead of hardcoding debug flags

Waiting for clarification before approving.

Update (8447f64): New commits add PYTHONFAULTHANDLER=1 and expand PYTHONUNBUFFERED=1 across multiple files (action.yml, Dockerfile.installci, utils.py, conftest.py, run_install_ci.py). These are reasonable diagnostic enhancements for better crash tracebacks and real-time output streaming. ✅ No concerns with these additions.

However, original concerns remain unaddressed:

gpu-stg runner change still present
-s -vv verbose pytest flags still present

Still awaiting clarification on whether these debugging changes are temporary or intended for permanent merge.

Update (943d9fa): Runner label changed again: [self-hosted, gpu-stg] → [self-hosted] (removes GPU requirement entirely for build job). ⚠️ New concern: Build job now runs on any self-hosted runner without GPU label — is this intentional? The original was [self-hosted, gpu].

Original concerns still apply:

-s -vv verbose pytest flags still present
PR metadata still incomplete

Awaiting clarification on intended runner configuration.

Update (c08d001): Pytest verbosity reduced from -vv to -v. ✅ This partially addresses the verbose output concern — single verbose (-v) is more reasonable than very verbose (-vv).

Remaining concerns:

-s flag still disables output capture (increased log size)
Runner set to [self-hosted] (no GPU label) — clarification needed
PR metadata still incomplete

Still awaiting clarification on runner configuration and whether these changes are temporary.

Update (bc8d650): Added thread count constraints (OPENBLAS_NUM_THREADS=1, OMP_NUM_THREADS=1, MKL_NUM_THREADS=1) across action.yml, utils.py, conftest.py, and run_install_ci.py. ✅ These are reasonable additions to prevent thread oversubscription during parallel test execution — no concerns with these changes.

Remaining concerns unchanged:

-s flag still disables pytest output capture
Runner configuration still needs clarification
PR metadata incomplete

Update (9b90222): New lazy import implementation for isaaclab.app:

Replaced lazy_export() from isaaclab.utils.module with a custom __getattr__ using only importlib
Added new test file test_app_imports.py verifying no numpy/torch/scipy loaded on import

✅ This is a positive improvement — avoids pulling in heavy numeric dependencies when importing isaaclab.app, which is important for fast startup before Kit initializes.

Remaining concerns unchanged:

-s flag still disables pytest output capture
Runner [self-hosted] (no GPU) — clarification needed
PR metadata still incomplete

Update (f438131): ⚠️ Reverted the lazy import improvement:

isaaclab/app/__init__.py: Switched back to using lazy_export() from isaaclab.utils.module (which may pull in heavier dependencies)
Deleted test_app_imports.py — the test that verified no numpy/torch/scipy were loaded on import is now gone

This removes the isolation benefit from the previous commit. If lightweight imports before Kit startup are important, consider keeping the standalone __getattr__ implementation.

Remaining concerns unchanged:

-s flag still disables pytest output capture
Runner [self-hosted] (no GPU) — clarification needed
PR metadata still incomplete

Update (67d10a4): PR scope significantly reduced. Current diff now only contains:

source/isaaclab/isaaclab/utils/__init__.py: Removed explicit configclass import (now lazy-loaded via attach_stub) — ✅ reasonable change for lazy import consistency
tools/conftest.py: Added -s -v pytest flags — ⚠️ previous concern still applies (increases log verbosity)

The workflow/runner changes and environment variable additions appear to have been removed from this PR.

Current status: Only the pytest verbosity flags (-s -v) remain as a concern. Is this PR being used for debugging only, or are these flags intended to be permanent?

Update (3b15efd): New commit adds deferred CUDA device setting in app_launcher.py — torch import/set_device() is now deferred until after SimulationApp starts to avoid NumPy/OpenBLAS at-fork handlers crashing Kit's platform-info fork during startup. ✅ This is a well-implemented fix with clear comments explaining the rationale. No concerns with this change.

Current PR scope:

app_launcher.py: Deferred CUDA device setting — ✅ looks good
utils/__init__.py: Lazy configclass import — ✅ looks good
conftest.py: -s -v pytest flags — ⚠️ previous concern still applies (increases log verbosity)

Overall this PR is improving. The main technical changes are solid. Only remaining question is whether the pytest verbosity flags are intended to be permanent.

Update (ba57283): ✅ Pytest -v flag removed — only -s remains in conftest.py. This addresses my verbosity concern.

New changes in this commit:

physics_manager_cfg.py, newton_collision_cfg.py, newton_manager_cfg.py: Changed to direct imports (from isaaclab.utils.configclass import configclass) — ✅ cleaner, more explicit
test_isaac_rtx_renderer_utils.py: Refactored tests with a mock_sim_context fixture — ✅ cleaner test code, removes duplicated patch blocks

Current PR scope looks clean. No remaining concerns with the technical changes. The -s flag still disables output capture but without -v the impact is reasonable for debugging purposes.---

Update (fa628db): Added changelog entry (pbarejko-debugging.rst) documenting the CUDA device setting fix and .skip files for newton/physx sub-packages. ✅ Documentation-only changes — no new concerns.

Update (c70e9e0): Test infrastructure improvements:

test_noise.py: Removed unnecessary AppLauncher/simulation_app boilerplate — this test only exercises pure math utilities and doesn't need the Omniverse app. ✅ Good cleanup.
test_wrench_composer.py: Added teardown_module() to close simulation_app after tests complete. ✅ Proper resource cleanup.

No new concerns. PR continues to look solid.

Update (f8b2c7a): test_wrench_composer.py now also removes the AppLauncher/simulation_app boilerplate entirely (and the teardown_module added in the previous commit). ✅ This test only exercises pure math/torch utilities and doesn't need the Omniverse app — same cleanup as test_noise.py. Good improvement for faster test execution.

No new concerns. PR looks ready for merge.

Update (ee49ecf): Changelog consolidation — moved fragment files from changelog.d/ into main CHANGELOG.rst files and bumped versions (isaaclab 5.2.0→5.2.1, isaaclab_newton 0.9.0→0.9.1, isaaclab_physx 0.7.0→0.7.1). ✅ Standard release housekeeping. No code changes, no new concerns.

Final status: PR looks good to merge.

AntoineRichard

LGTM

AntoineRichard · 2026-05-15T07:47:35Z

Consider removing this if we're going to merge this?

pbarejko · 2026-05-15T15:17:33Z

Closing because of #5633

# Description This PR is based on and includes the changes from #5620, then adds one CI fix on top: it unsets `HUB__ARGS__DETECT_ONLY` inside the Docker test container before running Isaac Lab commands. Some base images set this flag, which prevents OmniHub from starting and makes cold Nucleus asset retrieval fall back to slow repeated retries. This was reproduced from the failing Actions job: https://github.com/isaac-sim/IsaacLab/actions/runs/25904143763/job/76158743634 The affected `test_rsl_rl_export_flow.py` Dexsuite Kuka-Allegro export timed out at 600 s with the flag set, then completed in about 73 s with the flag unset after clearing the local KukaAllegro mirror. Fixes # N/A ## Type of change - Bug fix (non-breaking change which fixes an issue) ## Screenshots N/A - CI-only change. ## Checklist - [x] I have read and understood the [contribution guidelines](https://isaac-sim.github.io/IsaacLab/main/source/refs/contributing.html) - [x] I have run the [`pre-commit` checks](https://pre-commit.com/) with `./isaaclab.sh --format` - [x] I have made corresponding changes to the documentation (N/A - CI-only change) - [x] My changes generate no new warnings - [x] I have added tests that prove my fix is effective or that my feature works (validated with the affected Docker export test) - [x] I have added a changelog fragment under `source/<pkg>/changelog.d/` for every touched package (N/A for the CI-only commit; #5620 carries its own changelog fragments) - [x] I have added my name to the `CONTRIBUTORS.md` or my name already exists there ## Test Plan - `./isaaclab.sh -f` - Docker reproduction with `HUB__ARGS__DETECT_ONLY=true`: `test_export_flow[Isaac-Dexsuite-Kuka-Allegro-Reorient-v0]` timed out after 600 s. - Docker reproduction with `HUB__ARGS__DETECT_ONLY` unset after clearing the KukaAllegro mirror: `test_export_flow[Isaac-Dexsuite-Kuka-Allegro-Reorient-v0]` passed in 72.75 s. --------- Co-authored-by: Piotr Barejko <pbarejko@nvidia.com>

pbarejko requested review from hhansen-bdai and kellyguo11 as code owners May 14, 2026 22:24

github-actions Bot added bug Something isn't working infrastructure labels May 14, 2026

huidongc reviewed May 14, 2026

View reviewed changes

Comment thread tools/conftest.py

greptile-apps Bot reviewed May 14, 2026

View reviewed changes

isaaclab-review-bot Bot reviewed May 14, 2026

View reviewed changes

pbarejko requested a review from pascal-roth as a code owner May 14, 2026 22:47

github-actions Bot added the isaac-lab Related to Isaac Lab team label May 14, 2026

Update

67d10a4

pbarejko force-pushed the pbarejko/debugging branch from f438131 to 67d10a4 Compare May 15, 2026 03:08

pbarejko added 5 commits May 14, 2026 20:57

Defer use of torch

3b15efd

Explicit import of configclass

c51ba2b

Remove verbose flag

f52a750

Fix RTX Renderer utils tests

ba57283

rst files

fa628db

hujc7 mentioned this pull request May 15, 2026

[CI] Set OMP_/OPENBLAS_/MKL_NUM_THREADS=1 in test containers #5625

Closed

3 tasks

pbarejko added 3 commits May 14, 2026 22:39

Remove app for noise tests

d04e1a1

Shutdown app for wrench composer

c70e9e0

Remove app launcher from wrench composer test

f8b2c7a

hujc7 mentioned this pull request May 15, 2026

[CI][DO NOT MERGE] Test new Isaac Sim image latest-develop sha256:06197a67 #5630

Closed

AntoineRichard approved these changes May 15, 2026

View reviewed changes

Comment thread tools/conftest.py

Copy link
Copy Markdown

Collaborator

AntoineRichard May 15, 2026 •

edited

Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider removing this if we're going to merge this?

AntoineRichard changed the title ~~Debugging~~ Prevents early numpy imports to avoid Kit crash May 15, 2026

AntoineRichard mentioned this pull request May 15, 2026

Fixes OmniHub startup in Docker tests #5633

Merged

7 tasks

Merge branch 'develop' into pbarejko/debugging

ee49ecf

pbarejko closed this May 15, 2026

hujc7 mentioned this pull request May 15, 2026

[Fix] Pin numpy!=2.3.5 to dodge OpenBLAS atfork SIGSEGV at Kit fork() #5642

Draft

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prevents early numpy imports to avoid Kit crash#5620

Prevents early numpy imports to avoid Kit crash#5620
pbarejko wants to merge 10 commits into
isaac-sim:developfrom
pbarejko:pbarejko/debugging

pbarejko commented May 14, 2026

Uh oh!

Uh oh!

greptile-apps Bot commented May 14, 2026 •

edited

Loading

Comments Outside Diff (1)

Uh oh!

greptile-apps Bot May 14, 2026

Uh oh!

isaaclab-review-bot Bot left a comment •

edited

Loading

Uh oh!

AntoineRichard left a comment

Uh oh!

AntoineRichard May 15, 2026 •

edited

Loading

Uh oh!

pbarejko commented May 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

pbarejko commented May 14, 2026

Description

Type of change

Screenshots

Checklist

Uh oh!

Uh oh!

greptile-apps Bot commented May 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 2/5

Important Files Changed

Comments Outside Diff (1)

Uh oh!

greptile-apps Bot May 14, 2026

Choose a reason for hiding this comment

Uh oh!

isaaclab-review-bot Bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Review Summary

⚠️ Concerns

PR Metadata Issues

Technical Concerns

🔍 Questions for Author

📋 Recommendations

Uh oh!

AntoineRichard left a comment

Choose a reason for hiding this comment

Uh oh!

AntoineRichard May 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pbarejko commented May 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

greptile-apps Bot commented May 14, 2026 •

edited

Loading

isaaclab-review-bot Bot left a comment •

edited

Loading

AntoineRichard May 15, 2026 •

edited

Loading