Skip to content

Feature/leapp export integration#5105

Open
frlai wants to merge 11 commits intoisaac-sim:developfrom
frlai:feature/leapp_export_integration
Open

Feature/leapp export integration#5105
frlai wants to merge 11 commits intoisaac-sim:developfrom
frlai:feature/leapp_export_integration

Conversation

@frlai
Copy link

@frlai frlai commented Mar 24, 2026

Description

FEATURE: adds new LEAPP export functionality that targets manged environments that using rsl_rl. Policies can be exported end to end. Also adds a new direct deployment environment to deploy exported policies, bypassing all the manager classes.

Type of change

  • New feature (non-breaking change which adds functionality)

Checklist

  • I have read and understood the contribution guidelines
  • I have run the pre-commit checks with ./isaaclab.sh --format
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • I have updated the changelog and the corresponding version in the extension's config/extension.toml file
  • I have added my name to the CONTRIBUTORS.md or my name already exists there

Additional TODOs:

  1. Add documentation
  2. Add integration into direct environment documentation
  3. Edit tests to work with without pretrained checkpoints
  4. Type-hinting and general function name update

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Mar 24, 2026

Greptile Summary

This PR introduces end-to-end LEAPP export support for RSL-RL manager-based environments and a new DirectDeploymentEnv that runs exported policies in simulation without Isaac Lab's training managers. It also adds pervasive @leapp_tensor_semantics annotations across articulation data, sensor data, and command terms so that LEAPP can automatically wire I/O during export.

Key changes and concerns:

  • CircularBuffer regression (circular_buffer.py): The refactor to support export tracing removes an important guard — calling __getitem__ when _num_pushes == 0 (e.g. right after reset() before the first append()) produces index_in_buffer = max_length, an out-of-bounds tensor index. The docstring still documents RuntimeError: If the buffer is empty but the check is gone. Additionally, _append now calls torch.cat every step, creating a fresh (max_length, batch_size, …) tensor each time instead of an in-place pointer write — a potential throughput regression for tasks with long history windows.
  • deploy.py resource leak: env.close() is inside the try block after the main loop; it is never called when KeyboardInterrupt interrupts env.step(). A finally block is needed.
  • export.py typo: Variable vilidate (line 289) is passed directly as validate=vilidate to leapp.compile_graph().
  • DirectDeploymentEnv: Well-structured; the I/O resolver correctly handles state:, command:, and write: connection strings. The hard-coded cfg.scene.num_envs = 1 in __init__ is consistent with single-env deployment intent.
  • LEAPP semantic annotations: The @leapp_tensor_semantics decorator is cleanly applied across the entire sensor/asset/command API surface; MRO-based semantic resolution in the proxy layer is correct.
  • The PR author notes that documentation, tests without pretrained checkpoints, and type-hinting are still outstanding TODOs.

Confidence Score: 3/5

  • Not yet safe to merge — the CircularBuffer out-of-bounds regression affects existing training/play workflows, not just the new export path.
  • The LEAPP export infrastructure and DirectDeploymentEnv are well-engineered, but the CircularBuffer refactor introduces a concrete out-of-bounds index regression (index_in_buffer = max_length when _num_pushes == 0) that affects any task using observation history — a broad regression outside the new feature scope. The deploy.py resource leak and export.py typo are minor but should be fixed before merge. The PR also self-notes missing documentation, tests without pretrained checkpoints, and type-hinting as outstanding TODOs.
  • source/isaaclab/isaaclab/utils/buffers/circular_buffer.py needs the empty-buffer guard restored; scripts/reinforcement_learning/deploy.py needs a finally block for env.close().

Important Files Changed

Filename Overview
scripts/reinforcement_learning/deploy.py New deployment script for LEAPP-exported policies; env.close() is never called on KeyboardInterrupt due to missing finally block.
scripts/reinforcement_learning/rsl_rl/export.py New RSL-RL LEAPP export script; contains a typo (vilidate) passed directly to leapp.compile_graph(validate=…), and disables TorchScript globally before imports which may surprise downstream users.
source/isaaclab/isaaclab/envs/direct_deployment_env.py New environment that wires LEAPP InferenceManager to scene entities; well-structured I/O resolution, but hardcodes num_envs = 1 without validation against the loaded LEAPP model.
source/isaaclab/isaaclab/utils/buffers/circular_buffer.py Refactored to use torch.cat in _append (tensor allocation every step) and removed the empty-buffer guard in __getitem__, introducing a potential out-of-bounds index when _num_pushes == 0.
source/isaaclab/isaaclab/utils/leapp/export_annotator.py Complex proxy-based patcher for annotation; disables training managers cleanly, uses a unified dedup cache for obs/action tensors, and correctly patches history buffers as LEAPP state.
source/isaaclab/isaaclab/utils/leapp/proxy.py Proxy classes for transparent observation/action annotation; MRO-walking semantic resolution is correct and the dedup cache key (id(real_data), name) is appropriate.
source/isaaclab/isaaclab/utils/leapp/leapp_semantics.py Clean semantic metadata decorator and element-name resolver helpers; no issues found.
source/isaaclab/isaaclab/sensors/ray_caster/ray_caster_data.py Fields converted to private with public properties for LEAPP annotation; minor: original quat_w docstring said "(x, y, z, w)" but new docstring says "(w, x, y, z)" — verify actual data convention matches QUAT_WXYZ.
source/isaaclab_rl/test/test_rsl_rl_export_flow.py Integration tests launch export.py as a subprocess per task; gracefully skips missing checkpoints; no issues found.
source/isaaclab/isaaclab/assets/articulation/base_articulation_data.py LEAPP semantic decorators added to all abstract properties; type annotations updated from list[str] = None to `list[str]
source/isaaclab/isaaclab/managers/manager_term_cfg.py Added cmd_kind and element_names fields to CommandTermCfg for deployment metadata; inline comments explain intent but formal docstrings would be cleaner for a public config class.

Sequence Diagram

sequenceDiagram
    participant U as User / Script
    participant EX as export.py
    participant EA as ExportPatcher
    participant ENV as ManagerBasedRLEnv
    participant LP as leapp (annotate)
    participant DD as DirectDeploymentEnv
    participant IM as InferenceManager (LEAPP)

    Note over U,LP: Export Flow
    U->>EX: run export.py --task --use_pretrained_checkpoint
    EX->>EA: patch_env_for_export(env, task_name, method)
    EA->>ENV: patch obs_manager, action_manager, history buffers
    EX->>LP: leapp.start(task_name)
    EX->>ENV: env.reset()
    loop validation_steps
        EX->>ENV: policy(obs)  [traced through annotated proxies]
        ENV-->>LP: annotate.input_tensors(state reads)
        ENV-->>LP: annotate.output_tensors(action writes)
        EX->>ENV: env.step(actions)
    end
    EX->>LP: leapp.stop()
    EX->>LP: leapp.compile_graph() → .onnx + .yaml

    Note over U,IM: Deployment Flow
    U->>DD: DirectDeploymentEnv(cfg, leapp.yaml)
    DD->>IM: InferenceManager(leapp.yaml)
    DD->>DD: _resolve_io() [parse state:/command:/write: connections]
    loop simulation running
        DD->>DD: _read_inputs() [scene.data.property / command_manager]
        DD->>IM: run_policy(inputs)
        IM-->>DD: outputs
        DD->>DD: _write_outputs() [entity.set_joint_*_target_*]
        DD->>DD: decimation physics loop
    end
Loading

Reviews (1): Last reviewed commit: "precommit updates" | Re-trigger Greptile

Comment on lines +289 to +290
vilidate = args_cli.validation_steps > 0
leapp.compile_graph(visualize=not args_cli.disable_graph_visualization, validate=vilidate)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Typo: vilidate should be validate

The variable on line 289 is misspelled as vilidate. It is immediately passed as validate=vilidate to leapp.compile_graph(). While the code runs (Python doesn't care about the variable name), the docstring in __getitem__ still says "Raises: RuntimeError: If the buffer is empty." and a reader tracing through the code will be confused by the typo.

Suggested change
vilidate = args_cli.validation_steps > 0
leapp.compile_graph(visualize=not args_cli.disable_graph_visualization, validate=vilidate)
validate = args_cli.validation_steps > 0
leapp.compile_graph(visualize=not args_cli.disable_graph_visualization, validate=validate)

Comment on lines +55 to +61
try:
with torch.inference_mode():
while simulation_app.is_running():
env.step()
env.close()
except KeyboardInterrupt:
pass
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 env.close() never called on KeyboardInterrupt

env.close() sits inside the try block after the run loop, so it is only executed when simulation_app.is_running() returns False on its own. If a KeyboardInterrupt (or any other exception) interrupts env.step(), the except KeyboardInterrupt: pass path silently swallows the interrupt without calling env.close(), leaving scene handles and the SimulationContext open.

Move env.close() into a finally block so cleanup always runs:

Suggested change
try:
with torch.inference_mode():
while simulation_app.is_running():
env.step()
env.close()
except KeyboardInterrupt:
pass
env.reset()
try:
with torch.inference_mode():
while simulation_app.is_running():
env.step()
except KeyboardInterrupt:
pass
finally:
env.close()

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is based on play.py play.py does not implement this for uniformity I believe deploy.py also should not.

Comment on lines 158 to 165

# admissible lag
valid_keys = torch.minimum(key, self._num_pushes - 1)
# the index in the circular buffer (pointer points to the last+1 index)
index_in_buffer = torch.remainder(self._pointer - valid_keys, self.max_length)
# The buffer is stored oldest->newest along dimension 0, so the most
# recent item lives at the last index.
index_in_buffer = (self.max_length - 1 - valid_keys).to(dtype=torch.long)
# return output
return self._buffer[index_in_buffer, self._ALL_INDICES]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Out-of-bounds index when buffer is accessed before any push (or right after reset)

The previous implementation guarded against this with an explicit RuntimeError:

if torch.any(self._num_pushes == 0) or self._buffer is None:
    raise RuntimeError("Attempting to retrieve data on an empty circular buffer. Please append data first.")

That check was removed, but the docstring on lines 151-153 still documents RuntimeError: If the buffer is empty.

With the new code, when _num_pushes[i] == 0 for any environment index i (which is always true immediately after reset() and before the first append()):

valid_keys[i] = torch.minimum(key[i], 0 - 1)  # → -1
index_in_buffer[i] = (max_length - 1) - (-1)  # → max_length  (out of bounds!)

self._buffer[max_length, …] is an out-of-bounds index access on a tensor whose first dimension is exactly max_length, causing a silent wrong-index read or a CUDA assertion failure instead of the documented RuntimeError.

The guard should be restored (or valid_keys should be clamped to zero before computing index_in_buffer).

Comment on lines +135 to +136
def _append(self, data: torch.Tensor):
self._buffer = torch.cat([self._buffer[1:], data.unsqueeze(0)], dim=0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 _append allocates a new tensor every step

The new implementation of _append:

def _append(self, data: torch.Tensor):
    self._buffer = torch.cat([self._buffer[1:], data.unsqueeze(0)], dim=0)

allocates a brand-new (max_length, batch_size, …) tensor on every call — O(max_length × batch_size × feature_dim) bytes per environment step. The previous in-place pointer-based write was O(1) in memory.

For tasks using long history windows (e.g. history_length=50) and large batch_size, this can cause measurable throughput regression during training or play.

The motivation stated in the commit (tracing-based export compatibility) could instead be achieved by keeping the in-place write for runtime and only switching to torch.cat inside the _patch_history_buffer_append wrapper used during export tracing.

@github-actions github-actions bot added enhancement New feature or request isaac-lab Related to Isaac Lab team labels Mar 25, 2026
Copy link

@isaaclab-review-bot isaaclab-review-bot bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 Isaac Lab Review Bot

PR: #5105 — Feature/LEAPP Export Integration
Author: @frlai | Changes: +2684/−56 across 33 files


Summary

This PR adds end-to-end LEAPP export functionality for RSL-RL policies trained in manager-based Isaac Lab environments, plus a new DirectDeploymentEnv that can replay exported policies by bypassing all manager classes. The approach uses proxy objects to intercept tensor reads/writes during a tracing pass, then annotates them with semantic metadata for LEAPP graph compilation.

Core new modules:

  • isaaclab/utils/leapp/ — export annotator, proxy chain, semantic metadata decorators, utility helpers
  • isaaclab/envs/direct_deployment_env.py — lightweight deployment env wired via YAML connection strings
  • scripts/reinforcement_learning/rsl_rl/export.py — export script
  • scripts/reinforcement_learning/deploy.py — deployment script
  • isaaclab_rl/test/test_rsl_rl_export_flow.py — integration tests (60+ tasks)

Cross-cutting changes:

  • @leapp_tensor_semantics decorators on ~150+ properties across articulation, rigid object, rigid object collection, contact sensor, frame transformer, IMU, and ray caster data classes
  • CircularBuffer rewritten from pointer-based ring to shift-based append (for traceability)
  • RayCasterData fields converted from plain attributes to @property + private backing fields
  • CommandTermCfg extended with cmd_kind and element_names
  • Minor observation function fixes for tracing compatibility (humanoid, deploy, dexsuite)

Architecture Assessment

Strengths:

  1. The proxy-based approach (_EnvProxy_SceneProxy_EntityProxy_DataProxy) is genuinely elegant — it avoids modifying any concrete data class implementations and resolves semantics lazily via MRO walking.
  2. The shared dedup cache between observation and action paths ensures a single LEAPP input edge for properties read from both sides.
  3. The isaaclab_connection protocol in the YAML provides a clean deployment contract between export and inference.
  4. Separating export annotation (export_annotator.py) from deployment wiring (direct_deployment_env.py) is a good architectural choice.

Concerns:

  1. Hard dependency on leapp at import time in core modules. Every data class (base_articulation_data.py, base_rigid_object_data.py, all sensor data classes) now imports from leapp import InputKindEnum and from isaaclab.utils.leapp.leapp_semantics import ... at module level. This means leapp becomes a mandatory dependency for anyone who imports Isaac Lab, even if they never intend to export. This is a significant change to the dependency footprint. Consider making the import conditional or lazy (e.g., TYPE_CHECKING guard for the enum types, with the decorator doing a runtime check).

  2. Missing __init__.py for isaaclab/utils/leapp/ — the package directory has no __init__.py. While implicit namespace packages work in Python 3.3+, Isaac Lab consistently uses explicit __init__.py files. This should be added for consistency and to define the public API.

  3. CircularBuffer rewrite has subtle behavioral change. The shift from torch.roll to torch.cat([self._buffer[1:], data.unsqueeze(0)], dim=0) allocates a new tensor on every append (O(max_len) copy). The old implementation did O(1) pointer-based writes. For large buffers or high-frequency use, this is a performance regression. The tradeoff is valid for export traceability, but it should be documented and benchmarked. Consider keeping the original O(1) path for training and only switching to the traceable path during export.


Code Quality Issues

  1. Typo in export.py line 279: vilidate = args_cli.validation_steps > 0 — should be validate.

  2. export.py docstring says "Train an RL agent" (line 35) — should say "Export an RL agent checkpoint".

  3. RayCasterData.quat_w docstring inconsistency: The new property docstring says (w, x, y, z) but the original field and Isaac Lab convention use (x, y, z, w). The QUAT_WXYZ_ELEMENT_NAMES also uses ["qw", "qx", "qy", "qz"]. Clarify which convention is actually used.

  4. CommandTermCfg fields lack proper docstrings. The new cmd_kind and element_names fields have only inline comments. They should have proper RST-formatted docstrings like other fields in the class.

  5. _first_param_name in direct_deployment_env.py works correctly for bound methods (where inspect.signature strips self), but add a comment clarifying that bound methods are expected input.

  6. direct_deployment_env.py hardcodes num_envs = 1. This is reasonable for deployment but should be documented as a deliberate constraint.


Export Correctness

  1. ensure_torch_tensor silently swallows conversion errors. In utils.py, the except Exception catch-all means that if warp-to-torch conversion fails for any reason, the raw warp array is passed through silently. This could lead to hard-to-debug issues. At minimum, log a warning.

  2. patch_warp_to_torch_passthrough mutates a global function (wp.to_torch). This side-effect persists for the entire process lifetime with no way to unpatch. Consider using a context manager pattern.

  3. History buffer patching (_patch_history_buffer_append) replaces _append on the instance. The new CircularBuffer._append method seems designed specifically for this monkey-patching, but it is not documented as an extension point. If the buffer implementation changes, this silently breaks.

  4. torch.jit._state.disable() at module level in export.py disables TorchScript for the entire process. This is extremely aggressive — if any other code in the pipeline relies on JIT compilation, it breaks silently.


Test Coverage

The test file (test_rsl_rl_export_flow.py) is a solid integration test suite covering 60+ tasks. However:

  1. Tests require pretrained checkpoints — they skip if unavailable. The PR checklist explicitly notes "Edit tests to work without pretrained checkpoints" as a TODO. This means the test suite may be effectively a no-op in CI.

  2. No unit tests for the core modules: export_annotator.py, proxy.py, leapp_semantics.py, direct_deployment_env.py, or the CircularBuffer changes. The proxy chain and cache deduplication logic are complex enough to warrant targeted unit tests.

  3. No deployment testDirectDeploymentEnv is completely untested.


Documentation

The PR checklist acknowledges missing documentation:

  • "Add documentation" — TODO
  • "Add integration into direct environment documentation" — TODO

For a feature of this scope, documentation is critical before merge.


CI Status

labeler — pass (only job that ran)

No lint, build, or test CI jobs executed. This is concerning for a 33-file, 2600+ line PR.


Verdict

The architecture is well-designed and the proxy-based annotation approach is clever. Key items to address:

Priority Issue
P0 Hard leapp dependency in core modules — must be optional/lazy
P0 Missing __init__.py for isaaclab/utils/leapp/
P1 CircularBuffer performance regression — document or mitigate
P1 Typo: vilidatevalidate
P1 No unit tests for core proxy/annotation logic
P2 Documentation required per contribution guidelines
P2 ensure_torch_tensor silent error swallowing
P2 Global side effects (wp.to_torch mutation, torch.jit disable) need documentation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request isaac-lab Related to Isaac Lab team

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant