Skip to content

Update DexSuite camera mini-batches#5668

Merged
kellyguo11 merged 1 commit into
isaac-sim:developfrom
ooctipus:zhengyuz/dexsuite-camera-minibatches
May 18, 2026
Merged

Update DexSuite camera mini-batches#5668
kellyguo11 merged 1 commit into
isaac-sim:developfrom
ooctipus:zhengyuz/dexsuite-camera-minibatches

Conversation

@ooctipus
Copy link
Copy Markdown
Collaborator

@ooctipus ooctipus commented May 18, 2026

Summary

Since Heterogeneous support #5024 for newton, the base camera training for dexsuite became more difficult to get decent success rate within 2000 iterations, increasing minibatch size from 2 to 8 helps recover the sample efficiency

sha               | description                    | SR     | verdict
------------------+--------------------------------+--------+--------
6621d49b          | parent commit                  | 0.7122 | PASS
965136dc          | PR #5024 default               | 0.0051 | FAIL
965136dc_minib8   | PR #5024 + num_mini_batches=8  | 0.6911 | PASS
  • Updated DexSuite Kuka-Allegro single-camera and duo-camera RSL-RL PPO examples to use 8 mini-batches per update.
  • Added an isaaclab_tasks changelog fragment for the example config change.

Set the camera-based DexSuite Kuka-Allegro RSL-RL examples to
use 8 mini-batches per update for PPO training.
@ooctipus ooctipus requested a review from Mayankm96 as a code owner May 18, 2026 08:11
@github-actions github-actions Bot added the isaac-lab Related to Isaac Lab team label May 18, 2026
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 18, 2026

Greptile Summary

This PR updates the num_mini_batches hyperparameter from 2 to 8 for the single_camera and duo_camera DexSuite Kuka-Allegro RSL-RL PPO runner configs, and adds a corresponding changelog fragment.

  • Both camera-based PPO runner configs (single_camera, duo_camera) now split each rollout buffer into 8 mini-batches per update instead of 2, reducing per-update batch size — a common adjustment when CNN-based observation pipelines increase per-sample memory cost.
  • The state-only default config is unaffected and continues to use num_mini_batches=4 from ALGO_CFG.
  • A changelog fragment is included that accurately describes the change.

Confidence Score: 5/5

Safe to merge — only a numeric hyperparameter is changed in two example configs with no effect on any shared base config or other tasks.

The change is isolated to two camera-based example runner configs. num_steps_per_env=32 is evenly divisible by 8, so the rollout buffer splits cleanly. The base ALGO_CFG and the state-only default config are untouched, and the changelog entry matches the change.

No files require special attention.

Important Files Changed

Filename Overview
source/isaaclab_tasks/isaaclab_tasks/manager_based/manipulation/dexsuite/config/kuka_allegro/agents/rsl_rl_ppo_cfg.py Bumps num_mini_batches from 2 to 8 for both single_camera and duo_camera PPO runner configs; base ALGO_CFG (used by the default state-only config) remains at 4.
source/isaaclab_tasks/changelog.d/zhengyuz-dexsuite-camera-minibatches.rst New changelog fragment accurately describing the mini-batch count change for camera-based PPO examples.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[Rollout Collection\nnum_steps_per_env=32 per env] --> B[Full Rollout Buffer\n32 × num_envs steps]
    B --> C{Split into mini-batches}
    C -->|Before PR\nnum_mini_batches=2| D[2 mini-batches\n16 steps/env each]
    C -->|After PR\nnum_mini_batches=8| E[8 mini-batches\n4 steps/env each]
    D --> F[PPO Update\nnum_learning_epochs=5]
    E --> F
    F --> G[Next Rollout]
    style E fill:#90EE90
    style D fill:#FFB6C1
Loading

Reviews (1): Last reviewed commit: "Update DexSuite camera mini-batches" | Re-trigger Greptile

Copy link
Copy Markdown

@isaaclab-review-bot isaaclab-review-bot Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review: PR #5668 - Update DexSuite camera mini-batches

Summary

This is a well-motivated configuration fix that addresses a training regression introduced by PR #5024 (heterogeneous Newton backend support). The empirical evidence showing success rate recovery from 0.51% to 69.11% is compelling.

✅ Strengths

  1. Clear motivation with data: The PR provides concrete metrics demonstrating the issue and fix effectiveness
  2. Consistent changes: Both single_camera and duo_camera configs are updated symmetrically
  3. Proper changelog fragment: Follows the repository's changelog convention

📝 Observations

  1. Mini-batch ratio change: The camera-based configs now use 8 mini-batches vs. the default ALGO_CFG value of 4. This 2x multiplier for camera-based training makes sense given the higher memory/compute requirements of CNN policies, but it would be helpful to document why camera configs specifically need this.

  2. CI Status: The rendering-correctness check is failing, though this appears unrelated to the config changes in this PR. Worth confirming this is a known flaky test or pre-existing issue on develop.

  3. Divisibility consideration: With num_steps_per_env = 32 and num_mini_batches = 8, the effective mini-batch size will be (num_envs * 32) / 8. This should work fine for typical environment counts, but ensure the total samples per update is divisible by 8 for the environments where these configs are used.

💭 Optional Suggestions (Non-blocking)

  • Consider adding a brief inline comment in the config explaining why camera-based training uses a higher mini-batch count than the default (e.g., # Higher mini-batch count for better sample efficiency with CNN policies)

Verdict

The change is straightforward, well-justified with empirical data, and follows the existing code patterns. LGTM from a code perspective.

@kellyguo11 kellyguo11 moved this to Ready to merge in Isaac Lab May 18, 2026
@kellyguo11 kellyguo11 merged commit 6fbce9e into isaac-sim:develop May 18, 2026
64 of 65 checks passed
@github-project-automation github-project-automation Bot moved this from Ready to merge to Done in Isaac Lab May 18, 2026
matthewtrepte pushed a commit to matthewtrepte/IsaacLab that referenced this pull request May 18, 2026
## Summary
Since Heterogeneous support isaac-sim#5024 for newton, the base camera training
for dexsuite became more difficult to get decent success rate within
2000 iterations, increasing minibatch size from 2 to 8 helps recover the
sample efficiency
```
sha               | description                    | SR     | verdict
------------------+--------------------------------+--------+--------
6621d49          | parent commit                  | 0.7122 | PASS
965136d          | PR isaac-sim#5024 default               | 0.0051 | FAIL
965136dc_minib8   | PR isaac-sim#5024 + num_mini_batches=8  | 0.6911 | PASS
```
 
- Updated DexSuite Kuka-Allegro single-camera and duo-camera RSL-RL PPO
examples to use 8 mini-batches per update.
- Added an isaaclab_tasks changelog fragment for the example config
change.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

isaac-lab Related to Isaac Lab team

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

2 participants