Replace direct np.random.* calls with np.random.RandomState instances#8798
Conversation
Replace global `np.random` function calls with proper `RandomState` instances for reproducibility: - transforms/utils.py: Replace `np.random.random.__self__` (3 sites) with `np.random.RandomState()` in generate_pos_neg_label_crop_centers, weighted_patch_samples, and get_extreme_points - transforms/signal/array.py: Replace `np.random.choice` (2 sites) with `self.R.choice` in SignalRandAddSine and SignalRandAddSquarePulsePartial (classes already inherit Randomizable) - data/synthetic.py: Replace `np.random.random.__self__` (2 sites) with `np.random.RandomState()` in create_test_image_2d/3d - data/utils.py: Replace `np.random.randint` fallback with `np.random.RandomState().randint` in get_random_patch - utils/ordering.py: Replace `np.random.shuffle` with `np.random.RandomState().shuffle` in random ordering Ref Project-MONAI#6888 Signed-off-by: SexyERIC0723 <haoyuwang144@gmail.com>
📝 WalkthroughWalkthroughThis pull request refactors random number generation initialization across five modules in MONAI's data and transform layers. When no explicit random state is provided to functions, the code now creates fresh Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes 🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@monai/data/synthetic.py`:
- Around line 65-66: The default for random_state in monai/data/synthetic.py
must remain tied to the global numpy RNG so callers using np.random.seed(...)
stay deterministic: when random_state is None, set rs to the global RandomState
(e.g., use np.random.random.__self__ or an equivalent access to the global RNG)
instead of creating a new np.random.RandomState(), and update the
create_test_image_2d docstring to correctly state that the default uses the
global np.random state; ensure the change is applied where rs is set and in the
docstring lines referencing the default behavior so tests like
create_test_image_2d remain deterministic.
In `@monai/data/utils.py`:
- Around line 123-124: The code creates an unseeded RandomState() when
rand_state is None which breaks reproducibility; change the default path to use
the global RNG (np.random.randint) instead of np.random.RandomState(), i.e., use
rand_int = np.random.randint if rand_state is None else rand_state.randint and
keep the existing min_corner generation using rand_int; update callers/tests to
add coverage for get_random_patch(..., rand_state=None) (or the function in this
module that defines rand_state) to assert reproducible behavior after
np.random.seed(...) so the public API remains deterministic when a caller relies
on the global seed.
In `@monai/transforms/utils.py`:
- Around line 668-669: Add unit tests that verify reproducibility when the
functions in monai.transforms.utils that default rand_state to
np.random.RandomState() are called without an explicit rand_state: set a global
seed via np.random.seed(some_int), call the target function twice (back-to-back)
and assert the outputs are identical, and also assert that supplying an explicit
RandomState yields equivalent deterministic behavior; target the three functions
that use "if rand_state is None: rand_state = np.random.RandomState()" so the
tests exercise the default-path (no rand_state argument) and confirm whether
global seeding produces reproducible results.
In `@monai/utils/ordering.py`:
- Around line 204-205: The random ordering branch (ordering_type="random")
creates an unseeded RNG via np.random.RandomState().shuffle(idx_np) which breaks
reproducibility; change this to use the global RNG (np.random.shuffle(idx_np))
or, better, accept an injected seed/RNG on the class/constructor and call
np.random.RandomState(self.rng).shuffle(idx_np) (or use self.rng.shuffle when
passing a RandomState/Generator) so ordering respects seeding; update the code
references in ordering.py where idx_np is shuffled and add unit tests that
exercise the random ordering path to assert reproducible output given the same
seed.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 41d8f354-47da-46e8-bca8-6a3dc28a893d
📒 Files selected for processing (5)
monai/data/synthetic.pymonai/data/utils.pymonai/transforms/signal/array.pymonai/transforms/utils.pymonai/utils/ordering.py
The previous commit replaced global np.random.* calls with unseeded np.random.RandomState() instances. This broke 12 tests because callers relying on np.random.seed() for determinism no longer get reproducible results from these public API functions. Revert transforms/utils.py, data/synthetic.py, data/utils.py, and utils/ordering.py back to the global RNG. These utility functions accept an optional rand_state parameter -- when None, they intentionally fall back to the global RNG to respect np.random.seed(). The signal/array.py fix (self.R.choice) is retained because those classes already inherit from Randomizable and use self.R for all other random operations -- np.random.choice was an inconsistency. Signed-off-by: SexyERIC0723 <haoyuwang144@gmail.com>
|
Thank you for the review, @coderabbitai! You're absolutely right — replacing the global RNG with unseeded I've pushed a follow-up commit that reverts the changes to:
These utility functions accept an optional The only change retained is in |
|
Tip For best results, initiate chat on the files or code changes.
|
atharvajoshi01
left a comment
There was a problem hiding this comment.
Clean fix. Using self.R (the RandomState instance) instead of np.random.choice ensures reproducibility when a seed is set. Both call sites in SignalFillEmpty and the square pulse transform had the same issue.
ericspod
left a comment
There was a problem hiding this comment.
Hi @haoyu-haoyu I see there was more work here that has been reverted, this seems like the right call as pointed out and the remaining changes are fine.
Description
Replace direct
np.random.*global function calls with propernp.random.RandomStateinstances for reproducibility, as requested in #6888.Changes
monai/transforms/utils.pynp.random.random.__self__np.random.RandomState()monai/transforms/signal/array.pynp.random.choiceself.R.choice(classes already Randomizable)monai/data/synthetic.pynp.random.random.__self__np.random.RandomState()monai/data/utils.pynp.random.randintfallbacknp.random.RandomState().randintmonai/utils/ordering.pynp.random.shufflenp.random.RandomState().shuffleScope
This PR covers 9 functional call sites across 5 files. The remaining
np.random.*calls in the codebase are either:np.random.seedinset_determinism()— intentionally sets global stateRandomizableinheritance, left for a follow-upChecks
Signed-off-byincluded (DCO)self.RfromRandomizablebase classRef #6888