New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MAINT] Use Numpy Random Generator to replace legacy RandomState #4084
Conversation
👋 @ymzayek Thanks for creating a PR! Until this PR is ready for review, you can include the [WIP] tag in its title, or leave it as a github draft. Please make sure it is compliant with our contributing guidelines. In particular, be sure it checks the boxes listed below.
For new features:
For bug fixes:
We will review it as quick as possible, feel free to ping us with questions if needed. |
I updated the codebase where possible to use the new generator. In |
I have not looked at it in detail but could we not just expand what types are passed to support also more "modern" rng options? |
reread the top message. ignore what I just wrote. |
Yes I should've been more clear. And it turns out it only happens in the public API that you can pass RandomState in the cases where it is eventually passed to a sklearn function that uses |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thx.
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #4084 +/- ##
==========================================
- Coverage 91.63% 91.56% -0.07%
==========================================
Files 143 143
Lines 16115 16113 -2
Branches 3353 3353
==========================================
- Hits 14767 14754 -13
- Misses 800 813 +13
+ Partials 548 546 -2
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
@@ -979,7 +980,7 @@ def test_decoder_multiclass_error_incorrect_cv(multiclass_data): | |||
|
|||
def test_decoder_multiclass_warnings(multiclass_data, rng): | |||
X, y, _ = multiclass_data | |||
groups = rng.binomial(2, 0.3, size=len(y)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure I understand this change?
Any reason we cannot use the fixture rng
here?
And if we cannot then it should be removed from the arguments of the test as it is not use.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Remi-Gau in some test cases the random numbers generated differ enough between the 2 strategies leading to a failure and so they need a different seed. So in cases where I used the function with an explicit seed like this is because the fixture's seed of 42 fails the test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok but then we can remove rng from the list of arguments, no?
I may need more coffee but I don't see it being called anywhere else in the test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like I need more coffee haha
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removing it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Except for one change I don't understand, the rest is good with me.
Thanks for taking of this one @ymzayek
np.random.default_rng
for rng fixture #4055Changes proposed in this pull request:
numpy.random.default_rng
sklearn.utils.check_random_state
internally to instantiate a RandomState object.