-
Notifications
You must be signed in to change notification settings - Fork 538
More robust performance test #588
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request aims to make a test more robust by replacing a 3-class classification dataset with an easier 2-class dataset, reducing the likelihood of failures due to statistical fluctuations. The change is well-implemented. My feedback includes a suggestion to strengthen the performance assertions in the test to better leverage the simpler dataset and improve the test's ability to detect regressions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR modifies the test_predict_logits_and_consistency test to use inline dataset generation instead of the shared X_y fixture. The test now creates its own classification dataset with specific parameters (60 samples, 2 classes, 3 features) rather than using the module-level fixture that generates a different dataset (60 samples, 3 classes, 5 features).
Key Changes:
- Removed the
X_yfixture parameter from the test function signature - Added inline
sklearn.datasets.make_classification()call with custom parameters (2 classes vs 3, 3 features vs 5)
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
brendan-priorlabs
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, @bejaeger!
Issue
Previously, the test was prone to statistical fluctuations because we were fitting a very small dataset.
This creates a dataset specific to the test and uses a few more samples to make the test more robust.