Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Native HPO support and implementation on conda in Windows #5641

Open
wants to merge 13 commits into
base: branch-23.12
Choose a base branch
from

Conversation

Zekrom-7780
Copy link

Pull Request: [Issue #5380] Enhancements to Hyperparameter Optimization Functions

Description

This PR addresses Issue #5380 by introducing enhancements to the Hyperparameter Optimization (HPO) functions in the cuML module. The goal of these changes is to make it more convenient to work with various Hyperparameter Optimization methods and improve the overall consistency and functionality of the cuML module.

Changes Made

  • Introduced a new function add_sklearn_documentation to make it easier to import and document the HPO methods from scikit-learn (sklearn).
  • Added a collection of commonly used HPO methods from sklearn.model_selection to the cuML module.
  • Ensured that the documentation for these imported methods maintains a consistent structure and properly credits the scikit-learn developers.
  • Ran the pre_commit code to ensure code quality and style standards are met.

If I've inadvertently omitted any HPO methods from sklearn.model_selection, please don't hesitate to notify me so that we can include them in the cuML module.

Reviewers

cc: @wphicks

P.S.

Thank you for your review and feedback on this PR.

@Zekrom-7780 Zekrom-7780 requested a review from a team as a code owner November 3, 2023 11:07
Copy link

copy-pr-bot bot commented Nov 3, 2023

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@github-actions github-actions bot added the Cython / Python Cython or Python issue label Nov 3, 2023
@Zekrom-7780
Copy link
Author

@wphicks Regarding This pull request requires additional validation before any workflows can run on NVIDIA's runners.

What more should I do to make this go away?
I have made my final commit verified by signing with the SSH key, and I have also followed the pre_commit mentioned in the project's contribution guidelines

@csadorf
Copy link
Contributor

csadorf commented Nov 8, 2023

@wphicks Regarding This pull request requires additional validation before any workflows can run on NVIDIA's runners.

What more should I do to make this go away? I have made my final commit verified by signing with the SSH key, and I have also followed the pre_commit mentioned in the project's contribution guidelines

@Zekrom-7780 There is nothing you have to do at the moment. We will trigger CI runs once we have addressed some issues with our CI pipeline.

@wphicks wphicks added feature request New feature or request non-breaking Non-breaking change labels Nov 14, 2023
@wphicks
Copy link
Contributor

wphicks commented Nov 16, 2023

Apologies @Zekrom-7780! This came in while I was on vacation. The changes look very nice! At our internal meeting today, I'll check in with the team about what level of testing we'll want for features included in this way and follow up here afterward.

@dantegd
Copy link
Member

dantegd commented Nov 20, 2023

/ok to test

@Zekrom-7780
Copy link
Author

@wphicks @dantegd , could you tell me where I'm going wrong?

@dantegd
Copy link
Member

dantegd commented Nov 21, 2023

@Zekrom-7780 nothing wrong on your side, the PR needed the changes from #5661 so I just merged branch-23.12 and that should filx things

@dantegd
Copy link
Member

dantegd commented Nov 21, 2023

/ok to test

@Zekrom-7780
Copy link
Author

@wphicks @dantegd , I changed the code, as I saw in the tests that Sklearn didn't have a BayesSearchCV, so I added changed the CVs and added them in these changes

@wphicks
Copy link
Contributor

wphicks commented Nov 21, 2023

Thanks very much @Zekrom-7780! Could you add some very basic tests demonstrating that these methods work with a cuML estimator? No need to repeat all of the testing that already exists in sklearn; just show that they are functional from the new import location.

@Zekrom-7780
Copy link
Author

Sure @wphicks , I'll add some tests here, but where should I add them?
Also can I pick up more issues along with this one if that's fine with you?

@wphicks
Copy link
Contributor

wphicks commented Nov 22, 2023

@Zekrom-7780 You can add a new file here: https://github.com/rapidsai/cuml/tree/branch-23.12/python/cuml/tests. And of course we'd be delighted if you want to pick up more issues! Thank you so much for this contribution. It looks great, and I can't wait to see what else you'd like to work on. Feel free to ping me directly if you have questions.

@Zekrom-7780
Copy link
Author

@wphicks @dantegd , added Pytests.
I dont know like how to write them, so that took some time, but please tell me if I need to add more or different tests.

@wphicks
Copy link
Contributor

wphicks commented Nov 27, 2023

Thanks, @Zekrom-7780! In terms of tests, I was thinking something that just performs a very basic test of functionality. I.e. You could basically directly borrow from Scikit-Learn here, but rather than using their MockClassifier, create a nearly-identical MockClassifier that does inherit from cuML's base estimator class. That should give us good confidence that we won't have any issues with e.g. assumptions related to numpy instead of cupy or something like that.

Let me know if you can tackle that or if you need more help in prepping it. We can always sync up on the public RAPIDS Slack channel if you'd like to iron it out together. Really appreciate you taking the time to figure out pytest generally!

@Zekrom-7780
Copy link
Author

Thanks a lot @wphicks , I tried building a class similar to what you suggested to use cupy instead of numpy, but I wasn't able to do it.
I've joined the Slack channel, can I reach out to you regarding this?

@wphicks
Copy link
Contributor

wphicks commented Nov 30, 2023

@Zekrom-7780 Of course! I'm "William Hicks" on the Slack channel.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Cython / Python Cython or Python issue feature request New feature or request non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants