-
Notifications
You must be signed in to change notification settings - Fork 92
Initial KNN classifier add. #1650
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
4d8deac to
cdefc9f
Compare
Codecov Report
@@ Coverage Diff @@
## main #1650 +/- ##
=========================================
+ Coverage 100.0% 100.0% +0.1%
=========================================
Files 240 242 +2
Lines 18853 18942 +89
=========================================
+ Hits 18845 18934 +89
Misses 8 8
Continue to review full report at Codecov.
|
bc458cd to
7f3b1c3
Compare
…ng with the workers assigned to the original, larger job was causing this issue.
Accidentally removed KNClassifier in merge with main.
…ther graphviz related tests are somehow broken.
freddyaboulton
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@chukarsten This looks good to me! I want to resolve the discussion on prediction explanations before merging though.
Also, we typically exclude estimators from AutoMLSearch until we run the performance tests on them. We do this by adding the class name to _not_used_in_automl in gen_utils.py. I think we should follow the same pattern here but what do you think @dsherry ?
| assert msg in caplog.text | ||
|
|
||
|
|
||
| @patch('evalml.pipelines.BinaryClassificationPipeline.score', return_value={"Log Loss Binary": 0.8}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How come splitting this test into many tests fixes the crashing worker problem?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The dreaded windows test bug D:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That I'm not sure. We can't say conclusively that it does. It fixed it "this time", but as @bchen1116 knows, sometimes just modifying something seems to affect the presence of the crashing worker. We were discussing maybe having a nightly testing of main to see if main, outside of tests via merging, exhibits this crash going forward. But really I guess this will be a question of whether we accept the additional lines due to splitting the test in the hopes of fixing the bug going forward, or not.
evalml/pipelines/components/estimators/classifiers/kneighbors_classifier.py
Outdated
Show resolved
Hide resolved
evalml/tests/model_understanding_tests/prediction_explanations_tests/test_algorithms.py
Outdated
Show resolved
Hide resolved
evalml/pipelines/components/estimators/classifiers/kneighbors_classifier.py
Show resolved
Hide resolved
bchen1116
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a few comments. Don't forget to add the KNN classifier to the api_references.rst file and make sure the corresponding docs are updated to include the classifier.
Otherwise, agree with @freddyaboulton that this should be excluded from AutoMLSearch until perf tests are run.
evalml/pipelines/components/estimators/classifiers/kneighbors_classifier.py
Outdated
Show resolved
Hide resolved
evalml/pipelines/components/estimators/classifiers/kneighbors_classifier.py
Show resolved
Hide resolved
evalml/pipelines/components/estimators/classifiers/kneighbors_classifier.py
Show resolved
Hide resolved
| assert msg in caplog.text | ||
|
|
||
|
|
||
| @patch('evalml.pipelines.BinaryClassificationPipeline.score', return_value={"Log Loss Binary": 0.8}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The dreaded windows test bug D:
angela97lin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed with what's already been said about perf testing and adding to AutoML, but otherwise just left some comments for cleanup! 😁
evalml/pipelines/components/estimators/classifiers/kneighbors_classifier.py
Show resolved
Hide resolved
evalml/pipelines/components/estimators/classifiers/kneighbors_classifier.py
Outdated
Show resolved
Hide resolved
evalml/pipelines/components/estimators/classifiers/kneighbors_classifier.py
Show resolved
Hide resolved
evalml/pipelines/components/estimators/classifiers/kneighbors_classifier.py
Outdated
Show resolved
Hide resolved
…ate is handled in the KNN class.
…sed a few more issues.
…testing prior to performance eval.
bchen1116
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left one note, but looks good to me!
One thing I want to bring to attention would be the possibility of standardizing the names. I brought up the idea to change the string name to KNN Classifier, but I see the value of keeping the naming in line with how SKLearn names it, which was KNeighborsClassifier. I think we should make these names the same, like we do for all other components (ie class LogisticRegressionClassifier with name Logistic Regression Classifier). On this note, I think it might make sense to change the name to K Neighbors Classifier (no dash), but let me know what you think!
Pull Request Description
First crack at adding KNN classifier(/regressor) to the baseline.
After creating the pull request: in order to pass the release_notes_updated check you will need to update the "Future Release" section of
docs/source/release_notes.rstto include this pull request by adding :pr:123.