New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
replace BaselineClassifier/Regressor by DummyClassifier/Regressor. Issue #618 #644
Conversation
@shinnar The pre-commit issue still persists. How should I run |
it looks like the previous pre-commit issues where indeed resolved by the recent fix. The remaining pre-commit issues are introduced by this PR.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR is shaping up nicely.
It looks like DummyRegressor and DummyClassifier both have hyperparameter arguments that should be included in their schemas.
https://scikit-learn.org/stable/modules/generated/sklearn.dummy.DummyRegressor.html
https://scikit-learn.org/stable/modules/generated/sklearn.dummy.DummyClassifier.html
For the random state parameter, I would suggest copying from LogisticRegression:
https://github.com/IBM/lale/blob/master/lale/lib/sklearn/logistic_regression.py#L240-L254
note that for the constant parameter, the schema should have a "side constraint". If you have trouble with that part, I am happy to help
_hyperparams_schema = { | ||
"allOf": [ | ||
{ | ||
"description": "This first object lists all constructor arguments with their types, but omits constraints for conditional hyperparameters.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add a link to the underlying sklearn docs here (an example doing this is https://github.com/IBM/lale/blob/master/lale/lib/sklearn/logistic_regression.py#L397)
_hyperparams_schema = { | ||
"allOf": [ | ||
{ | ||
"description": "This first object lists all constructor arguments with their types, but omits constraints for conditional hyperparameters.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add a link to the underlying sklearn docs here (an example doing this is https://github.com/IBM/lale/blob/master/lale/lib/sklearn/logistic_regression.py#L397)
lale/lib/sklearn/__init__.py
Outdated
@@ -25,6 +25,7 @@ | |||
* lale.lib.sklearn. `AdaBoostClassifier`_ | |||
* lale.lib.sklearn. `BaggingClassifier`_ | |||
* lale.lib.sklearn. `DecisionTreeClassifier`_ | |||
* lale.lib.sklearn. `DummyRegressor`_ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this was intended to be DummyClassifier
Thanks for the answer. I ran |
pyright needs node to be installed. If that is not feasible, you can commit without running pyright. Note that we can not accept the PR until it passes pyright, but it is fine to optimistically commit, and then, if the static checking fails in github actions, fix the reported problem and push a fix. Note that if you have installed the pre-commit hooks, you can skip a given test for a commit by running:
|
I have nodejs installed in my system but it still fails the pyright check. So I skipped the pyright check, and now I am facing an issue in the static check in the CI/CD pipeline. Any suggestion? |
apologies, but we still seem to be having some CI issues. I think we figured out a fix. once it is working, I will ping you to rebase onto it. apologies again. |
* Score method for individual operators. * Score for pipelines. * update pre-commit checkers to latest versions. Fix some problems they newly flag * Trying to fix the pre-commit. * Reverting back isort version to 5.7.0. * Trying to change github action keys. Co-authored-by: Avi Shinnar <shinnar@us.ibm.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good. One last thing is testing.
please add appropriate lines to
https://github.com/IBM/lale/blob/master/test/test_core_classifiers.py#L131
and
https://github.com/IBM/lale/blob/master/test/test_core_regressors.py#L91
also, please rebase to the latest on master, which will hopefully fix the CI problem.
lale/lib/sklearn/dummy_regressor.py
Outdated
"quantile": { | ||
"description": "The quantile to predict using the “quantile” strategy. A quantile of 0.5 corresponds to the median, while 0.0 to the minimum and 1.0 to the maximum.", | ||
"type": "float", | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should have a default and range constraints. I would suggest something like:
"quantile": {
"description": "The quantile to predict using the “quantile” strategy. A quantile of 0.5 corresponds to the median, while 0.0 to the minimum and 1.0 to the maximum.",
"anyOf": [{
"enum": [None]
},
{
"type": "float",
"minimum": 0.0,
"maximum": 1.0
}],
"default": None,
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vickyvishal
sorry, my fault. I was not careful when I expanded the schema:
that should be "type": "number", not "type": "float"
lale/lib/sklearn/dummy_classifier.py
Outdated
"constant": { | ||
"description": "The explicit constant as predicted by the “constant” strategy. This parameter is useful only for the “constant” strategy.", | ||
"side constraint": "", | ||
"type": ["int", "str"], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be "type": ["integer", "string"]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it looks like the PR is failing because of this type. it should change to:
"constant": {
"description": "The explicit constant as predicted by the “constant” strategy. This parameter is useful only for the “constant” strategy.",
"anyOf": [
{"type": ["integer", "string"]},
{"enum": [None]},
"default": None
]}
This is looking great. once it passes the tests, we are almost ready to merge. Thanks for your contribution! |
Codecov Report
@@ Coverage Diff @@
## master #644 +/- ##
==========================================
+ Coverage 81.23% 81.29% +0.05%
==========================================
Files 308 312 +4
Lines 15816 15909 +93
==========================================
+ Hits 12848 12933 +85
- Misses 2968 2976 +8
Continue to review full report at Codecov.
|
lale/lib/sklearn/dummy_classifier.py
Outdated
}, | ||
"constant": { | ||
"description": "The explicit constant as predicted by the “constant” strategy. This parameter is useful only for the “constant” strategy.", | ||
"side constraint": "", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this line should be removed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previously it was decided to have a "side constraint" for "constant" schema. Should I remove it? If there is a property to add for "side constraint" I can add it now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry for the confusion. a "side constraint", in our terminology, is something like this:
https://github.com/IBM/lale/blob/master/lale/lib/sklearn/logistic_regression.py#L320-L333
which adds additional constraints that relate more than one hyperparameter.
However, I don't think it is needed for this PR.
the "side constraint" line above should definitely be removed, as it is probably what is currently breaking the tests. I would guess that is due from the space in the key, although I did not investigate that closely.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation. Now it's clear. I will remove that key and push again
@hirzel , I have sent you "Developer's Certificate of Origin" over email. Thanks |
@vickyvishal I think that something like #644 (comment) will fix the current build problem |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I checked out the PR and found the two remaining problems, which I added as comments. With those changes made, the dummy regressor test passes locally, so this should be it.
by the way, you can try running
python -m unittest test.test_core_regressors
to run the test locally
lale/lib/sklearn/dummy_regressor.py
Outdated
"relevantToOptimizer": [], | ||
"additionalProperties": False, | ||
"required": ["strategy", "quantile"], | ||
"property": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vickyvishal this should be "properties", not "property".
lale/lib/sklearn/dummy_regressor.py
Outdated
"enum": ["mean", "median", "quantile", "constant"], | ||
"default": "mean", | ||
}, | ||
"random_state": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the regressor does not have a random state argument. please delete. instead, please add
"constant": {
"description": "The explicit constant as predicted by the “constant” strategy. This parameter is useful only for the “constant” strategy.",
"anyOf": [
{"type": ["integer", "string"]},
{"enum": [None]},
{"default": None},
],
},
Thank you for your contribution! |
Replacing BaselineClassifier with DummyClassifier from Sklearn.
Relacing BaselineRegressor with DummyRegressor from Sklearn.
Created new operator with make_operator with schemas defined in BaselineClassifier and BaselineRegressor
Contributed by students of SRH University, Heidelberg.
https://github.com/tauseefhashmi
https://github.com/frankcode101
https://github.com/tanmaygaikwad
https://github.com/RajathReddy9
https://github.com/vickyvishal/