Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add estimator type with reg and clf for NGBClassifier and NGBRegressor #325

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

NeroHin
Copy link

@NeroHin NeroHin commented Apr 5, 2023

Fixed #324

  • Problem: Currently, when using NGBClassifier or NGBRegressor with the sklearn ensemble voting classifier or regressor, a ValueError is returned with the message "The estimator NGBClassifier / NGBRegressor should be a classifier / regressor."

  • Solution: Added two attributes, self._estimator_type = "classifier" and self._estimator_type = "regressor", into the NGBClassifier and NGBRegressor classes respectively. These attributes indicate whether the model is a classifier or a regressor, allowing the models to be correctly identified by the sklearn ensemble voting classifier or regressor.

  • Impact: This change should resolve the ValueError issue when using NGBClassifier or NGBRegressor with the sklearn ensemble voting classifier or regressor, improving the usability of the models in an ensemble learning setting.

@NeroHin
Copy link
Author

NeroHin commented Apr 5, 2023

#324

@NeroHin
Copy link
Author

NeroHin commented Apr 5, 2023

@alejandroschuler
Copy link
Collaborator

hey @NeroHin, thanks for the suggestion.

Does it make sense to use ngboost in an ensemble for classification or regression alongside eg xgboost? The two are fundamentally different in the case of regresion: ngboost is for probabilistic prediction- it outputs an entire predictive distribution. There's not much point in using ngboost for point prediction if that's all you need since it's only every about as good as xgboost but much slower (see paper for performance comparison). In the case of standard classification (without some parametric discrete distribution eg poisson) ngboost is basically the same as xgboost except, again, slower.

So while there is nothing fundamentally "wrong" with this change, I feel like it is enabling a use-case that doesn't really make much sense or should not be encouraged. Does that make sense or am I missing something?

@NeroHin
Copy link
Author

NeroHin commented Apr 6, 2023

hey @NeroHin, thanks for the suggestion.

Does it make sense to use ngboost in an ensemble for classification or regression alongside eg xgboost? The two are fundamentally different in the case of regresion: ngboost is for probabilistic prediction- it outputs an entire predictive distribution. There's not much point in using ngboost for point prediction if that's all you need since it's only every about as good as xgboost but much slower (see paper for performance comparison). In the case of standard classification (without some parametric discrete distribution eg poisson) ngboost is basically the same as xgboost except, again, slower.

So while there is nothing fundamentally "wrong" with this change, I feel like it is enabling a use-case that doesn't really make much sense or should not be encouraged. Does that make sense or am I missing something?

@alejandroschuler Thanks for your reply. I've seen some new research to use ngboost has a better performance than xgboost (case1 case2).

In my case, I have to use the ensemble method by sklearn.ensemble.VotingClassifier() can correction with the imbalance data for false positive rate and enhance the true positive rate.

But when I add the NGBoost classifier into sklearn.ensemble.VotingClassifier(), it'll return the error with issue #324, and I reference (ref1, ref2), confirm the error is NGBoost didn't have the self._estimator_type into initialize function, so I create this pull request and hope to fix it to help those using ngboost into the ensemble method of sklearn.ensemble.

@alejandroschuler
Copy link
Collaborator

I don't find those papers terribly convincing and in our own paper we found that ngboost is generally only as good as xgboost or a little worse. That makes sense: ngboost isn't trying to directly compete with xgboost.

I still think it's probably a waste of time / energy to try to use ngboost in an ensemble if your goal is point prediction or classification, but I'm going to approve the PR here b/c it doesn't hurt to have the functionality and hopefully people will see this discussion and be dissuaded from using it :)

@NeroHin
Copy link
Author

NeroHin commented Apr 9, 2023

I don't find those papers terribly convincing and in our own paper we found that ngboost is generally only as good as xgboost or a little worse. That makes sense: ngboost isn't trying to directly compete with xgboost.

I still think it's probably a waste of time / energy to try to use ngboost in an ensemble if your goal is point prediction or classification, but I'm going to approve the PR here b/c it doesn't hurt to have the functionality and hopefully people will see this discussion and be dissuaded from using it :)

Thanks for your comment !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

estimator compatibility issues with sklearn
2 participants