Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multilabel Classification method give the same results #47

Open
3 tasks done
DariuszMajerek opened this issue Jan 13, 2024 · 6 comments
Open
3 tasks done

Multilabel Classification method give the same results #47

DariuszMajerek opened this issue Jan 13, 2024 · 6 comments
Labels
bug Something isn't working

Comments

@DariuszMajerek
Copy link

Contribution guidelines

  • I've read the contribution guidelines.
  • The documentation does not mention anything about my problem.
  • There are no open or closed issues that are related to my problem.

Description

I've tried to compare three methods of Multilabel Classification by Random Forest. I wanted to check wich method will be the best MultiOutputClassifier, ClassifierChain or native multilabel RandomForestClassifier. To my surprise, all the results were identical. What is wrong then, since when I do the same calculations using sklearn I get different results. Could you help me.

test.pdf

Expected behaviour

No response

Actual behaviour

No response

Steps to reproduce

No response

Python and package version

  • Python: import sys; sys.version
  • ATOM: import atom; atom.__version__
@DariuszMajerek DariuszMajerek added the bug Something isn't working label Jan 13, 2024
@tvdboom
Copy link
Owner

tvdboom commented Jan 14, 2024

What version of atom are you using? That functionality was deprecated in 5.1.0 I believe. I see that the documentation was not updated accordingly, sorry for that. In the latest version the multioutput meta-estimator is assigned by default. Doing atom.multioutput = ... doesn't do anything. So the same results make sense because you are using the same estimator (check it printing atom.rf.estimator). So you can either downgrade to the previous version or you can assign the three estimators directly to the run method (that way you also have all three models in the same atom instance).

atom.run(["RF", MultiOutputClassifier(RandomForestClassifier()), ClassifierChain(RandomForestClassifier())])

@dax44
Copy link

dax44 commented Jan 14, 2024

My version is 5.2.0.
Unfortunately your example don't work for me. When I use your command, I've got:

Training ========================= >>
Models: RF, MOC, CC
Metric: average_precision


Results for RandomForest:
Fit ---------------------------------------------
Train evaluation --> average_precision: 1.0
Test evaluation --> average_precision: 0.6468
Time elapsed: 0.155s
-------------------------------------------------
Total time: 0.155s


Results for MultiOutputClassifier:
Fit ---------------------------------------------

Exception encountered while running the MOC model.
TypeError: MultiOutputClassifier.__init__() got an unexpected keyword argument 'estimator__bootstrap'


Results for ClassifierChain:
Fit ---------------------------------------------

Exception encountered while running the CC model.
TypeError: _BaseChain.__init__() got an unexpected keyword argument 'base_estimator__bootstrap'


Final results ==================== >>
Total time: 0.160s
-------------------------------------
RandomForest --> average_precision: 0.6468 ~
Consecutive runs of model RF. The former model has been overwritten.

@tvdboom
Copy link
Owner

tvdboom commented Jan 15, 2024

I made a mistake. You have to specify in the custom model that the class doesn't need a multilabel wrapper.

from sklearn.datasets import make_multilabel_classification
from sklearn.multioutput import ClassifierChain, MultiOutputClassifier
from sklearn.ensemble import RandomForestClassifier
from atom import ATOMClassifier, ATOMModel

X, y = make_multilabel_classification(n_samples=300, n_classes=3, random_state=1)

atom = ATOMClassifier(X, y=y, verbose=2, random_state=1)

chain = ATOMModel(ClassifierChain(RandomForestClassifier()), native_multilabel=True)
multi = ATOMModel(MultiOutputClassifier(RandomForestClassifier()), native_multilabel=True)

atom.run(["rf", chain, multi])

@dax44
Copy link

dax44 commented Jan 15, 2024

Thanks for quick replay. Unfortunately this still don't work. There is no native_multilabel parameter in ATOMModel module.
I've the following error:

TypeError: ATOMModel() got an unexpected keyword argument 'native_multilabel'

@tvdboom
Copy link
Owner

tvdboom commented Jan 15, 2024

you are right. that's functionality of the development branch, not yet released. The dev branch also contains a fix for the error you showed before (TypeError: _BaseChain.__init__() got an unexpected keyword argument 'base_estimator__bootstrap').

You can install atom from that branch using pip install git+https://github.com/tvdboom/ATOM.git@development. Then it should work.

@dax44
Copy link

dax44 commented Jan 15, 2024

Yes, it works :)
Thank you for your help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants