-
Notifications
You must be signed in to change notification settings - Fork 106
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multi-label classification #267
Comments
@jayahm Hello, I haven't tested it yet but it should work well with the multi-output package from scikit-learn which transform a general estimator into a multi-label classification (or regression) algorithm: https://scikit-learn.org/stable/modules/classes.html#module-sklearn.multioutput so in this case it would be used together with the ClassifierChain or MultiOutputClassifier methods. |
Hi I have tested with ClassifierChain. I got the following errors: y_self has the shape of (200, 6918), where 6918 is the number of labels (0-1 binarized).
|
Hello, Can you provide me with a small code example you used to get this error? Then, I can what can be done. |
Hi Thanks for your response. I have created s simple code here https://www.dropbox.com/s/soaysxi2rhhj388/for_deslib.zip?dl=0 |
Hi Were you able to run my code? I really hope there is a way to perform multi-label classification using this library. |
@jayahm Hello, According to the example you provided, you want each base model to be a multilabel classifier and select the best between them according to each new sample correct? If that is the case there is no support for that. To the best of my knowledge, there is no dynamic ensemble technique that performs classifier selection of multi-label models. So we would need to develop a new technique first and then add it as it would involve multiple adaptations to this context in multiple steps in the pipeline (region of competence definition, competence estimation, selection scheme, and combination). I spent some time looking if there exists any technique in the literature by did not find any, so there is a huge potential for interesting research there... If what you want is just to have a usual, classical DS technique (which works as single label classifier) that is transformed to perform multilabel classification with classical techniques that makes multi-label decomposition (binary relevance or classifier chain) you can just use like that: `from sklearn.datasets import make_multilabel_classification knorae = KNORAE(random_state=42) |
According to the example you provided, you want each base model to be a multilabel classifier and select the best between them according to each new sample correct? Yes, very true. I spent some time looking if there exists any technique in the literature by did not find any, so there is a huge potential for interesting research there... Yes, I couldn't find too actually. I could feel from the beginning that this task is not straightforward since we need to define many things in the context of multi-label classification (region of competence definition, competence estimation, selection scheme, combination, etc). The main reason might be, for example, that a sample can have 3 labels, while another sample can have 5 labels. So, I am not sure how that can be adapted to this library. If what you want is just to have a usual, classical DS technique (which works as a single-label classifier) that is transformed to perform multilabel classification with classical techniques that makes multi-label decomposition (binary relevance or classifier chain) you can just use like that: Do you mean to first train multiple single-label classifiers as base classifiers (
|
Hi I tried your suggestion but using a heterogeneous pool of classifiers. I used the code I wrote above. It seems like in order to train each classifier, it still needs a single-label dataset. I think the code you suggested previously will generate bagging classifiers, right? Or, what are the base classifiers of that KNORAE you suggested? |
Yeah, it would generate a bagging classifier. Unfortunately to use a heterogenous one the current implementation does not allow due to some limitations in how scikit-learn clone classifiers (issue #89 ). I have a workaround in mind but it will take some time to have everything compatible with both libraries. However I just saw there is a quite recent paper (published on june 20th) that proposes a DES method for multi-label classification: [(https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4145875] |
Hi @Menelau That sounds good. I'll check the mentioned paper. Thanks for sharing. Hopefully, deslib will capable of handling multi-label classification soon. |
Hi
Can this library and its methods work with multi-label classification algorithms?
The text was updated successfully, but these errors were encountered: