Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

can "OneVsRestClassifier" be usefull to tune FMclassification into a multi-class classification case #49

Open
farimahfanaei opened this issue Apr 19, 2016 · 8 comments

Comments

@farimahfanaei
Copy link

As fastFM-core only accept {1,-1} as class labels and there is no predict_proba or decision_function I was wondering if it can fully satisfy the expectations of OneVsRestClassifierI?
If so is there any tip or solution so that I can apply sgd-FMclassification in multi-class problem?

@ibayer
Copy link
Owner

ibayer commented Apr 20, 2016

Actually there is a predict_proba function for the als and sdg solver but it's not showing up in the docs (there is an open issue now). In general, I would recommend to use the mcmc solver with the fit_predict_proba function.

#47

@farimahfanaei
Copy link
Author

thank you so much,I was thinking to try implementing a customized predic_proba and 'fit' functions for sgd which can solve the problem but you said that there is an issue!
Although I preferred to use sgd method but I also tried it withmcmcsolver as following:
y_proba =OneVsRestClassifier(fm).fit_predict_proba(X_train, y_train, X_test)
but I face the following error:

AttributeError: 'OneVsRestClassifier' object has no attribute 'fit_predict_proba'

@ibayer
Copy link
Owner

ibayer commented Apr 21, 2016

You are talking about the OneVsRestClassifierI class from sklearn? I thought you want to implement it yourself. Please provide a Short, Self Contained, Correct Example to help us understand the issue.

@farimahfanaei
Copy link
Author

yes, I am facing a multi-class classification problem and I was thinking to use OneVsRestClassifier from sklearn for tuning sgd-FMclassification to solve my multi-class classification problem using FastFM method .

fm = sgd.FMClassification(n_iter=1000, init_stdev=0.1, rank=2, random_state=123,l2_reg_w=0, l2_reg_V=0, l2_reg=0, step_size=0.1)

OneVsRestClassifier(fm).fit(X_train,y_train)

with the above code I face the following error related to sklearn.base :

RuntimeError: Cannot clone object FMClassification(init_stdev=0.1, l2_reg=None, l2_reg_V=0, l2_reg_w=0,
n_iter=1000, random_state=123, rank=2, step_size=0.1), as the constructor does not seem to set parameter l2_reg_V

I thought maybe it is because of the fact that sgd (or als) solver has not a proper decision_function for OneVsRestClassifier !?
That's why I asked this question and I was wondering if implementing customized **fit _and _predict_proba by myself can be a plausible solution for this problem?

@ibayer
Copy link
Owner

ibayer commented Apr 21, 2016

The Cannot clone object FMClassification error has been reported before #44 .

That's why I asked this question and I was wondering if implementing customized fit *and *predict_proba > by myself can be a plausible solution for this problem?

I think it's better to fix this clone issue first, maybe that fixes your problem too. I'll look into it but it might take a while. Your example doesn't run. You could just adapt the code from http://scikit-learn.org/stable/auto_examples/plot_multilabel.html to create a self contained example.

@farimahfanaei
Copy link
Author

farimahfanaei commented Aug 12, 2016

@ibayer I tried to solve that clone problem we faced and the problem was in __init__()function in sgd.py , I just try to feed the parameter values manually I am afraid if it is a efficient way to do so !
But my question is still if FMClassifier is able to solve a problem with huge number of classes like 1000-100000 classes using OneVsRestClassifier ? because I still have the following error:

sklearn.utils.validation.NotFittedError: This OneVsRestClassifier instance is not fitted yet. Call 'fit' with appropriate arguments before using this method.

I had make a change in _validate_class_labels(y) in base.py file and I set the assert len(set(y)) equal to my maximum number of classes! so I was afraid if this method can really work with multi-class classification using OneVsRestClassifier ?

@macks22
Copy link
Contributor

macks22 commented Mar 20, 2017

I was able to get the OneVsRestClassifier working with the following (not terribly elegant) patching:

from fastFM import als
class FMClassifier(als.FMClassification):
    def fit(self, X, y, *args):
        y = y.copy()
        y[y == 0] = -1
        return super(FMClassifier, self).fit(X, y, *args)

    def predict_proba(self, X):
        probs = super(FMClassifier, self).predict_proba(X)
        return np.tile(probs, 2).reshape(2, probs.shape[0]).T

from sklearn.multiclass import OneVsRestClassifier
clf = OneVsRestClassifier(FMClassifier(n_iter=500, random_state=42), n_jobs=-1)

Basically, scikit produces [0, 1] labels which must be converted to [-1, 1] labels. Then it extracts the probabilities from what it assumes is a (n_instances, 2) array. Broadcasting the values would be better than my tiling solution, but I didn't know the syntax off the top of my head. There may be other changes needed to make the API fully compatible; I haven't tested this with a pipeline.

@Darinyazanr
Copy link

Darinyazanr commented Mar 29, 2018

I want to use fm to solve a multi-class classification problem too,Can you give some advise? @farimahfanaei @ibayer @macks22

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants