Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MLARAM bug " return numpy.array(numpy.matrix(allranks))" raises ValueError: matrix must be 2-dimensional #68

Closed
simon-m opened this issue Sep 25, 2017 · 7 comments
Milestone

Comments

@simon-m
Copy link

simon-m commented Sep 25, 2017

Hello,

when running the following code, I get the error mentioned in the title:

from skmultilearn.neurofuzzy import MLARAM
mam = MLARAM(vigilance=0.9, threshold=0.02, neurons=[])
from sklearn.datasets import make_multilabel_classification
x, y = make_multilabel_classification(sparse = True, n_labels = 5,
  return_indicator = 'sparse', allow_unlabeled = False)
mam.fit(x.todense(), y.todense())
mam.predict(x.todense())

Full traceback:


Traceback (most recent call last):
  File "C:\Program Files\WinPython-32bit-3.6.1.0Qt5\python-3.6.1\lib\site-packages\IPython\core\interactiveshell.py", line 2881, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-5-1c3b03743990>", line 1, in <module>
    mam.predict(x.todense())
  File "C:\Program Files\WinPython-32bit-3.6.1.0Qt5\python-3.6.1\lib\site-packages\skmultilearn\neurofuzzy\MLARAMfast.py", line 141, in predict
    ranks = self.predict_proba(X)
  File "C:\Program Files\WinPython-32bit-3.6.1.0Qt5\python-3.6.1\lib\site-packages\skmultilearn\neurofuzzy\MLARAMfast.py", line 233, in predict_proba
    return numpy.array(numpy.matrix(allranks))
  File "C:\Program Files\WinPython-32bit-3.6.1.0Qt5\python-3.6.1\lib\site-packages\numpy\matrixlib\defmatrix.py", line 274, in __new__
    raise ValueError("matrix must be 2-dimensional")
ValueError: matrix must be 2-dimensional

Also, note that MLARAM does not support sparse matrices contrary to what is mentioned in the comments. As an aside, in the same method, the test

if len(X) == 0:
    return

fails on sparse matrices and should probably be replaced by something like

        if scipy.sparse.issparse(X):
            if X.getnnz() == 0:
                return
        elif len(X) == 0:
            return

regards,

Simon

@simon-m
Copy link
Author

simon-m commented Sep 26, 2017

Hello,
I added support for sparse X and y for MLARAM. However, being a complete novice when it comes to working on a shared codebase, I'd rather ask:
Is it appropriate to ask for a pull request now?
Simon

@ChristianSch
Copy link
Member

Yes, please do. If there is something off we can review your work. It can always be improved in the remote branch. We then approve your request if everything's good and we tested that it works as intented.
Thanks for your work! Seeing forward to your pull request, cheers.

@niedakh niedakh added this to Next release in Scikit-multilearn Oct 2, 2017
@niedakh
Copy link
Contributor

niedakh commented May 13, 2018

i've merged your PRs but it seems that the todense() approach still fails, mostly because neuron vectors become matrices instead of arrays if todense() is used.

@simon-m which representation is the best for a neuron's vector? array, matrix or sparse matrix?

@simon-m
Copy link
Author

simon-m commented May 14, 2018

Well, so far the code seems to rely on operations that cannot be applied to sparse matrices. This PR only adds some todense() where needed to avoid crashes. Thus, until the code is rewritten in a way sparse matrices can be processed, sparse matrices are probably not suited.

From what I see, there is a great deal of numpy.squeeze(numpy.asarray(a.todense())) so this argues for the use of arrays instead of matrices.

Cheers

@niedakh
Copy link
Contributor

niedakh commented May 31, 2018

@simon-m, ok so I went that way, using arrays internally for now if dense representation is passed. this is fixed in master.

@niedakh niedakh closed this as completed May 31, 2018
Scikit-multilearn automation moved this from Next release to Done May 31, 2018
@ghiander
Copy link

ghiander commented Jun 2, 2018

Hello,

I am trying to use the classifier with a dense matrix (n_samples, n_features) that was converted from a Pandas DataFrame. I tried to use several functions such as "DataFrame.values" and "numpy.matrix(DataFrame)" with no success. The classifier raises the error "ValueError: matrix must be 2-dimensional".

@ghiander
Copy link

ghiander commented Jun 2, 2018

I forgot to say that I can actually train the classifier using a DataFrame converted using "numpy.matrix(DataFrame)" but it does not work when it comes to predict a test matrix

@niedakh niedakh added this to the 0.1.0 milestone Jun 6, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

No branches or pull requests

4 participants