New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fast NMF algorithm for dense and sparse data #896
Comments
I would rather like an algorithm that scales with n_samples rather than n_features. Maybe an Averaged SGD optimization of the NMF cost function + positive projections? |
Their algorithm didn't gave me the impression that it doesn't scale wrt n_samples (the datasets they use for their experiments are pretty large). And variable selection seems to help accelerate convergence a lot. BTW, the tricks necessary for efficient implementation of averaging in the sparse case may not be applicable if there's a projection step (to be verified). |
Ok interesting. The code seems simple enough to implement too. |
I added a preliminary implementation of this method here: A difference is that my implementation uses cyclic coordinate selection instead of greedy. |
I obtained a 20x speed up by numba-ing the most computationally expensive part. Now computing the NMF on the full News20 dataset takes 10 seconds. |
should we close this in favor of #4811? |
Let's close it when #4852 is merged :) |
#4852 is merged now :) |
Here's an algorithm which I think would be a good candidate for inclusion in scikit-learn:
http://www.cs.utexas.edu/~cjhsieh/nmf/
The text was updated successfully, but these errors were encountered: