Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sparse matrices fail #17

Open
swadey opened this issue Mar 2, 2014 · 5 comments
Open

sparse matrices fail #17

swadey opened this issue Mar 2, 2014 · 5 comments

Comments

@swadey
Copy link

swadey commented Mar 2, 2014

I get this error when calling kmeans on a sparse matrix:

julia> kmeans(x', 50)                                                                                                                                                         
ERROR: no method kmeans(SparseMatrixCSC{Float32,Int32}, Int64) 

Could this be due to the StoredArray change in julia?

@swadey
Copy link
Author

swadey commented Mar 2, 2014

BTW, I'm on julia HEAD: JuliaLang/julia@244cffc

@lindahua
Copy link
Contributor

lindahua commented Mar 2, 2014

The algorithm itself is only for dense matrices.

We may add a k-means algorithms for sparse matrices something in future. However, this is not very high in our priority list. A pull request may make this happen faster.

@swadey
Copy link
Author

swadey commented Mar 2, 2014

@lindahua is there an actual dependency on dense vectors or just that it produces dense centroids? I don't know what the implementation is doing, but if it's doing some kind of kd-tree/ball-tree for a nearest neighbor approximation, that would make sense.

@lindahua
Copy link
Contributor

lindahua commented Mar 2, 2014

The algorithm scans each element in a dense pattern when computing the mean & computing distances. The pairwise distance computing function only accepts dense matrices, as it relies on BLAS's gemm to compute distances in a very fast way.

@lindahua
Copy link
Contributor

lindahua commented Mar 2, 2014

It does not use kd-tree in any way, it just relies on BLAS to compute pairwise Euclidean distances.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants