Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sparse data structures #5

Open
mikolalysenko opened this issue Mar 5, 2015 · 2 comments
Open

Sparse data structures #5

mikolalysenko opened this issue Mar 5, 2015 · 2 comments

Comments

@mikolalysenko
Copy link
Member

Right now there is some basic support and proof of concept tools for working with sparse matrices in SciJS, but it is not well optimized yet.

In practice, there are really 3 widely used sparse formats:

  • List of lists
  • Hash tables
  • Compressed sparse row/column

The first 2 are generally used only for constructing sparse matrices, and so they don't have to be very fast. The last one though requires some more serious consideration.

One interesting observation is that compressed sparse row/column matrices are effectively transposes of one another. In fact, I believe that we can realize the same benefits of CSR(C) formats in ndarray by just using run length encoding, which is equivalent. It also is nice as all the usual ndarray transpose/slice/step operations would work without any changes. Using run length encoding for a default sparse format would give a general and efficient solution, which could be useful in iterative solvers or other applications which would need high performance tensor operations. Finally, RLE could also be useful in working with large image/voxel data which is difficult to compress into the limited heap size of JavaScript.

I have done some preliminary work implementing this as a proof-of-concept, so the idea seems sound, but more work is necessary to get this integrated and working smoothly across the whole ecosystem:

@rreusser
Copy link
Member

Some references:

Numerical Recipes (column-indexed and row-indexed sparse storage):
http://www.aip.de/groups/soe/local/numres/bookcpdf/c2-7.pdf

Compressed Row Storage (CRS):
http://netlib.org/linalg/html_templates/node91.html

Compressed Column Storage (CCS):
http://netlib.org/linalg/html_templates/node92.html

I don't have strong feelings on one of these schemes vs RLE encoding, except that, for example, the CCS scheme stores row information that prevents you having to search in order to locate specific rows. When I tried this a long time ago, the sparse LU preconditioner was moderately challenging since you had to perform operations for which the storage scheme wasn't designed like traversing the matrix in the non-indexed direction. I think the marginal challenge of CRS/CCS over RLE is perhaps outweighed by the potential benefits when it comes time to implement algorithms that go beyond just matrix-vector multiplication, but just an opinion. 😃

@rreusser
Copy link
Member

FWIW on the topic sparse matrices: http://math.nist.gov/MatrixMarket/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants