Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

compact dense sparsity patterns #3701

Open
Mathadon opened this issue May 19, 2024 · 3 comments
Open

compact dense sparsity patterns #3701

Mathadon opened this issue May 19, 2024 · 3 comments

Comments

@Mathadon
Copy link
Contributor

Mathadon commented May 19, 2024

When generating code for some function that uses a dense matrix as a parameter/input, the generated code contains the sparsity pattern of that matrix. Since the codegen lists each entry of the matrix, a 1000x1000 matrix has a sparsity pattern of more than 1 million entries, resulting in > 8 MB of object code. In the particular case I'm looking at, the sparsity patterns are responsible for 90% of the object code size.

Furthermore, the sparsity patterns are not even used in the function body. They are only used by the function

CASADI_SYMBOL_EXPORT const casadi_int* Function_sparsity_in(casadi_int i) {

This is at least the case in my version of casadi. Perhaps this changed already but I couldn't find an issue relating to this.

Would it be possible to generate a more compact form of dense sparsity patterns for such cases?

@jaeandersson
Copy link
Member

jaeandersson commented May 19, 2024

The sparsity pattern is stored as a vector of integers where:

  • The first two elements are the dimensions (nrow, ncol)
  • The following (ncol+1) entries are the nonzero offset for each column. Note that this means that the (ncol+3)-th entry is the total number of nonzeros (nnz)
  • The last nnz entries are the row index for each nonzero. Indices are increasing for each column, with no duplicates

For a dense matrix, we have nnz=nrow * ncol, which can easily be checked, and the last part is superfluous information. So you can decrease the storage needs from (2+(ncol+1)+nrow * ncol) to (2+(ncol+1)). The problem is that every algorithm where sparsity is used needs to be specialized to handle the dense case separately. Currently, many common algorithms are, like matrix multiplication, but not all. For maintainability reasons, I would be against implementing the more compact storage (which is easy and quick to do) unless we go through all algorithms in CasADi where sparsity is being used and update them accordingly. You can of course implement it yourself in your own fork of CasADi.

It's possible to get an even more compact representation, or if you want to support something like triangular patterns or higher order tensors, by letting the third element of the sparsity vector be something else than zero. It's always zero now by definition. But again, this generation would need to be propagated everywhere in a maintainable way.

@Mathadon
Copy link
Contributor Author

Okay, thanks for the feedback! I'll leave this open for now since it seems like a valid feature.

@jaeandersson
Copy link
Member

Okay, thanks for the feedback! I'll leave this open for now since it seems like a valid feature.

It would be a useful feature. It's usually hard to get funding for features that would "benefit a lot of users a little" as opposed to "benefit a few users a lot". That's the situation with a lot of potential core improvements.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants