Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BLAS-like kernel for symmetric updates in LDL factorization without pivoting #31

Open
poulson opened this issue Aug 7, 2015 · 17 comments

Comments

@poulson
Copy link

poulson commented Aug 7, 2015

There is a long history of proposals to include simple modifications of ?syrk and ?herk to support C += alpha A (D A)^T, where D is diagonal. This kernel turns out to be important for Interior Point Methods, which often make use of LDL factorizations (without pivoting) of modified saddle-point systems.

Does such a routine already exist in BLIS? If not, is there a good place to start for adding support? Or a preferred name/convention?

@chriscoey
Copy link

Seconding this, but I would suggest further generalizing from diagonal D to symmetric/hermitian.

@fgvanzee
Copy link
Member

@poulson @chriscoey Can you elaborate more on what you have in mind, since I'm unfamiliar with this operation?

Specifically, I'm interested in knowing the structure of A in C += alpha A (D A)^T. (Also, what is an LDL factorization? It's not clear to me how this relates to your request, or if it is strictly unnecessary background / context.)

@rvdg
Copy link
Collaborator

rvdg commented Apr 26, 2019 via email

@fgvanzee
Copy link
Member

@rvdg Thanks, Robert. So is a real domain LDL factorization specified as A = L D L^T?

@devinamatthews
Copy link
Member

devinamatthews commented Apr 26, 2019

@fgvanzee yes. The updates in question are the A -= L D L^T updates for A11, A21, and A22 (depending on variant).

@poulson I implemented C = A D B multiplications (with D diagonal) in TBLIS (interface and implementation). C = A B D is also possible. It should be possible to port this to BLIS and also extend to ?syrk.

@chriscoey we had an undergraduate working on general 3-matrix multiplication. @rvdg did that end up as a usable product?

@poulson
Copy link
Author

poulson commented Apr 26, 2019

@fgvanzee Yes, Robert (@rvdg) already clarified. Such factorizations are especially useful for symmetric quasi-(semi)definite matrices, which are, up to permutation, of the form [A, B; B', -C], where A, and C are Hermitian positive-(semi)definite. In the definite case, one can show bounds (e.g., via Saunders' analysis) on the stability of such factorizations. In practice, Interior Point Methods are built on this type of factorization (with the semidefinite cases modified to be definite).

@fgvanzee
Copy link
Member

I still haven't seen anyone relate C += alpha A (D A)^T and A -= L D L^T. Even after you nix the alpha, harmonize the +/-, and assume A (in the first expression) is unit lower triangular, those don't look like the same operation to me.

@fgvanzee
Copy link
Member

fgvanzee commented Apr 26, 2019

Nevermind, I see now that they are equivalent.

EDIT: No, I take that back. I'm still in doubt.

@fgvanzee
Copy link
Member

Let A = L. I don't see how (D A)^T = D L^T. The lhs applies the diagonal elements to the columns of A^T (because (D A)^T = A^T D) while the rhs applies the diagonal elements to the rows of L^T.

This may seem like minutae, but I like to understand the operation before I think about implementing it.

Note: While I appreciate that there are sometimes rich stories to tell about the applications that employ these operations, I actually don't need to know how the operation is used (it just clutters my brain and doesn't stick anyway). I happily leave such knowledge-keeping to people like @poulson and @devinamatthews. :)

@poulson
Copy link
Author

poulson commented Apr 26, 2019

Hi Field,

Think of an $LDL^H$ factorization as a square-root free Cholesky factorization, where the pivots are stored in the diagonal matrix $D$. Each rank-one update of a right-looking / greedy / variant 3 algorithm are Hermitian and of the form $a_{2,1} \delta_{1,1} a_{2,1}^H$. In the blocked algorithm, they become $A_{2,1} D_{1, 1} A_{2,1}^H$, where $D_{1,1}$ is the diagonal matrix of pivots from the active diagonal block, $A_{1,1} = L_{1,1} D_{1,1} L_{1,1}^H$.

@fgvanzee
Copy link
Member

@poulson I've never heard of a pivoted Cholesky factorization, but I get the idea!

@poulson
Copy link
Author

poulson commented Apr 27, 2019

@fgvanzee The diagonal entries you divide by are sometimes called pivots even if you didn't permute. Per the other question: diagonally-pivoted Cholesky is useful for computing a rank-revealing factorization of a Hermitian positive semi-definite matrix.

@fgvanzee
Copy link
Member

@poulson Ah, thanks for that clarification. I'd never encountered this alternate instance/usage of "pivot." Good to know. (The President might say I am a fake linear algebraist. Although I think he would struggle to even pronounce "algebraist" correctly.)

@pauldj
Copy link

pauldj commented May 10, 2019

The D of an LDL' factorization is actually NOT diagonal.
It is block-diagonal, with blocks of size 1x1 or 2x2.

An LDL factorization is like a Cholesky factorization, but for a matrix that is symmetric but not necessarily positive definite. L is unit lower triangular and D is diagonal.

@poulson
Copy link
Author

poulson commented May 10, 2019

@pauldj Quasi-diagonal 'D" -- which I find useful to refer to as 'B', as in LBL -- is only needed for the general symmetric/Hermitian-indefinite case. It is provably not needed for symmetric quasi-definite matrices. Indeed, most Interior Point Methods have highly indefinite, highly ill-conditioned sparse systems that they factor with LDL with diagonal D.

@chuckyschluz
Copy link

I have been looking for an open source implementation of this myself. Intel MKL has two routines to compute A * B * A^T, where A is a general matrix, and B is Hermitian (NOT necessarily positive definite). sypr takes sparse A and B, whereas syprd takes sparse A and dense B. They work very well -- but I have no clue what's going on under the hood.

https://software.intel.com/content/www/us/en/develop/documentation/onemkl-developer-reference-c/top/blas-and-sparse-blas-routines/inspector-executor-sparse-blas-routines/inspector-executor-sparse-blas-execution-routines/mkl-sparse-sypr.html

https://software.intel.com/content/www/us/en/develop/documentation/onemkl-developer-reference-c/top/blas-and-sparse-blas-routines/inspector-executor-sparse-blas-routines/inspector-executor-sparse-blas-execution-routines/mkl-sparse-syprd.html

Aaron-Hutchinson added a commit to sifive/sifive-blis that referenced this issue Apr 4, 2023
@lorentzenchr
Copy link

The original proposal to add an extension to ?syrk to support C += alpha A (D A)^T for general A and diagonal D was once discussed on the mailing list and even has an open PR #536.

I personally would be very interested to have this functionality available. One big use case is for Generalized Linear Models which are fit by an iterative solver. The hessian - required by 2nd order/Newton optimization methods - is given by $X'DX$ where $D$ is a diagonal matrix changing in each iteration and $X$ is a fixed data matrix (going under many names: design matrix, feature matrix, covariates, explanatory variables, …).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants