# `istril`, `istriu`, `diag`, etc for block diagonal matrices #31122

Open
opened this issue Feb 20, 2019 · 19 comments

Contributor

### mcognetta commented Feb 20, 2019

 Currently, block diagonal matrices don't correctly support `istril` or `istriu` ```julia> D = Diagonal([[1 2; 3 4], [5 6; 7 8]]) 2×2 Diagonal{Array{Int64,2},Array{Array{Int64,2},1}}: [1 2; 3 4] ⋅ ⋅ [5 6; 7 8] julia> A = [1 2 0 0; 3 4 0 0; 0 0 5 6; 0 0 7 8] # full matrix of D 4×4 Array{Int64,2}: 1 2 0 0 3 4 0 0 0 0 5 6 0 0 7 8 julia> istriu(D) true julia> istriu(A) false``` These can be solved by adding ```istril(D::Diagonal{<:Number}) = true istril(D::Diagonal) = all(istril, D.diag)``` etc. Once this is fixed, what should the intended behavior of `tril(D, k)`, `triu(D, k)`, `diag(D, k)`, etc. be? `tril(D, k)` can just recursively check the blocks but should we make `diag(D, k)` actually return the k-th super diagonal of the 'full' matrix? I realize this is quite low on the priority list, but it seems like block diagonal matrices were hastily added and a lot of support is missing.

Contributor

### eulerkochy commented Feb 21, 2019

 `istril` for a block diagonal matrix is true iff any one of its constituent matrices is lower triangular. `istriu` for a block diagonal matrix is true iff any one of its constituent matrices is upper triangular. the block diagonal matrix will be diagonal iff all of its constituents matrices are diagonal Well, the above points can be proven. Given this, I think this can be implemented in Julia. The use case for block diagonal matrices are its application in solving Dirac Equation (https://en.wikipedia.org/wiki/Dirac_equation) and fast matrix multiplication using Strassen's Algorithm (I don't know if it's already implemented or not) I realize this is quite low on the priority list, but it seems like block diagonal matrices were hastily added and a lot of support is missing. What exactly is missing?
Member

### andreasnoack commented Feb 21, 2019

 I realize this is quite low on the priority list, but it seems like block diagonal matrices were hastily added and a lot of support is missing. Yes. We should have gone through the methods in `diagonal.jl` and checked for places where implicitly assume that the elements are scalars. It would also be good to add ensure test coverage for this.

Open

Contributor

### dkarrasch commented Mar 12, 2019

 Oo! Apparently, even for general block matrices `A::Array{Array{Float64,2},2}` `istriu`, `istril`, and `isdiag` are determined based on the block structure, not recursively: ``````A = [i == j ? rand(2,3) : zeros(2,3) for i in 1:3, j in 1:3] isdiag(A), istril(A), istriu(A) # == (true, true, true) `````` The same is true for `triu(A)` and `tril(A)`, which only look at the outer matrix. In this sense, we are introducing inconsistency when we look at band structure recursively. The big question seems to be whether this is just oversight in `LinearAlgebra` or whether this was done on purpose. The inconsistency pointed out in the OP indicates that likely we would want the results for block matrices to be consistent with the "flattened" full matrix, which calls for a recursive handling of these issues. Obviously, this would be a breaking change. Also, it seems that symmetry/Hermitianness is determined somewhat recursively: we go through one triangle of the outer matrix, and ask whether that entry is equal to its own transpose/adjoint. This comparison, I guess, goes down the hierarchy if necessary. The following work as one would expect from a recursive/flattened viewpoint: ``````A = [rand(2,3) for _ in 1:1, _ in 1:1] # 1×1 Array{Array{Float64,2},2} isdiag(A) # true b = rand(2,3) A = [[b for _ in 1:1, _ in 1:1] for _ in 1:1, _ in 1:1] # 1×1 Array{Array{Array{Float64,2},2},2} issymmetric(A) # false A = [[b'b for _ in 1:1, _ in 1:1] for _ in 1:1, _ in 1:1] # 1×1 Array{Array{Array{Float64,2},2},2} issymmetric(A) # true `````` In summary, the issue is more general than described in the OP.
Contributor

### dkarrasch commented Mar 14, 2019

I had a quick look at where `istril` gets called in Julia's stdlibs, and found one very prominent case:

Lines 956 to 973 in f44a37f

 function (\)(A::AbstractMatrix, B::AbstractVecOrMat) require_one_based_indexing(A, B) m, n = size(A) if m == n if istril(A) if istriu(A) return Diagonal(A) \ B else return LowerTriangular(A) \ B end end if istriu(A) return UpperTriangular(A) \ B end return lu(A) \ B end return qr(A,Val(true)) \ B end

If I understand that correctly, then (at least) in this case we really do want `istril` and `istriu` to look at the block structure, and not elementwise. The slightly modified example from above

``````A = [i == j ? rand(2,2) : zeros(2,2) for i in 1:3, j in 1:3]
isdiag(A), istril(A), istriu(A) # == (true, true, true)
``````

is diagonal, and we would want that to be recognized when solving. Admittedly, `Diagonal` matrices (which are addressed in this issue) would be caught earlier by multiple dispatch and would not be "misclassified" by an elementwise definition of diagonality. But still, the question remains: how do we interpret things like symmetry, triangular or band structure for block matrices?

Open

Contributor

### dlfivefifty commented Mar 14, 2019

 I was surprised at first by the following behaviour: ```julia> A = Diagonal([[1 2], [3 4 5; 5 6 7; 8 9 10]]) 2×2 Diagonal{Array{Int64,2},Array{Array{Int64,2},1}}: [1 2] ⋅ ⋅ [3 4 5; 5 6 7; 8 9 10] julia> A[1,2] 1×3 Array{Int64,2}: 0 0 0 julia> A[2,1] 3×2 Array{Int64,2}: 0 0 0 0 0 0``` But it ended up being very helpful for diagonal `BlockArray`s, see https://github.com/JuliaArrays/BlockArrays.jl/blob/0776c2e745ee63a761d6a29f78ba7d54ec2d49a3/test/test_blockarrayinterface.jl#L80 This is very similar to the discussion on whether `adjoint` should adjoint the entries as well (it does). So I think at this point things are committed to being block-aware. Though it probably could be codified better and more consistent.
Contributor

### dkarrasch commented Mar 14, 2019

 @dlfivefifty So what's the conclusion? Do we judge symmetry, band/diagonal/triangular structure recursively/elementwise? Then the use of `istril` in the above code snippet from `generic.jl` would be inconsistent, right? Maybe, as a compromise, one could define `istril`, `istriu` and `isdiag` as block-tril, block-triu, and block-diagonal for `AbstractMatrices`, whereas for special matrix types (`Bidiagonal`, `SymTridiagonal`, `Diagonal` etc.) we go down recursively? That's at least what is proposed in #31313, and has been done already for `isdiag(::Diagonal)` earlier. Again, in the `generic.jl` case above we wouldn't want elementwise triangularity/diagonality.
Contributor

### dlfivefifty commented Mar 14, 2019

 I have no conclusion, and am happy with whatever the consensus is.
Member

### StefanKarpinski commented Mar 14, 2019

 We continue to have the problem that no one has ever written down what a matrix of matrices is supposed to represent. Without such a statement, questions like "should `isdiag` be recursive?" will continue to be unclear and answered differently by different people. As it is, I suspect that we are terribly inconsistent about this: some parts of the code look at it one way, other parts another way.
Contributor Author

### mcognetta commented Mar 14, 2019

 This is slightly off the main topic but still relevant. To me, it doesn't seem like having block matrices in the stdlib is worth the trouble. This would be a breaking change but it seems like we should restrict all matrices in LinearAlgebra to those with numerical eltypes. One can use a third party package if they really need block matrix support (these exist already). As Stefan said we have an inconsistent (or non existent) definition and even simple stuff on block matrices fails in the current release.
Contributor

### dlfivefifty commented Mar 14, 2019

 Restricting to `<: Number` may be a bad idea as I think there are plenty of other algebras that are not numbers that people may want to do linear algebra with (quaternions come to mind but one might define `Quaternion <: Number`). I think the current "use at your own risk" situation is preferred to overly restricting types. That said, I think in the specific case of block-matrices I find linear algebra ill-defined without control on the block sizes, and better left to BlockArrays.jl. Though LinearAlgebra could still work with something like `SMatrix{2,2}`.
Contributor

### dkarrasch commented Mar 14, 2019

 That said, I think in the specific case of block-matrices I find linear algebra ill-defined without control on the block sizes Yes, I noticed that consistency of block sizes is never checked. One case where block stuff works nicely is with block-diagonal matrices. Here, solve "passes" the `\` further down to the blocks, and hence only several smaller factorizations instead of one big factorization need to be computed.
Contributor

### dkarrasch commented Mar 15, 2019

 In any case, the PR related to this issue is introducing the following inconsistency: ``````julia> dv = [[1 2; 3 4], [5 6; 7 8]]; julia> ev = [zeros(Int, 2,2)]; julia> Bl = Bidiagonal(dv, ev, :L) 2×2 Bidiagonal{Array{Int64,2},Array{Array{Int64,2},1}}: [1 2; 3 4] ⋅ [0 0; 0 0] [5 6; 7 8] julia> istril(Bl) true `````` whereas the same matrix written as block-diagonal yields ``````julia> D = Diagonal(dv); julia> istril(D) false `````` with the PR. I'm not sure we really need a clear and universal definition of what block matrices are supposed to represent to improve consistency (although that would be generally appreciated, no doubt). Regardless of that definition, one may ask two different questions on block matrices: (i) is there block-band structure or not, and (ii) is there band structure elementwise. As I argued above, I think for `AbstractMatrix` it makes sense to leave `istril` and friends to look at the block structure, whereas for special matrix types, it makes sense to let `istril` look recursively into the matrix. One could then ask for a certain block structure via the type (`A isa UpperTriangular`) and for certain elementwise structure via `istriu(A)`. This approach, btw, is already implemented for `isdiag` for Diagonal matrices #29638, which acts recursively. If this is agreeable, then I think the PR is good, but should be complemented by according changes in bidiagonal, tridiagonal and friends.
Contributor

### dlfivefifty commented Mar 15, 2019

 It's a bad idea to have behaviour change depending on the type, it will just lead to hard to debug bugs. Perhaps anything that's defined in an abstract algebra sense (`adjoint`, `Hermitian`, apparently `transpose`, etc.) should be recursive. Anything that's array-structure related (`triu`, `UpperTriangular`, etc.) should not be recursive.
Contributor

### dkarrasch commented Mar 15, 2019 • edited

 Indeed, that would be a good check for consistency: `istriu(triu(A))` should return `true`, but the related PR here would return `false` if the diagonal blocks are not upper triangular themselves. According to that rule, we would need to make `isdiag(::Diagonal)` unconditionally true.
Member

### StefanKarpinski commented Mar 18, 2019

 This business of interpreting of matrices of matrices as representing block matrices in some vague, unspecified way is pretty questionable, imo. If we're going to do it, I think that we at least deserve a description of how it's supposed to work. Julia is, after all, not some type-impoverished language that can't do this in a better way with actual types.
Member

### andreasnoack commented Mar 28, 2019

 The existence of BlockArrays.jl doesn't change the fact that Julia arrays can have array elements and we need to consider what is the right behavior for such arrays in our algorithms. Treating elements as scalars is not a neutral position. The issues associated with arrays of arrays is something we gradually have understood and we will need to continue to discuss it. It isn't helpful to try to keep the status quo and point to external packages when the problem is completely within the scope of `LinearAlgebra`.
Contributor

### nickrobinson251 commented Apr 26, 2019

 There's a pretty alpha, soon-to-be registered BlockDiagonal.jl package (Obviously, as @andreasnoack says, this doesn't solve any problems in the scope of `LinearAlgebra`, but i thought it worth mentioning in case it can help anyone reading this issue.)

Open

Member

### StefanKarpinski commented Apr 29, 2019 • edited

 The issues associated with arrays of arrays is something we gradually have understood and we will need to continue to discuss it. All I've asked for (repeatedly) is a clarification of what the block array model even is. Have yet to get it.
Contributor

### dlfivefifty commented Apr 29, 2019

 I would say any sensible model would at the very least have the property that operations on matrix-representations of the imaginary numbers behave the same way as imaginary numbers, or throw errors. Matrix multiplication satisfies this property: ```julia> i = [0 1; -1 0]; julia> A = reshape([I+2i,I-i,2I-3i,3I+4i], 2,2) 2×2 Array{Array{Int64,2},2}: [1 2; -2 1] [2 -3; 3 2] [1 -1; 1 1] [3 4; -4 3] julia> B = [1+2im 2-3im; 1-im 3+4im] 2×2 Array{Complex{Int64},2}: 1+2im 2-3im 1-1im 3+4im julia> A*[I+i,2I-i] 2-element Array{Array{Int64,2},1}: [0 -5; 5 0] [12 5; -5 12] julia> B*[1+im,2-im] 2-element Array{Complex{Int64},1}: 0 - 5im 12 + 5im``` `adjoint` is also fine, though FWIW `transpose` is "wrong": ```julia> A' 2×2 Adjoint{Adjoint{Int64,Array{Int64,2}},Array{Array{Int64,2},2}}: [1 -2; 2 1] [1 1; -1 1] [2 3; -3 2] [3 -4; 4 3] julia> B' 2×2 Adjoint{Complex{Int64},Array{Complex{Int64},2}}: 1-2im 1+1im 2+3im 3-4im julia> transpose(A) 2×2 Transpose{Transpose{Int64,Array{Int64,2}},Array{Array{Int64,2},2}}: [1 -2; 2 1] [1 1; -1 1] [2 3; -3 2] [3 -4; 4 3] julia> transpose(B) 2×2 Transpose{Complex{Int64},Array{Complex{Int64},2}}: 1+2im 1-1im 2-3im 3+4im``` The current behaviour of `Diagonal{<:AbstractMatrix}` is also fine (though I'd be happy too if an error is thrown for off-diagonal entries). Based on this model, it makes sense that `istril`, `istriu` are not recurrsive.