Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

`istril`, `istriu`, `diag`, etc for block diagonal matrices #31122

Open
mcognetta opened this issue Feb 20, 2019 · 19 comments

Comments

8 participants
@mcognetta
Copy link
Contributor

commented Feb 20, 2019

Currently, block diagonal matrices don't correctly support istril or istriu

julia> D = Diagonal([[1 2; 3 4], [5 6; 7 8]])
2×2 Diagonal{Array{Int64,2},Array{Array{Int64,2},1}}:
 [1 2; 3 4]      ⋅     
     ⋅       [5 6; 7 8]

julia> A = [1 2 0 0; 3 4 0 0; 0 0 5 6; 0 0 7 8] # full matrix of D
4×4 Array{Int64,2}:
 1  2  0  0
 3  4  0  0
 0  0  5  6
 0  0  7  8

julia> istriu(D)
true

julia> istriu(A)
false

These can be solved by adding

istril(D::Diagonal{<:Number}) = true
istril(D::Diagonal) = all(istril, D.diag)

etc.

Once this is fixed, what should the intended behavior of tril(D, k), triu(D, k), diag(D, k), etc. be?
tril(D, k) can just recursively check the blocks but should we make diag(D, k) actually return the k-th super diagonal of the 'full' matrix?

I realize this is quite low on the priority list, but it seems like block diagonal matrices were hastily added and a lot of support is missing.

@eulerkochy

This comment has been minimized.

Copy link
Contributor

commented Feb 21, 2019

  • istril for a block diagonal matrix is true iff any one of its constituent matrices is lower triangular.
  • istriu for a block diagonal matrix is true iff any one of its constituent matrices is upper triangular.
  • the block diagonal matrix will be diagonal iff all of its constituents matrices are diagonal
    Well, the above points can be proven.
    Given this, I think this can be implemented in Julia.

The use case for block diagonal matrices are its application in solving Dirac Equation (https://en.wikipedia.org/wiki/Dirac_equation) and fast matrix multiplication using Strassen's Algorithm (I don't know if it's already implemented or not)

I realize this is quite low on the priority list, but it seems like block diagonal matrices were hastily added and a lot of support is missing.

What exactly is missing?

@andreasnoack

This comment has been minimized.

Copy link
Member

commented Feb 21, 2019

I realize this is quite low on the priority list, but it seems like block diagonal matrices were hastily added and a lot of support is missing.

Yes. We should have gone through the methods in diagonal.jl and checked for places where implicitly assume that the elements are scalars. It would also be good to add ensure test coverage for this.

@eulerkochy eulerkochy referenced a pull request that will close this issue Mar 10, 2019

Open

Add istriu, istril, isdiag for Diagonal matrices #31313

@dkarrasch

This comment has been minimized.

Copy link
Contributor

commented Mar 12, 2019

Oo! Apparently, even for general block matrices A::Array{Array{Float64,2},2} istriu, istril, and isdiag are determined based on the block structure, not recursively:

A = [i == j ? rand(2,3) : zeros(2,3) for i in 1:3, j in 1:3]
isdiag(A), istril(A), istriu(A) # == (true, true, true)

The same is true for triu(A) and tril(A), which only look at the outer matrix.

In this sense, we are introducing inconsistency when we look at band structure recursively. The big question seems to be whether this is just oversight in LinearAlgebra or whether this was done on purpose. The inconsistency pointed out in the OP indicates that likely we would want the results for block matrices to be consistent with the "flattened" full matrix, which calls for a recursive handling of these issues. Obviously, this would be a breaking change.

Also, it seems that symmetry/Hermitianness is determined somewhat recursively: we go through one triangle of the outer matrix, and ask whether that entry is equal to its own transpose/adjoint. This comparison, I guess, goes down the hierarchy if necessary. The following work as one would expect from a recursive/flattened viewpoint:

A = [rand(2,3) for _ in 1:1, _ in 1:1] # 1×1 Array{Array{Float64,2},2}
isdiag(A) # true

b = rand(2,3)
A = [[b for _ in 1:1, _ in 1:1] for _ in 1:1, _ in 1:1] # 1×1 Array{Array{Array{Float64,2},2},2}
issymmetric(A) # false
A = [[b'b for _ in 1:1, _ in 1:1] for _ in 1:1, _ in 1:1] # 1×1 Array{Array{Array{Float64,2},2},2}
issymmetric(A) # true

In summary, the issue is more general than described in the OP.

@dkarrasch

This comment has been minimized.

Copy link
Contributor

commented Mar 14, 2019

I had a quick look at where istril gets called in Julia's stdlibs, and found one very prominent case:

function (\)(A::AbstractMatrix, B::AbstractVecOrMat)
require_one_based_indexing(A, B)
m, n = size(A)
if m == n
if istril(A)
if istriu(A)
return Diagonal(A) \ B
else
return LowerTriangular(A) \ B
end
end
if istriu(A)
return UpperTriangular(A) \ B
end
return lu(A) \ B
end
return qr(A,Val(true)) \ B
end

If I understand that correctly, then (at least) in this case we really do want istril and istriu to look at the block structure, and not elementwise. The slightly modified example from above

A = [i == j ? rand(2,2) : zeros(2,2) for i in 1:3, j in 1:3]
isdiag(A), istril(A), istriu(A) # == (true, true, true)

is diagonal, and we would want that to be recognized when solving. Admittedly, Diagonal matrices (which are addressed in this issue) would be caught earlier by multiple dispatch and would not be "misclassified" by an elementwise definition of diagonality. But still, the question remains: how do we interpret things like symmetry, triangular or band structure for block matrices?
See also #31154, @dlfivefifty .

@dlfivefifty

This comment has been minimized.

Copy link
Contributor

commented Mar 14, 2019

I was surprised at first by the following behaviour:

julia> A = Diagonal([[1 2], [3 4 5; 5 6 7; 8 9 10]])
2×2 Diagonal{Array{Int64,2},Array{Array{Int64,2},1}}:
 [1 2]               ⋅                  
        ⋅         [3 4 5; 5 6 7; 8 9 10]

julia> A[1,2]
1×3 Array{Int64,2}:
 0  0  0

julia> A[2,1]
3×2 Array{Int64,2}:
 0  0
 0  0
 0  0

But it ended up being very helpful for diagonal BlockArrays, see https://github.com/JuliaArrays/BlockArrays.jl/blob/0776c2e745ee63a761d6a29f78ba7d54ec2d49a3/test/test_blockarrayinterface.jl#L80

This is very similar to the discussion on whether adjoint should adjoint the entries as well (it does). So I think at this point things are committed to being block-aware. Though it probably could be codified better and more consistent.

@dkarrasch

This comment has been minimized.

Copy link
Contributor

commented Mar 14, 2019

@dlfivefifty So what's the conclusion? Do we judge symmetry, band/diagonal/triangular structure recursively/elementwise? Then the use of istril in the above code snippet from generic.jl would be inconsistent, right? Maybe, as a compromise, one could define istril, istriu and isdiag as block-tril, block-triu, and block-diagonal for AbstractMatrices, whereas for special matrix types (Bidiagonal, SymTridiagonal, Diagonal etc.) we go down recursively? That's at least what is proposed in #31313, and has been done already for isdiag(::Diagonal) earlier. Again, in the generic.jl case above we wouldn't want elementwise triangularity/diagonality.

@dlfivefifty

This comment has been minimized.

Copy link
Contributor

commented Mar 14, 2019

I have no conclusion, and am happy with whatever the consensus is.

@StefanKarpinski

This comment has been minimized.

Copy link
Member

commented Mar 14, 2019

We continue to have the problem that no one has ever written down what a matrix of matrices is supposed to represent. Without such a statement, questions like "should isdiag be recursive?" will continue to be unclear and answered differently by different people. As it is, I suspect that we are terribly inconsistent about this: some parts of the code look at it one way, other parts another way.

@mcognetta

This comment has been minimized.

Copy link
Contributor Author

commented Mar 14, 2019

This is slightly off the main topic but still relevant. To me, it doesn't seem like having block matrices in the stdlib is worth the trouble. This would be a breaking change but it seems like we should restrict all matrices in LinearAlgebra to those with numerical eltypes. One can use a third party package if they really need block matrix support (these exist already).

As Stefan said we have an inconsistent (or non existent) definition and even simple stuff on block matrices fails in the current release.

@dlfivefifty

This comment has been minimized.

Copy link
Contributor

commented Mar 14, 2019

Restricting to <: Number may be a bad idea as I think there are plenty of other algebras that are not numbers that people may want to do linear algebra with (quaternions come to mind but one might define Quaternion <: Number). I think the current "use at your own risk" situation is preferred to overly restricting types.

That said, I think in the specific case of block-matrices I find linear algebra ill-defined without control on the block sizes, and better left to BlockArrays.jl. Though LinearAlgebra could still work with something like SMatrix{2,2}.

@dkarrasch

This comment has been minimized.

Copy link
Contributor

commented Mar 14, 2019

That said, I think in the specific case of block-matrices I find linear algebra ill-defined without control on the block sizes

Yes, I noticed that consistency of block sizes is never checked. One case where block stuff works nicely is with block-diagonal matrices. Here, solve "passes" the \ further down to the blocks, and hence only several smaller factorizations instead of one big factorization need to be computed.

@dkarrasch

This comment has been minimized.

Copy link
Contributor

commented Mar 15, 2019

In any case, the PR related to this issue is introducing the following inconsistency:

julia> dv = [[1 2; 3 4], [5 6; 7 8]];
julia> ev = [zeros(Int, 2,2)];
julia> Bl = Bidiagonal(dv, ev, :L)
2×2 Bidiagonal{Array{Int64,2},Array{Array{Int64,2},1}}:
 [1 2; 3 4]      ⋅     
 [0 0; 0 0]  [5 6; 7 8]
julia> istril(Bl)
true

whereas the same matrix written as block-diagonal yields

julia> D = Diagonal(dv);
julia> istril(D)
false

with the PR.

I'm not sure we really need a clear and universal definition of what block matrices are supposed to represent to improve consistency (although that would be generally appreciated, no doubt). Regardless of that definition, one may ask two different questions on block matrices: (i) is there block-band structure or not, and (ii) is there band structure elementwise. As I argued above, I think for AbstractMatrix it makes sense to leave istril and friends to look at the block structure, whereas for special matrix types, it makes sense to let istril look recursively into the matrix. One could then ask for a certain block structure via the type (A isa UpperTriangular) and for certain elementwise structure via istriu(A). This approach, btw, is already implemented for isdiag for Diagonal matrices #29638, which acts recursively. If this is agreeable, then I think the PR is good, but should be complemented by according changes in bidiagonal, tridiagonal and friends.

@dlfivefifty

This comment has been minimized.

Copy link
Contributor

commented Mar 15, 2019

It's a bad idea to have behaviour change depending on the type, it will just lead to hard to debug bugs.

Perhaps anything that's defined in an abstract algebra sense (adjoint, Hermitian, apparently transpose, etc.) should be recursive. Anything that's array-structure related (triu, UpperTriangular, etc.) should not be recursive.

@dkarrasch

This comment has been minimized.

Copy link
Contributor

commented Mar 15, 2019

Indeed, that would be a good check for consistency: istriu(triu(A)) should return true, but the related PR here would return false if the diagonal blocks are not upper triangular themselves. According to that rule, we would need to make isdiag(::Diagonal) unconditionally true.

@StefanKarpinski

This comment has been minimized.

Copy link
Member

commented Mar 18, 2019

This business of interpreting of matrices of matrices as representing block matrices in some vague, unspecified way is pretty questionable, imo. If we're going to do it, I think that we at least deserve a description of how it's supposed to work. Julia is, after all, not some type-impoverished language that can't do this in a better way with actual types.

@andreasnoack

This comment has been minimized.

Copy link
Member

commented Mar 28, 2019

The existence of BlockArrays.jl doesn't change the fact that Julia arrays can have array elements and we need to consider what is the right behavior for such arrays in our algorithms. Treating elements as scalars is not a neutral position. The issues associated with arrays of arrays is something we gradually have understood and we will need to continue to discuss it. It isn't helpful to try to keep the status quo and point to external packages when the problem is completely within the scope of LinearAlgebra.

@nickrobinson251

This comment has been minimized.

Copy link
Contributor

commented Apr 26, 2019

There's a pretty alpha, soon-to-be registered BlockDiagonal.jl package (Obviously, as @andreasnoack says, this doesn't solve any problems in the scope of LinearAlgebra, but i thought it worth mentioning in case it can help anyone reading this issue.)

@StefanKarpinski

This comment has been minimized.

Copy link
Member

commented Apr 29, 2019

The issues associated with arrays of arrays is something we gradually have understood and we will need to continue to discuss it.

All I've asked for (repeatedly) is a clarification of what the block array model even is. Have yet to get it.

@dlfivefifty

This comment has been minimized.

Copy link
Contributor

commented Apr 29, 2019

I would say any sensible model would at the very least have the property that operations on matrix-representations of the imaginary numbers behave the same way as imaginary numbers, or throw errors. Matrix multiplication satisfies this property:

julia> i = [0 1; -1 0];

julia> A = reshape([I+2i,I-i,2I-3i,3I+4i], 2,2)
2×2 Array{Array{Int64,2},2}:
 [1 2; -2 1]  [2 -3; 3 2]
 [1 -1; 1 1]  [3 4; -4 3]

julia> B = [1+2im 2-3im; 1-im 3+4im]
2×2 Array{Complex{Int64},2}:
 1+2im  2-3im
 1-1im  3+4im

julia> A*[I+i,2I-i]
2-element Array{Array{Int64,2},1}:
 [0 -5; 5 0]  
 [12 5; -5 12]

julia> B*[1+im,2-im]
2-element Array{Complex{Int64},1}:
  0 - 5im
 12 + 5im

adjoint is also fine, though FWIW transpose is "wrong":

julia> A'
2×2 Adjoint{Adjoint{Int64,Array{Int64,2}},Array{Array{Int64,2},2}}:
 [1 -2; 2 1]  [1 1; -1 1]
 [2 3; -3 2]  [3 -4; 4 3]

julia> B'
2×2 Adjoint{Complex{Int64},Array{Complex{Int64},2}}:
 1-2im  1+1im
 2+3im  3-4im

julia> transpose(A)
2×2 Transpose{Transpose{Int64,Array{Int64,2}},Array{Array{Int64,2},2}}:
 [1 -2; 2 1]  [1 1; -1 1]
 [2 3; -3 2]  [3 -4; 4 3]

julia> transpose(B)
2×2 Transpose{Complex{Int64},Array{Complex{Int64},2}}:
 1+2im  1-1im
 2-3im  3+4im

The current behaviour of Diagonal{<:AbstractMatrix} is also fine (though I'd be happy too if an error is thrown for off-diagonal entries).

Based on this model, it makes sense that istril, istriu are not recurrsive.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.