Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed the algorithm for powers of a matrix. #21184

Merged
merged 3 commits into from
Apr 24, 2017
Merged

Fixed the algorithm for powers of a matrix. #21184

merged 3 commits into from
Apr 24, 2017

Conversation

iamnapo
Copy link
Contributor

@iamnapo iamnapo commented Mar 27, 2017

Fixed #21143

I did a few small bug-fixes and also changed the ^(A::AbstractMatrix, p::Integer) algorithm a bit to make use of powm(A0::UpperTriangular{T}, p::Real).

@@ -113,6 +113,7 @@ export
ordschur,
peakflops,
pinv,
powm,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is an internal helper function, it shouldn't be exported

@iamnapo
Copy link
Contributor Author

iamnapo commented Mar 27, 2017

fixed it.

@ararslan ararslan added domain:linear algebra Linear algebra needs tests Unit tests are required for this change labels Mar 27, 2017
@iamnapo
Copy link
Contributor Author

iamnapo commented Mar 28, 2017

I changed some and added a few more tests based on:
Deadman & Higham [March 2014] , Higham & Lin [September 2013]

@ararslan ararslan removed the needs tests Unit tests are required for this change label Mar 28, 2017
@kshyatt
Copy link
Contributor

kshyatt commented Mar 28, 2017

Is this good to go?

@andreasnoack
Copy link
Member

I don't know yet. I haven't looked carefully. What is the easiest way to resume? Should I open a PR from my branch or should this PR be against master?

@tkelman
Copy link
Contributor

tkelman commented Mar 28, 2017

let's change this branch PR to be against master, so we can get a more complete view of the diff including the few commits from Andreas' branch

@iamnapo iamnapo changed the base branch from anj/powm to master March 28, 2017 19:59
@tkelman
Copy link
Contributor

tkelman commented Mar 28, 2017

and close/reopen to trigger CI, apparently changing the target branch doesn't do that

@tkelman tkelman closed this Mar 28, 2017
@tkelman tkelman reopened this Mar 28, 2017
function ^(A::AbstractMatrix, p::Number)
# Matrix power
^(A::AbstractMatrix, p::Integer) = p < 0 ? Base.power_by_squaring(inv(A),-p) : Base.power_by_squaring(A,p)
function ^{T}(A::AbstractMatrix{T}, p::Real)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the right approach for complex p ? is there a different method for that?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure. Maybe we can ask @higham what the right approach is.

Copy link
Contributor Author

@iamnapo iamnapo Mar 28, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only approach I found is by definition: expm(p * logm(A)). But @stevengj pointed out correctly that it's very computationally expensive, so I didn't add it.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's all there is for complex p. If you do an initial Schur decomp of A then expm and logm are applied to triangular matrices, which reduces the cost.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be reasonable to have ^(A::AbstractMatrix, p::Number) = expm(p * logm(A)) as a fallback. We can optimize it later if needed.

base/intfuncs.jl Outdated
@@ -193,7 +193,6 @@ end

^{T<:Integer}(x::T, p::T) = power_by_squaring(x,p)
^(x::Number, p::Integer) = power_by_squaring(x,p)
^(x, p::Integer) = power_by_squaring(x,p)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure if this should be here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fallback method shouldn't be removed by this PR.

if isinteger(p)
return A^Integer(real(p))
return p < 0 ? Base.power_by_squaring(inv(A),-Integer(p)) : Base.power_by_squaring(A,Integer(p))
Copy link
Member

@stevengj stevengj Mar 29, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The promotion here is still wrong if p is not an integer, as explained in #21143. A^2.0 should produce a floating-point matrix even if A is integer.

The return type should be something like Matrix{Base.promote_op(^, eltype(A), typeof(p))}.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, note that power_by_squaring is pretty inefficient for matrices because it generates a lot of temporary matrices. We should really have a specialized _power_by_squaring!(Aᵖ::Matrix, A::Matrix, p::Integer) function that minimizes the number of temporary arrays and uses a pre-allocated output array. But that can be a separate PR, since it is just a performance optimization.

Copy link
Member

@stevengj stevengj Mar 29, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If A is an integer matrix, A^2 should be an integer matrix, but A^2.0 should be a floating-point matrix.

We definitely should not use diagonalization for integer p, because it will be way slower even than what we are doing now, despite the extra temporary matrices (which mostly waste memory, not time, for large A). e.g.

julia> A = randn(1000,1000); A = A'*A; # random SPD matrix

julia> @time A^4;
  0.428707 seconds (114.28 k allocations: 20.806 MiB, 20.50% gc time)

julia> @time A^4.5;
  3.669913 seconds (2.70 M allocations: 172.916 MiB, 2.36% gc time)

@@ -310,15 +310,23 @@ kron(a::AbstractMatrix, b::AbstractVector)=kron(a,reshape(b,length(b),1))
kron(a::AbstractVector, b::AbstractMatrix)=kron(reshape(a,length(a),1),b)

# Matrix power
function ^{T}(A::AbstractMatrix{T}, p::Integer)
# non-inversible matrix to negative power is undefined.
if (p < 0) && (det(A) == 0)
Copy link
Member

@stevengj stevengj Mar 29, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check does not seem like a good idea. First, inv(A) will already throw an error if A is singular, and just as quickly as det will. Second, computing det(A) means that you compute the LU factorization of A twice. Third, you could get false positives because of underflow (e.g. det([1e-300 0; 0 1e-300]) == 0 even though the matrix is not singular).

(Basically, you should forget everything you learned in high school about the determinant as a computational tool.)

if isinteger(p)
return p < 0 ? Base.power_by_squaring(inv(A),-Integer(p)) : Base.power_by_squaring(A,Integer(p))
if isinteger(p) && !(typeof(p) <: Int64)
return float.(A^Integer(p))
Copy link
Member

@stevengj stevengj Mar 29, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not right. First, Matrix{Integer}^Integer should be an integer matrix (even if p is not Int64). Second, for Matrix{Int}^Float64 we need to convert to floating point and then exponentiate in order to avoid overflow problems. Third, calling float.(X) will make an extra copy even if the matrix is already floating point.

We want something like:

convert(Matrix{promote_op(^, eltype(A), typeof(p)}, A)^Integer(p)

although that isn't quite right either: if A is not a Matrix, but is some other AbstractMatrix type, then we probably want the same type. Maybe

T = Base.promote_op(^, eltype(A), typeof(p))
return (T == eltype(A) ? A : copy!(similar(A, T), A))^Integer(p)

@iamnapo
Copy link
Contributor Author

iamnapo commented Mar 29, 2017

One thing I'm not sure is what should happen is A is singular and p is negative real

@@ -285,7 +285,6 @@ logm(D::Diagonal) = Diagonal(log.(D.diag))
logm(D::Diagonal{<:AbstractMatrix}) = Diagonal(logm.(D.diag))
sqrtm(D::Diagonal) = Diagonal(sqrt.(D.diag))
sqrtm(D::Diagonal{<:AbstractMatrix}) = Diagonal(sqrtm.(D.diag))
^(D::Diagonal, p::Real) = Diagonal((D.diag).^p)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why was this method removed?

end
end
if np_real_eigs
warn("Matrix with nonpositive real eigenvalues, a nonprincipal matrix power will be returned.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe pass once=true here so that people don't see repeated warnings?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you mean in something like (A^0.5) * (A^0.5)? If so, I'm not sure how to come up with a right key for the warning.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default key is the warning message, which seems fine here.

warn("Matrix with nonpositive real eigenvalues, a nonprincipal matrix power will be returned.")
end

if isreal(A) && ~np_real_eigs
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

!np_real_eigs

sqrt_diag!(A0, A, s)

# Compute the Gauss-Legendre quadrature formula
x, w = Base.QuadGK.gauss(Float64, m)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This won't work now that QuadGK was removed from Base, no?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since it looks like m ≤ length(theta), we can just precompute these and store them in a const array.

@iamnapo
Copy link
Contributor Author

iamnapo commented Mar 30, 2017

@stevengj
If I put once=true then the warning will only show once, even if later I have a different matrix.
I removed ^(D::Diagonal, p::Real) because it created ambigiuties with ^(x, p::Integer), for some reason.
Regarding logm: #21179

@@ -512,6 +512,49 @@ end
@test_throws ArgumentError diag(zeros(0,1),2)
end

@testset "New matrix ^ implementation" for elty in (Float64, Complex{Float64})
A11 = convert(Matrix{elty}, rand(10, 10))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might want to just have rand(elty,10,10) here so that you actually make complex matrices.

# Check whether the matrix has nonpositive real eigs
np_real_eigs = false
for i = 1:n
if imag(d[i]) < eps() && real(d[i]) <= 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be the eps of the element type? not sure how likely it would be to matter

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's also typemin() for example, but I thought default eps() is generally a safe condition to produce a warning. You think I should change it nevertheless?

Copy link
Member

@stevengj stevengj Apr 5, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a good way to do the test because it depends on overall scaling of the matrix (if you multiply the whole matrix by 1e±100 the result will change).

Also, shouldn't you be checking abs(imag), not imag? And shouldn't the warning be for real < 0, not real ≤ 0?

I'm not sure what a robust test would be here. Maybe abs(imag(d[i])) ≤ ɛ && real(d[i]) < 0, where ɛ = eps(maxabs(d))?

@tkelman
Copy link
Contributor

tkelman commented Apr 5, 2017

It evidently isn't tested, but tightening ^(A::AbstractMatrix, p::Number) to ^(A::AbstractMatrix, p::Real) and not having a method for other subtypes of Number could be breaking

retmat = Base.power_by_squaring(A, p)
end
for i = 1:n
retmat[i,i] = real(retmat[i,i])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know there are separate issues where this has been discussed, but just checking - any imaginary components here should only ever be round-off, right? Does blas hemm touch the diagonal imaginary components or wouldn't it leave them alone? Is the issue round-off for julia-generic or non-blas element types?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the imaginary part is guaranteed to be zero, so any nonzero imaginary part is purely due to roundoff errors.

The errors arise for any values, including blas floats, because multiplying by the matrix of eigenvectors uses the general matmul routine. I don't think there is a specialized BLAS routine for matrix * real diagonal * matrix', which is the operation here.

Copy link
Member

@stevengj stevengj Apr 13, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, for Hermitian^p the operation is hermitian * hermitian, but again there is no specialized BLAS method for this because in general the result would not be Hermitian. (The result is Hermitian here because the matrices being multiplied are A^p and A^q, i.e. different powers of the same Hermitian matrix.) So the roundoff errors still arise for any values, including BLAS floats.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example, note the nonzero imaginary parts on the diagonal of:

julia> A = rand(Complex128,10,10); A = Hermitian(A + A')
10×10 Hermitian{Complex{Float64},Array{Complex{Float64},2}}:
 0.0861346-0.0im        0.837547-0.743436im        1.09493-0.163298im 
  0.837547+0.743436im    1.96064-0.0im             0.491744-0.792954im 
  0.869364-0.131544im    1.06965-0.252339im        0.370136+0.0155668im
   0.83727+0.303993im    1.18724+0.108695im         1.38671+0.191601im 
   1.50475+0.357721im    1.00663-0.421604im         1.02963-0.13435im  
    1.1999+0.0748608im  0.370293+0.503285im        1.13011-0.0552782im
   1.56458+0.546159im   0.512528-0.0406777im       0.721986+0.445042im 
  0.394002+0.645965im   0.935305-0.473984im        0.971132+0.493203im 
  0.741598-0.175156im    1.91631+0.111909im        0.868631+0.834707im 
   1.09493+0.163298im   0.491744+0.792954im      0.00985939-0.0im      

julia> imag(diag(A^2 * A^3))
10-element Array{Float64,1}:
 0.0        
 5.68434e-14
 0.0        
 0.0        
 0.0        
 4.54747e-13
 0.0        
 5.68434e-14
 0.0        
 0.0        

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

end
end

function expm{T<:Real}(A::Symmetric{T})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would this, logm, and sqrtm for Symmetric of non-Real element types become no-method errors here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, these are covered by Hermitian methods

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about complex symmetric, non hermitian?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oooh I'm sorry, for some reason I thought symmetric matrices are always hermitian

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's a bit of a niche type, a fair number of routines even in blas or lapack aren't always implemented for c/z symmetric

@stevengj
Copy link
Member

LGTM. The Travis failure seems unrelated.

end
end
function ^{T<:Real}(A::Symmetric{T}, p::Real)
F = eigfact(full(A))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is it we need the full call here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(In general, a lot of the full calls seem odd to me; we seem better off adding additional methods elsewhere, but I thought that could be left for another PR. This particular one is especially egregious though because eigfact should be especially efficient for real-symmetric matrices. What happens if you just delete all of the full calls?)

@jebej
Copy link
Contributor

jebej commented Apr 21, 2017

Thanks for pushing with this. It would be really great to have it for 0.6.0


function ^(A::AbstractMatrix, p::Number)
# Matrix power
^{T}(A::AbstractMatrix{T}, p::Integer) = p < 0 ? Base.power_by_squaring(inv(A),-Integer(p)) : Base.power_by_squaring(A,Integer(p))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is Integer(p) needed here?

@tkelman
Copy link
Contributor

tkelman commented Apr 21, 2017

The tightening of the allowed power type here from Number to Real would currently break [1 2; 3 4]^im

@iamnapo
Copy link
Contributor Author

iamnapo commented Apr 21, 2017

@tkelman I'm not sure where are you referring to

@tkelman
Copy link
Contributor

tkelman commented Apr 21, 2017

it's collapsed right now, but the line at #21184 (comment)

@iamnapo
Copy link
Contributor Author

iamnapo commented Apr 21, 2017

Currently [1 2; 3 4]^im produces a MethodError. What do you think should happen?

(I plan on working at function ^{T}(A::AbstractMatrix{T}, p::Complex) at another PR)

@tkelman
Copy link
Contributor

tkelman commented Apr 21, 2017

On master it gives a consistent answer with http://www.wolframalpha.com/input/?i=N%5B%7B%7B1,+2%7D,+%7B3,+4%7D%7D%5EI%5D. I don't know of a use case for this off the top of my head, but that doesn't seem like a good reason to change it from consistent with Wolfram to an error.

@stevengj
Copy link
Member

@iamnapo, we should just add the ^(A::AbstractMatrix, p::Number) = expm(p*logm(A)) fallback routine and a test.

retmat = retmat * powm!(UpperTriangular(float.(A)), real(p - floor(p)))
end
else
S,Q,d = schur(complex(full(A)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

another full that may have been missed, or was this one needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is needed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good to know - what was happening without it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if A is symmetric with complex values, but not hermitian

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@iamnapo, do you mean that there was a methoderror for schur(::Symmetric{<:Complex}}? It would be better to solve that by adding a method to shur or shurfact that calls full.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we just need:

schur(A::Symmetric{<:Complex}) = schur(full(A))

Copy link
Contributor Author

@iamnapo iamnapo Apr 22, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I'll add it.

@eval begin

function ($funm)(A::Symmetric)
function ($funm){T<:Real}(A::Symmetric{T})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is the problem for Symmetric{<:Complex} that isposdef doesn't work?

Copy link
Member

@stevengj stevengj Apr 22, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, the issue is that Symmetric{<:Complex} may have ill-conditioned eigenvectors F.vectors, so it is not reliable to use diagonalization to compute matrix functions. Whereas for real-symmetric matrices the eigenvectors are unitary, which we are exploiting here: F.vectors' == inv(F.vectors). The complex analogue is a Hermitian matrix, not complex-symmetric.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool. Useful to write that down somewhere, I wouldn't have remembered that detail from a linear algebra textbook.

@iamnapo iamnapo changed the title Fixed the algorithm for real powers of a matrix. Fixed the algorithm for powers of a matrix. Apr 22, 2017
@@ -107,6 +107,7 @@ function schur(A::StridedMatrix)
SchurF = schurfact(A)
SchurF[:T], SchurF[:Z], SchurF[:values]
end
schur(A::AbstractMatrix) = schur(full(A))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem with this definition is that it will produce an infinite loop for any AbstractMatrix type where full(A) === A and there is no more-specific schur method.

I would just add schur(A::Triangular{<:Complex}) = schur(full(A)) (and anything else as needed).

Copy link
Contributor Author

@iamnapo iamnapo Apr 22, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't full(A) always of type StridedMatrix?

@@ -111,8 +111,7 @@ schur(A::Symmetric) = schur(full(A))
schur(A::Hermitian) = schur(full(A))
schur(A::UpperTriangular) = schur(full(A))
schur(A::LowerTriangular) = schur(full(A))


schur(A::Tridiagonal) = schur(full(A))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These methods can obviously be made vastly more efficient. For example, an upper-triangular matrix is already in Schur form. But that can be left for another PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of course. I just added these definitions to make it work for now. I plan to work on schur() after this PR is closed.

@tkelman
Copy link
Contributor

tkelman commented Apr 22, 2017

Do the original 2 commits by a different author pass on their own? Any failing commits should be squashed here, but git is bad at attribution if you squash together commits by different authors. Otherwise this lgtm now, thanks for the epic persistence @iamnapo !

@iamnapo
Copy link
Contributor Author

iamnapo commented Apr 22, 2017

I don't think there will be a problem, but in the case there is, you can use command line to --amend authorship, if that's what you mean.

mfasi and others added 3 commits April 23, 2017 08:02
fixed Matrix^Real_Number functionality

reverted logm changes

remove ambiguous Diagonal^Real method
…ntegers.

added a few more test for ^;now uses random 10x10 matrix as well.

fixed some ambiguities, corrected the algorithm a bit.

correctly promotes Matrix{Int}^Float64

reverted some wrong changes

reverted logm changes

speedup if A is diagonal, fixed some tests

small changes based on feedback

powm is now scale-invariant

powm scales in-place, fixed testing a bit, corrected some bugs

removed full() calls

added rollback for Matrix^Complex

added schur() support for Symmetric,Hermitian,Triangular,Tridiagonal matrices
and be uniform about space after # in comments
@Sacha0
Copy link
Member

Sacha0 commented Apr 23, 2017

thanks for the epic persistence @iamnapo !

Epic persistence indeed! Thanks @iamnapo! :)

@iamnapo
Copy link
Contributor Author

iamnapo commented Apr 24, 2017

is this good to go?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

incorrect code for Matrix^Number