Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Define factorize(Adjoint) to make e.g. inv(Adjoint) work (in most cases) #26302

Merged
merged 1 commit into from Mar 7, 2018

Conversation

andreasnoack
Copy link
Member

Fixes #26299. It will still fail for complex symmetric matrices and it will require a few more definitions to handle that (relatively rare) case so I've added a @test_broken for now.

@@ -1222,6 +1222,7 @@ function factorize(A::StridedMatrix{T}) where T
end
qrfact(A, Val(true))
end
factorize(A::Adjoint) = adjoint(factorize(parent(A)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why prefer adjoint(factorize(parent(A))) over factorize(copy(A))? Prefer avoiding an allocation to generating an object that may dispatch more nicely downstream?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As you say, it saves an allocation and I think this within the core business of Adjoint since the main thing to do after factorize is solving and most solves ought to have methods for Adjoint.

Copy link
Member

@Sacha0 Sacha0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Andreas! :)

@garrison
Copy link
Sponsor Member

garrison commented Mar 2, 2018

Thanks! Probably makes sense to do a similar thing for Transpose as well?

@stevengj
Copy link
Member

stevengj commented Mar 3, 2018

And ConjMatrix

@andreasnoack
Copy link
Member Author

@garrison I've added a version for Transpose as well.
@stevengj ConjArray was deprecated in #25217

@fredrikekre
Copy link
Member

I think the user should do inv(copy(A')) (or inv(A)' here instead of doing the materialization automatically.

@mbauman
Copy link
Sponsor Member

mbauman commented Mar 5, 2018

Could you expand on that, @fredrikekre? Adjoints are themselves AbstractMatrixes, supporting all sorts of abstract implementations beyond those specifically targeting ::Adjoint.

B = complex.(A, randn(n, n))
B = B + transpose(B)

@test_broken inv(B')*B' ≈ I
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a note about why this is broken and what would be required to fix it?

@fredrikekre
Copy link
Member

Could you expand on that, @fredrikekre?

There is just so many cases where Adjoint/Tranpose does not work right now (see e.g. #25331), and is the solution really to add methods for these in all cases? For #25331 this would require many combination of methods.

In this particular case the 0.7 equivalent of inv(A') is inv(copy(A')) and I think it might be nice to be explicit about the materialization of the lazy adjoint here, and not "fool" the user that inv can somehow take advantage of a lazy transpose. But on the other hand Adjoint <: AbstractArray so we can not stop them from dispatching to methods accepting AbstractArrays.

Perhaps it would have been different if Adjoint/Transpose weren't AbstractArrays, in that case we could choose to implement only methods were we can take advantage of the lazyness.

@andreasnoack
Copy link
Member Author

andreasnoack commented Mar 5, 2018

I think it might be nice to be explicit about the materialization of the lazy adjoint here, and not "fool" the user that inv can somehow take advantage of a lazy transpose.

Generally, there is a tension here between more and less experienced users. Many non-expert users would like to avoid taking Adjoint into account and just get a result even if it has some unnecessary allocations. People more interested in the implementation might prefer an error here but I'm not completely convinced. Specifically for this PR, there is actually an advantage from using Adjoint over the inv(copy(A')) so I think the critique doesn't apply in this particular case.

@mbauman
Copy link
Sponsor Member

mbauman commented Mar 5, 2018

For #25331 this would require many combination of methods.

That's not true at all. Check out #26331.

The totally generic AbstractArray implementations will be pretty terrible until we have a way to improve iteration order over row-major arrays. But I believe that, too, will come in time.

@Jutho
Copy link
Contributor

Jutho commented Mar 7, 2018

I think the user should do inv(copy(A')) (or inv(A)' here instead of doing the materialization automatically.

I am confused. The current implementation does not materialize A', is that correct?

On the Adjoint <: AbstractArray issue being raised in this topic, one inconsistency with this is that Factorization is not a subtype of AbstractArray, but then if you take its adjoint, it gets wrapped in a Adjoint object and so the adjoint of a Factorization suddenly does become a subtype of AbstractArray.

@andreasnoack
Copy link
Member Author

I am confused. The current implementation does not materialize A', is that correct?

That is correct.

Your second point is also correct. Although a bit annoying, it only seems to be a theoretical and not a practical concern. If it becomes a real problem, I think we should just make Factorization a subtype of AbstractMatrix. Otherwise, we'd have to give up adjoint solves which would be unfortunate.

@andreasnoack andreasnoack merged commit 8613942 into master Mar 7, 2018
@andreasnoack andreasnoack deleted the anj/adjfact branch March 7, 2018 12:58
@StefanKarpinski
Copy link
Sponsor Member

I think we should just make Factorization a subtype of AbstractMatrix.

Any reason not to do this?

@stevengj
Copy link
Member

stevengj commented Mar 7, 2018

Making it a subtype of AbstractMatrix would require that we have getindex, but for an n×n Factorization object getindex would be O(n) rather than O(1).

In general, making Adjoint and Transpose subtypes of AbstractArray seems problematic for any kind of "matrix-free" data structure, where you want to be able to take the adjoint but don't have explicit matrix storage or any efficient way to obtain individual entries.

@andreasnoack
Copy link
Member Author

Ref: #1412 and #10064

For QR, I think indexing would as costly as O(n^2).

In general, making Adjoint and Transpose subtypes of AbstractArray seems problematic for any kind of "matrix-free" data structure, where you want to be able to take the adjoint but don't have explicit matrix storage or any efficient way to obtain individual entries.

Which potential issues are there here? Isn't it just that you can hit an operation that takes forever or maybe consumes all your memory. We already have these issues with other AbstractArrays such as SparseMatrixCSC and DistributedArrays. Is there a feasible alternative? If we don't have Adjoint<:AbstractMatrix then Adjoitn{<:AbstractMatrix} can't benefit from AbstractMatrix methods.

@Jutho
Copy link
Contributor

Jutho commented Mar 7, 2018

Are there, except for the <:Factorizations, other types of objects which are wrapped in Adjoint but are not themselves <:AbstractArray?

Theoretically, it should be possible to make the adjoint of a qr factorization be such that it is the lq factorization with matrices L=Adjoint(R), Q=Adjoint(Q) and vice versa. But I am not sure if that is easy to express in the way the information of the QR factorization is encoded.

The adjoint of an SVD factorization is again an SVD factorization. Not sure about the others though (pivoted LU, ...).

@andreasnoack
Copy link
Member Author

Theoretically, it should be possible to make the adjoint of a qr factorization be such that it is the lq factorization with matrices L=Adjoint(R), Q=Adjoint(Q) and vice versa. But I am not sure if that is easy to express in the way the information of the QR factorization is encoded.

Initially, I thought this couldn't work but now I'm in doubt. At least for the factorizations that are parametric on the array type this might actually work. I'll have to try it out. Our QR is probably a good test case.

@andreasnoack
Copy link
Member Author

I've just tried to play around with this and, unfortunately, I don't think this can really work for the QR. It would require more type parameters for the factorization since the field for storing the blocked loadings isn't parametric. Even if we did that, it would be quite awkward compared to just wrapping the factorization in Adjoint.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants