Support for qr decomposition pullback #469

rkube · 2021-07-12T19:29:02Z

Added a rrule for the qr deomposition. @sethaxen

rkube · 2021-08-25T20:03:01Z

I've ported this pullback to CUDA.jl: https://gist.github.com/rkube/b17ef683409d76a3f01bcc590b85de6e
Where would be a good place for that code?

oxinabox · 2021-09-21T17:36:34Z

pokes @sethaxen
(I can't really review this)

oxinabox · 2021-09-21T17:41:31Z

src/rulesets/LinearAlgebra/factorization.jl

+        ∂T = d === :R ? Ȳ : nothing
+
+        ∂F = Tangent{LinearAlgebra.QRCompactWY}(; factors=∂factors, T=∂T)


∂factors isn't defined

oxinabox · 2021-09-21T17:45:35Z

src/rulesets/LinearAlgebra/factorization.jl

+        # R. Schreiber and C. van Loan, Sci. Stat. Comput. 10, 53-57 (1989).
+        # Instead of backpropagating Q̄ and R̄ through (factors)bar and T̄, we re-use factors to carry Q̄ and T to carry R̄
+        # in the Tangent object.
+        ∂T = d === :R ? Ȳ : nothing


We do not uses nothing to represent not used.
We use ZeroTangent for not used, (and NoTangent for not having a meaningful tangent space)

sethaxen · 2021-09-21T18:49:39Z

pokes @sethaxen
(I can't really review this)

Yeah, sorry, I took the deep dive studying the various QR parameterizations a few weeks back in prep for reviewing this but haven't had the chance to yet. Sorry for the delay, @rkube!

sethaxen · 2021-10-08T22:25:38Z

So this is a really tricky set of rules to define, perhaps trickier than any of the other rules we have in ChainRules currently. Here are just a few complications:

The signatures for qr are all changing with Julia v1.7 (below I use the 1.7 signatures)
qr can produce 4 different types in the standard library, summarized below:

# returns QRCompactWY via LAPACK.geqrt!
qr(A::StridedMatrix{<:BlasFloat}, pivot = NoPivot(); kwargs...)
qr!(A::StridedMatrix{<:BlasFloat}, ::NoPivot; kwargs...)

# returns QR via qrfactUnblocked!
qr(A::AbstractMatrix, pivot = NoPivot())
qr!(A::AbstractMatrix, ::NoPivot)

# returns QRPivoted via qrfactPivotedUnblocked!
qr(A::AbstractMatrix, ::ColumnNorm)
qr!(A::AbstractMatrix, ::ColumnNorm)

# returns SuiteSparse.SPQR.QRSparse
qr(A::SparseMatrixCSC, pivot = NoPivot())

None of the QR objects generate the Q matrix. Instead, they represent it in a compact form, where factors contains Householder reflectors in the strict lower trapezoid, and R in the upper trapezoid. Computing rules in terms of these compact elements is challenging, roughly as challenging as implementing the qr functions themselves.
Calling .Q on one of these factorizations produces a AbstractQ <: AbstractMatrix object that basically has all of the same fields. The AbstractQ objects are AbstractMatrixes, which means they by default hit all of our AbstractMatrix rules and therefore will end up with AbstractMatrix cotangents unless we write custom rrules for every function one might call on a QR object.
The AbstractQ objects are weird. For an nxk matrix A, size(qr(A).Q) == (n, n). However, Q also allowed to be multiplied by matrices with size (k, m). So consider code like the following:

A = randn(10, 5)
Q, _ = qr(A)
v = randn(5)
w = randn(10)
y = Q*w + Q*v
@assert size(y) == (10,)

This is completely allowed, but note that the cotangent of Q will be ∂Q = ∂y * w' + ∂y * v'. This adds two matrices of size (10, 10) and (10, 5), respectively. This addition will be handled by the AD engine and will error, so it's necessary then to use ProjectTo to padd the (10, 5) matrix with zeros to make it (10, 10), but this is very wasteful when dealing with very tall matrices where one may never use its (10, 10) version.

I don't think we can just address a subset of these complications one-at-a-time. Once we start adding rules, which will override AD systems' default behavior of differentiating through the qr! fallback (for operator-overloading ADs), then we will need to have more rules to make sure all of our rules compose nicely. I need to think more if there's a way that this can be handled without a tremendous amount of complication.

Support for qr decomposition pullback

4ea8dbc

sethaxen self-requested a review July 12, 2021 19:38

Kolaru mentioned this pull request Jul 22, 2021

Implement QR pullback #306

Open

rkube mentioned this pull request Aug 30, 2021

rrule for casting LinearAlgebra.QRCompactWYQ into a Matrix #516

Open

GiggleLiu mentioned this pull request Sep 11, 2021

Implement lowrank_svd in NiSparseArrays jieli-matrix/SparseADRules.jl#35

Closed

jieli-matrix mentioned this pull request Sep 17, 2021

[Review]: rrule on qr jieli-matrix/SparseADRules.jl#37

Merged

oxinabox reviewed Sep 21, 2021

View reviewed changes

sethaxen closed this Sep 28, 2021

sethaxen reopened this Sep 28, 2021

sethaxen mentioned this pull request Jul 21, 2022

rrule for QR-decomposition #651

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for qr decomposition pullback #469

Support for qr decomposition pullback #469

rkube commented Jul 12, 2021 •

edited

Loading

rkube commented Aug 25, 2021

oxinabox commented Sep 21, 2021 •

edited

Loading

oxinabox Sep 21, 2021

oxinabox Sep 21, 2021

sethaxen commented Sep 21, 2021

sethaxen commented Oct 8, 2021

		∂T = d === :R ? Ȳ : nothing

		∂F = Tangent{LinearAlgebra.QRCompactWY}(; factors=∂factors, T=∂T)

Support for qr decomposition pullback #469

Are you sure you want to change the base?

Support for qr decomposition pullback #469

Conversation

rkube commented Jul 12, 2021 • edited Loading

rkube commented Aug 25, 2021

oxinabox commented Sep 21, 2021 • edited Loading

oxinabox Sep 21, 2021

Choose a reason for hiding this comment

oxinabox Sep 21, 2021

Choose a reason for hiding this comment

sethaxen commented Sep 21, 2021

sethaxen commented Oct 8, 2021

rkube commented Jul 12, 2021 •

edited

Loading

oxinabox commented Sep 21, 2021 •

edited

Loading