RowVector, pinv, and quasi-division #23067

martinholters · 2017-08-01T07:46:51Z

No tests yet, I first would like feedback whether we actually want (all of) this.

Demo:

julia> c = [1; 2]
2-element Array{Int64,1}:
 1
 2

julia> r = [3; 4]'
1×2 RowVector{Int64,Array{Int64,1}}:
 3  4

julia> s = r*c
11

julia> pinv(c)      # returns Matrix on master
1×2 RowVector{Float64,Array{Float64,1}}:
 0.2  0.4

julia> pinv(c)*c    # returns Vector on master
1.0

julia> pinv(r)      # throws MethodError on master
2-element Array{Float64,1}:
 0.12
 0.16

julia> s/c          # throws MethodError on master
1×2 RowVector{Float64,Array{Float64,1}}:
 2.2  4.4

julia> r\s          # throws MethodError on master
2-element Array{Float64,1}:
 1.32
 1.76

Closes #23028.

martinholters · 2017-08-01T09:17:18Z

For consistency, we'd then probably also want r/r and c\c (which currently return a RowVector and a Vector, respectively) to return a scalar and r\r and c/c (which currently throw a DimensionMismatch) to return a Matrix.

The rationale for the proposed change is
a) if it works for a n×1 or 1×n matrix, it should work for the corresponding (row) vector
b) x/y should have the same type as x*pinv(y), likewise x\y and pinv(x)*y

Leaving pinv(::AbstractVector) alone (returning a matrix) and add pinv(::RowVector) also returning a matrix would make (r*c)/c == (r*c)*pinv(c) and r*(c/c) == r*(c*pinv(c)) have different types which also strikes me as not desirable. Opinions?

CC @andyferris as the father of RowVector 😄

andyferris · 2017-08-01T10:25:21Z

I do like consistency.

One thing I'm wondering about is what / and \ should really be doing when there are multiple answers or no answer. I do get that the current L2 minimisation behaviour for matrices is convenient, but it's already quite annoying (to me anyway) that the square matrix division is treated as exact and rectangular not, where I can get run-time errors when I expect the QR behaviour and hit an example of a singular square matrix at run time (typically expecting rectangular matrices but sizes which depend on run-time data).

So for instance, should vec1/vec2 behave strictly or non-strictly? Outside of rectangular matrix division, are there examples of non-strict / methods in Julia or in packages?

Cc @andreasnoack

martinholters · 2017-08-01T12:30:36Z

If / and \ are not strict for rectangular matrices, they probably shouldn't be for vectors.

Relatedly, just noticed this:

# \(A::StridedMatrix,x::Number) = inv(A)*x Should be added at some point when the old elementwise version has been deprecated long enough
# /(x::Number,A::StridedMatrix) = x*inv(A)

which dates back to February 2014 - maybe now is the time.

andyferris · 2017-08-01T12:47:38Z

If / and \ are not strict for rectangular matrices, they probably shouldn't be for vectors.

Right, I don't disagree, and the method would be next-to-useless if it were strict because of numerical rounding errors, etc.

What I was asking was the (heretical?) question of whether non-strict / and \ for rectangular matrices has been good or not? (Since in practice I find I can't use these operators in my code at all if I can't be 100% assured that the matrix isn't square, like shown below)

julia> [0.0 1.0 0.0; 0.0 1.0 0.0] \ [0.0, 1.0] # rectangular - works
3-element Array{Float64,1}:
 0.0
 0.5
 0.0

julia> [0.0 1.0; 0.0 1.0] \ [0.0, 1.0] # square - doesn't work
ERROR: Base.LinAlg.LAPACKException(1)
Stacktrace:
 [1] chklapackerror(::Int64) at .\linalg\lapack.jl:34
 [2] trtrs!(::Char, ::Char, ::Char, ::Array{Float64,2}, ::Array{Float64,1}) at .\linalg\lapack.jl:3291
 [3] \(::Array{Float64,2}, ::Array{Float64,1}) at .\linalg\generic.jl:815

julia> reshape([1.0, 1.0], (2,1)) \ [0.0, 1.0] # rectangular - works
1-element Array{Float64,1}:
 0.5

So the reason I'm discussing this is that I just want to check that we aren't propagating badness in the name of consistency.

andyferris · 2017-08-01T13:02:15Z

An example of a non-strict / in another setting, Number, would be if 5/3 === 2 since this is the integer which minimizes the L2 error... we just don't see this stuff and tend to use InexactError or whatever whenever it doesn't exactly divide.

Just pondering out loud here. Maybe we need a ~/ operator or something... Or maybe I'm worrying overly much...

StefanKarpinski · 2017-08-01T13:12:42Z

Regarding the last issue, integer division produces floats, so division of integer matrices should produce float matrices, rendering the InexactError concern moot. Division of integer matrices should produce a float matrix whose values are as close as possible to the true division result.

andyferris · 2017-08-01T13:33:29Z

Regarding the last issue, integer division produces floats, so division of integer matrices should produce float matrices, rendering the InexactError concern moot. Division of integer matrices should produce a float matrix whose values are as close as possible to the true division result.

Right, this example was a bit of a distraction, sorry. The precise way of saying what I mean is this:

I expect (a / b) * b to be approximately a or else an error
I expect b * (b \ a) to be approximately a or else an error

And apart from overconstrained problems involving rectangular matrices I haven't seen this violated elsewhere (but someone please correct me if I'm wrong). Underconstrained rectangular matrix problems aren't an issue - they satisfy the above. My speculation is that perhaps overconstrained matrix (residual L2 minization) problems could instead be solved with ~/ and ~\ operators or something along those lines (which may also work for singular square matrices).

andreasnoack · 2017-08-01T13:45:44Z

I think it is fair to ask if the current polyalgorithm for \ is the right solution but I also think it should be handled separately from this PR. If we decide to change the polyalgorithm then the code in this PR won't make it any harder to implement the change.

@andyferris The potential runtime errors from square \ are annoying but they are also easy to avoid for an expert like you. You just do qrfact(A, Val{true})\b instead of A\b.

We use the LU in the square case because it is quite a bit faster than the QR. You could argue, though, that users who want speed could just do lufact(A)\b. If the main goal was to avoid the runtime error then we could also create Infs in the result if the LU failed but then there'd still be behavioral differences between the square and non-square cases.

Finally, if we decide that \ is generally minimizing the norm of the solution when A is singular, we'd also need to adjust solvers for matrices with special structure. Primarily Diagonal and Triangular. Not impossible but we should be fairly sure that it is really the right thing to do. I tend to think that the current default is reasonable. In particular, because we allow expert users to customize the behavior via the Factorization.

martinholters · 2017-08-01T14:07:05Z

So there are two questions here:

Should we keep the current behavior of / and \ for rectangular matrices? If not, should we give that operation a new name or operator? Probably yes, but which one?
Assuming we keep this "non-strict" behavior in one way or another, do the changes proposed in this PR make sense?

I'd prefer to discuss only the second question here. Care to open a dedicated issue for the first question, @andyferris?

martinholters · 2017-08-01T14:34:32Z

This is getting more and more involved... If we allow '/' for rectangular matrices, I'd expect (a/a)*a to work dimension-wise, but

julia> zeros(3,4) / zeros(3,4) * zeros(3,4) # OK
3×4 Array{Float64,2}:
 0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0

julia> zeros(3,2) / zeros(3,2) * zeros(3,2) # not OK
ERROR: DimensionMismatch("A has dimensions (3,2) but B has dimensions (3,2)")

Before I start changing that... is there a particular reason for the current behavior I'm not aware of?

andreasnoack · 2017-08-01T15:00:32Z

This is on 0.6, right? There was a recent bug fix for zero matrices, #22831.

martinholters · 2017-08-01T16:13:06Z

This is on 0.6, right?

It was on an apparently-not-recent-enough master, so yeah, at least that seems to be resolved, phew.

andreasnoack · 2017-08-02T13:19:17Z

Just to be clear: I think we should go ahead with this PR and continue the general discussion of \ elsewhere.

martinholters · 2017-08-02T13:39:46Z

Fixed some loose ends so that the following passes:

let c = [1.0, 2.0], r = [3.0, 4.0]', cm = reshape(c, :, 1), rm = reshape(r', 1, :), m = [1.0 2.0; 3.0 4.0]
    for a in (r, rm), b in (c, cm) # inner products
        @test (a*b)/b ≈ a*(b/b) ≈ (a*b)*pinv(b) ≈ a*(b*pinv(b))
        @test typeof((a*b)/b) == typeof(a*(b/b)) == typeof((a*b)*pinv(b)) == typeof(a*(b*pinv(b)))
        @test a\(a*b) ≈ (a\a)*b ≈ (pinv(a)*a)*b ≈ pinv(a)*(a*b)
        @test typeof(a\(a*b)) == typeof((a\a)*b) == typeof((pinv(a)*a)*b) == typeof(pinv(a)*(a*b))
    end
    for (a, b) in ((c, r), (cm, rm)) # outer products
        @test (a*b)/b ≈ a*(b/b) ≈ (a*b)*pinv(b) ≈ a*(b*pinv(b))
        @test typeof((a*b)/b) == typeof(a*(b/b)) == typeof((a*b)*pinv(b)) == typeof(a*(b*pinv(b)))
        @test a\(a*b) ≈ (a\a)*b ≈ (pinv(a)*a)*b ≈ pinv(a)*(a*b)
        @test typeof(a\(a*b)) == typeof((a\a)*b) == typeof((pinv(a)*a)*b) == typeof(pinv(a)*(a*b))
    end
    @test (m*c)/c ≈ m*(c/c) ≈ (m*c)*pinv(c) ≈ m*(c*pinv(c))
    @test typeof((m*c)/c) == typeof(m*(c/c)) == typeof((m*c)*pinv(c)) == typeof(m*(c*pinv(c)))
    @test r\(r*m) ≈ (r\r)*m ≈ (pinv(r)*r)*m ≈ pinv(r)*(r*m)
    @test typeof(r\(r*m)) == typeof((r\r)*m) == typeof((pinv(r)*r)*m) == typeof(pinv(r)*(r*m))
end

This verifies the consistency I was aiming for. I'll add those tests and some others that directly verify the results (so that they are not consistently wrong...) in the next days.

martinholters · 2017-08-02T13:42:41Z

base/linalg/generic.jl

@@ -795,6 +795,19 @@ function inv(A::AbstractMatrix{T}) where T
    A_ldiv_B!(factorize(convert(AbstractMatrix{S}, A)), eye(S0, checksquare(A)))
 end

+function pinv(v::AbstractVector{T}, tol::Real=real(zero(T))) where T


Should probably be tol::Real=eps(real(float(one(T))))*length(v) for consistency with pinv(::StridedMatrix{T}) (and likewise below)

Hm, I was under the impression that pinv(::StridedMatrix{T}, tol) ignores all singular values below tol, but the threshold is actually tol * maximum(singular_values), so for the vector case, it only matters whether tol < 1. So the with the computed default tol consistent with the matrix case, an all-zero vector would be returned if the input is all-zero, which makes sense, or if length(v) > 1/eps(real(float(one(T)))), which strikes me as pretty bizarre.

I think I'd rather keep the tol=0 default than aim for maximum consistency here. Opinions?

pinv(::Diagonal) also uses tol=0 by default, so there is precedence for inconsistency here. Users striving for consistency across types should probably give tol explicitly, otherwise exploiting information the type conveys to choose a sensible default seems pretty reasonable.

it only matters whether tol < 1

I don't follow the reasoning here

A vector has one singular value s, which obvious is the maximum one. That singular value will be ignored (resulting in a zero vector) if s<=s*tol, i.e. tol>=1 or s=0.

A vector has one singular value s, which obvious is the maximum one.

Of course.

I can see that the length(v) > 1/eps(real(float(one(T)))) condition is a bit weird although it might not be a big practical concern since it would be 32 PB for Float64. However, I did some simulations and I'm wondering if using the maximum length is the right thing to do. It looks like it is the minimum that determines the error and using the minimum would fix the bizarre case for vectors.

martinholters · 2017-08-02T13:43:21Z

base/linalg/generic.jl

@@ -842,10 +855,11 @@ function (\)(A::AbstractMatrix, B::AbstractVecOrMat)
    return qrfact(A,Val(true)) \ B
 end

-(\)(a::AbstractVector, b::AbstractArray) = reshape(a, length(a), 1) \ b
+(\)(a::AbstractVector, b::AbstractArray) = pinv(a) * b


But should this use tol=0 or the default? (Likewise below)

martinholters · 2017-08-03T07:01:02Z

base/linalg/generic.jl

+function pinv(v::AbstractVector{T}, tol::Real=real(zero(T))) where T
+    res = similar(v, typeof(zero(T) / (abs2(one(T)) + abs2(one(T)))))'
+    den = sum(abs2, v)
+    if den <= tol^2


For consistency with pinv(::StridedMatrix, tol), this should be iszero(den) || tol >= one(tol). Not sure how much sense that makes, though. Opinions?

Did you decide against consistency with the matrix case then? And to stick with default tolerance above?

Also broaden from StridedVector to AbstractVector while at it and don't employ full matrix SVD.

Also start testing consistency between division and multiplication with pseudo-inverse involving vectors.

Let \(::AbstractVector, ::AbstractMatrix) return a RowVector and \(::AbstractVector, ::AbstractVector) return a scalar.

martinholters · 2017-08-04T18:24:19Z

Good from my side. Travis failure is good old arnoldi.

yuyichao · 2017-10-14T03:34:43Z

base/linalg/generic.jl

 (/)(A::AbstractVecOrMat, B::AbstractVecOrMat) = (B' \ A')'
 # \(A::StridedMatrix,x::Number) = inv(A)*x Should be added at some point when the old elementwise version has been deprecated long enough
 # /(x::Number,A::StridedMatrix) = x*inv(A)
+/(x::Number, v::AbstractVector) = x*pinv(v)


This changes the result between two versions without any deprecation and have just caught me completely by surprise.

One more evidence why giving completely different meaning for operations that used to have elwize meanings makes no sense.

Which versions? It is an error on 0.6.

Ah, there used to be /(x::Number, r::Range):

julia> VERSION v"0.6.0" julia> 1 / (1:4) 4-element Array{Float64,1}: 1.0 0.5 0.333333 0.25

julia> VERSION v"0.7.0-DEV.2158" julia> 1 / (1:4) 1×4 RowVector{Float64,Array{Float64,1}}: 0.0333333 0.0666667 0.1 0.133333

The old method was removed in #22932 without adding an appropriate deprecation. So I guess we could add a deprecated /(x::Number, r::Range) to still do the element-wise operation, which would be more specific than the method defined here. @yuyichao is that the case that was bothering you?

If I may add, when dividing a scalar by a vector, taking the pseudo-inverse of the "vector" and then multiplying it by a constant seems very far from an intuitive behavior to me. I think if one is doing this operation and one is aware it is taking the pseudoinverse, then explicitly calling pinv seems more appropriate rather than relying on this (in)convenience.

martinholters added domain:linear algebra Linear algebra needs tests Unit tests are required for this change labels Aug 1, 2017

martinholters force-pushed the mh/row_pinv branch from 65a7f3b to 4a46304 Compare August 2, 2017 13:27

martinholters commented Aug 2, 2017

View reviewed changes

martinholters commented Aug 3, 2017

View reviewed changes

martinholters added 5 commits August 4, 2017 13:48

Make pinv(::AbstractVector) return a RowVector

3541e80

Also broaden from StridedVector to AbstractVector while at it and don't employ full matrix SVD.

Add pinv(::RowVector)

e47c37c

Add /(::Number, ::AbstractVector)

ff2c974

Also start testing consistency between division and multiplication with pseudo-inverse involving vectors.

Add \(::RowVector, ::RowVector), returning a scalar

756a92b

Fix \ for AbstractVector LHS

703f942

Let \(::AbstractVector, ::AbstractMatrix) return a RowVector and \(::AbstractVector, ::AbstractVector) return a scalar.

martinholters force-pushed the mh/row_pinv branch from 4a46304 to 703f942 Compare August 4, 2017 12:35

martinholters removed the needs tests Unit tests are required for this change label Aug 4, 2017

martinholters changed the title ~~RFC: RowVector, pinv, and quasi-division~~ RowVector, pinv, and quasi-division Aug 4, 2017

andreasnoack merged commit f1d3556 into master Aug 22, 2017

andreasnoack deleted the mh/row_pinv branch August 22, 2017 03:22

andreasnoack mentioned this pull request Aug 22, 2017

Use min(m,n) instead of max(m,n) when determining numerical rank in pinv and \ #23386

Closed

yuyichao reviewed Oct 14, 2017

View reviewed changes

martinholters mentioned this pull request Oct 16, 2017

Deprecate +/- methods for array+scalar etc #22932

Merged

KristofferC mentioned this pull request Jan 6, 2018

1 / [1,2,3] returns nonsense #25431

Closed

martinholters mentioned this pull request Aug 22, 2018

Disambiguate division and pseudo-division? #28827

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RowVector, pinv, and quasi-division #23067

RowVector, pinv, and quasi-division #23067

martinholters commented Aug 1, 2017

martinholters commented Aug 1, 2017

andyferris commented Aug 1, 2017

martinholters commented Aug 1, 2017

andyferris commented Aug 1, 2017

andyferris commented Aug 1, 2017

StefanKarpinski commented Aug 1, 2017

andyferris commented Aug 1, 2017 •

edited

Loading

andreasnoack commented Aug 1, 2017

martinholters commented Aug 1, 2017

martinholters commented Aug 1, 2017

andreasnoack commented Aug 1, 2017

martinholters commented Aug 1, 2017

andreasnoack commented Aug 2, 2017

martinholters commented Aug 2, 2017

martinholters Aug 2, 2017

martinholters Aug 3, 2017

martinholters Aug 3, 2017

andreasnoack Aug 4, 2017

martinholters Aug 4, 2017

andreasnoack Aug 4, 2017

martinholters Aug 2, 2017

martinholters Aug 3, 2017

tkelman Aug 5, 2017

martinholters commented Aug 4, 2017

yuyichao Oct 14, 2017

andreasnoack Oct 14, 2017

martinholters Oct 16, 2017

mohamed82008 Jan 6, 2018

RowVector, pinv, and quasi-division #23067

RowVector, pinv, and quasi-division #23067

Conversation

martinholters commented Aug 1, 2017

martinholters commented Aug 1, 2017

andyferris commented Aug 1, 2017

martinholters commented Aug 1, 2017

andyferris commented Aug 1, 2017

andyferris commented Aug 1, 2017

StefanKarpinski commented Aug 1, 2017

andyferris commented Aug 1, 2017 • edited Loading

andreasnoack commented Aug 1, 2017

martinholters commented Aug 1, 2017

martinholters commented Aug 1, 2017

andreasnoack commented Aug 1, 2017

martinholters commented Aug 1, 2017

andreasnoack commented Aug 2, 2017

martinholters commented Aug 2, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

martinholters commented Aug 4, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

andyferris commented Aug 1, 2017 •

edited

Loading