Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

re-allow vector*(1-row matrix) and transpose thereof #20423

Merged
merged 5 commits into from
Feb 5, 2017

Conversation

stevengj
Copy link
Member

@stevengj stevengj commented Feb 3, 2017

As discussed in #20389, I think that forbidding this case will cause more confusion than it is worth.

@stevengj stevengj added the domain:linear algebra Linear algebra label Feb 3, 2017
*(::RowVector, ::RowVector) = throw(DimensionMismatch("Cannot multiply two transposed vectors"))
@inline *(vec::AbstractVector, rowvec::RowVector) = vec .* rowvec
*(vec::AbstractVector, rowvec::AbstractVector) = throw(DimensionMismatch("Cannot multiply two vectors"))
*(mat::AbstractMatrix, rowvec::RowVector) = throw(DimensionMismatch("Cannot right-multiply matrix by transposed vector"))
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this implemented somewhere?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you want to allow this one too, Steven? In this case it would return a matrix, another generalized form of the outer product.

Copy link
Member Author

@stevengj stevengj Feb 3, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@StefanKarpinski, yes, it is implemented and I included a test: it falls back on the generic matrix×matrix since RowVector is a subtype of AbstractMatrix. I allow it because it is the transpose of the other case that we allow, and it is weird to allow one but not the other (x*A but not A'*x').

@tkelman
Copy link
Contributor

tkelman commented Feb 3, 2017

We don't usually implicitly convert between different dimensionalities all that many places, I think we should deprecate this. Otherwise are we going to allow matrix * rowvector too, if the matrix happens to have 1 column?

@andyferris
Copy link
Member

Tony, Stefan - to clarify, the line removed here will enable matrix * rowvector -> matrix.

(While I am in favour of deprecation, we do at least need to reimplement the method to go with that.)

@stevengj
Copy link
Member Author

stevengj commented Feb 3, 2017

@tkelman, there was a long discussion about this in the other issue. Short answer (a) we have allowed this particular thing for a long time and many users expect it and (b) yes, this PR allows matrix×rowvector for 1-column matrices too — that is even more "natural" in terms of code — it requires no special methods, because rowvector is already a subtype of matrix. (Disabling matrix×rowvector requires special-case code.)

(Also (c): any short deprecation message will simply cause intense confusion for people coming from fields where they are used to thinking of vectors as being equivalent to 1-column matrices when doing linear algebra and (d) it's hard to properly say what to replace such code with without knowing where the 1-column matrix came from: e.g. x * ones(n,1) should be replaced with x * ones(n)'. A generic deprecation would probably call reshape, and that would give users the misleading impression that we are requiring them to write ugly reshape code for what should be a simple operation.)

@tkelman
Copy link
Contributor

tkelman commented Feb 3, 2017

We haven't had a rowvector type for all this time that we've allowed dimensionally questionable operations. It's not that hard for people to grasp that in Julia a 1D vector and a 1 column matrix are different types and behave differently. I thought much of the point of 4774 is to allow us to make the distinction clearer.

@stevengj
Copy link
Member Author

stevengj commented Feb 3, 2017

@tkelman, just because they are different types and can follow different rules doesn't mean that they should. Many people learned linear algebra by thinking of vectors as following the same rules as one-column matrices, and think of A*x in that way, and learn the outer product x*y' as following the matrix*matrix rules for 1col*1row. Why can't we support this, rather than forcing users to comprehend a higher level of abstraction in thinking of their vector spaces?

A RowVector is a subtype of AbstractMatrix that acts in other ways like 1-row matrix, and vector*rowvector is defined, so in that sense also it is weird for that to work but vector*(1-row matrix) to not work.

What do you gain by imposing your notion of purity on these operations? A lot of users will be confused and inconvenienced; what is the corresponding gain?

@tkelman
Copy link
Contributor

tkelman commented Feb 3, 2017

Using the type system to write code that more clearly reflects when it does and does not work, and catches bugs earlier by allowing fewer questionable operations. Users having to think about the dimensionality of their arrays isn't a big ask. The vector-vs-matrix distinction in Julia is one of the many useful differences between us and Matlab, and not all that abstract or complicated.

@stevengj
Copy link
Member Author

stevengj commented Feb 3, 2017

(Another clue: this PR is a net deletion of code, even if you count the additional tests. Supporting fewer operations here requires more code, which is another clue that a restrictive view of matvec here isn't worth it.)

@stevengj
Copy link
Member Author

stevengj commented Feb 3, 2017

@tkelman, this is not a Matlab thing. This is a way that some people think about linear algebra. Why do we need to try (and probably fail) to convince them that their way of thinking about linear algebra is "wrong"? (They will no doubt conclude instead that Julia is overly pedantic and annoying.)

@tkelman
Copy link
Contributor

tkelman commented Feb 3, 2017

In terms of realizing that linear algebra as code, matlab has no distinction, we do. The question is whether we want that distinction to be reflected in what operations we allow, or partially try to hide it only for matvec.

@stevengj
Copy link
Member Author

stevengj commented Feb 3, 2017

Complex and real numbers are different types in Julia. Are we "partially hiding" that distinction by allowing complex*real? I don't think this is the right way to frame the question.

Many people think of linear algebra as allowing them to multiply vectors and one-row matrices. It doesn't cost us anything to support that viewpoint, and in fact it saves us code. It will confuse them if we forbid it, not because of Matlab, but because this is how they think of matvec operations.

@stevengj
Copy link
Member Author

stevengj commented Feb 3, 2017

See also #20389 (comment) by @StefanKarpinski

@fredrikekre
Copy link
Member

Should it be more than twice as slow? That is also something to consider for people that would use this method.

julia> using BenchmarkTools

julia> for N in (10, 100, 1_000, 10_000)
           times = zeros(2, length(N))
           x = rand(N); y = rand(N);

           yt = y'; ymat = reshape(y, 1, length(y));

           x*yt;
           x*ymat;

           @btime $x*$yt;
           @btime $x*$ymat;
       end
  163.458 ns (1 allocation: 896 bytes)
  326.870 ns (3 allocations: 992 bytes)
  5.138 μs (2 allocations: 78.20 KiB)
  9.544 μs (4 allocations: 78.30 KiB)
  556.765 μs (2 allocations: 7.63 MiB)
  1.333 ms (4 allocations: 7.63 MiB)
  88.486 ms (2 allocations: 762.94 MiB)
  289.008 ms (4 allocations: 762.94 MiB)

@mbauman
Copy link
Sponsor Member

mbauman commented Feb 3, 2017

I'm in favor of this PR for now. We should re-evaluate this in 0.6 to make a consistent decision alongside trailing singleton dimensions and partial indexing. Either our we allow our arrays to masquerade as other dimensionalities as convenient or we don't.

@stevengj
Copy link
Member Author

stevengj commented Feb 3, 2017

@fredrikekre, the speed is a separate issue; it's not intrinsic to what operations we allow (since the underlying calculations are essentially identical), it's just a question of what specialized methods we provide.

(Right now, if you treat a RowVector as a matrix in a matmul, it goes through generic code that is not as optimized. I think it is fine for the less-preferred way of expressing something to be slower by a constant factor ... the kinds of operations we're talking about are often not that performance critical anyway.)

@StefanKarpinski
Copy link
Sponsor Member

We should re-evaluate this in 0.6 to make a consistent decision alongside trailing singleton dimensions and partial indexing. Either our we allow our arrays to masquerade as other dimensionalities as convenient or we don't.

Note that allowing indexing with 1 into non-existent dimensions is quite different than partial linear indexing – in particular, partial linear indexing leaks (possibly invalid) assumptions about memory layout of arrays, whereas allowing indexing into non-existing dimensions with a 1 does not.

@mbauman
Copy link
Sponsor Member

mbauman commented Feb 3, 2017

Yes, but trailing singleton dimensions are quite alike inserting implicit singleton dimensions when you use more than one but fewer than N indices to index into an N-dimensional array. That's what I meant by partial indexing. We need better vocabulary here.

@StefanKarpinski
Copy link
Sponsor Member

Ok, let's go with this now and then reconsider the whole thing in the 0.6 cycle.

@andyferris
Copy link
Member

andyferris commented Feb 4, 2017

OK, I agree to merge this for v0.6 and revisit trailing dimensions and this together in the next development cycle (I think implied trailing singleton dimensions may be a hole in the argument in this comment).

@stevengj you missed some methods in other files (triangular and diagonal - see #20429 for exactly where), which really should be dealt with in this PR. There may also be other matrix * rowvector disambiguity methods splattered around linalg/ and sparse/, not sure right now (sorry I don't have much time at the moment).

Complex and real numbers are different types in Julia. Are we "partially hiding" that distinction by allowing complex*real? I don't think this is the right way to frame the question.

Here, the interface is defined by Number. The question is always which types (or traits) define the interface, and what methods that interface should consist of, and how does this implement concepts users need? We are talking about what interfaces are supported by AbstractVector, AbstractMatrix and RowVector.

@stevengj
Copy link
Member Author

stevengj commented Feb 4, 2017

@andyferris, done. Note that since diagonal and triangular matrices in Julia are always square, the only function of those methods was to throw an error in the 1x1 case, which hardly seems worth the trouble.

@@ -80,6 +80,7 @@ function (*){T,S}(A::AbstractMatrix{T}, x::AbstractVector{S})
TS = promote_op(matprod, T, S)
A_mul_B!(similar(x,TS,size(A,1)),A,x)
end
(*)(A::AbstractVector, B::AbstractMatrix) = reshape(A,length(A),1)*B
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so this method didn't previously exist?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It did. AFAICT, it had a similar implementation modulo inlining. (I have trouble with the mobile version of GitHub, sorry).

@andyferris
Copy link
Member

It did. AFAICT it had a similar implementation modulo inlining.

@andyferris
Copy link
Member

@stevengj Thanks, it's still good to remove those error messages I wrote, since they would be non-sensical.

@andyferris
Copy link
Member

@stevengj I think that the tests can be fixed by similar test changes as in #20429.

@stevengj
Copy link
Member Author

stevengj commented Feb 4, 2017

@andyferris, I think it should be fixed now—I fixed it by adding generic methods, rather than specialized methods for triangular, and deleting some specialized triangular methods.

@@ -81,6 +81,13 @@ function (*){T,S}(A::AbstractMatrix{T}, x::AbstractVector{S})
A_mul_B!(similar(x,TS,size(A,1)),A,x)
end

# these will throw a DimensionMismatch unless B has 1 row (or 1 col for transposed case):
A_mul_B(a::AbstractVector, B::AbstractMatrix) = A_mul_B(reshape(a,length(a),1),B)
Copy link
Member

@andyferris andyferris Feb 4, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A_mul_B? Is this meant to be *?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

whoops, right, this doesn't exist

A_mul_B(a::AbstractVector, B::AbstractMatrix) = A_mul_B(reshape(a,length(a),1),B)
A_mul_Bt(a::AbstractVector, B::AbstractMatrix) = A_mul_Bt(reshape(a,length(a),1),B)
A_mul_Bt(A::AbstractMatrix, b::AbstractVector) = A_mul_Bt(A,reshape(b,length(b),1))
A_mul_Bc(a::AbstractVector, B::AbstractMatrix) = A_mul_Bc(reshape(a,length(a),1),B)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should At_mul_B(::RowVector, ::AbstractMatrix) and friends be part of this list too? Or I guess it works as generic matrix-matrix multiplication?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since RowVector is an AbstractMatrix, it should always work with the generic fallback.

@andyferris
Copy link
Member

Cool - now LGTM. Thanks Steven.

@stevengj stevengj merged commit e3bb870 into JuliaLang:master Feb 5, 2017
@stevengj stevengj deleted the vecmat branch February 5, 2017 23:00
andyferris pushed a commit to andyferris/julia that referenced this pull request Feb 8, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants