Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce RowVector as the transpose of a vector #19670

Merged
merged 2 commits into from
Jan 13, 2017

Conversation

andyferris
Copy link
Member

@andyferris andyferris commented Dec 21, 2016

This is my take on #4774 (taking vector transposes seriously). The transpose of an AbstractVector becomes a RowVector, which is a single concrete type which is a lightweight wrapper or "view" of the original vector, and is for all purposes outside of linear algebra it is a 1 x n sized AbstractArray. The inner product v' * v is a scalar, v'' returns v, and so-on. For details of my thoughts while developing this, consider the README for the v0.5-compatible companion package here.

I've also taken the liberty of moving most of the definition of transpose and ctranspose to the LinAlg module since ctranspose and transpose are quite arguably specific to linear algebra (as opposed to permutedims which makes sense for an array).

There are a few optimizations left to be made involving map and reduce used in the inner product, but I don't mind if they are done in this PR or separately later (as "feature freeze" is coming soon) (done). Opinions, comments, feedback, etc are very welcome and since I think this is my first serious PR to julia - forgive me if I messed something up :)

@andyferris
Copy link
Member Author

OK, there's definitely some fallout in other tests that needs to be addressed.

@andyferris andyferris changed the title Introduce RowVector as the transpose of a vector WIP Introduce RowVector as the transpose of a vector Dec 21, 2016
@stevengj
Copy link
Member

How does this compare to #6837? cc @andreasnoack


# Some overloads from Base
@inline transpose(r::Range) = RowVector(r)
@inline ctranspose(r::Range) = RowVector(conj(r)) # is there such a thing as a complex range?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the approach in #6837 (which doesn't make a copy for ctranspose) better.

@stevengj
Copy link
Member

It's also unfortunate that this doesn't handle (2d) matrices.

@andyferris
Copy link
Member Author

Hi Steven,

I agree completely with what you are saying - we need to handle conjugation and matrices in a similar manner.

AFAICT #6837 doesn't extend well for conj(::Vector) being a view, or conj(::Array{T,3}), which seems a shame. I should have discussed this in the OP, but I prepared this PR in the context that the future would hold three separate "wrapper" types:

RowVector
TransposedMatrix # non-conjugating, also living in LinAlg
ConjArray   # Any dimension, living in Base

However, the final two are optimizations and not semantically important in the linear algebra sense (the view vs. copy semantic certainly is important for programmers). I think for us as developers it is going to be much easier to have a modular system of simple views rather than type parameters or an interface to ask "are you conjugated?" to a complicated type (dispatch would be difficult).

I was hoping we could deal with this in a process that goes something like this:

  • A PR for RowVector
  • A PR for ConjArray (perhaps just Conj)
  • A PR for TransposedMatrix (perhaps just Transpose)
  • A PR to remove A_mul_Bc etc from the parser and in LinAlg

However, I don't see all this happening in 9 days! I thought I'd see if there was an appetite to deal with the linear algebra (row vector) semantics in v0.6 or not, with the "software" side of things (optimizations, view vs. copy, etc) to come later.

@stevengj
Copy link
Member

Why should Transpose be distinct from RowVector, as opposed to @andreasnoack's approach of just Transpose{A<:AbstractArray{T,N}} being a subtype of AbstractArray{T,N} for any N?

It's true that ConjArray extends to higher-dimensional arrays, but I'm not sure how important that is, particular given the fusing conj.(A) for avoiding temporaries. Whereas making the conjugation a field (or type parameter) of Transpose as in #6837 has the advantage of reducing the number of types we need to deal with, and probably reducing code complexity.

@andyferris
Copy link
Member Author

andyferris commented Dec 22, 2016

Maybe I misunderstood - in the Transpose type I thought the transposition was mandatory and the conjugation was optional. In which case we don't have conjugate views in 1D or 2D either - I feel that would be quite unfortunate.

A transposed matrix is just a matrix, but the transpose of a vector in some senses is not "really" a matrix or a vector (at least, that is what I'm aiming for here). To me, it made sense to separate the two concepts at the (primary) type level.

For example, users might want to dispatch row vectors (or dual vectors) separately, and while Transpose{A} where A <: AbstractVector is possible, it seems a little wordy. I guess a typealias can help here. The other argument for separation is that I can't think of a method that is specialized on Transpose which shouldn't be specialized on both Transpose{A <: AbstractVector} and Transpose{A <: AbstractMatrix} (there might be a small number, I'm not sure, but it seems the concepts would be unrelated).

Transpose as in #6837 has the advantage of reducing the number of types we need to deal with, and probably reducing code complexity.

Hmm, this is where I disagree. I think that a modular system would help deal with each operation atomically, and thus reduce code complexity significantly. We still need just a single definition ctranspose(x) = transpose(conj(x)). With RowVector, TransposedMatrix and ConjArray being fully-function views, for any combination of conj and transpose and ctranspose, you just need five versions of * for generic matrix/vector multiplication (matrix-vector, matrix-matrix, rowvector-matrix, rowvector-vector and vector-rowvector) to replace all the generic versions of A_mul_Bc. We would still need some specializations for sparse and strided structures, but that hardly seems to be worsened in any way, an example being:

*(A::Transpose{Conj{M}} where M<:StridedMatrix, B::StridedVector) = ...
*(A::Transpose{Conj{M}} where M<:StridedMatrix, B::StridedMatrix) = ...
...

(I do realize that BLAS is handled in A_mul_B! but it's an illustration that it's pretty easy to map this directly to which BLAS call and TRANSA = 'C', TRANSB = 'N', or whatever it is). For sparse arrays, it might be just

*(A::Transpose{M} where M<:AbstractSparseMatrix, B::AbstractSparseMatrix)

for a generic Julia implementation. Anyway, it seems prettier to separate the Conj part to me.

As to broadcast fusion (like conj.()), if you could suggest a version that includes fused matrix multiplication I think that would be fantastic (that's a serious request, not tongue-in-cheek - it would be seriously awesome!). But you still have the disadvantage of not being able to split your operations over multiple lines of code (or even multiple functions!), which can be quite powerful.

@stevengj
Copy link
Member

We have a lot of separate code for Matrix and Vector, but they are still subtypes of the same type.

With a separate Conj type, you would conceivably have to handle Transpose{Conj} and Conj{Transpose} separately, although I guess that conj(A::Transpose) could conceivably canonicalize to Transpose{Conj}.

I guess it's not terrible to have separate Conj and Transpose types, but I'd really strongly prefer that (a) there be only one Transpose type of varying dimensions (not completely separate TransposedMatrix and RowVector types) and (b) that Conj and Transpose be implemented in the same PR to prevent regressions in A'*B and similar operations.

@andyferris
Copy link
Member Author

andyferris commented Dec 22, 2016

OK - I think I could support that; Transpose and Conj would work fine. Upon further thought, having transpose(vector) as being "the object that when transposed again returns vector" neatly sidesteps many of the subtle semantic issues of RowVector vs DualVector and so-on, and still lets us achieve a scalar inner product.

The question now is, do we work like mad to make this happen this week for v0.6? And does anyone know if jb/subtype will merge soon so we can use the new syntax? (I was really enjoying those signatures I wrote above 😄 )

PS - this PR does already attempt to "canonicalize" conj(transpose(x)) = transpose(conj(x)) since I knew we would want it in the future, but we could go to extremes and have the inner constructor of Conj throw an error if it wraps Transpose, to make it more concrete?

@vtjnash
Copy link
Sponsor Member

vtjnash commented Dec 22, 2016

it should be merged pretty soon. just needs some performance testing and review.

Transpose{M where M<:AbstractSparseMatrix}

This type does not have any subtypes (e.g. does not have where on the "outside"); I don't think that's what you want. What you probably wanted is one of:

(*){M <: AbstractSparseMatrix}(A::Transpose{M}, B::AbstractSparseMatrix)
(*)(A::Transpose{M}, B::AbstractSparseMatrix) where M <: AbstractSparseMatrix

@andyferris
Copy link
Member Author

You're right... I'm still in a bit of a headspin with the new syntax :)

@andyferris
Copy link
Member Author

andyferris commented Dec 22, 2016

I think this is the one I like:

(*)(A::(Transpose{M} where M <: AbstractArray), B::AbstractSparseMatrix)

I don't see that the brackets should be necessary, though. (edit: xref #18457 (comment))

@@ -0,0 +1,183 @@
immutable RowVector{T,V<:AbstractVector} <: AbstractMatrix{T}
Copy link
Member

@stevengj stevengj Dec 22, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alanedelman and I both tend to think that Transpose{T,N,AbstractArray{T,N}} <: AbstractArray{T,N}, so that a RowVector (AbstractRowVector? = Transpose{T,1,AbstractArray{T,1}}) would be a subtype of AbstractVector, i.e. a "1d" array object (an element of the dual space, with conjugation).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of course, that necessitates a bit of care, since the assumption AbstractVector == ColumnVector appears in a number of methods, but it doesn't seem too bad. There would need to be specialized +, *, and (most annoying) various broadcast_foo methods.

@@ -349,6 +349,15 @@ A_mul_B!(C::AbstractVector, A::BiTri, B::AbstractVector) = A_mul_B_td!(C, A, B)
A_mul_B!(C::AbstractMatrix, A::BiTri, B::AbstractVecOrMat) = A_mul_B_td!(C, A, B)
A_mul_B!(C::AbstractVecOrMat, A::BiTri, B::AbstractVecOrMat) = A_mul_B_td!(C, A, B)

@inline \(::Diagonal, ::RowVector) = error("Cannot left-divide matrix by transposed vector")
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you need to inline a function that just throws an error.

@stevengj
Copy link
Member

stevengj commented Dec 22, 2016

We had a long discussion today with @JeffBezanson, @andreasnoack, and @alanedelman. We initially leaned towards making a RowVector an AbstractVector, but after Alan left we began leaning in the other direction, making a RowVector an AbstractMatrix as in this proposal. Making it an AbstractMatrix seems less likely to cause regressions, and will require less code to be modified, than making it an AbstractVector (since there is lots of linear-algebra code that assumes every AbstractVector is a column vector).

@inline check_tail_indices(i1, i2, i3, is...) = i3 == 1 ? check_tail_indices(i1, i2, is...) : false

# Some conversions
convert(::Type{AbstractVector}, rowvec::RowVector) = rowvec.vec
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we sure we want this? There aren't currently any conversions between arrays with different numbers of dimensions. If there were, I suspect we'd only allow converting (n,)-size arrays to (n,1), not (1,n)-size arrays to (n,).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, this isn't very useful - you can just type .'

@inline *(::RowVector, ::RowVector) = error("Cannot multiply two transposed vectors")
@inline *(vec::AbstractVector, rowvec::RowVector) = kron(vec, rowvec)
@inline *(vec::AbstractVector, rowvec::AbstractVector) = error("Cannot multiply two vectors")
@inline *(mat::AbstractMatrix, rowvec::RowVector) = error("Cannot right-multiply matrix by transposed vector")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are all of these error definitions necessary? We already throw errors for multiplying matrices and vectors of incompatible shapes, and since RowVector <: AbstractMatrix the existing code should just work (i.e. continue to throw errors), no?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted users to see that semantically we only allow

  • matrix * matrix
  • matrix * vector
  • rowvector * matrix
  • rowvector * vector
  • vector * rowvector

So it's less about the specific shapes and more about the semantics of the type. I'm also deprecating (well, there should be a deprecation warning, not an error, but I haven't got to it yet) the vector * matrix method in favor for the last item on the list.

Ideally, there should only be four nice error message methods (a single one for each of vectormatrix, matrixrowvector, vectorvector, rowvectorrowvector), but they are repeated (a) because of ambiguities and (b) because of all the different A_mul_Bc methods.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andyferris, but there are already error messages for multiplications that deviate from these. If you want to improve the error messages, can't that be a separate PR?


# Multiplication #

@inline *(rowvec::RowVector, vec::AbstractVector) = reduce(+, map(At_mul_B, transpose(rowvec), vec))
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sum( rowvec.vec[i].' * vec[i] for i = 1:length(vec) ) to fuse the map and reduce?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right! I always forget to use generators.

end
A
end
@inline transpose(sv::SparseVector) = TransposedVector(sv)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-> RowVector

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks


# Multiplication #

@inline *(rowvec::RowVector, vec::AbstractVector) = reduce(+, map(At_mul_B, transpose(rowvec), vec))
Copy link
Member

@stevengj stevengj Dec 22, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess you are doing this in order to support vectors of matrices and similar, but it seems like we will also want to define specialized methods for RowVector{<:Number} * AbstractVector{<:Number}. e.g. for BLAS scalars we want to call BLAS.dotu.

(Even in the general case, can't you use mapreduce with zip(rowvec.vec, vec) to avoid the temporary arrays?, or sum with a generator as suggested above?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, and yes - agreed.

@musm
Copy link
Contributor

musm commented Dec 22, 2016

We had a long discussion today with @JeffBezanson, @andreasnoack, and @alanedelman. We initially leaned towards making a RowVector an AbstractVector, but after Alan left we began leaning in the other direction, making a RowVector an AbstractMatrix as in this proposal. Making it an AbstractMatrix seems less likely to cause regressions, and will require less code to be modified, than making it an AbstractVector (since there is lots of linear-algebra code that assumes every AbstractVector is a column vector).

Care to share the preference for RowVec <: AbstractMatrix over RowVec <: AbstractVector ? Could that not potentially also break future code for functions with x::AbstractMatrix. I don't see why RowVec <: AbstractMatrix should be prefered over RowVec <: AbstractVector ?

@stevengj
Copy link
Member

stevengj commented Dec 22, 2016

@musm, the basic arguments went like this, as I recall:

If you make RowVector <: AbstractVector:

  • Pro: there is some mathematical elegance, because it is natural to think of a row vector as belonging to the dual space of the column vector, and as such (ala Reisz) it is natural to think of the row vector as a "single-index object". (You can also think of it as raised vs. lowered indices, which doesn't change the number of indices.)

  • Con: For any code that cares about the "shape" of a vector (row vs. column vector), e.g. linear-algebra code or broadcast, it can't just operate on AbstractVector — it will have to check whether the AbstractVector is a row vector (or define a RowVector method) lest it yield wrong/unexpected results. This is a lot of code, and will be an continuing imposition on more and more code as time passes (as people write new linear-algebra libraries, for example, they will always have to check this on every AbstractVector argument). Worse, unmodified code will silently do the wrong thing.

If you make RowVector <: AbstractMatrix:

  • Con: not quite as elegant mathematically. There is a natural isomorphism from 1×n matrices (linear operators on n-component column vectors) to elements of the column-vector dual space, but they are not quite the same thing (one produces a scalar, the other a 1×1 matrix). Still, we can define specialized methods to get key properties "right", likeRowVector * AbstractVector = scalar etc. and transpose(transpose(x)) === x.

  • Pro: Linear-algebra code and broadcast and other shape-dependent code will mostly work as-is, with no additional checks/methods. At worst, if someone doesn't write a specialized RowVector method, then it will continue to do what it does now—might be slightly suboptimal (e.g. somewhere you might get a 1×1 matrix instead of a scalar), but it will still mostly do what you want. And the core operations that need to depend on RowVector, like *(RowVector, AbstractVector), can mostly be modified once at an abstract level (e.g. to call dot) and subsequent AbstractVector types (assuming they implement dot etc.) will mostly inherit the "correct" behavior without having to write specialized RowVector methods. Moreover, failures are loud: if some operation yields a matrix where a scalar or column vector was expected, it will generally die with a MethodError (this is what happens now if you do e.g. y -= y * (y'x) where x and y are vectors).

Put another way, there are very few methods that take an AbstractVector argument where it is useful to be able to pass a RowVector and have it do the same thing. In contexts that don't care about "shape" (row vs column), like string processing of an array of bytes from a file, it is not likely that anyone would want to pass a row vector in the first place. So you don't gain much by that approach, and you pay a huge price by imposing a new requirement on every function that does care about shape.

@stevengj
Copy link
Member

stevengj commented Dec 22, 2016

Another argument in favor of RowVector <: AbstractMatrix, from @JeffBezanson, is that it can be viewed as an incremental step in the right direction. x' is a matrix already, and this just makes it a "better" matrix. If we merge this, and it turns out that it is not enough—that there are lots of places where we would want it to be an AbstractVector, then we can take the next step of changing the dimensionality.

@andyferris
Copy link
Member Author

Well said, @stevengj. I agree 100% with the above, but couldn't express it as elegantly.

I also do view this as an incremental step. I have a "feeling" that a 1D dual vector would be a bit easier to deal with in a language with traits, so this could be revisited later if a row-shaped vector seems wrong in the future.

@andyferris
Copy link
Member Author

andyferris commented Dec 22, 2016

Regarding changing this to a general Transpose type, would people want either of

  • typealias RowVector{T, V <: AbstractVector} Transpose{V, T}
  • typealias TransposedMatrix{T, M <: AbstractMatrix} Transpose{M, T}

to be exported?

@stevengj
Copy link
Member

If RowVector <: AbstractMatrix and TransposedMatrix <: AbstractMatrix, then maybe they should be separate types after all.

@andyferris
Copy link
Member Author

Oh dear - now we go full circle! 😄

I think either way can be made to work. For example. if they are separate types, we can even do

typealias Transpose{A,T} Union{RowVector{T,A}, TransposedMatrix{T,A}}

lol

@stevengj
Copy link
Member

Sounds like we're ready to merge.

@stevengj stevengj merged commit 3d6fa35 into JuliaLang:master Jan 13, 2017
@stevengj
Copy link
Member

Would be great to have a Conj PR to go with this one.

@andyferris
Copy link
Member Author

Aim to do so, within a week (hopefully less).

@andyferris
Copy link
Member Author

See #20047

if length(rowvec) != length(vec)
throw(DimensionMismatch("A has dimensions $(size(rowvec)) but B has dimensions $(size(vec))"))
end
sum(@inbounds(return rowvec[i]*vec[i]) for i = 1:length(vec))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we be calling dot for two real vectors? This should produce the same result, of course, but is there a performance difference (especially when BLAS is called)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, good idea, I will address this in the new PR, along with the complex case.

I suppose I should go measure this, but is there much difference between native Julia and BLAS for level-1 operations, and if so, why? (BLAS might be multithreaded?) What about the outer product, which is now broadcast, maybe there is a faster BLAS outer product? (Would it be a rank-1 update on a zero'd array, or a direct op?)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BLAS might be multithreaded, but mainly it is the fact that it is hand-coded for SIMD. @simd can make up for this gap, but not always, and in generic code like you used above you don't have @simd anyway.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For broadcast operations, the advantage of BLAS1 is wiped out by fusion in typical circumstances. Of course, it would be great if broadcast could exploit SIMD well too, but we are dependent on the compiler for this.

rv = v.'

@test (rv*d)::RowVector == [2,6,12].'
@test_throws Exception d*rv
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make these test a specific exception type! This test would pass if your implementation had a typo in it.

# This is broken by the introduction of RowVector... see brittle comment above.
#@testset "issue #16548" begin
# @test which(\, (SparseMatrixCSC, AbstractVecOrMat)).module == Base.SparseArrays
#end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than simply commenting this test out, replace it with another test or set of tests that capture its spirit?

@test (cz*mat')::RowVector == [-2im 4 9]
@test_throws Exception [1]*reshape([1],(1,1))' # no longer permitted
@test cz*cz' === 15 + 0im
@test_throws Exception z*vz'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for example, vz is a typo and this test still passed because Exception is far too broad a type to be testing for

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I had to chuckle when I found this after you wrote about this possibility earlier. FYI its being addressed in #20047

@StefanKarpinski
Copy link
Sponsor Member

@andyferris: I want to say that this is – despite my waffling about whether we should do this or something else – a pleasure to work with. Thanks for all the hard work on bringing it to fruition!

@andyferris
Copy link
Member Author

Thanks Stefan - that means a lot to me :)

andyferris pushed a commit to andyferris/julia that referenced this pull request Feb 3, 2017
This used to be the outer product but was removed by JuliaLang#19670. It makes
much more sense to deprecate this than make it a breaking change.
@inline size(rowvec::RowVector, d) = ifelse(d==2, length(rowvec.vec), 1)
@inline indices(rowvec::RowVector) = (Base.OneTo(1), indices(rowvec.vec)[1])
@inline indices(rowvec::RowVector, d) = ifelse(d == 2, indices(rowvec.vec)[1], Base.OneTo(1))
linearindexing{V<:RowVector}(::Union{V,Type{V}}) = LinearFast()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't this be looking at the linearindexing of the wrapped type?

Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, this is what we want. It only wraps vectors, but is itself a two-dimensional object. If it passed this through, then it'd have pessimized performance for LinearSlow types unnecessarily.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

then are all AbstractVector types LinearFast? that currently isn't the case for e.g. SparseVector

Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, this is something that'd be nice to change at some point. LinearFast (read: index-with-one-dimension) and LinearSlow (read: index-with-N-dimensions) represent the same thing when N=1. It'd be worth seeing if the performance of SparseVector changes if you make it LinearFast — it may end up hitting simpler methods that way. If so, we could change the default for all vectors. And maybe eventually re-work this to be clearer (#20175 (comment)).

@Jutho
Copy link
Contributor

Jutho commented Aug 4, 2017

Hi Andy, I don't think I ever thanked you for this. I think we are all happy with this solution.

One final comment that I wanted to make, rather now than after julia v1.0, is that I did come up with one other solution to the issue of (vector) transposes. I didn't want to pollute that infamous issue any further now that it has finally found its peace, so I am posting it here. That solution is to have two parameters N in (Abstract)Array, where N1 represents the number of normal indices, and N2 the number of dual indices.

Linear algebra would than primarily take place with Vector{T} = Array{T,1,0}, Matrix{T} = Array{T,1,1} and RowVector = Array{T,0,1}, whereas most other multidimensional code would probably use Array{T,N,0}, but would actually only care about the total dimensionality N1+N2.

@andyferris
Copy link
Member Author

Thank you, @Jutho! :)

Yes, we've certainly have considered more complex combinations of dual and normal dimensions, as type parameters. I remember a chat with @jiahao and @mbauman where this was discussed, and Jiahao's talk at this year's JuliaCon ended by mentioning this.

My understanding is that v1.0 will happen as soon as practically possible (to just implement any foreseeable breaking changes and other "essential" features, bug fixes/correctness/consistency issues and performance bottlenecks). I recommend we play with this stuff in community packages and if we come up with an awesome design for dual spaces and multi linear algebra, we can propose changes for v2.0. For obvious reasons, you'd be a great person to help out with or to discuss such a design :)

For my part, I'm slightly confused about the complex case. If I have a complex matrix, the adjoint (ctranspose?) essentially takes the dual of both indices, no? What is the object (and the elements of the resulting array) where I take the dual of just one space but not the other? I'm sure there is a simple diagram for this, of course...

@Jutho
Copy link
Contributor

Jutho commented Aug 4, 2017

Actually transpose is the harder one. With the use of two parameters N1 and N2, the convention would be that the first N1 indices are normal, and the next N2 indices are dual.

If you have a matrix / linear map f:A->B between vector spaces A and B, that's an element of B ⊗ dual(A), i.e. N1=N2=1 in that language. Ctranspose maps f to a map f':B->A, i.e. an element of A ⊗ dual(B), so still a matrix with N1=N2=1.

Transpose, on the other hand, maps f to f.':dual(B)->dual(A), an element of dual(A) ⊗ B. That's the harder one to represent if you cannot actually encode which indices are dual and which are not. For real vector spaces, dual(A)=A and therefore there is no distinction, though it is useful to keep it in practice, so that rowvector * vector = scalar and vector * vector = error.

I've been reading a bit about diagrammatic notation and category theory to formalize my understanding: https://arxiv.org/abs/0908.3347v1

@andyferris
Copy link
Member Author

Ok - thanks I'll check out that reference.

So the discussion has been that we're more-or-less ditching transpose for linear algebra to replace it with adjoint (formerly ctranspose) and conj(adjoint(mat)) for the "non-conjugating transpose". So this light work out well.

@martinholters
Copy link
Member

I wonder whether a syntactically nice way of getting a row of a matrix as a RowVector would be worthwhile. Not that I've missed it personally, though. But A[5,:].' to get it could be a mathy context where you actually need transpose instead of adjoint.

@StefanKarpinski
Copy link
Sponsor Member

That's a fair point – you just want a row slice there, no conjugate.

@andyferris
Copy link
Member Author

We definitely need a way of getting a RowVector straight out of a matrix. The best I've thought up is row(A, 5).

@StefanKarpinski
Copy link
Sponsor Member

StefanKarpinski commented Aug 7, 2017

It's going to be really confusing that A[5,:] and row(A,5) do different things, I'm afraid.

@andyferris
Copy link
Member Author

These "taking seriously" issues highlight that getting what we want for arrays and what we want for linear algebra simultaneously will be challenging.

On one hand we want APL indexing. So, if we introduce dual vectors as 1D arrays, then this at least satisfies the AbstractArray interface if the dual vectors just come out of indexing by default. OTOH we might have "data" in the array and we don't want to have recursive adjoint and other linear algebra ideas coming into array indexing, and users would be confused why for this one indexing operation they get a dual vector instead of whatever similar indicates.

So I'm wondering if we should leave the AbstractArray interface alone and introduce the necessary functions to LinAlg that we need to do linear algebra. The row function would be a linear algebra function, like adjoint. Data applications would use indexing (and transpose).

@StefanKarpinski
Copy link
Sponsor Member

Ideally we could just have A[5,:].' give us a row vector.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet