Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Operators for tensor sum and tensor product in Julia 0.5 #13333

Closed
andyferris opened this issue Sep 27, 2015 · 15 comments
Closed

RFC: Operators for tensor sum and tensor product in Julia 0.5 #13333

andyferris opened this issue Sep 27, 2015 · 15 comments

Comments

@andyferris
Copy link
Member

Following the discussion on Arrays in Julia 0.5 #13157 , I want to suggest including new operators for the tensor sum (or direct sum) operation and for the tensor product (outer product).

The tensor sum (direct sum) is a way of combining both vector spaces as well as tensors (vectors, matrices or higher order arrays) of the same order. The tensor sum of two vector spaces V1 ⊕ V2 simply concatenates their two bases. Similarly, the tensor sum of two vectors v1 ⊕ v2 is their concatenation, vcat(v1,v2). For higher order tensors (higher dimensional arrays), all the vector spaces associated with each dimension are concatenated, and the elements of the arrays are filled in their original positions, resulting in a sparse (block diagonal) tensor. For example, the tensor sum of two matrices A ⊕ B is [A 0; 0 B]. Care would need to be taken for row vectors stored as 1-by-n arrays, but hopefully this will be resolved separately in Julia 0.5. For tensor sum, I suggest we use the usual symbol (use \oplus in the REPL), but we could also consider ++ or other operations (please suggest). The tensor sum operator might be useful in other concatenation contexts, such as string concatenation (since ⊕ is non-commutative and is basically the correct monoid for concatenating vectors of characters), but this is not the main point at this stage.

The tensor product (outer product) on vector spaces V1 ⊗ V2 creates a new basis with elements that are all possible outer products of the basis elements of V1 and of V2, and so the dimension of the resulting vector space becomes the product of the two individual dimensions. Julia already implements kron() which is the same idea, but restricted to only vectors and matrices. We could generalize kron and use (\otimes in the REPL) or ** (or something else) to implement this. Tensor products might be similar to other operations in other contexts - for example the outer join in relational algebra behaves similarly, so we could implement that for DataFrames.

Tensors and multilinear algebra are described by bi-monoidal categories (with the two monoids ⊗ and ⊕) so these to operations will make that a little simpler to write and reason about. Besides, these operators may have use in other fields where concatenation or outer products make sense. What is the policy on adding new operators to Julia? Is this frowned upon? Can we use unicode symbols, or must we continue with multiple ASCII symbols (which I find a bit ugly...)? Does anyone else think this is useful? If the community wants this feature, I could work on it and generate a PR (though the parser confuses me...)

@mbauman
Copy link
Sponsor Member

mbauman commented Sep 27, 2015

This can all be done in a package on 0.4, as these operators already exist (but aren't assigned a meaning):

julia> module TensorOperators
       export ++, , 
       ++(A::AbstractArray, B::AbstractArray) = vcat(A, B)
       const  = ++
       (A::AbstractArray, B::AbstractArray) = kron(A, B)
       end
TensorOperators

julia> using TensorOperators

julia> ([1,2,3] ++ [4,5,6])  [1 2 3 4 5 6]
6x6 Array{Int64,2}:
 1   2   3   4   5   6
 2   4   6   8  10  12
 3   6   9  12  15  18
 4   8  12  16  20  24
 5  10  15  20  25  30
 6  12  18  24  30  36

@andyferris
Copy link
Member Author

Excellent, thank you @mbauman, that makes implementation easy (I hadn't looked at operators in 0.4 yet... silly me)

The only thing left is this: do people feel these linear algebra primitives (⊕ and ⊗) should be included in Base, since they are pretty fundamental (and IMO useful)?

@KristofferC
Copy link
Sponsor Member

I don't think it should be included in Base. Depending on the context these operators could do different things, for example Voigt tensor.

@andyferris
Copy link
Member Author

Sure, if you have a tensor stored in some different storage format, you would have to implement the operators themselves for your type - if you wanted to use them. Same would go for hcat and vcat, though, right? The output shouldn't be ambiguous given the input types (the tensor sum of two symmetric matrices is still a symmetric matrix, for instance, but you can't just concatenate the Voigt vector representations without further information like a block-diagonal representation or padding a larger matrix with zeros. But Julia would already know they are some VoigtArray{T,N} type, so that's ok).

For vectors, I would think of ⊕ as a convenient way of accessing vcat and for vectors/matrices ⊗ is convenient way of accessing kron. I would hope extensions to other types would be unambiguous and mathematically correct. We use other common math symbols like ∈ (\in) in Base as duplicating existing functions or keywords for convenience and to add a mathematical style to the language, so why not these also?

@nalimilan
Copy link
Member

I think the trend is to reduce the size of Base by moving code to separate modules, not the other way. The plan is to have a set of modules installed/loaded by default, among which e.g. sparse matrices, linear algebra, etc. So if you create a Tensors.jl package and it becomes popular, it will be easy to ship it with Julia by default later. Being in Base doesn't give you any special privilege other than that.

@stevengj
Copy link
Member

See also #3250 and https://github.com/Jutho/TensorOperations.jl ... but that is a little different, focused on contractions.

@stevengj
Copy link
Member

In general, for defining these in Base I'm worried that the meaning of ⊕ and ⊗ is not universally agreed upon in different mathematical contexts. (In contrast, there is no widespread meaning for other than in as far as I know.)

e.g. Trefethen's Numerical Linear Algebra book uses them to denote floating-point (rounded) addition and multiplication (as opposed to exact addition and multiplication).

Even in the tensor context, whereas in some contexts you want of two matrices to mean kron, in other contexts you might want A ⊗ B to produce some sort of Tensor(A, B) object that encodes a linear operation acting directly on matrices (equivalent to the kron(A,B) matrix acting on matrices reshaped into vectors, but potentially more efficient by avoiding the need to store kron(A,B)).

@stevengj
Copy link
Member

Note that const ⊗ = kron works perfectly fine in 0.3 as well. 0.3 is when the Unicode infix operators were added (#6929).

@ScottPJones
Copy link
Contributor

is also used to denote XOR as well, but does that necessarily mean that Julia couldn't pick say the most common usage?
Would the use of it for tensor (direct) sum and for tensor (outer) product be considered the most frequent case?
For , searching with Google, I've only found it used as meaning "earth", "XOR", and "direct (or tensor) sum".
Is that particular book's convention used elsewhere?

@hayd
Copy link
Member

hayd commented Sep 28, 2015

They're also used in groups and rings. I don't think julia needs to claim them.

@ScottPJones
Copy link
Contributor

Are groups and rings represented in Julia with different types (in which case, where is the problem?), or all they all just Vector{SomeT} or Array{SomeT,dim}?

@jiahao
Copy link
Member

jiahao commented Sep 28, 2015

  1. Tensor sum is not always synonymous with direct sum. For example, the mathematical literature uses the former term to mean the particular quantity

    screen shot 2015-09-28 at 1 43 50 pm

    Reference: http://www.emis.ams.org/journals/AMUC/_vol-80/_no_1/_kubrusly/kubrusly.pdf

  2. It's quite common to use ⊕ and ⊗ as exact addition and multiplication in the numerical analysis literature. In addition to Trefethen's book, Knuth's TAoCP uses this exact same notation to introduce floating point arithmetic. Also common is their use to define a general tropical algebra over a scalar semiring. It's best to avoid operator punning, which is exactly what would happen if you wanted ⊕ to mean XOR, direct sum, max, or exact addition depending on the types of its operands.

  3. Groups and rings are generic algebraic structures exhibited by quite a few base Julia types and it is common to use different notation other ⊕ and ⊗ for more familiar types, e.g. + and * on reals, to mean the same thing as ⊕ and ⊗ in a formal algebraic setting. See my julia-dev post on the formal algebra of types where I had explored the idea of computing algebraic structures of Julia types in an IPython notebook.

@Jutho
Copy link
Contributor

Jutho commented Sep 28, 2015

I agree that this should not be in Base; specific implementations belong in a package (e.g. I also explored this last year also in my mostly unfinished TensorToolbox.jl ). I don't see that there is a unique generalisation to tensors anyway, so this could at most be some shorthand notation for kron etc, or you need to make a specific choice which might actually only match with kron up to a reshape and permutedims.

@stevengj
Copy link
Member

Closing as the consensus seems clearly against.

@andyferris
Copy link
Member Author

Thank you all very much for you comments.

I would have closed this some time ago but I've been travelling. (If I had of known the symbols were so ambiguous, I wouldn't have opened it in the first place...)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants