Skip to content

Row vectors

toivoh edited this page Apr 13, 2012 · 67 revisions

Row vectors

The scope of this suggestion is the workings of matrix algebra (in the case ndim <= 2). The underlying idea is that there is a fundamental difference between e g a row matrix, which is a matrix that happens to have only one row, and a row vector; an actual vector. A RowVector type is introduced to make this distinction; to the effect that common operations that return scalar results conceptually will return actual scalars, obviating the need to downconvert from 1x1 matrices, ndim=0 matrices, column matrices and the like in Julia code.

Assumptions: The operations of * for matrix multiplication and tranpose ' belong primarily to the domain of ndim <= 2 arrays.

Suggestion:

Create a RowVector type to represent a transposed, but still 1d, column vector. (Subtype of AbstractArray?) This will behave exactly like a 1xn Array (row matrix) in all cases except those given below, and promote to one in cases when no other behavior has been specified here.
Introduce hasdim to check the presence of a particular dimension:

hasdim(A::Any, k::Integer)         = false
hasdim(A::Array{T, n}, k::Integer) = (1 <= k <= n)
hasdim(A::RowVector, k::Integer)   = (k == 2)

Transpose: column'::RowVector, row'::Array{T,1}. This lets x'' = x even for column vectors.

Concatenation: [x y z] invokes row concatenation rowcat: produces a RowVector if all elements are scalars and/or row vectors; otherwise it behaves like hcat. hcat is still invoked by [x y z;] and [x;]; vcat is still invoked by [x].

Multiplication x*y: If either operand is scalar (or ::Array{T,0}), perform scalar multiplication (the other operand can have any ndim);
otherwise matrix multiplication on x*y, regarding them as matrices (error if the matrix dimensions don't match). The result type follows from

hasdim(x*y,1) = hasdim(x,1)
hasdim(x*y,2) = hasdim(y,2)

i e

typeof(x*y):          hasdim(y,2) = false   hasdim(y,2) = true
hasdim(x,1) = false   Scalar                RowVector
hasdim(x,1) = true    ColumnVector          Matrix

Addition: row + scalar (scalar including Array{T,0}) and row1 + row2 produce RowVectors; in all other cases row vectors are promoted to row matrices. Error if the operands have different sizes.

Motivation:

There is a difference in identity between scalars, vectors and matrices (and it's useful!)

Motivating example: Consider writing Matlab code for operations in an n-dimensional vector space.
Depending on their dimensions, one would classify variables as

1x1: scalar,         1xn: row vector 
nx1: column vector,  nxn: matrix   

Different operations on an object will make sense depending on this classification; the reason for most cases when operations are overloaded based on singleton dimensions in Matlab is to try to support this distinction!

Can we still distinguish between scalars, vectors and matrices when it happens that n=1? I believe so:

When a 1x1 or 1xn matrix is created, it's usually as a special case of a procedure that would produce non-singleton matrices. When column vectors are created, it is usually by a procedure that would never produce a multicolumn matrix; the intended type is mostly apparent from the construction!

  • Scalars are commonly produced by numeric literals, indexing an element of an array, ...
  • Row/column vectors are commonly produced by row/column concatenation, column vector construction: Array(T, n), matrix slicing, zeros(n), ...
  • Matrices are commonly produced by matrix concatenation, eye, zeros(m, n), ...

Propagating this classification will often infer the correct classification of computed results, e g produce actual scalars for

 x´*y,   x'*A*x   # x column vector, A matrix

and actual column vectors for

x'',     x*y'*z   # x, y, z column vectors

This should save a lot of trouble of e g having to convert 1x1 matrices or zero-d arrays into scalars.
With stricter semantics that would disallow e g row_vector + row_matrix, or column_vector*row_matrix, it could also catch a number of bugs.

Adding a special row vector type

The extra RowVector type undeniably adds some degree of complexity, but I think it can be more than worth it to avoid spurious addition of singleton dimensions to results. I don't see how this could be accomplished without keeping track of the fact that x' in x'y is a transposed column vector rather than a 1-row matrix.

The intent of RowVector is to be minimally intrusive, functioning almost completely as a row matrix ::Array{T,2}. The type difference serves to tag the result as a row vector, so that operations that care can propagate this information.
Also the redefined operations are intended to be minimally intrusive: no row vectors are created (and nothing will behave differently) unless column vectors are transposed, or scalars concatenated using [a b c] rather than [a b c;].

Futher benefits: Having a special RowVector type allows to enforce row vector arguments to a function with r::RowVector.

Discussion:

Some common use cases not caught by the rules: a row vector is intended, but it's actually created as a row matrix:

Gives a row matrix:        Use instead:
Array(T, (1,n))     -->    RowVector(T, n)
zeros(1,n)          -->    zeros(n)'
[1;]                -->    [1]'

These would have to be changed manually to take full benefit of the system. With strict compatibility rules, many such constructions would show up as errors.

Strict or loose size compatibility in multiplication and addition: A stricter alternative would be to
give an error on the matrix product x*y if hasdim(x, 2) != hasdim(y, 1), (one is given if size(x, 2) != size(y, 1)) and
give an error on the sum x+y if hasdim(x, k) != hasdim(y, k) for any k.

  • This would definitely help catch some errors that would otherwise corrupt the classification of results, avoiding to e g pull a scalar out of a 1x1 matrix down the line, or go hunting for the inconsistency backwards from the point of error.

  • It could also help to catch some otherwise hard-to-find bugs.

  • It preserves distributivity. Allowing scalar + vector = vector2 destroys distributivity regarding classification:

      scalar2 = row*(scalar + column)  !=  row*scalar + row*column = row2
    

The advantage with the looser compatibility rules given in the suggestion is that they are consistent with the regular Julia behavior of adding trailing singleton dimensions when needed to make operands compatible. (The promotion RowVector -> row matrix could be seen as promoting the missing first dimension to a singleton.)

Cons:

It seems that RowVector might end up so capable that it would be a serious alternative to column vectors for representing plain vectors. We don't want this! Having fewer operations defined on it than on column vectors would create some preference, anyway. It might be enough that row vectors must be indexed with r[1,n], ndims(r) = 2, etc.

Clone this wiki locally