Vectors vs. matrices

mars0i edited this page Jan 13, 2014 · 10 revisions


core.matrix supports both vectors (1D arrays) and matrices (2D arrays). This page explains the differences between the two.

Matrix properties

  • Can be thought of as a 2D nested array e.g. [ [1 2 3] [4 5 6] [7 8 9] ]
  • Always have dimensionality 2; equivalent to a 2D array
  • Are indexed by [row column]
  • Can be transposed, which reverses rows and columns
  • Support some matrix-only operations, e.g. det, trace
  • Can represent a row vector (e.g. [ [1 2 3] ]) or a column vector (e.g. [ [1] [2] [3] ])
  • A slice of a Matrix is a 1D vector

Vector properties

  • Can be though of as a 1D array, e.g. [1 2 3]
  • Always have dimensionality 1; equivalent to a 1D array
  • Are indexed by [position]
  • Transposing a vector is the identity function: if you call transpose on a vector, you will get the same vector back unchanged
  • Supports some vector-only operations, e.g. distance, dot-product
  • A slice of a vector is a 0D scalar

Why have both?

This topic has been discussed a few times in the Numerical Clojure group. It boils down to there being two different perspectives:

  • The array programming viewpoint, where all possible shapes of array are supported: in this case 1D vectors are a distinct type and don't have a row/column distinction
  • The mathematical matrix viewpoint, where everything is a 2D matrix. "Row vectors" and "column vectors" are really 1xN and Nx1 matrices in terms of their behaviour.

Technically, you can always emulate vectors by using matrices. Use either a Nx1 column matrix or a 1xN row matrix. Indeed, some mathematics systems (like MATLAB) expect you to do just this.

However, there are some important reasons that we need to have both:

  • core.matrix is designed to be a general purpose N-Dimensional API, so it needs to support both the 1D and 2D cases in general.
  • Many underlying matrix implementations have both vectors and matrices. We need to support them.
  • It can catch some runtime type errors (e.g. passing a matrix where a vector is required).
  • In some cases, using vectors rather than matrices offers higher performance (due to a simpler data structure and less complex indexing).

Which should I use?

Usually this will be clear from what is needed for your problem. If you need a 2D array of numbers then use a matrix. If you need a 1D array of numbers then use a vector.

If in doubt, the array programming approach (a mix of 1D vectors and 2D matrices) is generally recommended - this is the most idiomatic style supported by core.matrix and is likely to result in slightly cleaner and more logical code. It also has the potentially to be very slightly more efficient, as 1D vectors can benefit from more efficient indexing techniques.

Note: Matrix multiplication with vectors

Matrix multiplication with mmul will treat a vector as a 1xN matrix (row vector) when it's a first argument, or as an Nx1 matrix (column vector) when it's a second argument--except that the dimensionality of the result will be different than it would be with matrix arguments. This allows mmul to implement inner product.

(mmul (matrix [1 2 3]) (matrix [1 2 3]))         ; vector times vector
; => 14
(mmul (matrix [[1 2 3]]) (matrix [[1] [2] [3]])) ; multiplication of matrices as row and column vectors
; => [[14]]
(mmul (matrix [1 2 3]) (matrix [[1] [2] [3]]))   ; vector times column vector
; => [14]
(mmul (matrix [[1 2 3]]) (matrix [1 2 3]))       ; row vector times vector
; => [14]
(mmul (matrix [[1 2 3][4 5 6][7 8 9]]) (matrix [[1] [2] [3]])) ; matrix times column vector
; => [[14] [32] [50]]
(mmul (matrix [[1 2 3][4 5 6][7 8 9]]) (matrix [1 2 3]))       ; matrix times vector
; => [14 32 50]
(mmul (matrix [1 2 3]) (matrix [[1 2 3]]))      ; vector treated as row vector times row vector
; => RuntimeException Mismatched vector sizes ...
(mmul (matrix [[1] [2] [3]]) (matrix [1 2 3]))  ; column vector times vector treated as column vector
; => RuntimeException Mismatched vector sizes ...

Further discussion / reading: