# Array-Based Data Types
([Back to Overview](../index.html#/0/3))

## Arrays, and Matrices

Physicists like lists and arrays of numbers:

1-D list of numbers:

In [1]:
v = Vector{Float64}(undef, 3)

3-element Vector{Float64}:
 0.0
 0.0
 2.6769032496e-314

In [2]:
fill!(v, 0)

3-element Vector{Float64}:
 0.0
 0.0
 0.0

Note the convention here: functions that mutate their arguments should end in a `!` (this is to warn the user, and "only" a convention ... the compiler won't complain, but the user of your library might).

Arrays can be multi-dimensional:

In [3]:
m = Matrix{Float64}(undef, 2, 2)

2×2 Matrix{Float64}:
 0.0           2.31605e-314
 2.31609e-314  0.0

In [4]:
fill!(m, 1)

2×2 Matrix{Float64}:
 1.0  1.0
 1.0  1.0

The `Matrix` type is a 2D "array", but we can specify any number of dimensions -- 3D for example:

In [5]:
a = Array{Float64, 3}(undef, 2, 5, 2)

2×5×2 Array{Float64, 3}:
[:, :, 1] =
 2.317e-314  2.317e-314  2.23241e-314  2.23241e-314  2.23241e-314
 2.317e-314  2.317e-314  2.317e-314    2.317e-314    2.317e-314

[:, :, 2] =
 2.23241e-314  2.23241e-314  2.23241e-314  2.23241e-314  2.23241e-314
 2.317e-314    2.317e-314    2.317e-314    2.23241e-314  2.23241e-314

In [6]:
fill!(a, -1)

2×5×2 Array{Float64, 3}:
[:, :, 1] =
 -1.0  -1.0  -1.0  -1.0  -1.0
 -1.0  -1.0  -1.0  -1.0  -1.0

[:, :, 2] =
 -1.0  -1.0  -1.0  -1.0  -1.0
 -1.0  -1.0  -1.0  -1.0  -1.0

One-liner to get a matrix of zeros (use `ones` for ones)

In [7]:
zeros((2,3))

2×3 Matrix{Float64}:
 0.0  0.0  0.0
 0.0  0.0  0.0

## Indices start at 1 (And why shouldn't they?!)

In [8]:
v = [1, 2, 3]

3-element Vector{Int64}:
 1
 2
 3

In [9]:
v[1]

1

So doing things c-style will get you an error:

In [10]:
v[0]

LoadError: BoundsError: attempt to access 3-element Vector{Int64} at index [0]

The `end` shortcut gives you the _last_ index in an array type

In [11]:
v[end]

3

## Working with ND Arrays

The `size` method returns an order tuple of each directions size

In [12]:
size(a)

(2, 5, 2)

Loops in brackets `[ ... ]` get stored as ND arrays

In [13]:
m = [(i, j) for i=1:3, j=10:14]

3×5 Matrix{Tuple{Int64, Int64}}:
 (1, 10)  (1, 11)  (1, 12)  (1, 13)  (1, 14)
 (2, 10)  (2, 11)  (2, 12)  (2, 13)  (2, 14)
 (3, 10)  (3, 11)  (3, 12)  (3, 13)  (3, 14)

**A word of caution:** arrays are stored in column-major format in memory. That means that the first indices are contigous blocks in memory (just like Fortran, and _unlike_ C/Python). If you itterate over an array in Julia, make sure that the _inner_ loop itterates over the _first_ index. This is something to keep in mind when itterating over several variables in a loop.

In [14]:
for i in CartesianIndices(m)
    println(i)
end

CartesianIndex(1, 1)
CartesianIndex(2, 1)
CartesianIndex(3, 1)
CartesianIndex(1, 2)
CartesianIndex(2, 2)
CartesianIndex(3, 2)
CartesianIndex(1, 3)
CartesianIndex(2, 3)
CartesianIndex(3, 3)
CartesianIndex(1, 4)
CartesianIndex(2, 4)
CartesianIndex(3, 4)
CartesianIndex(1, 5)
CartesianIndex(2, 5)
CartesianIndex(3, 5)


We can interrogate the memory layout of a multi-dimensional array by using the `vec` method (a.k.a `[:]`) to slice it in 1-D:

In [15]:
m[:]

15-element Vector{Tuple{Int64, Int64}}:
 (1, 10)
 (2, 10)
 (3, 10)
 (1, 11)
 (2, 11)
 (3, 11)
 (1, 12)
 (2, 12)
 (3, 12)
 (1, 13)
 (2, 13)
 (3, 13)
 (1, 14)
 (2, 14)
 (3, 14)

Notice that the "double for loop" notation uses the _second_ index as the "inner" index -- this would be inefficinet memory access!

In [16]:
for i=1:3, j=10:14
    println((i, j))
end

(1, 10)
(1, 11)
(1, 12)
(1, 13)
(1, 14)
(2, 10)
(2, 11)
(2, 12)
(2, 13)
(2, 14)
(3, 10)
(3, 11)
(3, 12)
(3, 13)
(3, 14)


We can demonstrate the effect of this on performance by iterating over rows or columns in the inner loop

In [31]:
using BenchmarkTools

In [60]:
m = [i*j for i=1:1000, j=1:1000];

In [66]:
function iterate_rows(m)
    x, y = size(m)
    z = zeros(x, y)
    for i=1:x
        for j=1:y
            z[i, j] = m[i, j]
        end
    end
    z
end

iterate_rows (generic function with 1 method)

In [67]:
@benchmark iterate_rows(m)

BenchmarkTools.Trial: 601 samples with 1 evaluation.
 Range [90m([39m[36m[1mmin[22m[39m … [35mmax[39m[90m):  [39m[36m[1m6.875 ms[22m[39m … [35m15.983 ms[39m  [90m┊[39m GC [90m([39mmin … max[90m): [39m0.00% … 31.34%
 Time  [90m([39m[34m[1mmedian[22m[39m[90m):     [39m[34m[1m7.569 ms              [22m[39m[90m┊[39m GC [90m([39mmedian[90m):    [39m0.00%
 Time  [90m([39m[32m[1mmean[22m[39m ± [32mσ[39m[90m):   [39m[32m[1m8.305 ms[22m[39m ± [32m 1.598 ms[39m  [90m┊[39m GC [90m([39mmean ± σ[90m):  [39m7.21% ± 11.83%

  [39m [39m [39m▇[39m█[39m█[39m▄[39m▃[34m▂[39m[39m [39m [39m [39m [39m [39m [32m [39m[39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m 
  [39m▃[39m▆[39m█[39m█[39m█[39m█[39m█[34m█

In [68]:
function iterate_cols(m)
    x, y = size(m)
    z = zeros(x, y)
    for j=1:y
        for i=1:x
            z[i, j] = m[i, j]
        end
    end
    z
end

iterate_cols (generic function with 1 method)

In [69]:
@benchmark iterate_cols(m)

BenchmarkTools.Trial: 1838 samples with 1 evaluation.
 Range [90m([39m[36m[1mmin[22m[39m … [35mmax[39m[90m):  [39m[36m[1m1.426 ms[22m[39m … [35m9.704 ms[39m  [90m┊[39m GC [90m([39mmin … max[90m): [39m 0.00% … 61.99%
 Time  [90m([39m[34m[1mmedian[22m[39m[90m):     [39m[34m[1m2.030 ms             [22m[39m[90m┊[39m GC [90m([39mmedian[90m):    [39m 0.00%
 Time  [90m([39m[32m[1mmean[22m[39m ± [32mσ[39m[90m):   [39m[32m[1m2.712 ms[22m[39m ± [32m1.622 ms[39m  [90m┊[39m GC [90m([39mmean ± σ[90m):  [39m24.87% ± 24.52%

  [39m [39m▅[39m█[39m▄[39m▃[39m▁[34m [39m[39m [39m [39m [39m [39m [32m [39m[39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m 
  [39m▂[39m█[39m█[39m█[39m█[39m█[34m█[39m[39m█

### Broadcast Operations

Literal arrays are declared as follows

In [17]:
m_0 = [1 0; 1 0]

2×2 Matrix{Int64}:
 1  0
 1  0

Alternatively:

In [18]:
m_1 = [ [1 0]
        [1 0] ]

2×2 Matrix{Int64}:
 1  0
 1  0

In [19]:
m_2 = [ [0 1]
        [0 0] ]

2×2 Matrix{Int64}:
 0  1
 0  0

We can perform basic arithmetic operations on array data as expected:

In [20]:
m_3 = 2 * m_1 + m_2

2×2 Matrix{Int64}:
 2  1
 2  0

**Important** there is a difference in the maxtrix multiplication operator (i.e. the `*(A,B)` method defined for `Array`) and the element-wise multiplication opertation (i.e. `*(A[i], B[i])` for all `i`). The `.` syntax is short of "broadcasting" the multiplication operation over the entire array.

In [21]:
m_1 * m_2

2×2 Matrix{Int64}:
 0  1
 0  1

In [22]:
m_1 .* m_2

2×2 Matrix{Int64}:
 0  0
 0  0

 => The "dot" operator performs a broadcast. But we can also be more specific:

In [23]:
f(x::Number) = x^2 + 1

f (generic function with 1 method)

In [24]:
broadcast(f, m_3)

2×2 Matrix{Int64}:
 5  2
 5  1

In [25]:
f.(m_3)

2×2 Matrix{Int64}:
 5  2
 5  1

### Beware of Abstract Containers

**Every** concrete data type is a first class citizen $\Rightarrow$ there is no penalty to using a data type that is not `double`, `int`, etc (provided it is well designed) ... But abstract types are a different:

It is possible to define a container without the data type (it actually is of type `Any`)

In [26]:
v_a = Vector(undef, 3)

3-element Vector{Any}:
 #undef
 #undef
 #undef

this can hold any sort of data (which can also change during execution)

In [27]:
fill!(v_a, 1.1)

3-element Vector{Any}:
 1.1
 1.1
 1.1

In [28]:
v_a[2]="asdf"
v_a

3-element Vector{Any}:
 1.1
  "asdf"
 1.1

This can result in a performance hit... `Any` arrays are containers with abstract type $\Rightarrow$ they contain _pointers_ to data instead of data and should be avoided where performance is important

In [32]:
v_abstract = Vector(undef, 10000)
v_concrete = Vector{Float64}(undef, 10000)

fill!(v_abstract, 10)
fill!(v_concrete, 10);

In [33]:
@benchmark sqrt.(v_abstract)

BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range [90m([39m[36m[1mmin[22m[39m … [35mmax[39m[90m):  [39m[36m[1m288.257 μs[22m[39m … [35m  3.851 ms[39m  [90m┊[39m GC [90m([39mmin … max[90m): [39m0.00% … 89.99%
 Time  [90m([39m[34m[1mmedian[22m[39m[90m):     [39m[34m[1m370.265 μs               [22m[39m[90m┊[39m GC [90m([39mmedian[90m):    [39m0.00%
 Time  [90m([39m[32m[1mmean[22m[39m ± [32mσ[39m[90m):   [39m[32m[1m396.427 μs[22m[39m ± [32m221.855 μs[39m  [90m┊[39m GC [90m([39mmean ± σ[90m):  [39m5.03% ±  7.89%

  [39m [39m [39m▁[39m▄[39m▅[39m▅[39m▄[39m▅[39m▅[39m▆[39m█[34m█[39m[39m▇[39m▅[32m▄[39m[39m▃[39m▃[39m▃[39m▂[39m▂[39m▁[39m▁[39m▁[39m [39m▁[39m▁[39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m▂
  [39m▃[39m▆[39m█

In [34]:
@benchmark sqrt.(v_concrete)

BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range [90m([39m[36m[1mmin[22m[39m … [35mmax[39m[90m):  [39m[36m[1m14.923 μs[22m[39m … [35m  5.462 ms[39m  [90m┊[39m GC [90m([39mmin … max[90m): [39m 0.00% … 99.02%
 Time  [90m([39m[34m[1mmedian[22m[39m[90m):     [39m[34m[1m45.358 μs               [22m[39m[90m┊[39m GC [90m([39mmedian[90m):    [39m 0.00%
 Time  [90m([39m[32m[1mmean[22m[39m ± [32mσ[39m[90m):   [39m[32m[1m54.975 μs[22m[39m ± [32m210.334 μs[39m  [90m┊[39m GC [90m([39mmean ± σ[90m):  [39m15.75% ±  4.08%

  [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m▁[39m▅[39m█[39m▇[34m▄[39m[39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [32m [39m[39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m 
  [39m▂[39m▃[39m▃[39m▃