# Arrays

This notebook illustrates how to create and reshuffle arrays. Other notebooks focus on matrix algebra and functions applied to arrays.

## Load Packages and Extra Functions

In [1]:
using Printf, DelimitedFiles

include("src/printmat.jl");

# Scalars, Vectors and Multi-dimensional Arrays

*are treated as different things* in Julia, even if they happen to "look" similar. For instance, a $1 \times 1$ array is not a scalar and an $n \times 1$ array is not a vector.

However, we first present some common features of all arrays (vectors or multi-dimensional arrays).

# Creating Arrays

The typical ways of getting an array are 

* hard coding the contents
* reading in data from a file
* as a result from computations
* allocating the array and then changing the elements
* (often not so smart) growing the array by adding rows (or columns,..)
* by list comprehension

The next few cells give simple examples.

## 1. Hard Coding the Contents or Reading from a File

In [2]:
z = [11 12;                       #typing in your matrix
     21 22]
printblue("A matrix that we typed in:")
printmat(z)

x = readdlm("Data/MyData.csv",',',skipstart=1)  #read matrix from file
printblue("First four lines of x from csv file:")
printmat(x[1:4,:])

#to create a vector: [1,2] or [1;2]
#to create a 2x3x2 array:  [1 2 3;4 5 6;;;11 12 13;14 15 16]

[34m[1mA matrix that we typed in:[22m[39m
    11        12    
    21        22    

[34m[1mFirst four lines of x from csv file:[22m[39m
197901.000     4.180     0.770    10.960
197902.000    -3.410     0.730    -2.090
197903.000     5.750     0.810    11.710
197904.000     0.050     0.800     3.270



## 2a. Allocating an Array and Then Changing the Elements: Fill

An easy way to create an array is to use the `fill()` function.

```
A = fill(0,10,2)             #10x2, integers (0)
B = fill(0.0,10)             #vector with 10 elements, floats (0.0)
C = fill(NaN,10,2)           #10x2, floats (NaN)
D = fill("",3)               #vector with 3 elements, strings ("")
E = fill(Date(1),3)          #vector with 3 elements, dates (0001-01-01) 
```

In contrast, do *not* use `fill([1,2],7)`, since all 7 arrays will refer to the same underlying array (changing one changes all). Instead, use a comprehension (see below).

In [3]:
x = fill(0.0,3,2)     #creates a 3x2 matrix filled with 0.0
printblue("so far, x is filled with 0.0. For instance, x[1,1] is $(x[1,1])")

for i in 1:size(x,1), j in 1:size(x,2)
    x[i,j] = i/j
end

printblue("\nx after some computations:")
printmat(x)

[34m[1mso far, x is filled with 0.0. For instance, x[1,1] is 0.0[22m[39m

[34m[1mx after some computations:[22m[39m
     1.000     0.500
     2.000     1.000
     3.000     1.500



## 2b. Allocating an Array and Then Changing the Elements: A More General Approach (extra)

You can also create an array by 

```
A = Array{Int}(undef,10,2)       #10x2, integers
F = Array{Any}(undef,3)          #vector with 3 elements, can include anything
```

The ```undef``` signals that the matrix is yet not initialized. This is more cumbersome than `fill()`, but sometimes more flexible.

In [4]:
F    = Array{Any}(undef,3)
F[1] = [1;2;3;4]             #F[1] contains a vector
F[2] = "Sultans of Swing"    #F[2] a string
F[3] = 1978                  #F[3] an integer

printmat(F)

[1, 2, 3, 4]
Sultans of Swing
  1978    



## 3a. Growing an Array

Growing an array (vector, matrix or higher-dimensional array) is done by `[A;B]` to stack vertically, `[A;;B]` (which is the same as `[A B]`) to stack horizontally, or `[A;;;B]` which is to stack along dimension 3. Alternatively, use the `vcat`, `hcat` and `cat` functions to do the same things. Still, this is not very fast, so avoid it in a tight loop.

In [5]:
A = [11 12;
     21 22]
B = [1 2;
     0 10]

z = [A;B]                #same as vcat(A,B)
printblue("\n","stacking A and B vertically")
printmat(z)

z2 = [A B]                 #same as hcat(A,B)
printblue("\n","stacking A and B horizontally")
printmat(z2) 


[34m[1mstacking A and B vertically[22m[39m
    11        12    
    21        22    
     1         2    
     0        10    


[34m[1mstacking A and B horizontally[22m[39m
    11        12         1         2    
    21        22         0        10    



## 3b. Growing a Vector

There are special (faster) functions for growing a *vector* (not a matrix):
```
push!(old vector,new_element_1,new element_2)       #or pushfirst!()
```

If you instead want to append all elements of a vector, then do
```
append!(old vector,vector1_to_append,vector2_to_append)     #or prepend!()
```

In [6]:
B = Float64[]                 #empty vector, to include floats
for i = 1:3
    x_i = 2.0 + 10^i
    push!(B,x_i)              #adding an element at the end
end 
printblue("a vector with 3 elements:")
printmat(B)

[34m[1ma vector with 3 elements:[22m[39m
    12.000
   102.000
  1002.000



## 4. List Comprehension and map (extra)

List comprehension is a simple way of creating an array from repeated calculations. It is similar to the combination of pre-allocation and a "for loop."

(You can achieve the same thing with ```map``` (for instance, by ```map(i->collect(1:i),1:3)```).)

In [7]:
A = [collect(1:i) for i=1:3]         #this creates a vector of vectors

printblue("A[1] is vector with 1 element, A[2] a vector with 2 elements,...")
printmat(A)

[34m[1mA[1] is vector with 1 element, A[2] a vector with 2 elements,...[22m[39m
       [1]
    [1, 2]
 [1, 2, 3]



# Using Parts of a Matrix

The most common way to use parts of an array is by indexing. For instance, to use the second column of `A`, do `A[:,2]`.

Notice that `x = A[1,:]` gives a (column) vector (yes, it does), while `z = A[1:1,:]` gives a $1 \times k$ matrix.

Also notice that `z = A[1,:]` creates an independent copy, so changing (elements of) `z` will *not* change `A`.

A shortcut to loop over all rows of `A` is `for i in eachrow(A)`. There is also `eachcol()`.

In [8]:
A = [11 12;
     21 22]
printblue("A:")
printmat(A)

printblue("\nsecond column of A:")
printmat(A[:,2])

printblue("\n","first row of A (as a vector): ")
printmat(A[1,:])                          #notice 1 makes it a vector

printblue("\n","first row of A (as a 1x2 matrix): ")
printmat(A[1:1,:])                        #use 1:1 to keep it as a 1x2 matrix

for i in eachrow(A)          #looping over all rows
    printblue("another row: ")
    printmat(i)
end

[34m[1mA:[22m[39m
    11        12    
    21        22    


[34m[1msecond column of A:[22m[39m
    12    
    22    


[34m[1mfirst row of A (as a vector): [22m[39m
    11    
    12    


[34m[1mfirst row of A (as a 1x2 matrix): [22m[39m
    11        12    

[34m[1manother row: [22m[39m
    11    
    12    

[34m[1manother row: [22m[39m
    21    
    22    



## Performance Tips (extra)

In case you do not need an independent copy, then `y = view(A,1,:)` creates a *view* of the first row of `A`. This saves memory and is sometimes faster. Notice, however, that changing `y` by `y .= [1,2]` will now change also the first row of `A`. Notice that the dot `.` is needed for this. (In contrast, `y = [1,2]` would create a new `y` and not affect `A.`)

To make a *copy or a view?* If you need to save memory: a view. Instead, if you need speed: try both. (Copies are often quicker when you need to do lots of computations on the matrix, for instance, in a linear regression.)

In [9]:
printblue("\n","view of first row of A (although it prints like a column vector): ")
y = view(A,1,:)
printmat(y)

y .= [1,2]                    #changing y and thus the first row of A, notice the dot (.)
printblue("A after changing y by y .= [1,2]")
printmat(A)


[34m[1mview of first row of A (although it prints like a column vector): [22m[39m
    11    
    12    

[34m[1mA after changing y by y .= [1,2][22m[39m
     1         2    
    21        22    



## Performance Tips (extra)

Avoid creating and destroying lots of arrays in loops: it takes time. If possible re-use the existing arrays instead. The next cell provides an illustration.

In [10]:
function fn1(N)
  for i = 1:N
    tmp = zeros(N,N)         #create a new tmp in each loop
    tmp[i,i] = i             #do something with tmp
  end
  return nothing
end

function fn2(N)
  tmp = zeros(N,N)
  for i = 1:N
    tmp .= 0.0             #re-use the existing tmp, reset to zeros
    tmp[i,i] = i           #do something with tmp
  end
  return nothing
end


using BenchmarkTools       #a package for benchmarking computations 

@btime fn1(300)            #timing
@btime fn2(300)

  8.944 ms (900 allocations: 206.02 MiB)
  2.667 ms (3 allocations: 703.21 KiB)


# Splitting up an Array (extra)

Sometimes you want to create new variables from the columns (or rows) of a matrix. The next cell shows an example.

In [11]:
printblue("A simple way...which works well when you want to create a few variables")
x1 = A[:,1]
x2 = A[:,2]      #or (x1,x2) = (A[:,1],A[:,2])
printmat(x2)

printblue("Another way")
(z1,z2) = [A[:,i] for i in 1:2]
printmat(z2)

[34m[1mA simple way...which works well when you want to create a few variables[22m[39m
     2    
    22    

[34m[1mAnother way[22m[39m
     2    
    22    



# An Array of Arrays (extra)

If `x1` and `x2` are two arrays, then `y=[x1,x2]` is a vector (of arrays) where `y[1] = x1` and `y[2] = x2`.

In this case `y[1]` is actually a view of `x1` so changing elements of one changes the other.

If you instead want to stack `x1` and `x2` into a single matrix, use `[x1 x2]`, `[x1;x2]` or one of the `cat` functions discussed above.

In [12]:
x1 = ones(3,2)
x2 = [1,2,3]
y = [x1,x2]               #a vector of arrays

for i in 1:length(y)
    printblue("y[$i]:")
    printmat(y[i])
end

[34m[1my[1]:[22m[39m
     1.000     1.000
     1.000     1.000
     1.000     1.000

[34m[1my[2]:[22m[39m
     1    
     2    
     3    



# Arrays are Different...

Vectors and matrices (arrays) can take lots of memory space, so **Julia is designed to avoid unnecessary copies of arrays**. In short, notice the following. Let `A` be an array and you do one of the following computations:

* `B = A`, `B = reshape(A,n,m)`, `B = vec(A)`, or `B = A'`, and then followed by `B[1] = -999`
  
* `B = f1!(A)` where `f1!` is a function like
```
function f1!(B)
    B[1] = -999     #change some elements of B inside the function
  return B
end
```

* `B = [A,A]` (an array of arrays) followed by `B[1][1] = -999`

then also `A` will change. 

Notice that in all cases you are changing some *elements* of `B`, not redefining the entire `B` (like in `B = [1,2,3]`). Other ways to change some *elements* are `B[:] = [1,2]` or `B .= [1,2]` so the same behaviour applies to those cases.

If you do not like this behaviour, then use `copy(A)` to create an independent copy of the array.

In [13]:
function f1!(B)            #! is a convention for indicating that the function changes the input
    B[1] = -999            #changing ELEMENTS of B, affects outside value
    #B = B/2               #this would NOT affect the outside value
  return B
end

A  = [1.0,2.0]
printblue("original A:")
printmat(A)

B = f1!(A)
printblue("A after calling f1!(A): ")
printmat(A)

[34m[1moriginal A:[22m[39m
     1.000
     2.000

[34m[1mA after calling f1!(A): [22m[39m
  -999.000
     2.000

