# III. Working with arrays

Julia has built-in functions that can be used to create arrays of random numbers. We can use the `rand` function to generate random floats between 0 and 1.

In [1]:
a = rand(20);
show(a)

[0.48922170677183385, 0.23398820595670644, 0.3511448150688934, 0.2516944582440892, 0.5921796350799822, 0.9103196842639933, 0.6356949498027129, 0.15023800215840244, 0.23179112898203824, 0.9450957619342832, 0.8256604575028219, 0.5693926499018602, 0.4004953160232465, 0.659404301050567, 0.5934412844980641, 0.6916471595570342, 0.275160495655973, 0.1269230780611743, 0.683779800718552, 0.5833469693461528]

In [2]:
typeof(a)

Array{Float64,1}

The `rand` function can be used to sample from an arbitrary set of numbers. Here we sample three values from the values 1 through 20.

In [3]:
rand(1:20, 3)

3-element Array{Int64,1}:
 10
  3
  5

Here we use indexing to get the first three elements of __a__.

In [4]:
a[1:3]

3-element Array{Float64,1}:
 0.48922170677183385
 0.23398820595670644
 0.3511448150688934 

The __end__ keyword can be used to go to the last element along the dimension. The following will get the elements of __a__ starting from index 14 to the end.

In [5]:
a[14:end]

7-element Array{Float64,1}:
 0.659404301050567 
 0.5934412844980641
 0.6916471595570342
 0.275160495655973 
 0.1269230780611743
 0.683779800718552 
 0.5833469693461528

You can use a stride index to get skip over elements in the array. For example to get every 2nd element of __a__ starting with the second element.

In [6]:
show(a[2:2:end])

[0.23398820595670644, 0.2516944582440892, 0.9103196842639933, 0.15023800215840244, 0.9450957619342832, 0.5693926499018602, 0.659404301050567, 0.6916471595570342, 0.1269230780611743, 0.5833469693461528]

You can use `length` to get the number of elements in the array:

In [7]:
length(a)

20

There are a few very  useful functions Julia provides that are easy to understand in the context of one dimensional arrays: `map`, `filter`, `reduce`, `mapreduce`.

In [8]:
a = randn(15)
show(a)

[-0.9324118099711859, 0.4494861785132772, -1.9199083892308229, 0.23792681677809793, -0.012643964275971462, 0.3042749140951158, -0.7072116260068939, 1.577918548746474, -1.0801253299863138, -1.4114393782767616, -1.071505863466822, 0.2617045687700058, 1.0211372482706103, 1.0876196821174606, 1.4937735588634948]

The `map` function will apply a function elemenwise to an array. Here we take the exponential of every element of __a__. The first argument to `map` is the function you want to apply to every element in the second object. The function can be an anonymous function, a user-defined function, a built-in function, etc.

In [9]:
exp_a = map(exp, a)
show(exp_a)

[0.39360326840218096, 1.5675065599829474, 0.1466203935221383, 1.2686163479959065, 0.9874356348045037, 1.3556416898746582, 0.4930169984067904, 4.8448609669860385, 0.3395529668093627, 0.2437921214031746, 0.34249238213568384, 1.2991426784100435, 2.7763503689267206, 2.9672027741084324, 4.45387079146031]

The `filter` function will only return elements that satisfy a specified condition. Here we return elements of __a__ greater than zero.

In [10]:
filt_a = filter(x -> x > 0, a)
show((filt_a))

[0.4494861785132772, 0.23792681677809793, 0.3042749140951158, 1.577918548746474, 0.2617045687700058, 1.0211372482706103, 1.0876196821174606, 1.4937735588634948]

You can apply a reduction operation using `reduce`. Here we apply `reduce` to an array using the multiplication operator:

In [11]:
red_a = reduce(*, a)

-0.0005829342476231348

You can easily combine the `map` and `reduce` functions in Julia by using the `mapreduce` function. In what follows, the first argument does the map (i.e. square each element in __a__), and the second argument specifies the type of reduction to be applied (i.e. sum), and the last argument specifies what the `mapreduce` is being applied to.

In [12]:
eucnormsq = mapreduce(x -> x ^ 2, +, a)

16.72924856652884

There is also a useful `|>` operator that can be used to pass the result of one function as input to another function. For example, we can rewrite the above expression for __eucnormsq__ using this "pipe-greater-than" syntax:

In [13]:
eucnormsq = map(x -> x ^ 2, a) |> sum

16.72924856652884

What we did above first was to apply the mapping to __a__ (i.e. squaring each element of __a__) and then we passed the result of `map` as input into the `sum` function which summed up the squared elements.

In Julia, you'll likely often be working with multidimensional arrays.

In [14]:
A = [1 2 3; 4 5 6]

2×3 Array{Int64,2}:
 1  2  3
 4  5  6

Generating random matrices and indexing works the same as before. Below we generate an 8 by 10 matrix of random numbers each distributed according to a standard normal distribution.

In [15]:
A = randn(8, 10)

8×10 Array{Float64,2}:
  0.052261   0.245111  -0.0832386  …  -0.476504  1.70146    0.29428  
 -1.81725    0.402847   0.160868      -1.18155   0.043404   0.255165 
  0.309988   0.945666   0.287139      -0.130548  0.786618   0.653011 
  2.10265   -0.765735  -1.45954        0.216652  0.440409  -0.0089227
  0.247881  -1.32097   -0.873619       0.769605  0.228119  -0.500876 
  0.15384   -0.719572  -0.888789   …   2.02883   0.698597   0.190656 
  0.423731   0.565354  -1.89544        1.05145   0.475646   0.265792 
  0.40453    0.848234   0.263717       0.321172  0.496757   0.22709  

If we wanted all the rows but only columns 6 through 10 from our matrix __A__:

In [16]:
A[:, 6:10]

8×5 Array{Float64,2}:
 -0.48165   -0.05413   -0.476504  1.70146    0.29428  
 -0.423648   1.33057   -1.18155   0.043404   0.255165 
 -1.00506   -0.550859  -0.130548  0.786618   0.653011 
 -0.284291   0.80626    0.216652  0.440409  -0.0089227
 -0.971473   0.687713   0.769605  0.228119  -0.500876 
  0.129743  -1.44741    2.02883   0.698597   0.190656 
 -0.475728   0.467158   1.05145   0.475646   0.265792 
  0.9355    -2.48807    0.321172  0.496757   0.22709  

If you wanted rows two through four and only columns 1, 4, and 8 through 10 of __A__:

In [17]:
A[2:4, union(1, 4, 8:10)]

3×5 Array{Float64,2}:
 -1.81725   -0.277849  -1.18155   0.043404   0.255165 
  0.309988   0.634486  -0.130548  0.786618   0.653011 
  2.10265   -0.708673   0.216652  0.440409  -0.0089227

You can also use boolean indexing to extract elements. Here a random 8 x 10 matrix of booleans is generated:

In [18]:
mask = rand(Bool, 8, 10)

8×10 Array{Bool,2}:
 0  1  1  1  0  0  0  1  0  1
 1  0  0  0  1  1  0  0  1  1
 0  0  1  0  1  1  0  1  0  1
 1  0  1  1  1  0  0  0  0  1
 0  0  1  0  1  0  1  0  0  1
 0  0  0  0  1  0  1  1  1  1
 1  0  0  1  1  0  1  1  0  0
 1  1  1  0  1  0  1  0  0  0

The following statment will return the elements of __A__ that correspond to the elemnts of *mask* that have an entry of *true*.

In [19]:
A[mask]

39-element Array{Float64,1}:
 -1.8172537984266395  
  2.102649021655136   
  0.4237309346732946  
  0.40453028773749167 
  0.24511148274128544 
  0.8482340577630565  
 -0.08323860268359817 
  0.2871385862755773  
 -1.4595398328479714  
 -0.8736188707377037  
  0.2637165640969634  
  0.0693355670289256  
 -0.7086727549338354  
  ⋮                   
 -0.4765042726474644  
 -0.13054789972892253 
  2.028831584254096   
  1.0514464093139515  
  0.043403952274139575
  0.6985966577211443  
  0.2942801700353856  
  0.25516504789215766 
  0.6530108280571782  
 -0.008922699747143594
 -0.5008762893193759  
  0.1906557317892771  

Similarly if you wanted to return the elements of __A__ that were, say, greater than zero you could do something like the following:

In [20]:
A[A .> 0]

46-element Array{Float64,1}:
 0.0522609960451947 
 0.3099876832056938 
 2.102649021655136  
 0.24788102929147882
 0.15383973714882138
 0.4237309346732946 
 0.40453028773749167
 0.24511148274128544
 0.40284743536916967
 0.9456664156893394 
 0.5653543951135697 
 0.8482340577630565 
 0.1608678889479038 
 ⋮                  
 0.7866184487203483 
 0.44040926100676325
 0.22811866622875532
 0.6985966577211443 
 0.47564613698981667
 0.4967567273301734 
 0.2942801700353856 
 0.25516504789215766
 0.6530108280571782 
 0.1906557317892771 
 0.26579222952232545
 0.2270898684562241 

Note the dot notation used above which is necessary here to do an element-wise comparison.

One thing to be aware of when you do an assignment with arrays is that the new array is actually a *view* of the original array.

In [21]:
B = A

8×10 Array{Float64,2}:
  0.052261   0.245111  -0.0832386  …  -0.476504  1.70146    0.29428  
 -1.81725    0.402847   0.160868      -1.18155   0.043404   0.255165 
  0.309988   0.945666   0.287139      -0.130548  0.786618   0.653011 
  2.10265   -0.765735  -1.45954        0.216652  0.440409  -0.0089227
  0.247881  -1.32097   -0.873619       0.769605  0.228119  -0.500876 
  0.15384   -0.719572  -0.888789   …   2.02883   0.698597   0.190656 
  0.423731   0.565354  -1.89544        1.05145   0.475646   0.265792 
  0.40453    0.848234   0.263717       0.321172  0.496757   0.22709  

In [22]:
isequal(B, A)

true

The `===` tests if __B__ and __A__ point to the same location in memory:

In [23]:
B === A

true

Now let's change some elements of __B__. What do you think will happen to __A__?

In [24]:
B[1, 1:end] .= 999;

In [25]:
B

8×10 Array{Float64,2}:
 999.0       999.0       999.0       …  999.0       999.0       999.0      
  -1.81725     0.402847    0.160868      -1.18155     0.043404    0.255165 
   0.309988    0.945666    0.287139      -0.130548    0.786618    0.653011 
   2.10265    -0.765735   -1.45954        0.216652    0.440409   -0.0089227
   0.247881   -1.32097    -0.873619       0.769605    0.228119   -0.500876 
   0.15384    -0.719572   -0.888789  …    2.02883     0.698597    0.190656 
   0.423731    0.565354   -1.89544        1.05145     0.475646    0.265792 
   0.40453     0.848234    0.263717       0.321172    0.496757    0.22709  

Note that even though we changed the elements of __B__ the elements of the original array __A__ also changed.

In [26]:
A

8×10 Array{Float64,2}:
 999.0       999.0       999.0       …  999.0       999.0       999.0      
  -1.81725     0.402847    0.160868      -1.18155     0.043404    0.255165 
   0.309988    0.945666    0.287139      -0.130548    0.786618    0.653011 
   2.10265    -0.765735   -1.45954        0.216652    0.440409   -0.0089227
   0.247881   -1.32097    -0.873619       0.769605    0.228119   -0.500876 
   0.15384    -0.719572   -0.888789  …    2.02883     0.698597    0.190656 
   0.423731    0.565354   -1.89544        1.05145     0.475646    0.265792 
   0.40453     0.848234    0.263717       0.321172    0.496757    0.22709  

If you want to avoid this behavior then you can use the `copy` function to make a copy of the original array:

In [27]:
C = copy(A);

In [28]:
isequal(C, A)

true

In [29]:
C === A

false

What the above shows is that __C__ points to a different location in memory than __A__, so you can change __C__ without affecting __A__.

Let's move on and look at some basic functions and operations that you can with arrays.

To check the dimension of an array you can use the `ndims` function:

In [30]:
A = randn(8, 10)

8×10 Array{Float64,2}:
  1.46816   -0.00323373  -0.984092  …   0.394842  -0.477097   0.185557 
 -0.134089   1.37991     -0.915052      1.96685    1.2546     1.79164  
  0.164207  -0.478126    -0.35573      -0.338896  -0.418068   0.678582 
 -0.4873    -1.14405     -0.703901     -1.11022   -0.39306   -1.54468  
  0.963068   0.120289    -0.811619     -2.04663    1.34058    0.270189 
  1.55257   -0.751351     0.763494  …  -1.22276    0.562637  -0.0520285
  0.543123  -1.7885      -0.81668       0.4301    -2.02303   -1.25026  
  2.65575   -2.60739      0.249087      0.384562   0.101902  -0.942166 

In [31]:
ndims(A)

2

To get the number of rows and columns use `size`:

In [32]:
size(A)

(8, 10)

Ae before `length` returns the number of elements in the matrix.

In [33]:
length(A)

80

The `reshape` function will change the shape of the array:

In [34]:
A

8×10 Array{Float64,2}:
  1.46816   -0.00323373  -0.984092  …   0.394842  -0.477097   0.185557 
 -0.134089   1.37991     -0.915052      1.96685    1.2546     1.79164  
  0.164207  -0.478126    -0.35573      -0.338896  -0.418068   0.678582 
 -0.4873    -1.14405     -0.703901     -1.11022   -0.39306   -1.54468  
  0.963068   0.120289    -0.811619     -2.04663    1.34058    0.270189 
  1.55257   -0.751351     0.763494  …  -1.22276    0.562637  -0.0520285
  0.543123  -1.7885      -0.81668       0.4301    -2.02303   -1.25026  
  2.65575   -2.60739      0.249087      0.384562   0.101902  -0.942166 

In [35]:
C = reshape(A, 2, 40)

2×40 Array{Float64,2}:
  1.46816    0.164207  0.963068  …   0.678582   0.270189   -1.25026 
 -0.134089  -0.4873    1.55257      -1.54468   -0.0520285  -0.942166

In [36]:
size(C)

(2, 40)

In [37]:
A = randn(5, 5)

5×5 Array{Float64,2}:
 -0.696523  -0.471784   0.820062   0.128029   -0.657384 
  0.431981   0.120459   0.445679  -0.784718   -0.391408 
  1.7356    -0.178117  -1.98435   -0.905685    1.33795  
 -1.87878    0.848242  -0.194669  -0.0984287   0.0338903
  1.20802    0.739109   0.245352   0.308893   -0.356952 

There are a few ways to initialize arrays. Here we use constructor notation along with the **undef** keyword to initialize an array to nothing in particular (some undefined strings or some undefined integers):

In [38]:
InitStringArray = Array{String}(undef, 5, 5)

5×5 Array{String,2}:
 #undef  #undef  #undef  #undef  #undef
 #undef  #undef  #undef  #undef  #undef
 #undef  #undef  #undef  #undef  #undef
 #undef  #undef  #undef  #undef  #undef
 #undef  #undef  #undef  #undef  #undef

In [39]:
InitIntArray = Array{Int64}(undef, 5, 5)

5×5 Array{Int64,2}:
 140549209058800  140549209059120  …  140549209059760  140549209060080
 140549209058864  140549209059184     140549209059824  140549209060144
 140549209058928  140549209059248     140549209059888  140549209060208
 140549209058992  140549209059312     140549209059952  140549209060272
 140549209059056  140549209059376     140549209060016  140549209060336

The `zeros` function is available to create a matrix of zeros; the `fill` function can create a matrix with an arbitrary element.

In [40]:
zeros(4, 5)

4×5 Array{Float64,2}:
 0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0

In [41]:
fill("foo", 4, 5)

4×5 Array{String,2}:
 "foo"  "foo"  "foo"  "foo"  "foo"
 "foo"  "foo"  "foo"  "foo"  "foo"
 "foo"  "foo"  "foo"  "foo"  "foo"
 "foo"  "foo"  "foo"  "foo"  "foo"

You can use the following constructor notation to create an identity matrix:

In [42]:
using LinearAlgebra #this package contains linear algebra functionality

Imatfirst = Array{Float64}(I, 5, 5)

5×5 Array{Float64,2}:
 1.0  0.0  0.0  0.0  0.0
 0.0  1.0  0.0  0.0  0.0
 0.0  0.0  1.0  0.0  0.0
 0.0  0.0  0.0  1.0  0.0
 0.0  0.0  0.0  0.0  1.0

To create a sparse version of the identity matrix you can use the `Diagonal` function. The `Diagonal` function creates a diagonal matrix given an input matrix.

In [43]:
Imatsec = Diagonal(ones(5, 5))

5×5 Diagonal{Float64,Array{Float64,1}}:
 1.0   ⋅    ⋅    ⋅    ⋅ 
  ⋅   1.0   ⋅    ⋅    ⋅ 
  ⋅    ⋅   1.0   ⋅    ⋅ 
  ⋅    ⋅    ⋅   1.0   ⋅ 
  ⋅    ⋅    ⋅    ⋅   1.0

You can see the sparse version takes up less space:

In [44]:
varinfo() # see what objects are in your session

| name            |      size | summary                                |
|:--------------- | ---------:|:-------------------------------------- |
| A               | 240 bytes | 5×5 Array{Float64,2}                   |
| B               | 680 bytes | 8×10 Array{Float64,2}                  |
| Base            |           | Module                                 |
| C               | 680 bytes | 2×40 Array{Float64,2}                  |
| Core            |           | Module                                 |
| Imatfirst       | 240 bytes | 5×5 Array{Float64,2}                   |
| Imatsec         |  88 bytes | 5×5 Diagonal{Float64,Array{Float64,1}} |
| InitIntArray    | 240 bytes | 5×5 Array{Int64,2}                     |
| InitStringArray | 240 bytes | 5×5 Array{String,2}                    |
| Main            |           | Module                                 |
| a               | 160 bytes | 15-element Array{Float64,1}            |
| eucnormsq       |   8 bytes | Float64                                |
| exp_a           | 160 bytes | 15-element Array{Float64,1}            |
| filt_a          | 104 bytes | 8-element Array{Float64,1}             |
| mask            | 120 bytes | 8×10 Array{Bool,2}                     |
| red_a           |   8 bytes | Float64                                |


As mentioned before, if you want to do element-wise operations on an array you use dot notation. To demonstrate let's first generate a random matrix.

In [45]:
A = randn(4, 5)

4×5 Array{Float64,2}:
 -0.869221  -0.385225   -0.600609   1.00794   -1.48471  
 -1.10019   -1.79143    -0.241786   0.435377   0.339543 
  1.50909    0.777673   -1.49495    0.113104   0.0440349
  0.681267  -0.0608676   0.311107  -0.673308  -1.2748   

Now we square every element of __A__ using the dot syntax:

In [46]:
A.^2

4×5 Array{Float64,2}:
 0.755545  0.148398    0.360731   1.01595    2.20436   
 1.21042   3.20921     0.0584604  0.189553   0.11529   
 2.27735   0.604776    2.23486    0.0127926  0.00193907
 0.464125  0.00370486  0.0967876  0.453344   1.62511   

Similarly we can do element-wise division between two matrices. Below we can create a new random matrix __B__ then divide the elements of __A__ by their corresponding elements in __B__.

In [47]:
B =  randn(4, 5)

4×5 Array{Float64,2}:
 -0.130132   -1.50535   -1.94737   -1.00755    0.158515
 -0.0305581   1.35545    1.36826    0.431192   0.37851 
  0.198628    0.172688  -0.271589  -0.225997  -0.958321
  1.15803     0.197988   1.74994    0.774213  -0.248607

In [48]:
A ./ B

4×5 Array{Float64,2}:
  6.67951    0.255904   0.30842   -1.00039   -9.36635  
 36.0033    -1.32164   -0.176711   1.0097     0.897053 
  7.59754    4.50334    5.50444   -0.500467  -0.0459501
  0.588299  -0.307431   0.177782  -0.869668   5.12776  

There are a lot of basic functions that can be applied to arrays: `sum`, `mean`, `sort`, etc.

In [49]:
A = [[1 -1 2 3]; [4 -3 1 0]; [7 -3 -3 2]]

3×4 Array{Int64,2}:
 1  -1   2  3
 4  -3   1  0
 7  -3  -3  2

To sum all the elements of **A**:

In [50]:
A

3×4 Array{Int64,2}:
 1  -1   2  3
 4  -3   1  0
 7  -3  -3  2

In [51]:
sum(A)

10

In [52]:
sum(A, dims = 1) #sums each column

1×4 Array{Int64,2}:
 12  -7  0  5

In [53]:
sum(A, dims = 2) #sums each row

3×1 Array{Int64,2}:
 5
 2
 3

The `sort` function will sort the array along the indicated dimension.

In [54]:
A

3×4 Array{Int64,2}:
 1  -1   2  3
 4  -3   1  0
 7  -3  -3  2

In [55]:
sort(A, dims = 1) #sort each column in ascending order

3×4 Array{Int64,2}:
 1  -3  -3  0
 4  -3   1  2
 7  -1   2  3

In [56]:
sort(A, dims= 1, rev=true) #sort each column in descending order

3×4 Array{Int64,2}:
 7  -1   2  3
 4  -3   1  2
 1  -3  -3  0

In [57]:
sort(A, dims = 2) #sort each row in ascending order

3×4 Array{Int64,2}:
 -1   1  2  3
 -3   0  1  4
 -3  -3  2  7

# Exercise 3
* Create a 5 by 8 random array called *B* using **randn**.
* Find the elements of *B* that are less than 0.2.
* Retrieve the number of rows and columns of *B*.
* Multiply every element of *B* by 3 and assign that to a new array called *C*.
* Sort each row of *C* in ascending order.

In this lesson we covered:
* Single and multi-dimensional arrays.
* Array indexing.
* Applying functions to arrays.