# III. Working with arrays

Julia has built-in functions that can be used to create arrays of random numbers. We can use the `rand` function to generate random floats between 0 and 1.

In [1]:
a = rand(20);
show(a)

[0.9502793786615584, 0.6489044338385268, 0.07896123123192056, 0.27613725200094685, 0.16017186707719278, 0.5882327899003803, 0.552095086652632, 0.8231710665592555, 0.018850993756785117, 0.9371906592003292, 0.009127078433834868, 0.8169514939171134, 0.0808069470042374, 0.7375705077008241, 0.9589434299012154, 0.8830576340985039, 0.5388683535159571, 0.6661376860029673, 0.9547808334558365, 0.813868845950106]

In [2]:
typeof(a)

Array{Float64,1}

The `rand` function can be used to sample from an arbitrary set of numbers. Here we sample three values from the values 1 through 20.

In [3]:
rand(1:20, 3)

3-element Array{Int64,1}:
  5
 15
 12

Here we use indexing to get the first three elements of __a__.

In [4]:
a[1:3]

3-element Array{Float64,1}:
 0.9502793786615584 
 0.6489044338385268 
 0.07896123123192056

The __end__ keyword can be used to go to the last element along the dimension. The following will get the elements of __a__ starting from index 14 to the end.

In [5]:
a[14:end]

7-element Array{Float64,1}:
 0.7375705077008241
 0.9589434299012154
 0.8830576340985039
 0.5388683535159571
 0.6661376860029673
 0.9547808334558365
 0.813868845950106 

You can use a stride index to get skip over elements in the array. For example to get every 2nd element of __a__ starting with the second element.

In [6]:
show(a)

[0.9502793786615584, 0.6489044338385268, 0.07896123123192056, 0.27613725200094685, 0.16017186707719278, 0.5882327899003803, 0.552095086652632, 0.8231710665592555, 0.018850993756785117, 0.9371906592003292, 0.009127078433834868, 0.8169514939171134, 0.0808069470042374, 0.7375705077008241, 0.9589434299012154, 0.8830576340985039, 0.5388683535159571, 0.6661376860029673, 0.9547808334558365, 0.813868845950106]

In [7]:
show(a[2:2:end])

[0.6489044338385268, 0.27613725200094685, 0.5882327899003803, 0.8231710665592555, 0.9371906592003292, 0.8169514939171134, 0.7375705077008241, 0.8830576340985039, 0.6661376860029673, 0.813868845950106]

You can use `length` to get the number of elements in the array:

In [8]:
length(a)

20

There are a few very  useful functions Julia provides that are easy to understand in the context of one dimensional arrays: `map`, `filter`, `reduce`, `mapreduce`.

In [9]:
a = randn(15)
show(a)

[-0.09087650081447488, 0.309385718103435, 0.645389177402225, -0.44654574569972055, -1.5621250770077366, 0.5494123853169575, 0.45797836123914576, -0.4224191599812829, 0.2511982073362174, -1.2411314116266987, -0.8393720726627579, -1.2405491123080576, -0.5224046873071889, -0.6877102127875346, 0.6279727478559269]

The `map` function will apply a function elemenwise to an array. Here we take the exponential of every element of __a__. The first argument to `map` is the function you want to apply to every element in the second object. The function can be an anonymous function, a user-defined function, a built-in function, etc.

In [10]:
exp_a = map(exp, a)
show(exp_a)

[0.9131304748059139, 1.362587843953529, 1.9067289407108685, 0.6398344898393181, 0.20968999001367045, 1.7322348321244125, 1.5808747945196868, 0.6554592395223409, 1.2855648674683333, 0.28905699041970806, 0.4319816913964958, 0.28922535712342895, 0.5930926294914288, 0.5027258874524138, 1.8738080448421146]

The `filter` function will only return elements that satisfy a specified condition. Here we return elements of __a__ greater than zero.

In [11]:
filt_a = filter(x -> x > 0, a)
show((filt_a))

[0.309385718103435, 0.645389177402225, 0.5494123853169575, 0.45797836123914576, 0.2511982073362174, 0.6279727478559269]

You can apply a reduction operation using `reduce`. Here we apply `reduce` to an array using the multiplication operator:

In [12]:
red_a = reduce(*, a)

-9.853679363380804e-5

You can easily combine the `map` and `reduce` functions in Julia by using the `mapreduce` function. In what follows, the first argument does the map (i.e. square each element in __a__), and the second argument specifies the type of reduction to be applied (i.e. sum), and the last argument specifies what the `mapreduce` is being applied to.

In [13]:
eucnormsq = mapreduce(x -> x ^ 2, +, a)

8.8373962685682

There is also a useful `|>` operator that can be used to pass the result of one function as input to another function. For example, we can rewrite the above expression for __eucnormsq__ using this "pipe-greater-than" syntax:

In [14]:
eucnormsq = map(x -> x ^ 2, a) |> sum

8.8373962685682

What we first did above was apply the mapping to __a__ (i.e. squaring each element of __a__) and then we passed the result of `map` as input into the `sum` function which summed up the squared elements. Other useful functions, that I think are worth mentioning, that work on single dimensional arrays are `push!` and `pop!`.

In Julia, you'll likely often be working with multidimensional arrays. Multidimensional arrays have a fixed size.

In [15]:
A = [1 2 3; 4 5 6]

2×3 Array{Int64,2}:
 1  2  3
 4  5  6

Generating random matrices and indexing works the same as before. Below we generate an 8 by 10 matrix of random numbers each distributed according to a standard normal distribution.

In [16]:
A = randn(8, 10)

8×10 Array{Float64,2}:
  0.187882   1.98988    -0.391685  …  0.222676    0.450649  -0.762168 
 -0.179144  -1.38998     0.207925     0.135873   -1.93039   -0.581057 
 -0.154895   1.28376     0.568639     0.110747   -0.429426  -0.0925819
  0.842184   0.340546   -0.379539     0.274593    0.341614   0.373682 
 -0.535877  -0.210729   -0.596742     0.534008    0.945624   1.08506  
 -1.34526    0.390527   -0.480117  …  0.619854   -0.584752  -0.265408 
 -0.65545    0.0505797   1.35169      0.0459197  -0.114564  -1.40022  
 -0.909184  -0.189127   -0.120878     0.294521   -0.688271   0.150616 

If we wanted all the rows but only columns 6 through 10 from our matrix __A__:

In [17]:
A[:, 6:10]

8×5 Array{Float64,2}:
  0.390114   -1.08683   0.222676    0.450649  -0.762168 
  0.18279    -0.220802  0.135873   -1.93039   -0.581057 
 -0.478474    1.05734   0.110747   -0.429426  -0.0925819
 -0.349286   -0.568616  0.274593    0.341614   0.373682 
 -0.0299098  -1.39258   0.534008    0.945624   1.08506  
  2.02454    -1.3943    0.619854   -0.584752  -0.265408 
 -1.19623     0.824063  0.0459197  -0.114564  -1.40022  
 -0.649258   -1.46184   0.294521   -0.688271   0.150616 

If you wanted rows two through four and only columns 1, 4, and 8 through 10 of __A__:

In [18]:
A[2:4, union(1, 4, 8:10)]

3×5 Array{Float64,2}:
 -0.179144   1.07754  0.135873  -1.93039   -0.581057 
 -0.154895  -1.66101  0.110747  -0.429426  -0.0925819
  0.842184   0.95858  0.274593   0.341614   0.373682 

You can also use boolean indexing to extract elements. Here a random 8 x 10 matrix of booleans is generated:

In [19]:
mask = rand(Bool, 8, 10)

8×10 Array{Bool,2}:
 0  1  0  1  0  1  1  1  1  0
 1  0  0  1  0  0  0  0  0  1
 1  0  1  0  0  1  1  1  0  0
 0  0  1  1  1  1  1  1  1  1
 1  1  0  0  1  0  1  0  1  0
 0  1  0  0  1  0  1  0  0  1
 1  0  1  0  1  1  1  0  0  0
 1  0  1  0  0  0  0  0  1  0

The following statment will return the elements of __A__ that correspond to the elemnts of *mask* that have an entry of *true*.

In [20]:
A[mask]

39-element Array{Float64,1}:
 -0.1791437248201305 
 -0.15489458207326007
 -0.5358771376623253 
 -0.6554500287715558 
 -0.9091840883192736 
  1.9898761893432797 
 -0.21072893782665658
  0.39052720443909555
  0.5686386314913191 
 -0.37953942050208317
  1.3516922291012283 
 -0.12087781544665728
  0.5886737291316595 
  ⋮                  
 -1.3943018968029302 
  0.8240625161550317 
  0.22267637976410337
  0.11074709537884922
  0.27459299085864264
  0.4506487195739312 
  0.3416143141996625 
  0.9456239663485425 
 -0.6882708510961195 
 -0.5810567708994436 
  0.3736822346782938 
 -0.26540757247811403

Similarly if you wanted to return the elements of __A__ that were, say, greater than zero you could do something like the following:

In [21]:
A[A .> 0]

36-element Array{Float64,1}:
 0.18788202245308927
 0.842184190650599  
 1.9898761893432797 
 1.28376194788543   
 0.3405464475196761 
 0.39052720443909555
 0.05057968438875279
 0.2079253031701892 
 0.5686386314913191 
 1.3516922291012283 
 0.5886737291316595 
 1.0775425259285203 
 0.9585795084341849 
 ⋮                  
 0.11074709537884922
 0.27459299085864264
 0.5340075467021094 
 0.6198543175023281 
 0.04591969230670784
 0.29452105359253977
 0.4506487195739312 
 0.3416143141996625 
 0.9456239663485425 
 0.3736822346782938 
 1.0850612793377061 
 0.15061556113513877

Note the dot notation used above which is necessary here to do an element-wise comparison.

One thing to be aware of when you do an assignment with arrays is that the new array is actually a *view* of the original array.

In [22]:
B = A

8×10 Array{Float64,2}:
  0.187882   1.98988    -0.391685  …  0.222676    0.450649  -0.762168 
 -0.179144  -1.38998     0.207925     0.135873   -1.93039   -0.581057 
 -0.154895   1.28376     0.568639     0.110747   -0.429426  -0.0925819
  0.842184   0.340546   -0.379539     0.274593    0.341614   0.373682 
 -0.535877  -0.210729   -0.596742     0.534008    0.945624   1.08506  
 -1.34526    0.390527   -0.480117  …  0.619854   -0.584752  -0.265408 
 -0.65545    0.0505797   1.35169      0.0459197  -0.114564  -1.40022  
 -0.909184  -0.189127   -0.120878     0.294521   -0.688271   0.150616 

In [23]:
isequal(B, A)

true

The `===` tests if __B__ and __A__ point to the same location in memory:

In [24]:
B === A

true

Now let's change some elements of __B__. What do you think will happen to __A__?

In [25]:
B[1, 1:end] .= 999;

In [26]:
B

8×10 Array{Float64,2}:
 999.0       999.0        999.0       999.0       …  999.0       999.0      
  -0.179144   -1.38998      0.207925    1.07754       -1.93039    -0.581057 
  -0.154895    1.28376      0.568639   -1.66101       -0.429426   -0.0925819
   0.842184    0.340546    -0.379539    0.95858        0.341614    0.373682 
  -0.535877   -0.210729    -0.596742    0.96938        0.945624    1.08506  
  -1.34526     0.390527    -0.480117   -0.538183  …   -0.584752   -0.265408 
  -0.65545     0.0505797    1.35169     1.38384       -0.114564   -1.40022  
  -0.909184   -0.189127    -0.120878   -2.21054       -0.688271    0.150616 

Note that even though we changed the elements of __B__ the elements of the original array __A__ also changed.

In [27]:
A

8×10 Array{Float64,2}:
 999.0       999.0        999.0       999.0       …  999.0       999.0      
  -0.179144   -1.38998      0.207925    1.07754       -1.93039    -0.581057 
  -0.154895    1.28376      0.568639   -1.66101       -0.429426   -0.0925819
   0.842184    0.340546    -0.379539    0.95858        0.341614    0.373682 
  -0.535877   -0.210729    -0.596742    0.96938        0.945624    1.08506  
  -1.34526     0.390527    -0.480117   -0.538183  …   -0.584752   -0.265408 
  -0.65545     0.0505797    1.35169     1.38384       -0.114564   -1.40022  
  -0.909184   -0.189127    -0.120878   -2.21054       -0.688271    0.150616 

If you want to avoid this behavior then you can use the `copy` function to make a copy of the original array:

In [28]:
C = copy(A);

In [29]:
isequal(C, A)

true

In [30]:
C === A

false

What the above shows is that __C__ points to a different location in memory than __A__, so you can change __C__ without affecting __A__.

Let's move on and look at some basic functions and operations that you can with arrays.

To check the dimension of an array you can use the `ndims` function:

In [31]:
A = randn(8, 10)

8×10 Array{Float64,2}:
  0.602441  2.63243     -0.354204   …   0.128718  -0.462093   0.0201338
  0.300617  0.0349001   -0.0521646      0.428311  -0.551617  -0.129815 
  0.15071   1.12909     -2.16815        1.10208    0.744446  -1.08608  
  0.551117  0.0866609    0.850275       0.173404   1.34914    2.20036  
  0.546104  1.20459     -2.58967       -0.757207   1.23035    0.912463 
  1.70966   1.95799     -1.28433    …   1.42438   -0.207948  -0.0488128
 -0.697557  0.659376    -0.137753      -1.26174    2.02935    0.648953 
 -0.26325   0.00588833   0.363996       0.79115   -0.301912  -0.148045 

In [32]:
ndims(A)

2

To get the number of rows and columns use `size`:

In [33]:
size(A)

(8, 10)

Ae before `length` returns the number of elements in the matrix.

In [34]:
length(A)

80

The `reshape` function will change the shape of the array:

In [35]:
A

8×10 Array{Float64,2}:
  0.602441  2.63243     -0.354204   …   0.128718  -0.462093   0.0201338
  0.300617  0.0349001   -0.0521646      0.428311  -0.551617  -0.129815 
  0.15071   1.12909     -2.16815        1.10208    0.744446  -1.08608  
  0.551117  0.0866609    0.850275       0.173404   1.34914    2.20036  
  0.546104  1.20459     -2.58967       -0.757207   1.23035    0.912463 
  1.70966   1.95799     -1.28433    …   1.42438   -0.207948  -0.0488128
 -0.697557  0.659376    -0.137753      -1.26174    2.02935    0.648953 
 -0.26325   0.00588833   0.363996       0.79115   -0.301912  -0.148045 

In [36]:
C = reshape(A, 2, 40)

2×40 Array{Float64,2}:
 0.602441  0.15071   0.546104  -0.697557  …  -1.08608   0.912463    0.648953
 0.300617  0.551117  1.70966   -0.26325       2.20036  -0.0488128  -0.148045

In [37]:
size(C)

(2, 40)

There are a few ways to initialize arrays. Here we use constructor notation along with the **undef** keyword to initialize an array to nothing in particular (some undefined strings or some undefined integers):

In [38]:
InitStringArray = Array{String}(undef, 5, 5)

5×5 Array{String,2}:
 #undef  #undef  #undef  #undef  #undef
 #undef  #undef  #undef  #undef  #undef
 #undef  #undef  #undef  #undef  #undef
 #undef  #undef  #undef  #undef  #undef
 #undef  #undef  #undef  #undef  #undef

In [39]:
InitIntArray = Array{Int64}(undef, 5, 5)

5×5 Array{Int64,2}:
 140051130359280  140051130359600  …  140051130360240  140051130360560
 140051130359344  140051130359664     140051130360304  140051130360624
 140051130359408  140051130359728     140051130360368  140051130360688
 140051130359472  140051130359792     140051130360432  140051130360752
 140051130359536  140051130359856     140051130360496  140051130360816

The `zeros` function is available to create a matrix of zeros; the `fill` function can create a matrix with an arbitrary element.

In [40]:
zeros(4, 5)

4×5 Array{Float64,2}:
 0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0

In [41]:
fill("foo", 4, 5)

4×5 Array{String,2}:
 "foo"  "foo"  "foo"  "foo"  "foo"
 "foo"  "foo"  "foo"  "foo"  "foo"
 "foo"  "foo"  "foo"  "foo"  "foo"
 "foo"  "foo"  "foo"  "foo"  "foo"

You can use the following constructor notation to create an identity matrix:

In [42]:
using LinearAlgebra #this package contains linear algebra functionality

Imatfirst = Array{Float64}(I, 5, 5)

5×5 Array{Float64,2}:
 1.0  0.0  0.0  0.0  0.0
 0.0  1.0  0.0  0.0  0.0
 0.0  0.0  1.0  0.0  0.0
 0.0  0.0  0.0  1.0  0.0
 0.0  0.0  0.0  0.0  1.0

To create a sparse version of the identity matrix you can use the `Diagonal` function. The `Diagonal` function creates a diagonal matrix given an input matrix.

In [43]:
Imatsec = Diagonal(ones(5, 5))

5×5 Diagonal{Float64,Array{Float64,1}}:
 1.0   ⋅    ⋅    ⋅    ⋅ 
  ⋅   1.0   ⋅    ⋅    ⋅ 
  ⋅    ⋅   1.0   ⋅    ⋅ 
  ⋅    ⋅    ⋅   1.0   ⋅ 
  ⋅    ⋅    ⋅    ⋅   1.0

You can see the sparse version takes up less space:

In [44]:
varinfo() # see what objects are in your session

| name            |      size | summary                                |
|:--------------- | ---------:|:-------------------------------------- |
| A               | 680 bytes | 8×10 Array{Float64,2}                  |
| B               | 680 bytes | 8×10 Array{Float64,2}                  |
| Base            |           | Module                                 |
| C               | 680 bytes | 2×40 Array{Float64,2}                  |
| Core            |           | Module                                 |
| Imatfirst       | 240 bytes | 5×5 Array{Float64,2}                   |
| Imatsec         |  88 bytes | 5×5 Diagonal{Float64,Array{Float64,1}} |
| InitIntArray    | 240 bytes | 5×5 Array{Int64,2}                     |
| InitStringArray | 240 bytes | 5×5 Array{String,2}                    |
| Main            |           | Module                                 |
| a               | 160 bytes | 15-element Array{Float64,1}            |
| eucnormsq       |   8 bytes | Float64                                |
| exp_a           | 160 bytes | 15-element Array{Float64,1}            |
| filt_a          |  88 bytes | 6-element Array{Float64,1}             |
| mask            | 120 bytes | 8×10 Array{Bool,2}                     |
| red_a           |   8 bytes | Float64                                |


As mentioned before, if you want to do element-wise operations on an array you use dot notation. To demonstrate let's first generate a random matrix.

In [45]:
A = randn(4, 5)

4×5 Array{Float64,2}:
 -0.253533  -1.02351    -0.458424  -0.25999    0.245817
 -1.24669    0.742885   -0.832631   0.434332   1.06566 
 -1.23018    0.252181   -0.205416   0.133172  -2.81548 
 -0.124236   0.0964811  -1.38302    1.48817   -0.652766

Now we square every element of __A__ using the dot syntax:

In [46]:
A.^2

4×5 Array{Float64,2}:
 0.0642792  1.04757    0.210153   0.0675947  0.0604261
 1.55424    0.551878   0.693274   0.188644   1.13563  
 1.51334    0.0635955  0.0421957  0.0177348  7.92692  
 0.0154347  0.0093086  1.91275    2.21465    0.426104 

Similarly we can do element-wise division between two matrices. Below we can create a new random matrix __B__ then divide the elements of __A__ by their corresponding elements in __B__.

In [47]:
B =  randn(4, 5)

4×5 Array{Float64,2}:
 -0.111426   0.302788   0.813317   -0.33638    0.184128
 -0.792704  -0.700075   0.0621022   0.613906   0.378331
 -0.290877   1.50201   -0.931566    0.52897    1.14763 
 -0.816456   0.624224  -1.51396     1.28557   -0.445204

In [48]:
A ./ B

4×5 Array{Float64,2}:
 2.27536   -3.38028    -0.563648  0.772905   1.33503
 1.57271   -1.06115   -13.4074    0.707489   2.81674
 4.22921    0.167896    0.220506  0.251757  -2.45331
 0.152165   0.154562    0.913514  1.1576     1.46622

There are a lot of basic functions that can be applied to arrays: `sum`, `mean`, `sort`, etc.

In [49]:
A = [1 -1 2 3; 4 -3 1 0; 7 -3 -3 2]

3×4 Array{Int64,2}:
 1  -1   2  3
 4  -3   1  0
 7  -3  -3  2

To sum all the elements of **A**:

In [50]:
A

3×4 Array{Int64,2}:
 1  -1   2  3
 4  -3   1  0
 7  -3  -3  2

In [51]:
sum(A)

10

In [52]:
sum(A, dims = 1) #sums each column

1×4 Array{Int64,2}:
 12  -7  0  5

In [53]:
sum(A, dims = 2) #sums each row

3×1 Array{Int64,2}:
 5
 2
 3

The `sort` function will sort the array along the indicated dimension.

In [54]:
A

3×4 Array{Int64,2}:
 1  -1   2  3
 4  -3   1  0
 7  -3  -3  2

In [55]:
sort(A, dims = 1) #sort each column in ascending order

3×4 Array{Int64,2}:
 1  -3  -3  0
 4  -3   1  2
 7  -1   2  3

In [56]:
sort(A, dims= 1, rev=true) #sort each column in descending order

3×4 Array{Int64,2}:
 7  -1   2  3
 4  -3   1  2
 1  -3  -3  0

In [57]:
sort(A, dims = 2) #sort each row in ascending order

3×4 Array{Int64,2}:
 -1   1  2  3
 -3   0  1  4
 -3  -3  2  7

# Exercise 3
* Create a 5 by 8 random array called *B* using **randn**.
* Find the elements of *B* that are less than 0.2.
* Retrieve the number of rows and columns of *B*.
* Multiply every element of *B* by 3 and assign that to a new array called *C*.
* Sort each row of *C* in ascending order.

In this lesson we covered:
* Single and multi-dimensional arrays.
* Array indexing.
* Applying functions to arrays.