# 2.1 Arrays

Great, so we already have a gradient descent for 1D functions. The next step is obviously to extend it to functions of more than one variable; for example consider the function

$$ f(x,y) = x^2 + 2y^2 $$

It is obvious that with the current implementation we could not minimize it. For addressing this, we have to introduce *arrays*.

Arrays are pretty similar to those in Matlab. Below you have some common operations

In [2]:
# Define a vector
v = [1, 2, 3, 4]

4-element Array{Int64,1}:
 1
 2
 3
 4

In [3]:
# Define a _row_ matrix (notice that it is already a 2-dimensional array)
v = [1 2 3 4]

1×4 Array{Int64,2}:
 1  2  3  4

In [4]:
# Define a matrix
A = [1 2; 3 4]

2×2 Array{Int64,2}:
 1  2
 3  4

In [5]:
# or equivalently
B = [1 2;
     3 4]

A == B

true

Accesing elements is straightforward

In [6]:
println(v[1])
println(A[1, 2])
println(A[end, end])

1
2
4


Julia sees one- and two-dimensional arrays as vectors and matrices, so their multiplication is the matrix multiplication. For pointwise operation one must use `.*`:

In [7]:
A * A

2×2 Array{Int64,2}:
  7  10
 15  22

In [8]:
A .* A

2×2 Array{Int64,2}:
 1   4
 9  16

This *a point implies elementwise operation* thing extends to all functions, even user defined! This is extremely useful:

In [9]:
g(x) = x^2
g.(A)

2×2 Array{Int64,2}:
 1   4
 9  16

In [10]:
sqrt.(A)

2×2 Array{Float64,2}:
 1.0      1.41421
 1.73205  2.0    

In this last example, note that how Julia has changed the *type* of the Array from `Int64` to `Float64`. It is usually convenient to define the type of your arrays from the beginning; thus we can do:

In [11]:
A = [1. 2; 3 4] # the point after the one indicates that its type is float, forcing the rest ot types

2×2 Array{Float64,2}:
 1.0  2.0
 3.0  4.0

Of course there are also methods for defining arrays of zeros, ones...

In [12]:
zeros(Float64, 2, 2), ones(Float64, 2, 2)

([0.0 0.0; 0.0 0.0], [1.0 1.0; 1.0 1.0])

## 2.1.1 Important differences with respect to Matlab

There are of course plenty of differences between Julia and Matlab, but since arrays are so vital in every scientific computing code, we may well state here some of the more important ones:

* Julia does not automatically grow arrays in an assignment statement. 
* Maybe the most importante of all: Julia arrays are **not** copied when assigned to another variable. After A = B, changing elements of B will modify A as well

In [58]:
A = [1,2,3]
B = A
A[1] = 0
B

3-element Array{Int64,1}:
 0
 2
 3

An extensive list can be found here: https://docs.julialang.org/en/v1/manual/noteworthy-differences/#Noteworthy-differences-from-MATLAB-1

## 2.1.2 Important differences with respect to Python

Some other differences with respect to Python are:

* Julia has 1-based indexing, instead of 0-based indexing as in Python.
* Julia's slice indexing includes the last element, unlike in Python. `a[2:3]` in Julia is `a[1:3]` in Python

In [60]:
A = [1, 2, 3]
A[2:3] # this would be like 1:3 in Python

2-element Array{Int64,1}:
 2
 3

An extensive set of differences can be found at https://docs.julialang.org/en/v1/manual/noteworthy-differences/#Noteworthy-differences-from-Python-1

## 2.1.3 Arrays as lists and concatenation

Thinking back about our gradient descent, apart from extending it to multidimensional functions, it would also be nice to have a history of the iterations. For this we need some kind of list to store them in. 

If you come from Python it may surprise you to know there are no lists in Julia. Instead, list operations are done using arrays. For example, pushing an element to the end of an array can be done as:

In [34]:
C = [5, 1]
push!(C, 3)
C

3-element Array{Int64,1}:
 5
 1
 3

You may have noticed that `!` at the end of `push!`. In Julia there is the convention of naming the functions that modify its arguments ending with a `!`. Thus, we have two `sort`s: the first doesn't change the array `C` itself:

In [35]:
sort(C), C

([1, 3, 5], [5, 1, 3])

While `sort!` *does* modify it:

In [36]:
sort!(C), C

([1, 3, 5], [1, 3, 5])

Anyway, some users will feel conforted to have also the following resources for expanding arrays:

In [37]:
C = [C;2]

4-element Array{Int64,1}:
 1
 3
 5
 2

In [38]:
D = [1 2]; D = [D 3]

1×3 Array{Int64,2}:
 1  2  3

Note that we use a semicolon to concatenate vertically and a space to concatenate horizontally. Using a comma instead returns an array containing the given elements:

In [39]:
D = [1 2]; D = [D, 3]

2-element Array{Any,1}:
  [1 2]
 3     

# 2.2 Packages

As you may remember, our stopping criterion in `gradient_descent` involved checking `abs(Df(x))`. Since now `Df(x)` will be an array, we need a way to compute its norm. We could of course code it ourselves, but Julia comes with a rich library environment that has a lot of this work already done. 

In particular, the function `norm` comes in the package `LinearAlgebra.jl`, so let's install it and import it:

In [9]:
using Pkg
Pkg.add("LinearAlgebra")

[32m[1m  Updating[22m[39m registry at `~/.julia/registries/General`
[32m[1m  Updating[22m[39m git-repo `https://github.com/JuliaRegistries/General.git`
[32m[1m Installed[22m[39m OpenBLAS_jll ───── v0.3.9+2
[32m[1m Installed[22m[39m PlotThemes ─────── v1.0.3
[32m[1m Installed[22m[39m ColorVectorSpace ─ v0.8.5
[32m[1m Installed[22m[39m ArrayInterface ─── v2.7.0
[32m[1m  Updating[22m[39m `~/.julia/environments/v1.3/Project.toml`
[90m [no changes][39m
[32m[1m  Updating[22m[39m `~/.julia/environments/v1.3/Manifest.toml`
 [90m [4fba245c][39m[93m ↑ ArrayInterface v2.6.2 ⇒ v2.7.0[39m
 [90m [c3611d14][39m[93m ↑ ColorVectorSpace v0.8.4 ⇒ v0.8.5[39m
 [90m [4536629a][39m[93m ↑ OpenBLAS_jll v0.3.9+1 ⇒ v0.3.9+2[39m
 [90m [ccf2f8ad][39m[93m ↑ PlotThemes v1.0.2 ⇒ v1.0.3[39m


In [40]:
using LinearAlgebra # since we only need the function `norm` we could also do `using LinearAlgebra: norm`

a = [1, 1]
norm(a)

1.4142135623730951

Great! We are again prepared for coding. Let us first define the function we want to minimize and its derivate. Let's make it grow faster in the $y$ than in the $x$ direction to check it works properly:

![surface plot](surface_plot.png)

In [41]:
g(x) = x[1]^2 + 2*x[2]^2
Dg(x) = [2*x[1], 4*x[2]]

Dg (generic function with 1 method)

#### Exercise 2

Build the function $g(x) = x_1^2 + 2x_2^2$ and its gradient using only matrix and scalar-matrix operations (Hint: the transpose of the vector `x` is `x'`).

In [42]:
# Your solution goes here

#### Exercise 3
Bring the gradient descent of the first notebook into the multidimensional realm, and make it output the _history_ of `x`s and `f(x)`s (Hint: consider using the concatenate functions that were explained above. You may also use `push!`.)

In [61]:
""" Gradient descent for multidimensional functions Stops when the gradient is smaller than `TOL`, 
    or when the maximum number of iterations `maxiter` has been reached"""
function gradient_descent(f, Df, x; alpha = 0.1, TOL = 1e-10, maxiter = 1000, verbose = false)
    
    # Your code goes here

end

gradient_descent

In [62]:
# Let's test it
xn, fn = gradient_descent(g, Dg, [1., 1], alpha = 0.01, verbose = true); # We can add a semicolon to mute the output

Iter. 100,	x = [0.13261955589475316, 0.016870319358849667],	f(x) = 0.01815716195626071
Iter. 200,	x = [0.017587946605721556, 0.0002846076752695776],	f(x) = 0.0003094978688633571
Iter. 300,	x = [0.002332505667951425, 4.801422373777547e-6],	f(x) = 5.440628798339146e-6
Iter. 400,	x = [0.00030933586580571244, 8.10015288223532e-8],	f(x) = 9.568869099626508e-8
Iter. 500,	x = [4.102398514547257e-5, 1.3665216597881629e-9],	f(x) = 1.6829673609507166e-9
Iter. 600,	x = [5.4405826910255245e-6, 2.3053656811411595e-11],	f(x) = 2.959994001894948e-11
Iter. 700,	x = [7.215276602924866e-7, 3.889225527978334e-13],	f(x) = 5.206021645674525e-13
Iter. 800,	x = [9.56886778737698e-8, 6.561247671558511e-15],	f(x) = 9.156323073230169e-15
Iter. 900,	x = [1.2690189963775444e-8, 1.1069034361170077e-16],	f(x) = 1.6104092131670702e-16
Iter. 1000,	x = [1.6829673572159529e-9, 1.8673814466701936e-18],	f(x) = 2.8323791254544486e-18


# Bonus: the use of `...`

If you used `push!`, though, you may have a little annoyance: *the output is an array of arrays*. We can solve this by doing `hcat(xn...)`

In [69]:
xn = [[1,2],[3,4]]
hcat(xn...)

2×2 Array{Int64,2}:
 1  3
 2  4

Where we have used `...` to make Julia consider the elements inside `xn` as the input of `hcat`; for example when given the following function we can also do:

In [21]:
add(a,b) = a + b

add([2, 3]...)

5

Well, enough of sawing raw data. In the next notebook we will learn how to visualize it!