# Introduction: basic Julia usage

## Why should I learn Julia?


[Julia](https://julialang.org/) is a relatively new programming language, developed at MIT, with version 1.0 released in August 2018. Even though it is so recent, it has taken the scientific community by storm and many serious large scale projects have started using Julia.

The [Julia documentation](https://docs.julialang.org/en/v1/) outlines the main facts and features of Julia. Check out this [GitHub page](https://github.com/Datseris/whyjulia-manifesto) for a nice summary of why Julia.

# Basics of Julia

In this block we will overview basic Julia syntax, data structures, iteration, and using functions. The block assumes familiarity with programming, in the sense of reasoning about code, and also familiarity with the concept of an interactive development environment (or dynamic programming languages) where a program may be written and executed interactively line-by-line. The block doesn't assume any familiarity with a specific programming language however.

## Basic syntax


### Assignment

Assignment of variables in Julia is done with the `=` sign.

In [None]:
x = 1

1

In [None]:
x

1

You can assign *anything* to a variable binding. This includes functions, modules, data types, or whatever you can come up with.

Variable names can include practically any Unicode character. Additionally, most Julia editing environments offer "LaTeX Completion". Pressing e.g. `\delta` and then TAB will create the corresponding Unicode character using the LaTeX syntax.

In [None]:
δ = 4 # type `\delta` and then press tab!

4

You can assign multiple variables to multiple values using commas.

In [None]:
😺, 😀, 😞 = 1, 0, -1

(1, 0, -1)

In [None]:
😺 + 😞 == 😀 # legitimate code. Not good for readability though ;)

true

Strings are created between double quotes `"` while the single quotes `'` are used for characters only.

In [None]:
안녕하세요 = "人人生而自由，在尊严和权利上一律平等。"

"人人生而自由，在尊严和权利上一律平等。"

In [None]:
char = '안' # for characters, Julia prints their Unicode information

'안': Unicode U+C548 (category Lo: Letter, other)

Since assignment returns the value, by default this value is printed. This is **AMAZING**, but you can also silence printing by adding `;` to the end of the expression:

In [None]:
x = 3;

Lastly, you can interpolate any expression into a string using `$(expression)`

In [None]:
"the value of the cat face (😺) is $(😺)"

"the value of the cat face (😺) is 1"

In [None]:
"I am doing math inside a string: $(π^2 - x)"

"I am doing math inside a string: 6.869604401089358"

### Math operations

Basic math operators are `+, -, *, /` and `^` for power.

In [None]:
x = 3
y = x^2.6

17.398638404385867

Most julia operators have their `=` version, which updates something with its own value

In [None]:
x += 3 # x = x + 3
x -= 3
x *= 2
x /= 2

3.0

Literal numbers can multiply anything without having to put `*` inbetween, as long as the number is on the left side:

In [None]:
5x - 12.54y * 1.2e-5x

14.992145558678724

## Type basics

Everything that exists in Julia has a certain **Type**. (e.g. numbers can be integers, floats, rationals). This "type system" is instrumental for the inner workings of Julia, and is mainly what enables Julia to have performance matching static languages like C.

The type system also enables **Multiple Dispatch**, one of Julia's greatest features, which we will cover in the second lecture.

To find the type of a thing in Julia you simply use `typeof(thing)`:

In [None]:
x = 3
typeof(x)

Int64

In [None]:
typeof(1.5)

Float64

In [None]:
typeof(1.5f0)

Float32

In [None]:
typeof("asdf")

String

## Basic collection datastructures

Indexing a collection (like an array or a dictionary) in Julia is done with brackets: `collection[index]`.

In **ordered collections** (where the elements are specified by their order rather than some key), indexing is done using the positive integers. This means that **indexing in Julia starts from 1, which is exceptionally good,** because the index matches the element order: the 5th element has index 5.


### Tuples
Tuples are **immutable ordered collections** of elements of any type. They are most useful when the elements are not of the same type with each other and are intended only for small collections.

Syntax:

```julia
(item1, item2, ...)
```

In [None]:
myfavoritethings = ("purple", '🥁', π)

("purple", '🥁', π)

In [None]:
myfavoritethings[1]

"purple"

You can extract multiple values into variables from any collection using commas.

In [None]:
a, b, c = myfavoritethings
c

π = 3.1415926535897...

The type of the tuple is the type of its constituents.

In [None]:
typeof(myfavoritethings)

Tuple{String, Char, Irrational{:π}}

### Dictionaries
Dictionaries are **unordered mutable collections** of pairs key-value. They are intended for sets of relational data, and typically you want the data to be of the same type.

Syntax:
```julia
Dict(key1 => value1, key2 => value2, ...)
```

A good example of a dictionary is a contacts list, where we associate names with phone numbers.

In [None]:
myphonebook = Dict("Jenny" => "867-5309", "Ghostbusters" => "555-2368")

Dict{String, String} with 2 entries:
  "Jenny"        => "867-5309"
  "Ghostbusters" => "555-2368"

In [None]:
myphonebook["Jenny"]

"867-5309"

New entries can be added to the above dictionary, because it is mutable *(I will talk in more detail about mutability in a moment, but for now mutable means that "you can change its values without creating a copy or a new collection")*. The key of the entry must be of type `String` and the value of the entry must be of type `String`, because these are the types in the original dictionary.

In [None]:
myphonebook["Buzz Lightyear"] = "∞ and beyond"

myphonebook # this displays the phonebook

Dict{String, String} with 3 entries:
  "Jenny"          => "867-5309"
  "Buzz Lightyear" => "∞ and beyond"
  "Ghostbusters"   => "555-2368"

Dictionaries have a specific type for keys and values. First type is the type of key, second is the type of value.

In [None]:
typeof(myphonebook)

Dict{String, String}

### Named tuples

_(optional subsection that is typically skipped)_

These are exactly like tuples but also assign a name to each variable they contain.
Hence, they are an **immutable collection of ordered _and_ named elements**. 
They rest between the `Tuple` and `Dict` type in their use.

Their syntax is:
```julia
(key1 = val1, key2 = val2, ...)
```
For example:

In [None]:
nt = (x = 5, y = "str", z = 5/3)

(x = 5, y = "str", z = 1.6666666666666667)

These objects can be accessed with `[1]` like normal tuples, but also with the syntax `.key`:

In [None]:
nt[1]

5

In [None]:
nt.x # equivalent with nt[:x]

5

(named tuples are useful to know, because keyword arguments to functions are essentially named tuples)

### Arrays

The standard Julia `Array` is a **mutable and ordered collection of items of the same type**.
The dimensionality of the Julia array is important. A `Matrix` is an array of dimension 2. A `Vector` is an array of dimension 1. The *element type* of what an array contains is irrelevant to its dimension!

**i.e. a Vector of Vectors of Numbers and a Matrix of Numbers are two totally different things!**

The syntax to make a vector is enclosing elements in brackets:

In [None]:
fibonacci = [1, 1, 2, 3, 5, 8, 13]

7-element Vector{Int64}:
  1
  1
  2
  3
  5
  8
 13

In [None]:
mixture = [1, 1, 2, 3, "Ted", "Robyn"]

6-element Vector{Any}:
 1
 1
 2
 3
  "Ted"
  "Robyn"

As mentioned, the type of the elements of an array must be the same. Yet above we mix numbers with strings! I wasn't lying though; the above vector is an **unoptimized** version that can hold **any** thing. You can see this in the type of the vector, `Vector{Any}`.

Arrays of other data structures, e.g. vectors or dictionaries, or anything, as well as multi-dimensional arrays are possible:

In [None]:
vec_vec_num = [[1, 2, 3], [4, 5], [6, 7, 8, 9]] # vector of vectors, which is NOT a matrix

3-element Vector{Vector{Int64}}:
 [1, 2, 3]
 [4, 5]
 [6, 7, 8, 9]

If you want to make a matrix, two ways are the most common: (1) specify each entry one by one

In [None]:
matrix = [1 2 3; # elements in same row separated by space
          4 5 6; # semicolon means "go to next row"
          7 8 9]

3×3 Matrix{Int64}:
 1  2  3
 4  5  6
 7  8  9

or (2), you use a function that initializes a matrix. E.g. `rand(n, m)` will create an `n×m` matrix with uniformly random numbers

In [None]:
R = rand(4, 3)

4×3 Matrix{Float64}:
 0.0489836  0.924948   0.492329
 0.929548   0.96086    0.911561
 0.0304435  0.860544   0.965275
 0.676483   0.0664199  0.84543

In [None]:
R[1, 2] # two dimensional indexing

0.9249476688379149

Since arrays are mutable we can change their entries _in-place_ (i.e., without creating a new array):

In [None]:
fibonacci = [1, 1, 2, 3, 5, 8, 13]
fibonacci[1] = 15
fibonacci

7-element Vector{Int64}:
 15
  1
  2
  3
  5
  8
 13

We can add or remove elements from any mutable collection with functions like `push!, pop!, delete!`. We'll cover functions in more detail in a moment!

In [None]:
push!(fibonacci, 21)

8-element Vector{Int64}:
 15
  1
  2
  3
  5
  8
 13
 21

Lastly, for multidimensional arrays, the `:` symbol is useful, which means to "select all elements in this dimension".

In [None]:
x = rand(3, 3)

3×3 Matrix{Float64}:
 0.771966  0.589994  0.575424
 0.521611  0.50391   0.367429
 0.496781  0.970638  0.551084

In [None]:
x[:, 1] # it means to select the first column

3-element Vector{Float64}:
 0.7719655557101898
 0.5216106167933873
 0.4967807235899393

### Ranges
Ranges are useful shorthand notations that define a "vector" (one dimensional array) with equi-spaced entries. They are created with the following syntax:
```julia
start:stop # mainly for integers
start:step:stop
range(start, stop, length)
range(start, stop; step = ...)
```

In [None]:
r = 0:0.01:5

0.0:0.01:5.0

Ranges always include the first element and step until they _do not exceed_ the ending element. If possible, they include the stop element (as above).

In [None]:
r[end-3] # use `end` as index for the final element

4.97

Ranges are not unique to numeric data, and can be used with anything that extends their interface, e.g.

In [None]:
letterrange = 'a':'z'

'a':1:'z'

As ranges are printed in this short form, to see all their elements you can use `collect`, to transform the range into a `Vector`.

In [None]:
collect(letterrange)

26-element Vector{Char}:
 'a': ASCII/Unicode U+0061 (category Ll: Letter, lowercase)
 'b': ASCII/Unicode U+0062 (category Ll: Letter, lowercase)
 'c': ASCII/Unicode U+0063 (category Ll: Letter, lowercase)
 'd': ASCII/Unicode U+0064 (category Ll: Letter, lowercase)
 'e': ASCII/Unicode U+0065 (category Ll: Letter, lowercase)
 'f': ASCII/Unicode U+0066 (category Ll: Letter, lowercase)
 'g': ASCII/Unicode U+0067 (category Ll: Letter, lowercase)
 'h': ASCII/Unicode U+0068 (category Ll: Letter, lowercase)
 'i': ASCII/Unicode U+0069 (category Ll: Letter, lowercase)
 'j': ASCII/Unicode U+006A (category Ll: Letter, lowercase)
 ⋮
 'r': ASCII/Unicode U+0072 (category Ll: Letter, lowercase)
 's': ASCII/Unicode U+0073 (category Ll: Letter, lowercase)
 't': ASCII/Unicode U+0074 (category Ll: Letter, lowercase)
 'u': ASCII/Unicode U+0075 (category Ll: Letter, lowercase)
 'v': ASCII/Unicode U+0076 (category Ll: Letter, lowercase)
 'w': ASCII/Unicode U+0077 (category Ll: Letter, lowercase)
 'x': ASCII/

Ranges are cool because they **do not store all elements in memory** like `Vector`s. Instead they produce the elements on the fly when necessary, and therefore are in general preferred over `Vector`s if the data is equi-spaced. 

Lastly, ranges are typically used to index into arrays. One can type `A[1:3]` to get the first 3 elements of `A`, or `A[end-2:end]` to get the last three elements of `A`. If `A` is multidimensional, the same type of indexing can be done for any dimension:

In [None]:
A = rand(4, 4)

4×4 Matrix{Float64}:
 0.802819  0.0419949  0.740674  0.949413
 0.786347  0.449015   0.277375  0.281903
 0.85438   0.619798   0.337144  0.503436
 0.247161  0.589174   0.988236  0.0420344

In [None]:
A[1:3, 1]

3-element Vector{Float64}:
 0.8028193710189508
 0.7863467226277839
 0.854380155952781

In [None]:
A[1:3, 1:3]

3×3 Matrix{Float64}:
 0.802819  0.0419949  0.740674
 0.786347  0.449015   0.277375
 0.85438   0.619798   0.337144

## Iteration
Iteration in Julia is high-level. This means that not only it has an intuitive and simple syntax, but also iteration works with anything that can be iterated. Iteration can also be extended (more on that later).


### `for` loops

A `for` loop iterates over a container and executes a piece of code, until the iteration has gone through all the elements of the container. The syntax for a `for` loop is

```julia
for *var(s)* in *loop iterable*
    *loop body*
end
```

*you will notice that all Julia code-blocks end with `end`*

In [None]:
for n in 1:5
    println(n)
end

1
2
3
4
5


The nature of the iterating variable depends on what the iterating container has. For example, when iterating over a dictionary one iterates over pairs of key-value.

In [None]:
for pair in myphonebook # pair in myphonebook
    println(pair)
end

"Jenny" => "867-5309"
"Buzz Lightyear" => "∞ and beyond"
"Ghostbusters" => "555-2368"


Most of the time in such a scenario however, the variables that compose the iterable are decomposed directly after the `for` keyword, which also makes the code cleaner. For example:

In [None]:
for (key, val) in myphonebook # pair in myphonebook
    println("The number of $key is $val")
end

The number of Jenny is 867-5309
The number of Buzz Lightyear is ∞ and beyond
The number of Ghostbusters is 555-2368


In the context of `for`  loops, the `enumerate` iterator is often useful. It takes in an iterable and returns pairs of the index and the iterable value. 

In [None]:
for (i, v) in enumerate(rand(3))
    println("value of index $(i): $(v)")
end

value of index 1: 0.8317918607127136
value of index 2: 0.2110565380858982
value of index 3: 0.2528987856979812


### A note on command termination

Julia has a modern syntax parser that automatically understands when a command starts and ends. It does not require identation (like Python) or the `;` character (like C) to establish the end of a command. The following two are totally valid and syntactically identical Julia codes

In [None]:
        for (key,
val) in
                myphonebook
  println(
"The number of $key is $val"
        ) end

The number of Jenny is 867-5309
The number of Ghostbusters is 555-2368


In [None]:
for (key, val) in myphonebook println("The number of $key is $val") end

The number of Jenny is 867-5309
The number of Ghostbusters is 555-2368


However, code readability is important so it is strongly recommended to properly ident your code even if it is not enforced by the language!

### `while` loops

A `while` loop executes a code block until a boolean condition check (that happens at the start of the block) becomes `false`. Then the loop terminates (without executing the block again). The syntax for a standard `while` loop is

```julia
while *condition*
    *loop body*
end
```

In [None]:
n = 0
while n < 5
    n += 1
    println(n)
end

1
2
3
4
5



## Conditionals

Conditionals execute a specific code block depending on what is the outcome of a given boolean check. 
The  `&, |` are the boolean `and, or` operators.

### `if` block

In Julia, the syntax

```julia
if *condition 1*
    *option 1*
elseif *condition 2*
    *option 2*
else
    *option 3*
end
```

evaluates the conditions sequentially and executes the code-block of the first true condition.

In [None]:
x, y = 5, 6
if x > y
    x
else
    y
end

6

### Ternary operator

The ternary operator (named for having three arguments) is a convenience syntax for small `if` blocks with only two clauses. 

Specifically, the syntax

```julia
condition ? if_true : if_false
```

is syntactically equivalent to

```julia
if condition
    if_true
else
    if_false
end
```

For example

In [None]:
5 == 5.0 ? "yes" : "no"

"yes"

### `break` and `continue`
The keywords `continue` and `break` are often used with conditionals to skip an iteration or completely stop the iteration code block.

In [None]:
N = 1:100
for n in N
    isodd(n) && continue
    println(n)
    n > 10 && break
end

2
4
6
8
10
12


### List comprehension
The list comprehension syntax 
```julia
[expression(a) for a in collection if condition(a)]
```
is available as a convenience way to make a `Vector`. The `if` part is optional.

In [None]:
[    a^2 for a in 1:10 if iseven(a)      ]

5-element Vector{Int64}:
   4
  16
  36
  64
 100

## Functions
Functions are the bread and butter of Julia, which heavily supports functional programming.


### Function declaration

Functions are declared with two ways. First, the verbose

In [None]:
function f(x)
    # function body
    return x^2 # While `return` is not necessary, it is recommended for clarity
end

f (generic function with 1 method)

Or, you can define functions with the short form (best used for functions that only take up a single line of code)

In [None]:
f(x) = x^2  # equivalent with above

f (generic function with 1 method)

Functions are called using their name and parenthesis `()` enclosing the calling arguments:

In [None]:
f(5)

25

Functions in Julia support optional positional arguments, as well as keyword arguments. The **positional** arguments are **always given by their order**, while **keyword** arguments are **always given by their keyword**. Keyword arguments are all the arguments defined in a function after the symbol `;`. Example:

In [None]:
function g(x, y = 5; z = 2, w = 1)
    return x*z*y*w
end

g (generic function with 2 methods)

In [None]:
g(5) # give x. default y, z

50

In [None]:
g(5, 3) # give x, y. default z

30

In [None]:
g(5; z = 3) # give x, z. default y

75

In [None]:
g(2, 4; w = 0.1, z = 1.5) # give everything

1.2000000000000002

In [None]:
g(2, 4, 2) # keyword arguments can't be specified by position

MethodError: MethodError: no method matching g(::Int64, ::Int64, ::Int64)

Closest candidates are:
  g(::Any, ::Any; z, w)
   @ Main c:\Users\datse\OneDrive - University of Exeter\Teaching\Zero2Hero-JuliaWorkshop\1-JuliaIntro.ipynb:1
  g(::Any)
   @ Main c:\Users\datse\OneDrive - University of Exeter\Teaching\Zero2Hero-JuliaWorkshop\1-JuliaIntro.ipynb:1


###  Passing by reference: mutating vs. non-mutating functions

You can divide Julia variables into two categories: **mutable** and **immutable**. Mutable means that the values of your data can be changed in-place, i.e. literally in the place in memory the variable is stored in the computer. Immutable data cannot be changed after creation, and thus the only way to change part of immutable data is to actually make a brand new immutable object from scratch. Use `isimmutable(v)` to check if value `v` is immutable or not.

For example, `Vector`s are mutable in Julia:

In [None]:
x = [5, 5, 5]
x[1] = 6 # change first entry of x
x

3-element Vector{Int64}:
 6
 5
 5

But e.g. `Tuple`s are immutable:

In [None]:
x = (5, 5, 5)
x[1] = 6

MethodError: MethodError: no method matching setindex!(::Tuple{Int64, Int64, Int64}, ::Int64, ::Int64)

In [None]:
x = (6, 5, 5)

(6, 5, 5)

Julia **passes values by reference**. This means that if a mutable object is given to a function, and this object is mutated inside the function, the final result is kept at the passed object. E.g.:

In [None]:
function add3!(x)
    x[1] += 3
    return x
end

x = [5, 5, 5]
add3!(x)
x

3-element Vector{Int64}:
 8
 5
 5

**By convention**, functions with name ending in `!` alter their (mutable) arguments and functions lacking `!` do not. Typically the first argument of a function that ends in `!` is mutated.

For example, let's look at the difference between `sort` and `sort!`.

In [None]:
v = [3, 5, 2]

3-element Vector{Int64}:
 3
 5
 2

In [None]:
sort(v)

3-element Vector{Int64}:
 2
 3
 5

In [None]:
v

3-element Vector{Int64}:
 3
 5
 2

`sort(v)` returns a sorted array that contains the same elements as `v`, but `v` is left unchanged. <br><br>

On the other hand, when we run `sort!(v)`, the contents of v are sorted within the array `v`.

In [None]:
sort!(v)

3-element Vector{Int64}:
 2
 3
 5

In [None]:
v

3-element Vector{Int64}:
 2
 3
 5

### Functions as arguments

Functions, like literally anything else in Julia, are objects that can be passed around like any other value. Including giving them as arguments to other functions. 

A typical application of this is with the `findall` and related functions, that find the indices of all elements in a collection that return `true` for a particular expression.

In [None]:
expression(x) = (x < 0.5) | (x > 1.5)
x = 0:0.1:2
valid_indices = findall(expression, x)

10-element Vector{Int64}:
  1
  2
  3
  4
  5
 17
 18
 19
 20
 21

### The help system

Typing `?` followed by a function (or type) name will display its documentation string. Alternatively you can type `@doc` and then the function name.

For example

In [None]:
? cos

ErrorException: syntax: invalid identifier name "?"

## Broadcasting

_This subsection can be skipped for the sake of time_


Broadcasting is a convenient syntax for applying any function over the elements of an iterable input. I.e., the result is a new iterable whose elements is the function application of the elements of the input.

Broadcasting is done via the simple syntax of adding a dot `.` before the parenthesis in the function call: `g.(x)`.

In [None]:
h(x, y = 1) = x + y

h (generic function with 2 methods)

In [None]:
x = [1, 2, 3]
h.(x) # without 2nd argument, `h` is just `x + 1`

3-element Vector{Int64}:
 2
 3
 4

In [None]:
y = [1, 2, 3]
h.(x, y) # each element of `x` is added to the corresponding element of `y`

3-element Vector{Int64}:
 2
 4
 6

Let's now apply it to a vector `x`

In [None]:
x = [1, 2, 3]

3-element Vector{Int64}:
 1
 2
 3

In [None]:
h.(x)

3-element Vector{Int64}:
 2
 3
 4

Broadcasting can be useful when the number of operations is small and one can easily reason about the way the operations would be broadcasted across input(s). 

A typical example of broadcasting is to make an exponential range, which doesn't have a pre-made function in Julia:

In [None]:
exp_range = 10.0 .^ (-3:3)

7-element Vector{Float64}:
    0.001
    0.010000000000000002
    0.1
    1.0
   10.0
  100.0
 1000.0

*(notice that for infix operators (like `+, -`) the `.` is put before the operator)*

# Exercises - basics

**Important note for all exercises: when an exercise says _"use function `function_name` to do something"_, you need to first learn how to use the function. For this, you access the function's documentation string, using the help mode (type `?` or `@doc` in the Julia console and then type the function name)!**


## Babylonian square root
To get the square root of $y$ Babylonians used the algorithm $x_{n+1} = \frac{1}{2}(x_n + \frac{y}{x_n})$ iteratively starting from some value $x_0$ to converge to $x_n \to \sqrt{y}$ as $n\to \infty$. Implement this algorithm in a function `babylonian(y, ε, x0 = 1)` (default optional argument `x0`), that takes some convergence tolerance `ε` to compare with the built-in `sqrt(y)`. The function should return the steps it took to reach the square root value within given tolerance.

_Hint: for this exercise you only need a `while` code block without any other code structures such as `for, if, ...`._


## Counting nucleotides
Create a function that given a DNA strand (as a `String`, e.g. `"AGAGAGATCCCTTA"`) it counts how much of each nucleotide (A G T or C) is present in the strand and returns the result as a dictionary mapping the nucleotides to their counts. The function should throw an error (using the `error` function) if an invalid nucleotide is encountered. Test your result with `"ATATATAGGCCAX"` and `"ATATATAGGCCAA"`.

*Hint: Strings are iterables! They iterate over the characters they contain.*

## Fibonacci numbers
Using recursion (a function that calls itself) create a function that given an integer `n` it returns the `n`-th [Fibonacci number](https://en.wikipedia.org/wiki/Fibonacci_number). Apply it using `map` to the range `1:8` to get the result `[1,1,2,3,5,8,13]`.


## Hamming distance

Create a function that calculates the Hamming distance of two equal DNA strands, given as strings. This distance is defined by counting (sequentially) the number of non-equal letters in the two strands, e.g. `"ATA"` and `"ATC"` have distance of 1, while `"ATC"` and `"CAT"` have distance of 3. 

*Hint: this exercise has a one-liner solution, using the `zip` and `count` functions.*