# Starting up
In this course we are using Julia with Visual Studio Code as the text editor. To get started with this Julia basics tutorial, start up Visual Studio Code. Then, start up the Julia REPL by pressing **Ctrl+Shift+P** and searching for **Julia: Start REPL**. The REPL can be used to execute lines of code, which will be used in the first part of this tutorial.

Remember: Google is your friend!


# Basic Julia syntax
The following pieces of code can be easily executed in the Julia REPL. To start up the REPL pres ctrl+shift+p and find the command "Julia: Start REPL".

## Operators
Operators are special characters that are used for either mathematical or logical operations. Additionally, there is the "#" sign that you will frequently see, which is used to make comments in the code that will not be executed. Usually the text that is part of the comments also become a different color in text editors (see the code blocks below).

### Mathematical operators:

In [1]:
#Addition
24 + 21

45

In [2]:
#Substraction
123 - 432  

-309

In [3]:
#Division
234 / 2

117.0

In [4]:
#Remainder after division
234 % 2

0

In [5]:
#Multiplication
3 * 12

36

In [None]:
#Power
4^2

### Logical operators:
The overview below shows the most commonly used operators, which can be used to make comparisons with numeric values, booleans, and other objects. Using these operators, always returns a boolean value, which is either *true* or *false* that corresponds to 1 and 0, respectively.

| **Operator** | **meaning**      | **example** | **result**               |
|--------------|------------------|-------------|----------------|
| ==           | equality         | 4 == 6      | false |
| !=           | inequality       | 4 != 6      | true  |
| <            | less then        | false < true |true  |
| />           | greater then     | 6 > 6       | false |
| <=           | less or equal    | 4 <= 6      | true  |
| />=          | greater or equal | 6 >= 6      | true  |


#### **E1** 
What would the following evaluations return?

In [None]:
#a
4 == 8
#b
3 < 2^2
#c
4 - 6 >= 12
#d
12%3 == 0
#e
(6^2)/18 >= 2.01

### Boolean operators
Boolean operators are generally used when evaluating multiple expressions. For example, you can use them in case you want expression 1 AND expression 2 to be true. The two most commonly used are:

| **Operator** | **meaning** | **example**         |**result**           |
|--------------|-------------|---------------------|-----------|
| &&           | AND         | 4 > 3 && 4 <=4      | true   |
| \|\|         | OR          | 6 > 0 \|\| 6 < 0    | true |

#### **E2** 
What would the following evaluations return?

In [6]:
#a
true && false
#b
4 == 2 || 4 < 6
#c
true == 1 && false == 0
#d
6%4 <= 2 && 9 >= 9
#e
5 > 4 && 3^3 == 26

## Variables
Variables are used to store information, which allows for accessing pieces of information based on the name it was given. Once you assign a value to a variable in the REPL, it will be saved in the current workspace under that variable name and can be used in later on in the code. Simply typing the variable name in the REPL and pressing enter will show the contents of that variable.

In [7]:
x = 2

y = "Hello world"

var = 78.4

any_name_youComeUpWith = 942.0

942.0

In [None]:
# using the variables
var + any_name_youComeUpWith

### Variable types
Variables can have different types:

| Data type  | Expression | Examples           | Description                             |
|------------|------------|--------------------|-----------------------------------------|
| Integers   | Int64      | 1, -99             | Whole numbers                           |
| Floats     | Float64    | 1.23, 4.0          | Numbers with a decimal point            |
| Booleans   | Bool       | true, false        | Logical values                          |
| Characters | Char       | 'A', '$', '\u20AC' | Text data with exactly one character    |
| Strings    | String     | "Hello", "A", ""   | Text data with any number of characters |

#### **E3**
Create the following three variables:

- Variable x with value 1.234.
- Variable y with value 1//2.
- Variable z with value x + y*im.

What are the types of these three variables? You can make use of the function typeof(variable_name)

In [None]:
# Bank balance
balance = 100.0

# Interest rate
interest_rate = "1.05"

new_balance = balance * interest_rate

Running the last line currently returns an error. The error tells you that there is no methods that can multiply an Float64 with a String. Running the following lines of code shows you what the type of your variables are and where the issue comes from.

It is important to be aware of the types of variable you are working with to avoid errors and bugs in your code. This is because frequently you cannot used these types interchangeably. Try running the following code: 

In [None]:
typeof(balance)         #returns Float64
typeof(interest_rate)   #retunts String

Luckily it is possible to convert between variable type.

| From             | To      | Function               |
|------------------|---------|------------------------|
| Integer          | Float   | Float64(x)             |
| Float            | Integer | Int64(x)               |
| Integer or float | String  | string(x)              |
| String           | Float   | parse(Float64, string) |
| String           | Integer | parse(Int64, string)   |

Using this information the variable *interest_rate* can actually be converted from a String to a Float to execute the calculation.

In [9]:
new_balance = balance * parse(Float64, interest_rate)

#### **E5**
Convert the following variables

a) x with a value of 8.73 to a string and save it in y

b) now convert y back into a Float

#### **E6**
Use rounding functions (i.e., round(), ceil(), and floor()) to solve the following tasks:

a) Round 1252.1518 to the nearest larger integer and convert the resulting value to Int64.

b) Round 1252.1518 to the nearest smaller integer and convert the resulting value to Int16.

c) Round 1252.1518 to 2 digits after the decimal point.

d) Round 1252.1518 to 3 significant digits.

### Strings and characters
As can be seen in the above section, strings consist of a series of characters and are marked with the "" signs, while the individual characters are marked with the single quotations ''. You can work with strings in a variety of ways, including extracting information and attributes. For example, it extract information from a string bracket indexing can be used to access the characters.

In [11]:
str = "Hello world"
str[5]                  #extracts the 5th character from the string, returning a 'o'

Alternatively, a larger block of characters can also be obtained from the string by specifying the range, meaning the first specified index until the last specified index.

In [None]:
#extracting the 5th till the 8th character
str[5:8]        # extracting the string "o wo"

### Common string operations
Beside, for mathematical operations, the operators * and ^ can also be used with strings. The multiplication operator is able to glue multiple strings together. From the example below, variable s3 will have the full string "Hello world".

In [None]:
s1 = "Hello "
s2 = "world"
s3 = s1 * s2

#### **E7**
Now try to execute the following string modification and see what comes out of it:

In [None]:
#a 
s1^4

#b 
#obtain the same string as in a, using the * operator

#c
#convert the vollowing vector of char to a string
cs = ['h','e','l','l','o']

#d
string("This is", " an alternative", " approach", " to ", "paste strings together")

#### **E8**
Using the string generated in 7d, check the following:

a) Wether the string contains the substring "roach". Check google if there is already a function for this.

# Saving code
So far we have been running our code in the REPL. However, these lines will disappear when VScode is closed. To avoid this we can work in a .jl file that can be saved and re-run the next time you start up your session. To create a new .jl file go ot: File -> New file -> choose Julia file.

When working in a Julia text file, you can either run all the code in that file with the run button at the right top of your screen or run your code line by line. To run your code line by line you can do either of the following:
1. have your typing cursor anywhere on the line of code that you would like to run and press shift + enter or ctrl + enter
2. Select the code, as you would when you copy and paste text, and again press shift + enter or ctrl + enter

Using the latter approach also allows you to only run a specific part on a line instead of the full line.



# Using packages
By default, Julia only has a basic selection of function readily available. However, there are many more packages out there that contain already existing functions and mitigates the need for us to program these again. For example, the package Statistics contains functions such as mean, median, and std (standard deviation). Without loading the package into your workspace you wont be able to use these functions. To load a package into your workspace the *using* keyword can be. However, *using* a specific package can only be done after it has been installed.

Here are 2 different approaches to work with the package manager to install and update packages. You can see that for the first approach we need to load the package manager functions by running *using* Pkg.

In [None]:
# This can always be used to install packages
using Pkg                   #load the package manager from julia
Pkg.add("LinearAlgebra")       #installing the Statistics package (only need to be done once)
Pkg.update("LinearAlgebra")    #updating a package

# OR in the REPL
# you can type ] to get into the package manager (julia> will change into (@v1.6) pkg>) and then execute the following:
add LinearAlgebra           #installs a package
up                          #will update all installed packages

#to get out of the package manager type backspace

When you require certain packages in the script you are writing, the default location to write the *using* lines is at the start of your file.

# Arrays
So far, in the variable section, you learned how single values can be saved under a variable name. But often we are working with larger datasets that contain multiple values. For this we can use arrays, which are a collection of values stored between brackets []. Depending on the number of dimensions arrays can be called differently:
* Vector: 1-dimensional array
* Matrix: 2-dimensional array
* Array: >= 3 dimensional array

To create a vector, the values are placed between [] and separated by commas:

In [None]:
vec = [1, 2, 3, 4]

While for a 2-dimensional matrix, the columns (vertical lines of values) are separated with a space and the rows (horizontal lines of values) are separated with a semicolon (;):

In [None]:
mat = [1 2;3 4;5 6]

#### **E9**
Create a vector of positive integers that contains all odd numbers smaller than 10. Then change the first element to 4 and the last two elements to 1.

#### **E10**
Change the commas in the vector for variable *vec* into semicolon. What happened to the type of and shape of the array?

#### **E11**
Change the commas in the vector for variable vec into spaces. What happened to the type of and shape of the array?

#### **E12**
Create two vectors: vector of all odd positive integers smaller than 10 and vector of all even positive integers smaller than 10. Then concatenate these two vectors horizontally and fill the third row with 4.

Hint: check out the functions hcat() and vcat()

## Indexing
Now you know how to make a collection of data entries in an array, but how can you obtain specific values from this collection? As you frequently only need a value or smaller part of your array. To do so you use the brackets behind the variable name and specify which number of the element you want to access. For example, if we have the following vector:

In [None]:
x = [100, "julia", 14.235, 'K']

The second element can be accessed by typing x[2], returning "julia".

Alternatively, a range of values can be selected by using the colon (:). Here the range 2:4, means from 2 until 4 in steps of 1, running the following will show that 2:4 is a vector of [2,3,4]:

In [None]:
collect(2:4)

So the following code will obtain the 2nd, 3rd, and 4th element in our vector:

In [None]:
x[2:4]

#### **E13**
Find out how to select the 1st and 4th index from vector x

#### **E14**
Given the above vector x, use the range indexing to obtain all elements in the vector that are on a uneven numbered index.

#### **E15**
Given the above vector x, find out how to access the last index, assuming that you do not know the length of this vector or that this might change in the future.


### Matrices
These indexing examples used vectors that only have a single dimension. What if we are looking at a matrix with multiple dimensions? For this you use the same indexing approach that is separated by a comma. To see how this works, run the following examples:

In [None]:
mat = [1 2 3;4 5 6;7 8 9;10 11 12]
mat[2,3]            # 2nd row 3rd column -> 6
mat[:,2]            # all elements in the second column
mat[4,:]            # all elements in the 4th row
mat[2:4,:]          # all elements from the 2nd till the 4th row
mat[2:4,[1,3]]      # the elements from the 2nd till the 4th row from the 1st and 3rd column

mat[4,3] = 1        # replaces the value 12 with a 1 in the matrix

#### **E16**
Using the *z* matrix defined below,

a) What is the index of the of the element with value 22? Check this by running z[row,column] and obtaining the value 22

b) Obtain all elements in the 3rd column.

c) Obtain all elements on the first 2 rows.

d) Obtain all elements that are in the 4th column on the first three row.

e) Obtain the elements on the 5th row that are on the 2nd and 4th column.


In [None]:
z = reshape(2:2:48, 6, :)

#### **E17**
a) Find out how to make a matrix that contains all 1's with 50 rows and 70 columns

b) Replace the ones in the 3rd row and 5th column with a 5

c) Multiply all elements in the first three rows by 6 and save these values back in those 3 rows

d) After b) and c) does matrix_name[3,5] == 30? This should be the case

### Dot operator
In exercise 17c a matrix was multiplied by 6, which was likely achieved by only using the * operator. Multiplication is one of the few that works this way for vectors and matrices. If you try to do this with addition, you will get the following error:


ERROR: MethodError: no method matching +(::Matrix{Int64}, ::Int64)

For element-wise addition, use broadcasting with dot syntax: **array .+ scalar**


Telling you that you should be using the .+ for element-wise addition, meaning that the operation (e.g., adding 6) will be performed for each element in your matrix. On top of that, this is great example that reading error messages often contain a lot of information on where it goes wrong or even what the solution might be.

#### **E18**
Using the matrix k below,

a) Remove 1 from every element in the matrix and save the values in **k**.

b) Divide all elements by 7 and save the values in **y**.

c) Divide each element in **y** by each element in **k**.

d) What do you get when not using a . before the division sign in c)?

In [None]:
k = reshape(2:3:100, 3, :)

# Flow control
Flow control are statements that use boolean values to evaluate what will be executed next. For example, if a statement is true, perform this operation. If false, perform something else. These statements are generally called if-else statements. For example:

In [None]:
if true
    println("Do something")
else
    println("Do something else")
end

Whenever a condition after the if keyword evaluates to true, the if block is executed, otherwise, the else block is executed. This example below shows a if-else statement that checks if a number is even or odd.

In [None]:
number = 5

if number % 2 == 0
   println("The number is even")
else
   println("The number is odd")
end

In the expression after if, we are using the modulus operator, %, to check if the number returns 0 as a remainder when divided by 2. It evaluates to false because the remainder of dividing 5 by 2 is 1. When there are more than 2 conditions the **elseif** keyword is used:

In [None]:
number = 5

if number % 2 == 0
   println("The number is even")
elseif number % 2 == 1
   println("The number is odd")
else
   println("The number if not even or odd")
end

#### **E19**

a) An if-else statement that prints whether a number is positive, negative, or 0.

b) An if-else statement that checks if a variable contains a integer, float, string, character, or something else.

c) An if-else statement that checks whether a given year is a leap year or not.

**Leap year**: A year may be a leap year if it is evenly divisible by 4. Years that are divisible by 100 (century years such as 1900 or 2000) cannot be leap years unless they are also divisible by 400. 

# Repeated evaluation
## For loop
For loops can be used when repeated evaluations of the same code are required. The example below shows the basic functionality of for loops. After the keyword for, comes a range of values for which the code needs to be executed. Then for each of the values in this range (1:11) the code is executed, which is a simple print function for the example below. Run this code to see the output:

In [None]:
numbers = [-1,2,3,-6,8,0]
for i = numbers    #iterator
    println(i)  #code to execute for each value of i
end

The for loop is started by writing the *for* keyword. Then, we choose a temporary variable name. In this case, we are choosing *i*, whose value changes as we go through the vector *numbers*. Then, to specify we are looping through *numbers*, we write its name after the assignment operator =.

Inside the loop the println() function is used to print each element in *numbers*. Here, *i* is called a temporary variable because it is not accessible once the loop finishes.

#### **E20**
Make a for-loop that evaluates whether each element in *numbers* is positive, negative, or zero (if-else statement from previous exercise).

#### **E21**
Use for or while loop to print all integers between 1 and 100 which can be divided by both 3 and 7.

Hint: use the mod function.

#### **E22**
Make a for-loop that only prints the values on the diagonal of a random matrix z. Notice that the size of the matrix varies, to obtain information about the shape of matrices and vectors, the functions size(var_name) and length(var_name) can be used.

In [None]:
k = rand(3:6)
z = rand(1:20, k,k) # generates a matrix with integer values between 1 and 20 with a random size k

## While loop
In case there is not a clear range over which the iteration must take place, while loops could be used, which continue until the condition is not true is reached. The example below prints and adds 1 to the variable i until i = 9, for which the i <= 8 yields false.

In [None]:
global i = 1
while i <= 8
    println(i)
    i = i + 1
end

#### **E23**
Using a while loop, calculate the sum of only the even numbers in the following vector:

In [None]:
v = rand(1:20, rand(10:20))

# Exercises 

#### **E24**
This time use a for loop to calculate the sum of the even numbers in the above vector v.


#### **E25**
Write a loop that prints wether the letters in the following vector are capitalized or not.

Hint: Google is your friend and likely knowns the function that can evaluate if a character is upper or lower case.

In [None]:
letters = ['I', 'n', 'C', 'h', 'I', 'K', 'e', 'y']

#### **E26**
Using a loop convert the following vector of words/strings to a single sentence/string. Do not forget the spaces between words. 

In [None]:
v = ["This", "is", "going", "to", "be", "a", "sentence", "with", "spaces"]

#### **E27**
Making use of the *continue* keyword, write a for loop that prints all letters in a string except for the vowels.

Hint: the keyword *continue* can be used to skip the remainder of code of a current iteration in a for or while loop.

#### **E28**
Using the *break* keyword, write a while loop that starts with i=1, prints the value i and add 3 to is every iteration, and stops/break once a whole number is obtained when i is divided by 5.

Hint: the keyword *break* can be used to exit a for or while loop or a function, of which the latter is explained later on.

#### **E29**
Use two for loops (i.e., nested for loops) to iterate over the elements in the matrix below and count the number of 1's present.

In [None]:
dfgs = rand([1,0], 5,8)

# Thinking like a computer
So far we did relatively small simple exercises. However, generally in data science you have bigger tasks that you want to program, which cannot be done in a few lines of code. For these goals it becomes very important to translate the goal into smaller steps that the computer is able to understand. Therefore, in the next few exercises it is the aim to break up the following goals in small enough steps that the computer could execute it when it is converted to the code. Feel free to also code the exercises afterwards.

#### **E30** RNA transcript
Obtain the RNA sequence from the a given DNA sequence. A reminder of the pairs:

|DNA | complementary RNA pair|
|----|----|
|G  | C|
|C  | G|
|T  | A|
|A  | U|

In [None]:
DNA = "GTCTATGAC"

#### **E31** Rooting
Obtaining the square root of a positive number without using the sqrt() function.

#### **E32** Roman numerals
Converting a number between 1 and 300 to a roman numeral. 

| Roman numeral           | value     |
|-----------------|---------|
| I | 1   | 
| V | 5   |
| X | 10  |
| L | 50  |
| C | 100 |

Placing a roman numeral of a lower value before one with a higher value, subtracts that lower value from the higher one (e.g., IV = 5-1 and IIV = 5-1-1).

For example, 199 is CXCIX:

100 = C, 90 = XC, 9 = IX

208 is written as CVIII:

200 = CC, 8 = VIII

Complete description of the numbers: http://www.novaroma.org/via_romana/numbers.html


# Functions
There might be cases where you build a code that need to be applied more than once. In these cases, both for cleanness of the code itself and convenience, the code can be put in a function. These function can then be used in the same manner as, for example, the sum() function, which comes by default with Julia.

The default layout for a function is:

In [None]:
function FunctionName(input1, input2, ...)  #number of inputs depends on the number of variables that the function needs
    
    #code to execute

    return out1, ...        #variables that come from the function
end

A more specific example would be a function that calculates y for a linear line given the inputs a, b, and x.

In [None]:
function f(x, a, b)
    y = a*x + b
    return y
end

z = f(1,2,3)  # returns the value in y (5) and saves it in z.

Alternatively, for these simple function, an **inline** function could also be used. Inline functions are a more compact version and can only be used if the code that needs to be executed in the function consist of a single expression.

In [None]:
f(x,a,b) = a*x + b

Or if there are multiple outputs

In [None]:
function f(x, a, b)
    y1 = a*x + b
    y2 = b*x + a
    return y1, y2
end

z1, z2 = f(2,2,3) # returns the values 7 and 8 and saves it in z1 and z2, respectively.

#### **E33**

a) Write a one-line function that returns true if the input argument is an even number and false otherwise.

b) Write a function that calculates the C in pythagoras given and A and B and returns its value. (A^2 + B^2 = C^2)

c) Convert the if-else statement from **E20**, which evaluates if a value is positive, negative, or 0, to a function that returns the outcome of a given Float.


Or, another example would be to grab the code from a previous for-loop exercise and built it into a function.

In [None]:
function EvenOrUneven(numbers)
    for i in numbers
        if i % 2 .== 0   
            println("The number $i is even")
        else
            println("The number $i is uneven")
        end
    end
end

EvenOrUneven([1, 3, 6, 2, 3])

Apart from printing results in the REPL, the function could also save the results and return those values with the return keyword. For this, the variable res is created with the same number of elements as the input variable numbers. Then, in the for loop, the result of the evaluation is saved in res, 1 for even and 0 for an uneven number. Notice that in this case i is used as index values to obtain the value from numbers and save the results in res in the same respective location.

In [None]:
function IsEven(numbers)
    res = zeros(length(numbers))
    for i = 1:length(numbers)
        if numbers[i] % 2 .== 0   
            res[i] = 1              # is even
        else
            res[i] = 0              # is uneven
        end
    end
    return res
end

out = IsEven([1, 3, 6, 2, 3])
println(out)

The need for the return keyword is explained in the next section: Scope.

# Scope
One thing to consider when building functions is the scope of the variables. A variable can have a local or a global scope. Variables with a local scope can, for example, only be seen within a function, while global variables can be seen through out the "file", but not within functions. That is why these variables need to be passed down to a function, for example the variable numbers in the case below. Same as to why we need to return res to be able to used is outside the function in the global scope, which is saved in variable out. When running the code below you will notice that println(res) will actually return an error, since to the global scope this variable is not defined and it only exists within the local scope of the function.

In [None]:
function IsEven(numbers)
    res = zeros(length(numbers))
    for i = 1:length(numbers)
        if rem(numbers[i],2) .== 0   
            res[i] = 1
        end
    end
    return res
end

out = IsEven([1, 3, 6, 2, 3])
println(res)
println(out)

Another example of a local scope variable is the temporary variable that is used as iterator within the for and while loops. Once a for or while loop is finished the temporary variable is not present in the global work space.

# More exercises
When ever you want more information on a function in Julia type a ? followed by a function name and press enter in the REPL.

#### **E34**
Write a function that takes a vector as input, and cyclically shifts the elements by one position and prints the vector until the original vector has been re-obtained.

#### **E35**
You are given a file containing an unknown amount of numbers. Each number is one of the numbers 1 to 9. A number can appear zero or more times and can appear anywhere in the file. Some sample data are:

file = [5, 3, 7, 7, 7, 4, 3, 3, 2, 2, 2, 6, 7, 4, 7, 7, 2, 2, 9, 6, 6, 6, 6, 6, 8, 5, 5, 3, 7, 9, 9, 9]

Write a function that takes a vector with the data once and print the number which appears the most in consecutive positions and the number of times it appears. Ignore the possibility of a tie. For the above data, output should be 6, which appears consecutively 5 times.


#### **E36**
Now remember exercise 15:

Using a loop convert the following vector of words/strings to a single sentence/string. Do not forget the spaces between words. 

To achieve the same goal, we can also do this without using a for loop and with a simple dot operation and a function. Figure out (Google) how to achieve this. This is a perfect example that there are always more ways to reach the same goal in coding.

In [None]:
v = ["This", "is", "going", "to", "be", "a", "sentence", "with", "spaces"]



Finally, to further practice the basics with more exercises, the following site is a great starting point: https://mathigon.org/course/programming-in-julia/introduction

Also, on Exercism you can find more of these assignment with different levels of difficulty, which are a great way to practice your basic programming skills. https://exercism.org/tracks/julia

Julia documentation page: https://docs.julialang.org/en/v1/ 