# Getting Started in Julia
## Julia Basics
To get started in Julia, we're going to look at two basic concepts: **loops** and **functions**. We won't cover the full Julia syntax, but it is easy to pick up (especially if you are familier with Python). The documentation is available [here](https://docs.julialang.org/en/v1/).

### Jupyter notebooks

Jupyter notebooks are a convenient way to write and run code in interactive Julia sessions. This can be really useful when you are exploring data or writing code that runs over multiple lines.
Notebooks can also display formatted text (like this cell) and images. These notebooks are very commonly used for Julia and Python projects since they allow you to create a narrative around your code and make it easier for other users (and yourself!) to follow.

You can learn more about how to navigate Jupyter notebooks (creating cells, changing between code and text, etc.) [here](https://nbviewer.jupyter.org/github/ipython/ipython/blob/3.x/examples/Notebook/Index.ipynb).

### For loops
A **for** loop is a piece of code that repeats the same calculation multiple times.

In [None]:
for i=1:5
    println(i)
end

In Julia, for loops can iterate over any array.

In [None]:
v = ["alpha","beta"]
for i in v
    println(i)
end

### If statement
An **if** statement evaluates a condition and only runs code if the condition is `true`. If the condition is `false`, the inner code is ignored.

We can include several conditions in the same if statement by adding an **elseif** condition. If we add an **else** statement at the end, this code will run if none of the previous conditions were `true`. 

In [None]:
# if the number is less that 3, print it. 
# if the number is equal to 4, add 2 and print the result
# for other numbers print "skip!"
x = 4

if x<3
    println(x)
elseif x==4
    println(x+2)
else
    println("skip")
end

#### Exercise
Write a for loop to evaluate the if statement (above) on the numbers 1 to 5.

In [None]:
for x=1:5
    if x<3
        println(x)
    elseif x==4
        println(x+2)
    else
        println("skip")
    end
end

### While loops
A **while** loop is a combination of  **for** and **if**. If the **while** condition is `true`, it evaluates the code inside the loop and then returns to check the condition again. It will keep evaluating the inner code until the condition is no longer `true`.

Note: when writing while loops, make sure that the condition will eventually be `false` to avoid an infinite loop!            

In [None]:
# while loop
i = 1
while i<5
    println(i)
    i += 1 # this is just a quicker way of saying i = i + 1
end

## Functions
We've already learned how to use built-in functions in `R` and Julia. You can also write your own functions to store and re-use code that you'll need to run many times. In Julia, functions are especially beneficial because Julia's built-in complier will learn how to run them efficiently.

Each function has a few basic components:
* name (this is what you will use to run the function)
* arguments/input data
* body (code that runs in the function)
* output

In [None]:
function addOne(x) # name(arguments)
    # body of the function
    y = x+1
    return(y) # output
end

addOne(5)

In [None]:
# function
function Maximum(a,b) # name(arguments)
    # body of the function
    if a>b
        println("$a is bigger than $b")
        return(a) # output
    elseif b>a
        println("$b is bigger than $a")
        return(b) # output
    elseif b==a
        println("$a is equal to $b")
        return(a) # output
    end
end

In [None]:
m = Maximum(2,1)
println("m=$m")

#### Exercise 1: Practicing Functions
Write a function that takes in a 1D array of numbers and returns another array containing only the even numbers from the first array. Hint: a number x is even if we divide it by 2 and get a remainder of 0. In Julia, use x%2 to calculate the remainder.

#### Advanced Exercise: Handling errors
Functions can behave in unexpected ways if their input doesn't match the expected type. For example, try running one of the functions above with strings as inputs.
Look up the try/catch statement here: https://docs.julialang.org/en/v1/manual/control-flow/#Exception-Handling-1. Can you implement this to provide a warning if the function code doesn't succeed?

## Intro to Data Manipulation and Visualization in Julia
In this section, we will learn how to read in data and conduct data manipulation and visualization in Julia. This is an important step in solving a real-world optimization problem, as real-world data can be messy and difficult to work with.

## DataFrames
Like data frames in `R`, `Julia` also has a similar structure for datasets. You will need to load the packages `DataFrames` and `CSV` first:

In [None]:
using DataFrames, CSV

Now let's read in the data:

In [None]:
iris = CSV.read("iris.csv");

To view the first few rows of the data, you can use `first()`, or index the dataframe similar to what you did you in `R`:

To subset rows, pass in the indices in the first dimension. If you are not subsetting to particular columns, just pass in ``:`` in the second dimension (as opposed to leaving it blank in `R`).

In [None]:
iris[1:5,:]
first(iris,5)

In [None]:
To index a column using column name, simply put a `:` in front of the name to make it into a Julia symbol. 
We could also write the column name like this: `symbol("SepalLength")`.


To select all rows, you can either type `[:,:columnName]` or `[!,:columnName]`.

In [None]:
iris[!,:SepalLength]

In [None]:
species_price = DataFrame(Species = ["setosa", "versicolor", "virginica"],
                        Price = [2.5, 3.1, 3.2])

In [None]:
join(iris, species_price, on = :Species, kind = :left)


## Plotting in Julia

Julia also has extensive support for plotting. 

* `Plots.jl` is a powerful and concise tool for plotting. It provides the interface to many other plotting packages with simple and consistent syntax.
* `StatPlots.jl` offers the DataFrames integration for `Plots`. You can pass in a data frame, and map aesthetics to the column names directly. 

Using these would be somewhat similar to working with `ggplot2` in `R`. 

Here is an example of a scatter plot based on the `iris` data, where the x axis is the `SepalLength`, y axis is `SepalWidth`, and the grouping (therefore the colors) are based on the `Species`.

In [None]:
using Plots
using StatsPlots
pyplot()
scatter(iris[!,:SepalLength],iris[!,:SepalWidth],group=iris[!,:Species])

We can make the plot more interesting by adding a few custom settings. For example:
* Give it a title
* Provide xlabel and ylabel
* Change the transparency, shape, and size of the dots
* change background color to dark grey

In [None]:
scatter(iris[!,:SepalLength],iris[!,:SepalWidth],group=iris[!,:Species],
        title = "Sepal length vs. width",
        xlabel = "Length", ylabel = "Width",
        m=(0.5, [:cross :hex :star7], 12),
        bg=RGB(.2,.2,.2))

You can also do a box plot (with the violin plot in the background) grouped by the species. Note the `!` in `boxplot!` adds the current plot to the existing one. 

In [None]:
violin(iris[!,:Species],iris[!,:SepalLength])
boxplot!(iris[!,:Species],iris[!,:SepalLength], leg=false,
    xlabel = "Species", ylabel = "Sepal Length")

In [None]:
There are many other types of plots and custom options. You can explore more from [the tutorial](https://juliaplots.github.io/tutorial/).

#### Exercise: Plotting Icecream data

This time, we are going to read in a dataset directly from the package `RDatasets`. Use the following syntax 
```dataset("Ecdat", "Icecream")```

and save it as a dataframe called `icecream`. 

The dataset is on the ice cream consumption. The columns are:
* `Cons`: consumption level of ice cream
* `Income`: income level
* `Price`: price of ice cream
* `Temperature`: outside temperature at time of measurement

Inspect the first few rows of the data.

In [None]:
using RDatasets
icecream = dataset("Ecdat", "Icecream")
first(icecream,5)

##### Question 1: How is income related to Consumption?

In [None]:
scatter(icecream[!,:Income], icecream[!,:Cons],
    xlabel = "Income", ylabel = "Consumption")

##### Question 2: Do you see a positive relationship between the temperature and revenue?
Hint: start by creating the `Revenue` variable as the product between `Price` and `Cons`. 



In [None]:
icecream[!,:Revenue] = icecream[!,:Price] .* icecream[!,:Cons]
scatter(icecream[!,:Temp], icecream[!,:Revenue],
xlabel = "Temperature", ylabel = "Revenue")

##### Question 3: How does consumption relate to income?
Create a new variable `IncomeGroup` that assigns label to each row based on how much income was recorded (e.g. you could have 'low', 'medium' and 'high' groups). Then, plot the distribution of the consumption over the different groups.

In [None]:
function get_income_group(x)
    if (x < 80) 
        gr = "low"
    elseif (x < 85)
        gr = "medium"
    else 
        gr = "high"
    end
end

icecream[!,:IncomeGroup] = map(get_income_group,icecream[!,:Income])

In [None]:
boxplot(icecream[!,:IncomeGroup], icecream[!,:Cons], leg=false,
xlabel = "Income group", ylabel = "Consumption")