# Introduction to Julia

Abhi Gupta

6/26/2017

Based on Pearl Li's notebook from [QuantEcon's RBA/RBNZ Julia workshops](https://github.com/QuantEcon/RBA_RBNZ_Workshops).

Exercises taken from QuantEcon's [Julia Essentials](https://lectures.quantecon.org/jl/julia_essentials.html) and [Vectors, Arrays, and Matrices](https://lectures.quantecon.org/jl/julia_arrays.html) lectures.

### Outline

1. Syntax Review
2. Types and Multiple Dispatch
3. Additional Exercises

## Syntax Review

Most of the syntax covered here will be fairly familiar to users of MATLAB, but is worth covering in one place nonetheless.

### Hello World

In [None]:
println("Hello world!")

### Variable Assignment

In [None]:
# Assign the value 10 to the variable x
x = 10

In [None]:
# Variable names can have Unicode characters
# To get ϵ in the REPL, type \epsilon<TAB>
ϵ = 1e-4

In Julia, a variable name is just a reference to some data, not the piece of data itself. Multiple names can be associated with the same piece of data, unlike in MATLAB, where the name of a piece of data is bound to the data itself.

Variable names are case-sensitive. By convention, they are in snake_case.

### Booleans

Equality comparisons:

In [None]:
0 == 1

In [None]:
2 != 3

In [None]:
3 <= 4

Boolean operators:

In [None]:
true && false

In [None]:
true || false

In [None]:
!true

### Strings

In [None]:
# Strings are written using double quotes
str = "This is a string"

In [None]:
# Strings can also contain Unicode characters
fancy_str = "α is a string"

In [None]:
# String interpolation using $
# The expression in parentheses is evaluated and the result is 
# inserted into the string
"2 + 2 = $(2+2)"

### Functions

In [None]:
# Regular function definition
function double(x)
    y = 2x # scalar multiplication does not need a *
    return y
end

In [None]:
# Inline function definition
inline_double(x) = 2x

In [None]:
# Functions can refer to variables that are in scope when the
# function is defined
a = 5
add_a(x) = x + a
add_a(1)

In [None]:
# Functions can return multiple arguments
duple_of(x) = x, x + 1
a, b = duple_of(3)

In [None]:
# Optional arguments - no more varargin!
function foo(x, y = 0, override = 0)
    if override == 0
        return x + y
    else
        return override
    end
end

# Call with one argument
foo(5)

In [None]:
# Call with two arguments
foo(5, 3)

In [None]:
# If we want to specify override, we must also specify y
foo(5, 3, 100)

In [None]:
# Keyword arguments allow arguments to be identified by name
# instead of only by position
function join_strings(string1, string2; separator = ",")
    return string1 * separator * string2
end

# Call without keyword argument
join_strings("ciao", "mondo")

In [None]:
# Call with keyword argument
join_strings("ciao", "mondo"; separator = " ")

### Arrays

Explicit array construction:

In [None]:
A = [1, 2]

In [None]:
B = [1 2 3; 4 5 6]

One-dimensional arrays `Array{Int64,1}` are also called (type-aliased) `Vector{Int64}`s. Two-dimensional arrays are called `Matrix{Int64}`s.

Note that `A` is a `Vector{Int64}` of length 2, which is distinct from a `Matrix{Int64}` of size $2 \times 1$ (like a MATLAB "column vector") or a `Matrix{Int64}` or size $1 \times 2$ ("row vector").

Built-in array constructors:

In [None]:
zeros(2)

In [None]:
ones(2)

In [None]:
eye(2)

In [None]:
fill(true, 2)

Matrix operations:

In [None]:
# Matrix transpose
B'

In [None]:
# Matrix addition
B + B

In [None]:
# Add a matrix to a vector using broadcasting
B .+ A

In [None]:
# Matrix inverse
C = 4*eye(2)
inv(C)

In [None]:
# Elementwise operations
B .> 3

Access array elements using square brackets:

In [None]:
# First row of B
B[1, :]

In [None]:
# Element in row 2, column 3 of B
B[2, 3]

### Control Flow

If statements:

In [None]:
x = -3
if x < 0
    println("x is negative")
elseif x > 0 # optional and unlimited
    println("x is positive")
else         # optional
    println("x is zero")
end

While loops:

In [None]:
i = 3
while i > 0
    println(i)
    i = i - 1
end

For loops:

In [None]:
# Iterate through ranges of numbers
for i = 1:3
    println(i)
end

In [None]:
# Iterate through arrays
cities = ["Boston", "New York", "Philadelphia"]
for city in cities
    println(city)
end

In [None]:
# Iterate through arrays of tuples using zip
states = ["MA", "NY", "PA"]
for (city, state) in zip(cities, states)
    println("$city, $state")
end

In [None]:
# Iterate through arrays and their indices using enumerate
for (i, city) in enumerate(cities)
    println("City $i is $city")
end

### Exercises
1. Consider the polynomial $$p(x) = \sum_{i=0}^n a_i x^i$$ Using `enumerate`, write a function `p` such that `p(x, coeff)` computes the value of the polynomial with coefficients `coeff` evaluated at `x`.

2. Write a function that takes two 1-d arrays `x` and `y` and computes their inner product using `zip`.

3. Write a function that takes two sequences `seq_a` and `seq_b` as arguments and returns true if every element in seq_a is also an element of seq_b, else false. By “sequence” we mean an array, tuple or string (many Julia operations will work on all 3 types).

## Types and Multiple Dispatch

A **data type** is a classification identifying the kind of data you have. An object’s type determines the possible values it can take on, which operations and functions can be applied to it, and how the computer stores it.

Examples:

- Numeric types: `Int64`, `Float64`
- String types: `ASCIIString`, `UTF8String`
- `Bool`
- `Array`

Names of types are written in UpperCamelCase.

A **concrete instance** (also an object or a value) of a type `T` is a piece of data in memory that has type `T`.

Variables are not data, but are simply names that point/refer to a specific piece of data. The underlying data that a variable refers to has a specific type.

In [None]:
# What is the type of 10?
typeof(10)

In [None]:
# Is 10 an Int64?
isa(10, Int64)

In [None]:
# What is the type of the elements of an array?
X = [1.0, 2.0, 3.0]
eltype(X)

### Composite Types

A **composite type** is a collection of named fields that can be treated as a single value. They bear a passing resemblance to MATLAB structs.

All fields must be declared ahead of time. The double colon, `::`, constrains a field to contain values of a certain type. This is optional for any field.

In [None]:
# Type definition
type Parameter
    value::Float64
    transformation::Function # Function is a type!
    tex_label::String
    description::String
end

When a type with $n$ fields is defined, a constructor (function that creates an instance of that type) that takes $n$ ordered arguments is automatically created. Additional constructors can be defined for convenience.

In [None]:
# Creating an instance of the Parameter type using the default
# constructor
β = Parameter(0.9, identity, "\beta", "Discount rate")

In [None]:
# Alternative constructors end with an appeal to the default
# constructor
function Parameter(value::Float64, tex_label::String)
    transformation = identity
    description = "No description available"
    return Parameter(value, transformation, tex_label, description)
end

α = Parameter(0.5, "\alpha")

In [None]:
# Find the fields of an instance of a composite type
fieldnames(α)

In [None]:
# Access a particular field using .
α.value

In [None]:
# Fields are modifiable and can be assigned to, like 
# ordinary variables
α.value = 0.75

### Subtyping

Types are hierarchically related to each other. All are subtypes of the `Any` type.

There are two main kinds of types in Julia:

1. Concrete types: familiar types that you can create instances of, like `Int64` or `Float64`.
2. Abstract types: nodes in a type graph that serve to group similar kinds of objects. Abstract types cannot be instantiated and do not have explicitly declared fields. For example, `Integer` or `Number`.

In [None]:
# Define an abstract type
abstract Model

In [None]:
# Define concrete subtypes of that abstract type
type VAR <: Model
    n_lags::Int64
    variables::Vector{Symbol}
    data::Matrix{Float64}
end

In [None]:
# Check subtyping relation
VAR <: Model

In [None]:
# Instances of the VAR type are also instances of the Model type
model = VAR(1, [:gdp, :inflation]
isa(model, Model)

In [None]:
# Why does this throw an error?
3 <: Number

### Parameterized Types

**Parameterized types** are data types that are defined to handle values identically regardless of the type of those values.

Arrays are a familiar example. An `Array{T,1}` is a one-dimensional array filled with objects of any type `T` (e.g. `Float64`, `String`).

In [None]:
# Defining a parametric point
type Duple{T} # T is a parameter to the type Duple
    x::T
    y::T
end

This single declaration defines an unlimited number of new types: `Duple{String}`, `Duple{Float64}`, etc. are all immediately usable.

In [None]:
Duple(3, -15)

In [None]:
Duple("Broadway", "42nd St")

In [None]:
# What happens here?
Duple(1.5, 3)

We can also restrict the type parameter `T`:

In [None]:
# T can be any subtype of Number, but nothing else
type PlanarCoordinate{T<:Number}
    x::T
    y::T
end

In [None]:
PlanarCoordinate("4th Ave", "14th St")

### Why Use Types?

You can write all your code without thinking about types at all. If you do this, however, you’ll be missing out on some of the biggest benefits of using Julia.

If you understand types, you can:

- Write faster code
- Write expressive, clear, and well-structured programs (keep this in mind when we talk about functions)
- Reason more clearly about how your code works

Even if you only use built-in functions and types, your code still takes advantage of Julia’s type system. That’s why it’s important to understand what types are and how to use them.

In [None]:
# Example: writing type-stable functions
function sumofsins_unstable(n::Float64)  
    sum = 0  
    for i in 1:n  
        sum += sin(3.4)  
    end  
    return sum 
end  

function sumofsins_stable(n::Float64)  
    sum = 0.0  
    for i in 1:n  
        sum += sin(3.4)  
    end  
    return sum 
end

# Compile and run
sumofsins_unstable(1e5)
sumofsins_stable(1e5)

In [None]:
@time sumofsins_unstable(1e5)

In [None]:
@time sumofsins_stable(1e5)

In `sumofsins_stable`, the compiler is guaranteed that `sum` is of type `Float64` throughout; therefore, it saves time and memory. On the other hand, in `sumofsins_unstable`, the compiler must check the type of `sum` at each iteration of the loop. Let's look at the LLVM [intermediate representation](http://www.johnmyleswhite.com/notebook/2013/12/06/writing-type-stable-code-in-julia/).

### Exercise
Write a function `solve_discrete_lyapunov` that solves the discrete Lyapunov equation $$S = ASA' + \Sigma \Sigma'$$ using the iterative procedure $$S_0 = \Sigma \Sigma'$$ $$S_{t+1} = A S_t A' + \Sigma \Sigma'$$ taking in as arguments the $n \times n$ matrix $A$, the $n \times k$ matrix $\Sigma$, and a number of iterations. 
You can assume that your $A$ and $\Sigma$ matrices are of type `Matrix{Float64}` and the number of iterations is an `Integer`. Make sure to check your code for type stability!

### Multiple Dispatch

So far we have defined functions over argument lists of any type. Methods allow us to define functions “piecewise”. For any set of input arguments, we can define a **method**, a definition of one possible behavior for a function.

In [None]:
# Define one method of the function print_type
function print_type(x::Number)
    println("$x is a number")
end

In [None]:
# Define another method
function print_type(x::String)
    println("$x is a string")
end

In [None]:
# Define yet another method
function print_type(x::Number, y::Number)
    println("$x and $y are both numbers")
end

In [None]:
# See all methods for a given function
methods(print_type)

Julia uses **multiple dispatch** to decide which method of a function to execute when a function is applied. In particular, Julia compares the types of _all_ arguments to the signatures of the function’s methods in order to choose the applicable one, not just the first (hence "multiple").

In [None]:
print_type(5)

In [None]:
print_type("foo")

In [None]:
# This throws an error because no method of print_type has been
# defined for this set of arguments
print_type([1, 2, 3])

How is multiple dispatch useful for economic research? Recall that we defined the type `VAR` earlier, and made it a subtype of our abstract type `Model`. Let's define another subtype of `Model`:

In [None]:
# Define a general linear model
type GLM <: Model
    y_variables::Vector{Symbol}
    x_variables::Vector{Symbol}
    y_data::Matrix{Float64} # Nt * Ny
    x_data::Matrix{Float64} # Nt * Nx
end

Now we can use the same function name, `estimate`, to define different estimation behaviors for the different subtypes of `Model`:

In [None]:
using Distributions

function estimate(model::GLM)
    # Estimate a general linear model using OLS
end

function estimate(model::VAR)
    # Estimate a VAR using maximum likelihood
end

function estimate(model::VAR, prior::Distribution)
    # Estimate a Bayesian VAR
end

In [None]:
methods(estimate)

### Exercise

Implement the function `estimate(model::GLM)` using the given `GLM` type. That is, return a matrix of size $N_x \times N_y$ of coefficients estimated using OLS. You may find the `pinv` and `inv`, or `qr` functions helpful.

Test it on the following model:

In [None]:
β = ones(2, 1)                     # Nx x Ny
x_data = rand(1000, 2)             # Nt x Nx
y_data = x_data*β + randn(1000, 1) # Nt x Ny
model = GLM([:y1], [:x1, :x2], y_data, x_data)
# β_hat = estimate(model)

### Writing Julian Code

As we've seen, you can use Julia just like you use MATLAB and get faster code. However, to write faster and _better_ code, attempt to write in a “Julian” manner:

- Define composite types as logically needed
- Write type-stable functions for best performance
- Take advantage of multiple dispatch to write code that looks like math
- Add methods to existing functions

### Just-in-Time Compilation

How is Julia so fast? Julia is just-in-time (JIT) compiled, which  means (according to [this StackExchange answer](http://stackoverflow.com/questions/95635/what-does-a-just-in-time-jit-compiler-do), with emphasis mine):

> A JIT compiler runs after the program has started and compiles the code (usually bytecode or some kind of VM instructions) on the fly (or just-in-time, as it's called) into a form that's usually faster, typically the host CPU's native instruction set. _A JIT has access to dynamic runtime information whereas a standard compiler doesn't and can make better optimizations like inlining functions that are used frequently._

> This is in contrast to a traditional compiler that compiles all the code to machine language before the program is first run.

In particular, Julia uses type information at runtime to optimize how your code is compiled. This is why writing type-stable code makes such a difference in speed!

## Additional Exercises


1. Write a function `linapprox` that takes as arguments:
   - A function `f` mapping some interval $[a, b]$ into $\mathbb{R}$
   - Two scalars `a` and `b` providing the limits of this interval
   - An integer `n` determining the number of grid points
   - A number `x` satisfying $a \leq x \leq b$

   and returns the piecewise linear interpolation of `f` at `x`, based on `n` evenly spaced grid points `a = point[1] < point[2] < ... < point[n] = b`. Aim for clarity, not efficiency.<br><br>
