# Getting Started with Julia
* Composed by Anh Tran, PhD Candidate, Department of Economics, University of Bologna.
* This note is adapted mainly from the book **Mastering Julia** by Malcolm Sherrington and various sources (will be cited accordingly). 
* I assume no prior experience with Julia or Python, to which Julia's scripting is close, so I start from the very beginning. I focus on those most relevant for econometrics problems and so neglect many others.
* Sometimes I work on a working station in my office but most of the codes are performed on IJulia (or JuliaBox - a very similar kernel) on my laptops, either ASUS F555L or VAIO T13.
* Compilers: there are many available compilers for Julia but I choose Atom (a cross-platform one) with Juno package for Julia. 
* OS: I found Julia starts a bit faster on Linux (on the VAIO) and a bit slower on Wins (on the ASUS) but the performance is almost the same. 

## A note on packages for Julia 
* Statistics is provided on Github by both the https://Github.com/JuliaStats group and a Google group called https://groups.google.com/forum/#!forum/julia-stats.
    * Basic statistics is provided by `Stats.jl` and `StatsBase.jl`. There is support for time series, cluster analysis, hypothesis testing, MCMC methods and more. 
* For mathematics: 
    * The core includes: random number generators & exotic functions 
    * Packages are available for elemental calculus operations, ODE solvers, Monte-Carlo methods, programming and optimization. 
        * Packages for optimization methods are listed at the GitHub page: https://Github.com/JuliaOpt/


## String

```julia 
methodswith(String) # use to call all string-related functions 
```
- An array of characters that can be extracted by indexing it, starting from 1 
    - ``str = "Julia"``, then ``str[1]`` return "J", and ``str[end]`` returns "a". 
    - to return the length of characters: 
        - `endof(str)` : e.g, ``endof("Julia") ``
        - `length(str)` : ``length("Julia")``

#### Example
```julia
s="hello"
println(s)
endof(s)
x=length(s)
s=s[x-1]
println(s, "- length of s is ", x)
#Unicode
s2 = "I am the α: the beginning"
s2[12]
length(s2)
# length() and endof() give two diferent values 
# to look up the end character, it's best to use loops 
for c in s2 
    # println(c)
end
#substring : obtained by taking a range of indices 
return s2 
s2[3:5]
```

### Symbol type of Strings 
* Strings prefixed with __`:`__ are of type `Symbol`, such as `:green` 
    * used for IDs or keys 
    * cannot be concatenated 
    * should be used only if expected to remain constant over the course of the execution of the program 
    
### Constructing Strings
* Julia has an elegant string interpolation mechanism for constructing strings 
* concatenate strings 
    * with [*] operator or with the `string()` function:
        * Eg1: `"ABC" * "CDE"` returns "ABCCDE"
        * Eg2: `string("abc","def","ghi")` returns "abcdefghi"
* **Special characters** 
    * allow for: `\n, \t, \"` and so on 
    * Eg: 
    ```julia 
    s = "This is a double quote \" character."
    println(s) # return: This is a double quote " character.
    ```

### Formatting Numbers & Strings 
* The `@printf` macro takes a format string & one or more variables to substitute in this string while being formatted 
    * we can write a format string that includes **placeholders** for variables
* `@sprintf` treats a string as the return value 

```julia 
name = "Pascal"
@printf("Hello, %s \n", name)  
@sprintf("Hello, %s", name)
# @sprintf("Hello, %s \n", name) - wrong form, \n needs excluded
```



## Logical and Arithmetic Operators 
* Decimal arguments do not require prefixes 
* Assigment of binary, octal, hexadecimal ones need prefixes: `0b, 0o, 0x`
    * I guess, in economics, we rarely use these types so I won't go into details
* For operating on bits: `~(not), | (or), & (and), $(xor)`
* Arithmetic shifts can be performed using `<<` (LEFT) and `>>` (RIGHT) operator 
    * Arithmetic shilfs preserve the sign bit. 
    * Irrelevant when dealing with *unsigned* integers 

#### Example
```julia 
x=0xbb31; y=0xaa5f; 
z1=x$y 
z2=z1<<8  # the top two nibbles discarded 
typeof(z2)
```

## Booleans 
* `Bool` in Julia is a logical type. 
* `Bool` assigned to a variable when one equates it: 
    * to the constant `true` or `false` or
    * to a logical expression, e.g. `typeof(p)`
* True/False 
    * Julia does not treat 0, empty strings, Nulls as False and anything else as True. 
    * A Bool value may be promoted to an integer in which case true corresponds to unity

#### Example 
```julia 
return x #0xbb31
p=(2>3) # false 
x+p #0xbb31
p=(2<3) # true as 1 
x+p # 0xbb32
```

## Arrays 
* An indexable collection of (commonly) homogeneous values as integers, floats, booleans. 
* The index in Julia starts at **1** not 0. 
* Create arrays: 
    1. by enumerating its values 
    2. by defining a range as: `[start:step:end]`
        * if `step` provided as float values, the type of the array will be the step type
    3. by using functions: `zeros, ones, rand`. 
        * values often returned as float-point values. A little work is needed to provide integer results. 
    4. by using a list comprehension: i) create an uninitialized array, ii) assign the first values, iii) using the definition of the series to create the remaining values
    5. using `Int64[]` for a completely empty array. The array sizes are fixed but one can use functions to alter the size, e.g. `push!()` adds a value and increases the length by one. 
        * *Note*: the tailing `!` is borrowed from the Lisp-like syntax conventions, so one shouldn't use it for referring to common variables.
        
#### Example
** Arrays**
```julia
S1=[1, 2, 3, 1, 3, 10, 9] # 7-element Array{Int64,1}
S2=[1:3:9] # 3-element Array{Int64,1}
S3=[1:0.2:8] # 36-element Array{Float64,1}
             # type of C's elements is Float64, not Int64 as the step is float
S4=int(zeros(12)) #note the int() before using function zeros 
                  # 12-element Array{Int64,1}
S5=rand(1:12, 19) #19-element Array{Int64,1}
S6=int(ones(12)) 
#list comprehension for Fibonacci series  
A1=Array(Int64,18); A[1]=0; A[2]=1; [A[i]=A[i-1]+A[i-2] for i=3:length(A)]
    # 16-element Array{Any,1}
#empty array
A2=Int[] ; A3=Int64[] #same 
push!(A3,123) # 1-element Array{Int64,1}
```

**Matrices** 
```julia
A=[1 2 3 4 4] #1x5 Array{Int64,2}. Without commas, A is a 5-element Array{Int64,1}
A=[1 2 3 4 4; 1 2 3 4 5] # 2x5 Array{Int64,2}
A=reshape(A, 5,2) # reshape to 5x2 
#unit matrix
I=eye(4) # return a square (real) matrix with leading diagonal unity and the other value of zero

```

## Operations on Matrices 
* We are going to look at the simplest of operations. Operations on matrices are particularly important for econometrics. 
* **Matrix Operations**
    * transpose: `transpose()`
    * addition: `+, -`
    * product (mp x pq =mq): `*` 
    * division `/` (same size, square, non-square)
* **Elemental Operations** (same shape/size matrices)
    * work on the elements of the matrix 
    * product `(A(ij) * C(ij)): .*`
    * division `(A(ij) ./ C(ij)): ./`
    * comparison: `.==`
* **Other**
    * inverse: `inv(A)`

# Real and Rational Numbers 
I consider two number types: `Real` and `Rational` and Julia's associated operations. I leave `Complex` aside for a moment.

## Reals 
### Operators and Built-in Functions 
* For a comprehensive list of built-in functions, let's have a look at https://github.com/JuliaLang/julia/blob/release-0.4/doc/manual/mathematical-operations.rst
    * Example: `exp(), log(), sin(), cos(), gamma, bessel, zeta and hankel`
    * Note: the multiplication `*` can be omitted when there's no ambiguity. E.g: `x` is a variable, then `2*x` equiv. `2x`

### Special Values 
* There are three special values in dealing with real numbers: `Inf, -Inf` and `NaN`
    * `NaN` is "not a number" and not equal to any floating point value including itself. 
- Example 
```julia 
1.0/0.0 #Inf
-1.0/0.0 #-Inf
0.0/0.0 #NaN
typemin(Float64) #-Inf
typemax(Float64) #Inf
```

## Rationals 
* represents the exact ratios of integers, by using `//`
* if the numerator and denominator has a common factor then the number is reduced to the simplest form 
* **Functions** 
    * return values of numerator /denominator: `num(), den()`
    * convert rationals to float: `float()`



# More about Matrices 
I'm going to look at various concepts of: multi-dimensional arrays, sparse arrays, but first of all, vectorization and devectorization. 

## Vectorized and Devectorized Code 
The general ideas as follows:
* Vectorization: uses arrays plus operation to perform the calculation 
* Devectorization: loops **through the arrays** and performs a series of operation on scalars. 
In the current Julia 0.4.5, devectorized code is still performing better than vectorized code though the gap seems shortened compared to those in previous versions. 

Let's look at the following code
```julia 
function vecprod1(a,b,c,N) 
    for i=1:N 
        c=a.*b 
    end
end 

function vecprod2(a,b,c,N) 
    for i=1:N, j=1:length(c)
        c[j]=a[j]*b[j]
    end 
end 
```
* `Vecadd1` is vectorized code and `vecadd2` is devectorized. To me, the former looks actually neater and more readable.

* Let's see the huge difference in performance 

```julia 
A=rand(2); B=rand(2); C=zeros(2); 
@elapsed vecprod1(A,B,C,100000000) # 23.75
@elapsed vecprod2(A,B,C,100000000) # 1.06

#increase the size
A=rand(32); B=rand(32); C=zeros(32); 
@elapsed vecprod1(A,B,C,100000000) # 60.03
@elapsed vecprod2(A,B,C,100000000) # 24.12
```
## Multi-dimension Arrays
**To be continued**


## Sparse Matrices 
* For a complete discussion on arrays and matrices, have a look at the Julia docs at: http://docs.julialang.org/en/release-0.4/manual/arrays/
* Malcom only provides a brief discussion on sparse matrices, which I think are very important for econometrics applications. So this part mostly relies on the Julia document. 

* Dense vs. Sparse matrices 
    * Normal matrices where each entry corresponds to one cell `cell[i,j]` are inefficient if values are mostly zeros. 
    * It is better to use a scheme of tuples: `(i,j,v)` where `v` is the value referenced by `i` and `j`. They are named sparse matrices and can be created as follows. 

### Construct sparse matrices 
Let's look at how to create a sparse matrix 
* **From dense matrices**: functions creating dense matrices also work for sparse matrices by adding a `sp` prefix. 

```julia 
spzeros(3,10)
3x10 sparse matrix with 0 Float64 entries:

zeros(3,10)
3x10 Array{Float64,2}:
 0.0  0.0  ..  0.0
 ...
 0.0  0.0 ..   0.0
 
speye(3,3)
3x3 sparse matrix with 3 Float64 entries:
	[1, 1]  =  1.0
	[2, 2]  =  1.0
	[3, 3]  =  1.0
```

* **By `sparse()` function**: 
    * The function takes as its input a vector `I` of row indices, a vector `J` of column indices, and a vector `V` of nonzero values. 
    * ```sparse(I,J,V)``` constructs a sparse matrix such that `S[I[k],J[k]]=V[k]`.    
*Example*: create an identity matrix identical to the one generated by `speye(3,3)`
```julia 
I=[1,2,3]; J=[1,2,3];
V=[1,1,1]
M=sparse(I,J,V)
3x3 sparse matrix with 3 Int64 entries:
	[1, 1]  =  1
	[2, 2]  =  1
	[3, 3]  =  1
```
* To **retrieve the inputs** used to generate the sparse matrix, use `findn()` - to recover `I, J` vectors, and `findnz()` for `I, J & V`. 
*Example*: 
```julia
findnz(M)
([1,2,3],[1,2,3],[1,1,1])
```

* **Convert a dense matrix into a sparse matrix**
    * The `sparse()` function can convert a dense matrix into a sparse one
    
```julia 
sparse(eye(5))
   5x5 sparse matrix with 5 Float64 entries:
   [1, 1]  =  1.0
    ...
   [5, 5]  =  1.0

M1=[0 2 0; 0 0 1; 0 0 3]
M2=sparse(M1)
    3x3 sparse matrix with 3 Int64 entries:
	[1, 2]  =  2
	[2, 3]  =  1
	[3, 3]  =  3
```
    
## Data Arrays and Data Frames 
* The package `DataFrames` in Julia works similarly to data frames in R and the `pandas` package in Python. To install, use the familiar command 
```julia 
Pkg.add("DataFrames")
```
* The package introduces three basic types to Julia's base 
    - `NA` for missing value
    - `DataArray` extends the `Array` type to contain missing values 
    - `DataFrame` for the representation of tabular datasets 
* Working on DataFrames will be considered separately in the section of statistical computing. 

## Sets 
**To be continued**

# Types and Dispatch 
## Functions 
### First-class objects 
* What you can do with functions? 
    * assign to other identifiers 
    * pass as arguments to other functions 
    * return them as the value from other functions 
    * store as collections 
    * apply (`mapped`) to a set of values at runtime
* The data structure using the `()` notation is called a tuple which consists of an argument list of a set of dummy variables. 
    * Arguments are of type `{Any}` by default but specification of types can improve memory assignment and the optimization of generated code of the compiler. 
    
### Examples - Some special syntax for functions 
* `->` is a special syntax to set up an anonymous function. It looks very similar to the mathematical expression of a function, e.g., `f: Rn -> Rm`
* **Example 1**: a squaring function
```julia
map(x->x*x,([1 2 3]))
1x3 Array{Int64,2}:
 1  4  9
```
* **Example 2**: Hailstone sequence

```julia
function hailstone(n)
    k=1                            # set the counter (num. of elements)
    a=[n]                          #creating an array with the single entry [n]
    while n>1                      # while-end loops until the val of n reaches 1
        n=(n % 2 ==0) ? n>>1: 3n+1 # see later**
        push!(a,n)                 # each new value is added into the array & so increasing its length
        k+=1
    end
    return(k,a)
end
hailstone (generic function with 1 method)

hailstone(127)
(d,s)=hailstone(1000) # d is the num. of elements in the sequence
(112,[1000,500,250,125,376,188,94,47,142,71  …  80,40,20,10,5,16,8,4,2,1])
```

#### **More about syntax**
* let's look at the statement `n=(n % 2 ==0) ? n>>1: 3n+1` which encapsulates the logics of the Hailstone sequence. 
    * `(condition) ? statement-1: statement-2` is a shorthand for an `if else end` loop
    * `n>>1` is a bitshift left, and in this case, equivalent to `n\2`. Both halve `n` when `n` is even. 
* Julia orders its logical statements from left to right, so the following pairs of operators are equivalent: `orelse` & `||`; `andthen` & `&&` 
    * **short-circuit evaluation** are another couple of popular constructs:  
```julia
    (condition) || (statement) # if condition then true else statement
    (condition) && (statement) # if condition and then statement else false 
```      
   * The constructs return a value, either true or false, this will be T for || if the condition is met and F for `&&` if it isn't. 

### Passing arguments 
* Most functions can have a set of arguments and we can design an argument as being optional (or mandatory), and for the former, provide default values. 

#### Default and Optional arguments 
* If the type of arguments is not provided, `Any` is passed. Treating an `Any` argument, however, can cause errors and raise an exception as illustrated below. 
    * When basic operators are used for numbers of different types (but without specified ex ante), one type can be promoted to another. E.g, a real number is multiplied with a complex number, then the real number is promoted to a complex one and the result is complex.
* As for exceptions, an example is the multiplication of two arrays. 
```julia
a = [1.0,2,3]; println(sq(a))
ERROR: '*' has no method matching *(::Array{Float64,1},
::Array{Float64,1})
in sq at none:1
```
* If we want some (or all) of a function's argument to take default values , this can be done by using an `arg = value` syntax. 
```julia 
func(x, p=0.0)= exp(p*x)*sin(x)
t=linspace(0.0, 20*pi)
w=zeros(length(t))
for i=1:length(w)
    w[i]=func(t[i], 0.1)  # p=0.1 is the given value, otherwise p takes the default of 0.0
end
using PyPlot
plot(t,w)
methods(func)    # methods available for 
Out[29]: 2 methods for generic function func:
• func(x) at In[28]:1
• func(x, p) at In[28]:1
```
<img src="http://localhost:8888/files/image/untitled1.png" width="400" height="400" />

* **Remark**: 
    * In writing a function, optional arguments should come after the default ones. 
    * If there are two optional parameters, values for all of the preceding ones must be given in order to specify the ones come later in the list. See the example below: 
```julia 
y(x1, x2, x3, a=2.2, b=sqrt(a), c=sqrt(b*a), d=mean([a b c])) = a*x1 + b*x2 + c*x3 + d 
y(1,1,1)                   # all parameters (a,b,c,d) take default values 
              # Out[48]: 7.3195367113201835
y(0,2.2, 2, 1.1)           # set a=1.1 instead of the default of 2.2
              # Out[49]: 5.5298812455335815
y(0,2.2, 2, 1.1, 3, 2, 1)  # ignore the default definitions of (b,c,d) and reset their values, but before setting d, values of b,c must be specified
              # Out[50]: 11.600000000000001
```

In [50]:
y(x1, x2, x3, a=2.2, b=sqrt(a), c=sqrt(b*a), d=mean([a b c])) = a*x1 + b*x2 + c*x3 + d 
y(1,1,1)       # all parameters (a,b,c,d) take default values 
y(0,2.2, 2, 1.1, 3, 2, 1)  # ignore the default definitions of (b,c,d) and reset their values

11.600000000000001

In [39]:
? mean

search: mean mean! median median! SegmentationFault macroexpand module_parent



```
mean(v[, region])
```

Compute the mean of whole array `v`, or optionally along the dimensions in `region`. Note: Julia does not ignore `NaN` values in the computation. For applications requiring the handling of missing data, the `DataArray` package is recommended.


In [42]:
mean([1 2 4])

2.3333333333333335