# Julia Lesson
By Andrew Ma and Luke Miller

### What is Julia?
Julia is a high-level, high-performance dynamic programming language that looks like Ruby/Python syntax meets MatLab. It is meant to bridge the gap for mathematics and programming while also being very efficient at crunching numbers. Most of Julia's base library is written in Julia (woo metaprogramming).

In [1]:
# variable
x = 10
println(x)

# super hard math
y = x + 1
println(y)

# reassigning a variable
x = x + 1
println(x)

# unicode names
δ = 0.00001
println(δ)

안녕하세요 = "Hello"
println(안녕하세요)

10
11
11
1.0e-5
Hello


Stylistic Conventions:
- Names of variables are in lower case.
- Word separation can be indicated by underscores ('_'), but use of underscores is discouraged unless the name would be hard to read otherwise.
- Names of Types and Modules begin with a capital letter and word separation is shown with upper camel case instead of underscores.
- Names of functions and macros are in lower case, without underscores.
- Functions that write to their arguments have names that end in !. These are sometimes called “mutating” or “in-place” functions because they are intended to produce changes in their arguments after the function is called, not just return a value.

[Numeric Types](http://docs.julialang.org/en/stable/manual/integers-and-floating-point-numbers/)

In [2]:
# Overflow example
x = typemax(Int64)
println(x)
println(x+1)

9223372036854775807
-9223372036854775808


In [3]:
# Coefficients
x = 3
println(2x^2 - 3x + 1)
println(1.5x^2 - .5x + 1)
println(2^2x)

10
13.0
64


In [4]:
# Zero and One operators

println(zero(1.0))
println(one(0))

0.0
1


[Mathematical Operations](http://docs.julialang.org/en/stable/manual/mathematical-operations/)

In [5]:
# Char
a = 'a'
println(a)

# String
string = "I'm a string"
println(string)

a
I'm a string


In [6]:
# Functions
function e(x,y)
    x+y
end

function e2(x,y)
    x+y, x-y
end

f(x,y) = x + y
g = f 

∑(x,y) = x + y

println(e(1,2))
println(e2(1,2))
println(f(1,2))
println(g(1,2))
println(∑(1,2))

3
(3,-1)
3
3
3


In [7]:
# Functions continued
println(+(1, 2, 3))
h = +
println(h(1,2,3))

println(map(x -> x^2 + 2x - 1, [1,3,-1]))

bar(a,b,x...) = (a,b,x)
println(bar(1,2,3,4,5,6))

function optionalArg(x,y,z=0)
    x+y+z
end

println(optionalArg(1,2))
println(optionalArg(1,2,3))

6
6
[2,14,-2]
(1,2,(3,4,5,6))
3
6


In [8]:
# Scope
module A
a = 1 # a global in A's scope
end

module B
# b = a # would error as B's global scope is separate from A's
    module C
    c = 2
    end
b = C.c # can access the namespace of a nested global scope
        # through a qualified access
import A # makes module A available
d = A.a
# A.a = 2 # would error with: "ERROR: cannot assign variables in other modules"
end

B

In [9]:
# Method
k(x::Number, y::Number) = 2x - y;
println(k(1,2))

0


In [10]:
# Things with types and arrays

num = 12
println(typeof(num))
println(convert(UInt8, num))

numArray = Any[1 2 3; 4 5 6]
println(typeof(numArray))
println(numArray)
convert(Array{Float64}, numArray)

# Define your own conversion
import Base.convert
convert(::Type{Bool}, x::Real) = x==0 ? false : x==1 ? true : throw(InexactError())
println(convert(Bool, 1))
println(convert(Bool, 0))

Int64
12
Array{Any,2}
Any[1 2 3; 4 5 6]
true
false


## Type System and Polymorphism


Dynamic, with some of the advantages of static typings!
You can add type annotations that tell the compiler what concrete type a

In [32]:
1+2

3

In [11]:
(1+2)::AbstractFloat

LoadError: TypeError: typeassert: expected AbstractFloat, got Int64

In [12]:
(1+2)::Int

3

Julia has a nice way to call a different method based on what types are passed into it: multiple dispatch
Julia determines which function to dispatch the request to at run-time.

`Example function headers:
function collide(me::Circle, other::Rectangle)
function collide(me::Polygon, other::Circle)
function collide(me::Polygon, other::Rectangle)`

Then when you call 
`collide(me, other)`
it dispatches it to the correct method

In [13]:
type Point
    x::Float32
    y::Float32
end

type Vector2D
    x::Float32
    y::Float32
end

type UnitVector2D
    x::Float32
    y::Float32

    UnitVector2D(v::Vector2D) = (len = norm(v); new(v.x/len, v.y/len))
end

In [14]:
#Union Types:
VecOrUnit = Union{Vector2D, UnitVector2D}
dot(u::VecOrUnit, v::VecOrUnit) = u.x*v.x + u.y*v.y

dot (generic function with 1 method)

![Array methods](http://i.gyazo.com/029e6e3380e170f11b597c10e746601a.png)

In [15]:
# Generate random 4x4 array
randomArray = rand(4,4)

4×4 Array{Float64,2}:
 0.277027  0.0456322  0.21335   0.574605 
 0.205414  0.0257954  0.719418  0.284063 
 0.548858  0.891233   0.172288  0.0619572
 0.810177  0.959946   0.841337  0.317503 

In [16]:
# Broadcasting allows for the easy element-by-element binary operation on arrays
broadcast(+, randomArray, randomArray)

4×4 Array{Float64,2}:
 0.554053  0.0912644  0.426701  1.14921 
 0.410828  0.0515908  1.43884   0.568127
 1.09772   1.78247    0.344576  0.123914
 1.62035   1.91989    1.68267   0.635006

### Why use Julia?
![alt tag](http://i.gyazo.com/7e6d4b7b87a2d80f8d48e688026b5e94.png)
Julia is fast! In the figure, the benchmarks times are relative to C, where C=1.0
You can even call C code directly if you need even more speed.

In [1]:
# Calling C code
t = ccall( (:clock, "libc"), Int32, ())
println(t)

path = ccall((:getenv, "libc"), Cstring, (Cstring,), "SHELL")
unsafe_string(path)

3665644


"/bin/bash"

Julia is designed for paralellization and does not impose any style of parallelization on its users.The following example demonstrates how to count the number of heads in a large number of coin tosses in parallel.

In [18]:
nheads = @parallel (+) for i=1:100000000
  rand(Bool)
end

49997452

In [19]:
@time nheads = @parallel (+) for i=1:100000000
  rand(Bool)
end

  3.035767 seconds (200.02 M allocations: 2.981 GB, 7.63% gc time)


49997575

### DataFrames

In [2]:
using DataFrames

In [21]:
# DataArray
dv = @data([NA, 3, 2, 5, 4])
println(mean(dv))

println(mean(dropna(dv)))

convert(Array, dropna(dv))

println(dv)

# converting na's
dv = @data([NA, 3, 2, 5, 4])
println(convert(Array, dv, 11))

NA
3.5
[NA,3,2,5,4]
[11,3,2,5,4]


In [22]:
df = DataFrame(A = 1:10, B = ["M", "F", "F", "M", "F", "M", "F", "F", "M", "M"])

Unnamed: 0,A,B
1,1,M
2,2,F
3,3,F
4,4,M
5,5,F
6,6,M
7,7,F
8,8,F
9,9,M
10,10,M


In [23]:
println(head(df))
println(tail(df))
println(df[1:3, :])

6×2 DataFrames.DataFrame
│ Row │ A │ B   │
├─────┼───┼─────┤
│ 1   │ 1 │ "M" │
│ 2   │ 2 │ "F" │
│ 3   │ 3 │ "F" │
│ 4   │ 4 │ "M" │
│ 5   │ 5 │ "F" │
│ 6   │ 6 │ "M" │
6×2 DataFrames.DataFrame
│ Row │ A  │ B   │
├─────┼────┼─────┤
│ 1   │ 5  │ "F" │
│ 2   │ 6  │ "M" │
│ 3   │ 7  │ "F" │
│ 4   │ 8  │ "F" │
│ 5   │ 9  │ "M" │
│ 6   │ 10 │ "M" │
3×2 DataFrames.DataFrame
│ Row │ A │ B   │
├─────┼───┼─────┤
│ 1   │ 1 │ "M" │
│ 2   │ 2 │ "F" │
│ 3   │ 3 │ "F" │


In [24]:
describe(df)

A
Min      1.0
1st Qu.  3.25
Median   5.5
Mean     5.5
3rd Qu.  7.75
Max      10.0
NAs      0
NA%      0.0%

B
Length  10
Type    String
NAs     0
NA%     0.0%
Unique  2



In [25]:
println(mean(df[:A]))
println(median(df[:A]))

5.5
5.5


In [26]:
df2 = DataFrame(A = 1:4, B = randn(4))
println(df2)
colwise(cumsum, df2)

4×2 DataFrames.DataFrame
│ Row │ A │ B        │
├─────┼───┼──────────┤
│ 1   │ 1 │ -1.11442 │
│ 2   │ 2 │ -2.34239 │
│ 3   │ 3 │ -0.53598 │
│ 4   │ 4 │ 1.0139   │


2-element Array{Any,1}:
 DataArrays.DataArray{Int64,1}[[1,3,6,10]]                             
 DataArrays.DataArray{Float64,1}[[-1.11442,-3.45681,-3.99279,-2.97889]]

### Example

In [3]:
dataframe = readtable("train.csv")
head(dataframe)

Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked
1,1,0,3,"Braund, Mr. Owen Harris",male,22.0,1,0,A/5 21171,7.25,,S
2,2,1,1,"Cumings, Mrs. John Bradley (Florence Briggs Thayer)",female,38.0,1,0,PC 17599,71.2833,C85,C
3,3,1,3,"Heikkinen, Miss. Laina",female,26.0,0,0,STON/O2. 3101282,7.925,,S
4,4,1,1,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,35.0,1,0,113803,53.1,C123,S
5,5,0,3,"Allen, Mr. William Henry",male,35.0,0,0,373450,8.05,,S
6,6,0,3,"Moran, Mr. James",male,,0,0,330877,8.4583,,Q


![alt-img](https://i.gyazo.com/52718ce716290c7bc07e26f2282ecc86.png)

In [4]:
dataframe[:familysize] = dataframe[:SibSp] + dataframe[:Parch]
head(dataframe)

Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked,familysize
1,1,0,3,"Braund, Mr. Owen Harris",male,22.0,1,0,A/5 21171,7.25,,S,1
2,2,1,1,"Cumings, Mrs. John Bradley (Florence Briggs Thayer)",female,38.0,1,0,PC 17599,71.2833,C85,C,1
3,3,1,3,"Heikkinen, Miss. Laina",female,26.0,0,0,STON/O2. 3101282,7.925,,S,0
4,4,1,1,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,35.0,1,0,113803,53.1,C123,S,1
5,5,0,3,"Allen, Mr. William Henry",male,35.0,0,0,373450,8.05,,S,0
6,6,0,3,"Moran, Mr. James",male,,0,0,330877,8.4583,,Q,0


In [5]:
dataframe[:Age] = convert(Array, dataframe[:Age], mean(dropna(dataframe[:Age])))
head(dataframe)

Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked,familysize
1,1,0,3,"Braund, Mr. Owen Harris",male,22.0,1,0,A/5 21171,7.25,,S,1
2,2,1,1,"Cumings, Mrs. John Bradley (Florence Briggs Thayer)",female,38.0,1,0,PC 17599,71.2833,C85,C,1
3,3,1,3,"Heikkinen, Miss. Laina",female,26.0,0,0,STON/O2. 3101282,7.925,,S,0
4,4,1,1,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,35.0,1,0,113803,53.1,C123,S,1
5,5,0,3,"Allen, Mr. William Henry",male,35.0,0,0,373450,8.05,,S,0
6,6,0,3,"Moran, Mr. James",male,29.69911764705882,0,0,330877,8.4583,,Q,0


In [6]:
head(dataframe[dataframe[:Sex] .== "male", :])

Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked,familysize
1,1,0,3,"Braund, Mr. Owen Harris",male,22.0,1,0,A/5 21171,7.25,,S,1
2,5,0,3,"Allen, Mr. William Henry",male,35.0,0,0,373450,8.05,,S,0
3,6,0,3,"Moran, Mr. James",male,29.69911764705882,0,0,330877,8.4583,,Q,0
4,7,0,1,"McCarthy, Mr. Timothy J",male,54.0,0,0,17463,51.8625,E46,S,0
5,8,0,3,"Palsson, Master. Gosta Leonard",male,2.0,3,1,349909,21.075,,S,4
6,13,0,3,"Saundercock, Mr. William Henry",male,20.0,0,0,A/5. 2151,8.05,,S,0


### Accessing available public classic datasets

In [7]:
using RDatasets
iris = dataset("datasets", "iris")
head(iris)

Unnamed: 0,SepalLength,SepalWidth,PetalLength,PetalWidth,Species
1,5.1,3.5,1.4,0.2,setosa
2,4.9,3.0,1.4,0.2,setosa
3,4.7,3.2,1.3,0.2,setosa
4,4.6,3.1,1.5,0.2,setosa
5,5.0,3.6,1.4,0.2,setosa
6,5.4,3.9,1.7,0.4,setosa
