# <img src="https://github.com/JuliaLang/julia-logo-graphics/raw/master/images/julia-logo-color.png" height="100" /> _for Pythonistas_

> TL;DR: _Julia looks and feels a lot like Python, only much faster. It's dynamic, expressive, extensible, with batteries included, in particular for Data Science_.

This notebook is an **introduction to Julia for Python programmers**.

It will go through the most important Python features (such as functions, basic types, list comprehensions, exceptions, generators, modules, packages, and so on) and show you how to code them in Julia.

# Getting Started with Julia in Colab/Jupyter
You can either run this notebook in Google Colab, or using Jupyter on your own machine.

## Running on Google Colab
1. Work on a copy of this notebook: _File_ > _Save a copy in Drive_ (you will need a Google account). Alternatively, you can download the notebook using _File_ > _Download .ipynb_, then upload it to [Colab](https://colab.research.google.com/).
2. Execute the following cell (click on it and press Ctrl+Enter) to install Julia, IJulia (the Jupyter kernel for Julia) and other packages. You can update `JULIA_VERSION` and the other parameters, if you know what you're doing. Installation takes 2-3 minutes.
3. Reload this page (press Ctrl+R, or ⌘+R, or the F5 key) and continue to the _Checking the Installation_ section.

* _Note_: If your Colab Runtime gets reset (e.g., due to inactivity), repeat steps 2 and 3.

In [None]:
%%shell
set -e

#---------------------------------------------------#
JULIA_VERSION="1.7.1" # any version ≥ 0.7.0
JULIA_PACKAGES="IJulia BenchmarkTools PyCall PyPlot"
JULIA_PACKAGES_IF_GPU="CUDA"
JULIA_NUM_THREADS=4
#---------------------------------------------------#

if [ -n "$COLAB_GPU" ] && [ -z `which julia` ]; then
  # Install Julia
  JULIA_VER=`cut -d '.' -f -2 <<< "$JULIA_VERSION"`
  echo "Installing Julia $JULIA_VERSION on the current Colab Runtime..."
  BASE_URL="https://julialang-s3.julialang.org/bin/linux/x64"
  URL="$BASE_URL/$JULIA_VER/julia-$JULIA_VERSION-linux-x86_64.tar.gz"
  wget -nv $URL -O /tmp/julia.tar.gz # -nv means "not verbose"
  tar -x -f /tmp/julia.tar.gz -C /usr/local --strip-components 1
  rm /tmp/julia.tar.gz

  # Install Packages
  if [ "$COLAB_GPU" = "1" ]; then
      JULIA_PACKAGES="$JULIA_PACKAGES $JULIA_PACKAGES_IF_GPU"
  fi
  for PKG in `echo $JULIA_PACKAGES`; do
    echo "Installing Julia package $PKG..."
    julia -e 'using Pkg; pkg"add '$PKG'; precompile;"' &> /dev/null
  done

  # Install kernel and rename it to "julia"
  echo "Installing IJulia kernel..."
  julia -e 'using IJulia; IJulia.installkernel("julia", env=Dict(
      "JULIA_NUM_THREADS"=>"'"$JULIA_NUM_THREADS"'"))'
  KERNEL_DIR=`julia -e "using IJulia; print(IJulia.kerneldir())"`
  KERNEL_NAME=`ls -d "$KERNEL_DIR"/julia*`
  mv -f $KERNEL_NAME "$KERNEL_DIR"/julia  

  echo ''
  echo "Successfully installed `julia -v`!"
  echo "Please reload this page (press Ctrl+R, ⌘+R, or the F5 key) then"
  echo "jump to the 'Checking the Installation' section."
fi

Installing Julia 1.7.1 on the current Colab Runtime...
2022-02-17 17:06:20 URL:https://storage.googleapis.com/julialang2/bin/linux/x64/1.7/julia-1.7.1-linux-x86_64.tar.gz [123374573/123374573] -> "/tmp/julia.tar.gz" [1]
Installing Julia package IJulia...
Installing Julia package BenchmarkTools...
Installing Julia package PyCall...
Installing Julia package PyPlot...
Installing IJulia kernel...
[36m[1m[ [22m[39m[36m[1mInfo: [22m[39mInstalling julia kernelspec in /root/.local/share/jupyter/kernels/julia-1.7

Successfully installed julia version 1.7.1!
Please reload this page (press Ctrl+R, ⌘+R, or the F5 key) then
jump to the 'Checking the Installation' section.




## Running This Notebook Locally
If you prefer to run this notebook on your machine instead of Google Colab:

* Download this notebook (File > Download .ipynb)
* Install [Julia](https://julialang.org/downloads/)
* Run the following command in a terminal to install `IJulia` (the Jupyter kernel for Julia), and a few packages we will use:
```bash
julia -e 'using Pkg
            pkg"add IJulia; precompile;"
            pkg"add BenchmarkTools; precompile;"
            pkg"add PyCall; precompile;"
            pkg"add PyPlot; precompile;"'
```

* Next, go to the directory containing this notebook:

    ```julia
cd /path/to/notebook/directory
```

* Start Jupyter Notebook:

    ```bash
julia -e 'using IJulia; IJulia.notebook()'
```

    Or replace `notebook()` with `jupyterlab()` if you prefer JupyterLab.

    If you do not already have [Jupyter](https://jupyter.org/install) installed, IJulia will propose to install it. If you agree, it will automatically install a private Miniconda (just for Julia), and install Jupyter and Python inside it.

* Lastly, open this notebook and skip directly to the next section.

## Checking the Installation
The `versioninfo()` function should print your Julia version and some other info about the system (if you ever ask for help or file an issue about Julia, you should always provide this information).

In [None]:
versioninfo()

Julia Version 1.7.1
Commit ac5cc99908 (2021-12-22 19:35 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
  CPU: Intel(R) Xeon(R) CPU @ 2.20GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-12.0.1 (ORCJIT, broadwell)
Environment:
  JULIA_NUM_THREADS = 4


In [None]:
?versioninfo

search: [0m[1mv[22m[0m[1me[22m[0m[1mr[22m[0m[1ms[22m[0m[1mi[22m[0m[1mo[22m[0m[1mn[22m[0m[1mi[22m[0m[1mn[22m[0m[1mf[22m[0m[1mo[22m



```
versioninfo(io::IO=stdout; verbose::Bool=false)
```

Print information about the version of Julia in use. The output is controlled with boolean keyword arguments:

  * `verbose`: print all additional information

See also: [`VERSION`](@ref).


In [None]:
using Pkg
Pkg.add("Distributions")
using Distributions

[32m[1m    Updating[22m[39m registry at `~/.julia/registries/General.toml`
[32m[1m   Resolving[22m[39m package versions...
[32m[1m   Installed[22m[39m IrrationalConstants ─ v0.1.1
[32m[1m   Installed[22m[39m Rmath ─────────────── v0.7.0
[32m[1m   Installed[22m[39m PDMats ────────────── v0.11.5
[32m[1m   Installed[22m[39m DataAPI ───────────── v1.9.0
[32m[1m   Installed[22m[39m StatsFuns ─────────── v0.9.15
[32m[1m   Installed[22m[39m OrderedCollections ── v1.4.1
[32m[1m   Installed[22m[39m SpecialFunctions ──── v2.1.2
[32m[1m   Installed[22m[39m ChainRulesCore ────── v1.12.0
[32m[1m   Installed[22m[39m QuadGK ────────────── v2.4.2
[32m[1m   Installed[22m[39m LogExpFunctions ───── v0.3.6
[32m[1m   Installed[22m[39m Rmath_jll ─────────── v0.3.0+0
[32m[1m   Installed[22m[39m OpenSpecFun_jll ───── v0.5.5+0
[32m[1m   Installed[22m[39m Compat ────────────── v3.41.0
[32m[1m   Installed[22m[39m StatsAPI ──────────── v1.2.1
[32m[

In [None]:
using Pkg
Pkg.add("Graphs")
using Graphs

[32m[1m   Resolving[22m[39m package versions...
[32m[1m   Installed[22m[39m ArnoldiMethod ─ v0.2.0
[32m[1m   Installed[22m[39m SimpleTraits ── v0.9.4
[32m[1m   Installed[22m[39m Inflate ─────── v0.1.2
[32m[1m   Installed[22m[39m StaticArrays ── v1.3.5
[32m[1m   Installed[22m[39m Graphs ──────── v1.6.0
[32m[1m    Updating[22m[39m `~/.julia/environments/v1.7/Project.toml`
 [90m [86223c79] [39m[92m+ Graphs v1.6.0[39m
[32m[1m    Updating[22m[39m `~/.julia/environments/v1.7/Manifest.toml`
 [90m [ec485272] [39m[92m+ ArnoldiMethod v0.2.0[39m
 [90m [86223c79] [39m[92m+ Graphs v1.6.0[39m
 [90m [d25df0c9] [39m[92m+ Inflate v0.1.2[39m
 [90m [699a6c99] [39m[92m+ SimpleTraits v0.9.4[39m
 [90m [90137ffa] [39m[92m+ StaticArrays v1.3.5[39m
[32m[1mPrecompiling[22m[39m project...
[32m  ✓ [39m[90mInflate[39m
[32m  ✓ [39m[90mSimpleTraits[39m
[32m  ✓ [39m[90mStaticArrays[39m
[32m  ✓ [39m[90mArnoldiMethod[39m
[32m  ✓ [39mGraphs
  

In [None]:
Base.Dict{Symbol,V}(a::NamedTuple) where V = 
  Dict{Symbol,V}(n=>v for (n,v) in zip(keys(a), values(a)))
Base.convert(::Type{Dict{Symbol,V}}, a::NamedTuple) where V =
    Dict{Symbol,V}(a)
Base.isequal(a::Dict{Symbol,<:Any}, nt::NamedTuple) =
  length(a) == length(nt) &&
  all(a[n] == v for (n,v) in zip(keys(nt), values(nt)))

In [None]:
struct Variable
  name::Symbol
  r::Int # number of possible values 
end

const Assignment = Dict{Symbol,Int}
const FactorTable = Dict{Assignment,Float64}
struct Factor
  vars::Vector{Variable}
  table::FactorTable
end 

variablenames(φ::Factor) = [var.name for var in φ.vars]

select(a::Assignment, varnames::Vector{Symbol}) =
  Assignment(n=>a[n] for n in varnames)

function assignments(vars::AbstractVector{Variable})
  names = [var.name for var in vars]
  return vec([Assignment(n=>v for (n,v) in zip(names, values))
              for values in product((1:v.r for v in vars)...)])
end


function normalize!(φ::Factor)
  z = sum(p for (a,p) in φ.table)
  for (a,p) in φ.table
    φ.table[a] = p/z
    end
  return φ
end

normalize! (generic function with 1 method)

In [None]:
struct BayesianNetwork 
  vars::Vector{Variable}
  factors::Vector{Factor}
  graph::SimpleDiGraph{Int64}
end

In [None]:
# Algorithm 2.3
function probability(bn::BayesianNetwork, assignment)
  subassignment(φ) = select(assignment, variablenames(φ))
  probability(φ) = get(φ.table, subassignment(φ), 0.0)
  return prod(probability(φ) for φ in bn.factors)
end

probability (generic function with 1 method)

In [None]:
# Example 2.3
X = Variable(:x, 2)
Y = Variable(:y, 2)
Z = Variable(:z, 2)
φ = Factor([X, Y, Z], FactorTable(
  (x=1, y=1, z=1) => 0.08, (x=1, y=1, z=2) => 0.31,
  (x=1, y=2, z=1) => 0.09, (x=1, y=2, z=2) => 0.37,
  (x=2, y=1, z=1) => 0.01, (x=2, y=1, z=2) => 0.05,
  (x=2, y=2, z=1) => 0.02, (x=2, y=2, z=2) => 0.07,
)) 

Factor(Variable[Variable(:x, 2), Variable(:y, 2), Variable(:z, 2)], Dict(Dict(:y => 2, :z => 2, :x => 1) => 0.37, Dict(:y => 1, :z => 2, :x => 1) => 0.31, Dict(:y => 2, :z => 1, :x => 2) => 0.02, Dict(:y => 1, :z => 1, :x => 1) => 0.08, Dict(:y => 1, :z => 2, :x => 2) => 0.05, Dict(:y => 2, :z => 2, :x => 2) => 0.07, Dict(:y => 2, :z => 1, :x => 1) => 0.09, Dict(:y => 1, :z => 1, :x => 2) => 0.01))

In [None]:
# Example 2.5
B = Variable(:b, 2); S = Variable(:s, 2)
E = Variable(:e, 2)
D = Variable(:d, 2); C = Variable(:c, 2)
vars = [B, S, E, D, C]
factors = [
  Factor([B], FactorTable((b=1,) => 0.99, (b=2,) => 0.01)),
  Factor([S], FactorTable((s=1,) => 0.98, (s=2,) => 0.02)),
  Factor([E,B,S], FactorTable(
    (e=1,b=1,s=1) => 0.90, (e=1,b=1,s=2) => 0.04,
    (e=1,b=2,s=1) => 0.05, (e=1,b=2,s=2) => 0.01,
    (e=2,b=1,s=1) => 0.10, (e=2,b=1,s=2) => 0.96, 
    (e=2,b=2,s=1) => 0.95, (e=2,b=2,s=2) => 0.99)),
  Factor([D, E], FactorTable(
    (d=1,e=1) => 0.96, (d=1,e=2) => 0.03,
    (d=2,e=1) => 0.04, (d=2,e=2) => 0.97)),
  Factor([C, E], FactorTable(
    (c=1,e=1) => 0.98, (c=1,e=2) => 0.01, (c=2,e=1) => 0.02, (c=2,e=2) => 0.99))
]
graph = SimpleDiGraph(5)
add_edge!(graph, 1, 3); add_edge!(graph, 2, 3)
add_edge!(graph, 3, 4); add_edge!(graph, 3, 5) 
bn = BayesianNetwork(vars, factors, graph)

BayesianNetwork(Variable[Variable(:b, 2), Variable(:s, 2), Variable(:e, 2), Variable(:d, 2), Variable(:c, 2)], Factor[Factor(Variable[Variable(:b, 2)], Dict(Dict(:b => 1) => 0.99, Dict(:b => 2) => 0.01)), Factor(Variable[Variable(:s, 2)], Dict(Dict(:s => 2) => 0.02, Dict(:s => 1) => 0.98)), Factor(Variable[Variable(:e, 2), Variable(:b, 2), Variable(:s, 2)], Dict(Dict(:b => 1, :s => 1, :e => 1) => 0.9, Dict(:b => 1, :s => 1, :e => 2) => 0.1, Dict(:b => 2, :s => 2, :e => 2) => 0.99, Dict(:b => 1, :s => 2, :e => 2) => 0.96, Dict(:b => 2, :s => 2, :e => 1) => 0.01, Dict(:b => 1, :s => 2, :e => 1) => 0.04, Dict(:b => 2, :s => 1, :e => 1) => 0.05, Dict(:b => 2, :s => 1, :e => 2) => 0.95)), Factor(Variable[Variable(:d, 2), Variable(:e, 2)], Dict(Dict(:d => 2, :e => 1) => 0.04, Dict(:d => 2, :e => 2) => 0.97, Dict(:d => 1, :e => 2) => 0.03, Dict(:d => 1, :e => 1) => 0.96)), Factor(Variable[Variable(:c, 2), Variable(:e, 2)], Dict(Dict(:e => 2, :c => 1) => 0.01, Dict(:e => 1, :c => 1) => 0.98, D

In [None]:
# Algorithm 3.1 : implementation of the factor product
function Base.:*(φ::Factor, ψ::Factor)
  φnames = variablenames(φ)
  ψnames = variablenames(ψ)
  ψonly = setdiff(ψ.vars, φ.vars)
  table = FactorTable()
  for (φa,φp) in φ.table
    for a in assignments(ψonly)
      a = merge(φa, a)
      ψa = select(a, ψnames)
      table[a] = φp * get(ψ.table, ψa, 0.0)
    end
  end
  vars = vcat(φ.vars, ψonly)
  return Factor(vars, table)
end

In [None]:
# Algorithm 3.2

function marginalize(φ::Factor, name)
  table = FactorTable()
  for (a, p) in φ.table
    a′ = delete!(copy(a), name)
    table[a′] = get(table, a′, 0.0) + p
  end 
  vars = filter(v -> v.name != name, φ.vars)
  return Factor(vars, table)
end 

marginalize (generic function with 1 method)

In [None]:
# Algorithm 3.3

in_scope(name, φ) = any(name == v.name for v in φ.vars)

function condition(φ::Factor, name, value)
  if !in_scope(name, φ)
    return φ   
  end
  table = FactorTable()
  for (a, p) in φ.table
    if a[name] == value
      table[delete!(copy(a), name)] = p
    end
  end
  vars = filter(v -> v.name != name, φ.vars)
  return Factor(vars, table)
end

function condition(φ::Factor, evidence)
  for (name, value) in pairs(evidence)
    φ = condition(φ, name, value)
  end
  return φ
end

condition (generic function with 2 methods)

In [None]:
using LinearAlgebra



In [None]:
function sub2ind(siz, x)
  k = vcat(1, cumprod(siz[1:end-1]))
  return dot(k, x .- 1) + 1
end

function statistics(vars, G, D::Matrix{Int})
  n = size(D, 1)
  r = [vars[i].r for i in 1:n]
  q = [prod([r[j] for j in inneighbors(G,i)]) for i in 1:n]
  M = [zeros(q[i], r[i]) for i in 1:n]
  for o in eachcol(D)
    for i in 1:n
      k = o[i]
      parents = inneighbors(G,i)
      j=1
      if !isempty(parents)
        j = sub2ind(r[parents], o[parents])
      end
      M[i][j,k] += 1.0
    end
  end
  return M
end

statistics (generic function with 1 method)

# Example 4.1

In [None]:
G = SimpleDiGraph(3)
add_edge!(G, 1, 2)
add_edge!(G, 3, 2)
vars = [Variable(:A,2), Variable(:B,2), Variable(:C,2)]
D = [1 2 2 1; 1 2 2 1; 2 2 2 2]
M = statistics(vars, G, D)

3-element Vector{Matrix{Float64}}:
 [2.0 2.0]
 [0.0 0.0; 0.0 0.0; 2.0 0.0; 0.0 2.0]
 [0.0 4.0]

In [None]:
θ = [mapslices(x->normalize(x,1), Mi, dims=2) for Mi in M]

3-element Vector{Matrix{Float64}}:
 [0.5 0.5]
 [NaN NaN; NaN NaN; 1.0 0.0; 0.0 1.0]
 [0.0 1.0]

# Algorithm 4.2

In [None]:
function prior(vars, G)
  n = length(vars)
  r = [vars[i].r for i in 1:n]
  q = [prod([r[j] for j in inneighbors(G,i)]) for i in 1:n]
  return [ones(q[i], r[i]) for i in 1:n]
end

prior (generic function with 1 method)

In [None]:
α = prior(vars, G)

3-element Vector{Matrix{Float64}}:
 [1.0 1.0]
 [1.0 1.0; 1.0 1.0; 1.0 1.0; 1.0 1.0]
 [1.0 1.0]

In [None]:
M + α

3-element Vector{Matrix{Float64}}:
 [3.0 3.0]
 [1.0 1.0; 1.0 1.0; 3.0 1.0; 1.0 3.0]
 [1.0 5.0]

In [None]:
θ = [mapslices(x->normalize(x,1), Mi, dims=2) for Mi in M + α]

3-element Vector{Matrix{Float64}}:
 [0.5 0.5]
 [0.5 0.5; 0.5 0.5; 0.75 0.25; 0.25 0.75]
 [0.16666666666666666 0.8333333333333333]

## 4.3 Nonparameteric Learning

In [None]:
# Algorithm 4.3

gaussian_kernel(b) = x->pdf(Normal(0,b), x)

function kernel_density_estimate(φ, O)
  return x -> sum([φ(x - o) for o in O])/length(O)
end

kernel_density_estimate (generic function with 1 method)

### 4.4.2 Expectation-Maximization