# MCMC2.0: Iterators, Generators, and Channels

Iterators and generators are two important tools in the modern functional programming and Julia implements new "channels." I will show a typical implementation of them for MCMC.

## Iterators

Iterators are often used to save memories. If we wish to do something to each element of a large list, we do not have to store all the elements between the processes. Iterators return each element one by one, so we can do opearations one by one and each time we can discard old data to save memories. Actually, we already used iterators in usual "for" loops.

In [1]:
iter = 1 : 2

1:2

This is already iterator counting from 1 to 2.

In [2]:
element, state = iterate(iter)

(1, 1)

This is the initial state of the iterator "1:2".

In [3]:
element, state = iterate(iter, state)

(2, 2)

These are the next item and the next state.

In [4]:
result = iterate(iter, state)
result == nothing

true

If there is nothing to iterate, they return nothing.

In [5]:
iter = 1 : 5
for i in iter
    println(i)
end

1
2
3
4
5


This is the same as follow:

In [6]:
iter_result = iterate(iter)
while iter_result !== nothing
    (i, state) = iter_result
    println(i)
    iter_result = iterate(iter, state)
end

1
2
3
4
5


If we use an iterator instead of a list, we do not need to store the whole list anymore.

In [7]:
list = collect(iter)
for i in 1 : length(list)
    println(list[i] ^ 2)
end

1
4
9
16
25


This is meaningless and should be rewritten as:

In [8]:
for i in iter
    println(i ^ 2)
end

1
4
9
16
25


In this way, we can (trivially) avoid allocating a memory for the list. Or this works if the memory does not matter:

In [9]:
for i in list
    println(i ^ 2)
end

1
4
9
16
25


## IterTools

~ under construction ~

In [10]:
using IterTools
using Distributions
function Gibbs1(a::Float64, b::Float64, c::Float64)::Function
    xy::Tuple{Float64, Float64} -> begin
        x = rand(Normal(b / a * xy[2], 1 / sqrt(a)))
        x, rand(Normal(b / c * x, 1 / sqrt(c)))
    end
end
samples = iterated(Gibbs1(1.0, 0.8, 1.0), (0.0, 0.0))
collect(Iterators.take(samples, 10))

10-element Array{Tuple{Float64,Float64},1}:
 (0.0, 0.0)                                
 (0.4739005172993937, -1.0898215077285935) 
 (-1.2688200092225754, -0.6419541240487407)
 (0.22890195779437084, -0.4022747694649951)
 (-0.3144508539987875, 0.3073409581425116) 
 (0.35257427800241015, -1.1288639269196465)
 (-1.1329820158068977, -1.610605834704651) 
 (-0.06448259403362, -0.4480419449216983)  
 (0.31058844713367567, 1.3201575886598336) 
 (-0.060514341635201196, 1.570308116363705)

## Generator 1 (generator expression)

An iterator is most useful when combined with a generator. We generally call something creating an iterator a generator. Generators in general include a generator expression, a generator function, etc. Julia supports a generator expression as follows:

In [11]:
gene = (x^2 for x in 1 : 5)

Base.Generator{UnitRange{Int64},getfield(Main, Symbol("##5#6"))}(getfield(Main, Symbol("##5#6"))(), 1:5)

This is very similar to a list comprehension:

In [12]:
list = [x^2 for x in 1 : 5]

5-element Array{Int64,1}:
  1
  4
  9
 16
 25

The list comprehension will allocate a memory for the whole list. The generator will save your memory.

In [13]:
for i in gene
    println(i)
end

1
4
9
16
25


## Generator 2 (generator function = resumable function)

An easier, simpler, and most useful way to create an iterator/generator is a generator function, and Python and most modern language support this type of functions. Julia does not natively support Python-type generator functions, but simply you can call ResumableFunctions.jl. This is the easiest way to reproduce the behavior of generator functions.

In [14]:
using ResumableFunctions
@resumable function Ising()::Vector{Int64}
    N = 8
    σ = ones(Int64, N)
    β = 1.0
    while true
        for i in 1 : 1000
            j = rand(1 : N)
            ΔβE = 2β * σ[j] * (σ[mod1(j + 1, N)] + σ[mod1(j - 1, N)])
            -ΔβE > log(rand()) && (σ[j] = -σ[j])
        end
        @yield σ
    end
end

Ising (generic function with 1 method)

With @resumable macro, you can define a resumable function. This function does not have "return," but yield what you want at @yield points. This function directely returns an iterater, i.e. a function iterate(iter) is already defined when we call iter = Ising(). Then, this iterater gives you the "yield" repeatedly each time you reached @yield point.

In [15]:
iter = Ising()
@show iter();

iter() = [-1, -1, -1, -1, -1, -1, -1, -1]


In [16]:
for i in 1 : 10
    println(iter())
end

[1, 1, 1, 1, 1, 1, 1, 1]
[1, 1, 1, 1, 1, -1, -1, 1]
[-1, -1, -1, 1, 1, 1, 1, 1]
[1, 1, -1, -1, -1, -1, -1, 1]
[1, 1, 1, 1, 1, 1, 1, 1]
[1, 1, 1, 1, 1, -1, 1, 1]
[-1, -1, 1, 1, 1, 1, 1, -1]
[-1, -1, -1, -1, -1, -1, -1, -1]
[-1, -1, -1, -1, -1, -1, -1, -1]
[1, 1, 1, 1, 1, -1, -1, -1]


ResummableFunctions are very fast and memory-efficient because the macro @resummable will rewrite a function into a normal iterator directly before the JIT compilation.

## Channels

Instead of a generator function, Julia natively supports Channels for parallel computing. Especially, Channel is a stronger and more general tool to create an iterator from the function. This is a generalized concept of a generator function, but in most cases ResumableFunctions.jl is faster and memory-efficient. However, if you need a postprocess based on lazy evaluation, Channels work better from my experience.

In [17]:
function Gibbs2(a::Float64, b::Float64, c::Float64)::Channel
    Channel(ctype = Tuple{Float64,Float64}) do channel::Channel{Tuple{Float64,Float64}}
        N = 10
        x = 0.0
        y = 0.0
        put!(channel, (x, y))
        for i in 1 : N
            x = rand(Normal(b / a * y, 1 / sqrt(a)))
            y = rand(Normal(b / c * x, 1 / sqrt(c)))
            put!(channel, (x, y))
        end
    end
end

Gibbs2 (generic function with 1 method)

In [18]:
for z in Gibbs2(1.0, 0.8, 1.0)
    println(z)
end

(0.0, 0.0)
(0.9829611089484098, 1.0146254514164113)
(0.8302512487643459, 1.9382240080615676)
(0.8533806870959914, 0.3470739424909939)
(-0.4284540298917679, 0.6206711615112139)
(1.053350461232235, -0.32532115715575716)
(-0.6587307481735953, -1.1097739365418695)
(0.1739851697004564, 0.049982859507384236)
(0.9141017684126425, -1.1977370587585305)
(-1.5634071950282509, -2.1294161592055127)
(-0.2965097930145839, 0.05643498312732537)


This code is acceptable if the number of tasks is just one.

~ under construction ~

In [19]:
function xupdate(a::Float64, b::Float64, c::Float64, input::Channel{Tuple{Float64,Float64}}, output::Channel{Tuple{Float64,Float64}})
    N = 10
    x = 0.0
    y = 0.0
    for i in 1 : N
        (x, y) = take!(input)
        x = rand(Normal(b / a * y, 1 / sqrt(a)))
        put!(output, (x, y))
    end
    close(output)
end
function yupdate(a::Float64, b::Float64, c::Float64, input::Channel{Tuple{Float64,Float64}}, output::Channel{Tuple{Float64,Float64}})
    N = 10
    x = 0.0
    y = 0.0
    for i in 1 : N
        (x, y) = take!(input)
        y = rand(Normal(b / c * x, 1 / sqrt(c)))
        println((x, y))
        put!(output, (x, y))
    end
    close(output)
end
ch1 = Channel{Tuple{Float64,Float64}}(32)
ch2 = Channel{Tuple{Float64,Float64}}(32)
@async xupdate(1.0, 0.8, 1.0, ch1, ch2)
@async yupdate(1.0, 0.8, 1.0, ch2, ch1)
put!(ch1, (0.0, 0.0));

(0.4679852522868461, 0.8662685765831253)
(-0.27529622827669376, -1.3598648848350114)
(0.10737675984452144, 1.5337060029205214)
(-0.2695036999543239, 1.3476995398361922)
(1.6711865609457512, 1.611619713413844)
(1.1178329939196314, -1.1159872152067298)
(-1.2769155587808125, 0.23981805611572993)
(-0.6696316662763916, -0.060843259569185015)
(-0.2678181059722951, -0.006382054276658111)
(0.39487876062161253, -1.3051756012016937)


The difference bewteen Channels and ResumableFunctions lies in how they compile a code. Check: https://white.ucc.asn.au/2017/11/18/Lazy-Sequences-in-Julia.html

## RemoteChannels

It will be discussed in MCMC4.0.

## Lazy

Lazy.jl is sometimes useful.

In [20]:
using Lazy
takewhile(x -> x < 3, 1 : 5)



(1 2)

However, the coding style of Lazy.jl is different from ordinarly Julia, so I won't use this package unless it is necessary.

**Exercise**: implement takewhile and dropwhile for iterators. These functions are sometimes useful for MCMC.