# MCMC0.5: Introduction to Functional Programming

## Why functional?

One may claim that C/Fortran (or sometimes C++ without functional programming extension) is enough to do physics because they are fast, simple, and have enough functionalities. I agree with all the three points, but I think it is acceptable only when you are working only by yourself. Usually, a code by procedural programming is not easy to read, and is not human-oriented. In the procedural programming, you always have to use your brain to mimic how the computer works (is a human being a slave of computers?). Most computational physicists somehow got used to this way of thinking (already brainwashed) and wrote a code as readable to the machine as possible. Of course, it is ok if you are working by yourself and your collaborators do not care about your code, but today it is more important to share a code (via Github) and discuss physics on the Jupyter Notebook (maybe in old days it was enough to have one expert for computation and the others did not even have to see the code). In that case, it is important to write a readable, understandable, and well-organized code for everybody even without any comments!

I want to define functional programming as human-oriented programming. If you want to get a mean value, why not apply a "mean" function onto the collection? Some (brainwashed) people may wonder if it is ok not to sum each time of the iteration to reduce memory, and so on. Such concerns will be resolved later by a concept of lazy evaluation (actually in MCMC6.0). Using this black box, you now do not have to care about at which step functions are actually called. Then, you can just apply statistical operations like mean, std, etc. just on the collection of data (called iterator) afterwards. This is exactly how we think when we process big data, and you can now write a code as you think.

More strictly, functional programming may be defined as programming writing a code by compositions of pure functions. In other words, it is a deterministic finite-state machine (FSM). This notion is closely related to the Markov-chain Monte Carlo (MCMC) because MCMC is nothing but a nondeterministic FSM. From physicists' view, it is a "quantum" version of the functional programming, and every techinique in functional programming is applicable to MCMC, while in MCMC completely pure functional programming is impossible (software cannot produce any true random numbers by itself!) because we need some randomness (isn't it exactly like quantum mechanics?). From now on, I will take "pragmatism" by using functional programming as much as possible, but by including a (pseudo)nondeterministic function additionally to make it work as MCMC. In this sense, I do not stick to doing completely functional programming, but use it as a tool to make a code readable. Unless it diminishes the readability of a code, I sometimes use a destructive function only if needed.

Julia completely enables us to write a code as you think. You can use Unicode for variables, most linear algebra functions already exist, and Julia even discards class-based object-oriented programming, which is unnecessary for MCMC! Let's see the HMC function in MCMC1.0, and compare the function to the algorithm written above the function. If you are good at elementary mathematics, you can write a code exactly like you are doing math on your notebook. Like in [this pdf](https://github.com/MCSMC/MCMC_sample_codes/blob/master/review-MCMC.pdf), people unfortunately still believe that such "how we think" programming in Mathematica is useless. However, today in 2018, Julia and Jupyter Notebook work as a super-fast version of Mathematica/MATLAB, and Julia now works at a speed comparable to C/Fortran. Of course, if you wish to use low-level operations like SSE and AVX, we still need such procedural languages, but for most purposes, just calling BLAS/LAPACK functions is enough. Even more, Julia now supports GPGPU, so even for GPGPU programming Julia can be the first choice, instead of writing CUDA/OpenCL directly.

In this series of notebooks, I will introduce not only MCMC but also functional programming, Julia-like programming, and in addition a Bayesian way of thinking. Think Bayesian, think functional, and last but not least [think Julia](https://benlauwens.github.io/ThinkJulia.jl/latest/book.html)!

## Think functional

In functional programming, a program is regarded as one big function, and this is constructed just by a sequence of functions. In order to make those functions pure, it is important to write a function in the following ways.

1. Do not use/write a destructive function.
1. Do not include a side effect in functions. In other words, functions have to do things other than returning a unique value determined from the input.
1. (optional) static typing

In Julia, every destructive function has ! after its name:

In [1]:
a = [1, 2, 3]
push!(a, 4)

4-element Array{Int64,1}:
 1
 2
 3
 4

Therefore, the first condition is obeyed if you do not use any installed functions with "!". In some case, you need to use such destructive functions to speed up, and in this case you should take pragmatism. It is recommended to add ! when you define a new destructive function. In some sense insertion itself is regarded as a destructive operation (or a side effect), but it is usually unavoidable and we accept destructive insertion from the pragmatism.

Then, how can we replace destructive functions like "push!" and "pop!"? In most cases, destructive operations on arrays can be avoided by iterators and generators (see MCMC2.0). However, for matrices and tensors we sometimes still need a destructive function to do operations in a memory-efficient way. In that case, of course we accept to use destructive functions as pragmatists, or you can formally write a pure wrapper function for the destructive operations to make your program look like a pure functional code.

Then, what is a side effect?

In [2]:
println("Hello, World!")

Hello, World!


println is a "function," but it is doing more than a pure function is supposed to do. It outputs words on your screen and returns nothing! Including such I/O operations, something more than a mathematically pure function should do is all called a side effect. Some side effects are avoidable and others are unavoidable. It is important to reduce side effects as much as possible, but we as pragmatists accept useful side effects like I/O operations, insertion, control syntax, etc. On the other hand, the most important thing that you should have in mind is not to rewrite a global variable inside each function. Global variables should be defined always with "const" and even it is recommended to pass global variables as arguments.

Unfortunately, doing completely pure functional programming in Julia is not realistic. This is mostly because of the matter of a speed for numerical calculations. Most pure functional languages are not suitable for numerics and several times slower than C/Fortran. The speed and the referential transparency are in most cases trade-off, so you should not stick to the purity. In modern pure functional languages, referential transparency is almost always guaranteed, but in Julia you have to write a referentially transparent code conciously, which is a bit exhausting. Writing a hybrid code of two styles may be important from pragmatists' view.

## Higher-order functions

The most important feature of functional programming is higher-order functions.

### map

The most important higher-order function is map. See how it works.

In Julia you won't see this function not many times because there is a very nice short-hand notation.

The problem is that map in Julia does not do lazy evaluation. Each time we call map arrays are stored in the memory.

In order to do lazy evaluation, use Base.Generator instead.

Note that IterTools.imap has more sophisticated implementation for this lazy evaluation. I will come back to this point later in MCMC2.0 and MCMC6.0.

### filter

The next important higher-order function is filter.

### reduce

In [3]:
reduce(+, [1, 2, 3])

6

You should be careful if the function is not associative.

In [4]:
reduce(-, [3, 2, 1])

0

Reduction is done from left to right here. reduce is usually used in combination with map because reduce(+, a) is same as sum(a).

In [None]:
mapreduce()

accumulate is a cousin of reduce:

In [5]:
accumulate(+, [1, 2, 3])

3-element Array{Int64,1}:
 1
 3
 6

If you have time, let's try writing a code for a lazy version of accumulate!

### foreach

map, filter and reduce were the three most important functions. Compared to those, foreach is not important because it can be written instead by a "for loop." foreach just reduces the length of your code.

### zip

zip for arrays is not a higher-order function in a strict sense, but I will introduce it here because it is usually used in combination with other higher-order functions.

In [7]:
foreach(println, zip(1 : 2, 3 : 6))

(1, 3)
(2, 4)


### Currying

## Anonymous function

## Recursion

In most cases, writing functions by recursion is not a good strategy in Julia because tail call optimization is not supported (currently in Julia 1.0). Writing a loop explicitly is much better in most cases.

## Type stability

## Tips: Nondeterministic unit test

~ under construction ~