# Exploring Some Fundamentals


## A Bit of Context

### High School

Back when I was young, I discovered computers and programming with some high school friends that were making wonders with the help of a french copy of the [PC Bible](https://www.amazon.fr/Pc-Bible-Knorr/dp/0201883546) book.

More specifically, part of their coding was done with an obscure language they called assembly.

I was still fresh from BASIC and learning Pascal (with Turbo Pascal 6 then 7, anyone? ;-) ). Including x86 assembly in such development environments was easy, so I rapidly learned some of it, and discovered its counterparts: registers, bus, ports, cache, RAM, microcode, cycles, etc.

I do not code in assembly anymore. But what I learned at that time still serves me today. In my opinion, having a reasonable model of how computers work when programming is underrated.

### Engineering Shool

A few years later, when I was studying Computer Science, some serious theoritical background came along, with terms like [Von Neumann architecture](https://en.wikipedia.org/wiki/Von_Neumann_architecture) or [Turing machine](https://en.wikipedia.org/wiki/Turing_machine).

In those years, I was also supposed to learn about Lambda Calculus, Church et al. theoritical work on calculability. I think I completely missed the point at that time. I was rapidly lost when trying to "relate" this knowledge with anything I knew about computers.

For a reason: I had never been exposed to any functional programming/reasonning before.

### Back to Present

Fast-forward... I've now coded in a functional fashion with Swift, Kotlin, or Rust, had several looks into Haskell and Elm, and I'm leaning towards functional style or patterns in any language that does not prevent me from doing so, such as Python.

So when I found this video: https://www.youtube.com/watch?v=5C6sv7-eTKg, I thought it was a good time to return to functional programming "assembly" fundamentals.

What I found interesting in this video:

 * The speaker, [David Beazley](https://dabeaz.com/), does not take all those results for granted, and takes a questioning and exploratoring tone that I find pleasing.

 * He uses Python, a "usual" (non-primarily functional) and "easy" language to illustrate the various steps.

 * He spares me from decrypting the oh-so-obscure-to-me mathematical notations ( $(\lambda x.x)y$... ) that pop up in a vast majority of the papers supposed to explain those concepts.

All those make less barriers and less cognitive load than the material I used for my previous attempts.

One more thing: David Beazley is bright, clever and very educational. Most of all, he regularly acknowledges that this topic is mind-bending, and I think I needed to hear that.

So what follows is my latest attempt at understanding parts of lambda-calculus and Y-combinator. Go exorcise a serie of personal failures!

I hope you will find this interesting. Or at least entertaining.

So here we go.

## Diving into the Maelström

The idea is to build a computing model based exclusively on functions. In a way, it is a kind of game where you have to find out how to create and use the computing concepts that you use everyday without thinking too much about it.

What can make this hard are the rules that we choose to do so.

### The Rules

Beazley starts by setting a few simple rules, stating what is allowed and what is not, while trying to build the various concepts we need.

Those rules are quite simple:

1. Everything is a function, nothing else.

2. Each function has a single argument (that is a function, per previous rule).

3. Its return value is a function, for the same reason.

He also provides sound pieces of advice, like this one:

> When lost (like I have been), remember. Think of the various concepts as "behaviours".

HTH!

### Boolean Values

> How would you define booleans with such rules?

Ouch...

For starters, let's break a few rules, by using non-function arguments. To be a bit more concrete, we will consider electrical levels (engineering!) as Python strings.

In [None]:
HIGH = "5V"
LOW = "GND"  # i.e. ground, 0 Volts.

Now we want something that returns "5V" when TRUE, and "GND" when FALSE.

We could start with something like:

```python
assert TRUE("5V", "GND") == "5V"
```

and

```python
assert FALSE("5V", "GND") == "GND"
```

Minimal code to achieve this could be:

In [None]:
def TRUE(left, right):
    return left


def FALSE(left, right):
    return right

Does it run as expected?

Well, it seems so:

In [None]:
assert TRUE("5V", "GND") == "5V"
assert FALSE("5V", "GND") == "GND"

Now let's try to get back to our initial rules. At least 2 rules are broken here:

* we want single argument functions (rule-2),

* we want functions as arguments (rule-1).

To address the first broken rule, we'll use a simple trick that seems to be called *currying* (https://en.wikipedia.org/wiki/Currying).

The principle is to split a multiple-arguments function into several single-argument functions called in cascade. Instead of writing `f(a, b, c)`, we will code `f(a)(b)(c)` and expect the same return value.

To achieve this, the idea is to make each non-terminal function return another function that will in turn be called with the next argument. In Python, it means converting:

In [None]:
def add_ko(a, b):  # Two args here, rule-2 is broken!
    return a + b

to:

In [None]:
def add_ok(a):  # Single arg, OK for rule-2.
    
    def f(b):  # Single arg, OK too for rule-2.
        return a + b

    return f

`add_ok(a)` return the `f` function, that is called with the `b` argument.

Does it work?

Let Python tells us what its thinks about this:

In [None]:
# Rule-2 broken!
assert add_ko(3, 5) == 8

# Rule-2 compliant \o/.
assert add_ok(3)(5) == 8

With this, we alter initial `TRUE` and `FALSE` definition.

In [None]:
def TRUE(left):
    def f(right):
        return left
    return f


def FALSE(left):
    def f(right):
        return right
    return f

assert TRUE("5V")("GND") == "5V"
assert FALSE("5V")("GND") == "GND"

This code is now conformant to the "single argument" rule-2. Great.

Remember that rule-1 is also broken, as previous code uses strings as arguments, instead of functions. But if you look at the code closely, we can notice that only the assertion part fails us.

If "legal", i.e. single argument functions, were to be provided as the `left` and `right` arguments of `TRUE` and `FALSE`, all our rules would be standing.

In other words, our function definitions are OK. It is the calling code that is not.

So let's just remove that faulty code:

In [None]:
def TRUE(left):  
    def f(right):  
        return left  
    return f  

def FALSE(left):  
    def f(right):  
        return right  
    return f

And here we are. We can consider that, at this point, we have booleans.

Not convinced? Remember the piece of advice at the beginning. What we have constructed is functions that have the behaviour of booleans.

### Boolean Operators

#### Logical Not

Let's start with the simplest boolean operator: `NOT`.

What we want here is easy to state:

```python
assert NOT(TRUE) == FALSE  
assert NOT(FALSE) == TRUE  
```

With our electrical levels, it translates to:

```python
assert NOT(TRUE)("5V")("GND") == "GND"  
#      ^^FALSE^^  
assert NOT(FALSE)("5V")("GND") == "5V"  
#      ^^^TRUE^^^  
```

We can notice a few interesting facts about the `NOT` operator:

 * `NOT` is a function (rule-1 OK).

 * `NOT` takes a single argument (rule-2 OK).

 * Both `TRUE` and `FALSE` are functions, so
 
   * `NOT` takes a "legal" function as argument.

   * `NOT` returns a "legal" function.

That's rule-3!

`NOT` interface is well on tracks. It is now only a matter of finding a suitable implementation.

How to do this from this reasonable starting point:

```python
def NOT(f):  
    return ...  # The clever part goes here.  
```

?

We need to remember that `TRUE` returns the `left` argument (hint: on the left) and `FALSE` returns the `right` one (re-hint: on the right).

First, notice some ~~funny~~twisted, let's say... identities:

In [None]:
assert TRUE(FALSE)(TRUE) == FALSE  # Remember? TRUE returns left.
#             ^---------------^

If we write the same code, but using the named arguments, it becomes:

In [None]:
assert TRUE(left=FALSE)(right=TRUE) == FALSE  # TRUE returns left.
#                  ^---------------------^

Conversely:

In [None]:
assert FALSE(FALSE)(TRUE) == TRUE  # FALSE returns right.
#                     ^--------^

From this "finding", we can parametrize the very first function call of those "identities".

In [None]:
def NOT(f):
    return f(FALSE)(TRUE)  # Here, f is either TRUE or FALSE, that
    #        ^^^^^  ^^^^     will in turn choose left or right.
    #        left   right

`f` being `TRUE` will chose left function, that is `FALSE`. And *vice versa*.

Is that OK for Python?

In [None]:
assert NOT(TRUE) is FALSE
assert NOT(FALSE) is TRUE
# `is` instead of `==` is OK here, as TRUE and FALSE are always the same objects.

# Let's get more "physical".
assert NOT(TRUE)("5V")("GND") == "GND"
assert NOT(FALSE)("5V")("GND") == "5V"

We have a `NOT` function/behaviour! \o/

#### Logical And

Similarly, we would like to define a `AND` operator that plays well with our `TRUE` and `FALSE` functions.

Like this:

```python
assert AND(TRUE)(TRUE) is TRUE  
assert AND(TRUE)(FALSE) is FALSE  
assert AND(FALSE)(TRUE) is FALSE  
assert AND(FALSE)(FALSE) is FALSE  
```

First remark, `AND` has to deal with 2 arguments. So it will have to look like this:

```python
def AND(x):  
    def f(y):  
        ...  
    return f  
```

I'll drop the hint immediately: the point here is to think about the "binary operator shortcuts" that usual programming languages make.

*Binary operator shortcut*?

Let me explain: in C or Javascript, when you write `x && y`, what happens depends on `x`. More specifically, if `x` is `false`, `y` is never evaluated, as `x && y` will be `false` anyway.

Conversely, when `x` is true in C-ish `x && y`, computation has to reach for `y` to find out the final value. There are 2 possibilities at this point: either `y` is `true` and `x && y` is `true`, or `y` is `false`, and so is `x && y`. It means the result **is** the `y` value.

Building the `AND` function will use a similar behaviour.

If we recap, we want:

 * our `AND(x)(y)` function to return `FALSE` when `x` is `FALSE`. If you look carefully, it means two things:

    * if `x` is `FALSE` and we want to return `FALSE`, returning `x` is OK.

    * If we consider that `x` is the *left* argument of `AND(x)(y)` function, and `y` the *right* one, we want the *left* one (i.e. `FALSE`).

 * `AND(x)(y)` function to return `y` when `x` is `TRUE`. Which is the *right* argument.

How can we get the *left* or the *right* arguments? Do we have some functions to get them for us?

Hell yes we have! They are called `TRUE` and `FALSE`.

Say we'd like to get the right argument, we'd use the `FALSE` function:

In [None]:
assert FALSE("whatever")(FALSE) is FALSE  # Left-y part is bypassed by FALSE.
#                          ^---------^

And for the *left* argument, `TRUE` it will be:

In [None]:
assert TRUE(TRUE)("whatever") is TRUE  # Right-y part is bypassed by TRUE.
#             ^--------------------^

assert TRUE(FALSE)("whatever") is FALSE  # Right-y part is bypassed by TRUE.
#             ^--------------------^

From these examples and the "shortcut" mechanism we want to emulate, we can infer that our `x` (which is either `TRUE` or `FALSE`) should come first, and its result should call our `y` as its *left* argument and leave *right* one as the `FALSE` value.

Still there?

Bear with me. Let's write this down in Python:

In [None]:
def AND(x):
    def f(y):
        return x(y)(FALSE)
        #      ^ x being FALSE will choose FALSE, x being TRUE  
        #        will choose y (which in turn is TRUE or FALSE).

    return f

In [None]:
assert AND(TRUE)(TRUE) is TRUE
assert AND(TRUE)(FALSE) is FALSE
assert AND(FALSE)(TRUE) is FALSE
assert AND(FALSE)(FALSE) is FALSE

Nice.

Wait, we can do even better.

The `FALSE` value in `f` definition is called only when `x` is `FALSE`. So we can replace it in the implementation, without changing the outcome:

In [None]:
def AND(x):
    def f(y):
        return x(y)(x)
        #           ^--- Called only if x is FALSE, so it is always FALSE when called.
    return f

In [None]:
assert AND(TRUE)(TRUE) is TRUE
assert AND(TRUE)(FALSE) is FALSE
assert AND(FALSE)(TRUE) is FALSE
assert AND(FALSE)(FALSE) is FALSE

#### Logical Or

Its implementation is similar to AND's one. Intuitively, we use the same trick: we mimic the shortcut taken by usual C-ish x || y. In this case, if x is TRUE, y does not need to be evaluated.

In [None]:
assert TRUE(TRUE)("anything") is TRUE

If `x` is `FALSE`, `x || y` value is up to `y`.

In [None]:
assert FALSE("anything")(TRUE) is TRUE
assert FALSE("anything")(FALSE) is FALSE

Possible implementation:

In [None]:
def OR(x):
    def f(y):
        return x(x)(y)

    return f

If you are lost, let me reformulate `OR` behaviour once again:

* If `x` is `TRUE`, left function (`x` as `TRUE`) is returned.
* If `x` is `FALSE`, let `y` on the right (that can be either `TRUE` or `FALSE`) decide.

In [None]:
assert OR(TRUE)(TRUE) is TRUE
assert OR(TRUE)(FALSE) is TRUE
assert OR(FALSE)(TRUE) is TRUE
assert OR(FALSE)(FALSE) is FALSE

We have boolean logic (well, boolean logic behaviour at least)!

### Some Python Syntax

Did you know that the initial rules we chose for this journey are leading us
towards [Lambda Calculus](https://en.wikipedia.org/wiki/Lambda_calculus)?

> Lambda calculus (also written as λ-calculus) is a formal system in
> mathematical logic for expressing computation based on function abstraction
> and application using variable binding and substitution. It is a universal
> model of computation that can be used to simulate any Turing machine.
>
> It was introduced by the mathematician Alonzo Church in the 1930s as part
> of his research into the foundations of mathematics.

If you've done a bit of Python, you have already made the link with `lambda`
Python keyword.

It is a way to declare functions. More precisely, *anonymous functions*.

It means there are 2 ways of declaring functions in Python.

In [None]:
def add(a, b):
    return a + b

assert add(3, 5) == 8

or

In [None]:
lambda a, b: a + b

assert (lambda a, b: a + b)(3, 5) == 8

2 ways of declaring, but same usage and same result.

In the second case, using functions is not very convenient, so we give it a name with a usual Python assignment.

In [None]:
add = lambda a, b: a + b

assert add(3, 5) == 8

A Python `lambda` function definition is somewhat limited, as it is restricted to a single expression. But in our case, it is a convenient way to shorten our code.

Especially when you want only functions with a single argument, as

In [None]:
def add(a):
    def f(b):
        return a + b
    return f

becomes first:

In [None]:
def add(a):
    return lambda b: a + b

and finally:

In [None]:
add = lambda a: lambda b: a + b

Let's rewrite our previous work with this syntax:

In [None]:
TRUE = lambda x: lambda y: x
FALSE = lambda x: lambda y: y

NOT = lambda f: f(FALSE)(TRUE)

AND = lambda x: lambda y: x(y)(x)
OR = lambda x: lambda y: x(x)(y)

And ensure that all is working as expected:

In [None]:
assert TRUE("5V")("GND") == "5V"
assert FALSE("5V")("GND") == "GND"

assert NOT(TRUE) is FALSE
assert NOT(FALSE) is TRUE

assert NOT(TRUE)("5V")("GND") == "GND"
assert NOT(FALSE)("5V")("GND") == "5V"

assert AND(TRUE)(TRUE) is TRUE
assert AND(TRUE)(FALSE) is FALSE
assert AND(FALSE)(TRUE) is FALSE
assert AND(FALSE)(FALSE) is FALSE

assert OR(TRUE)(TRUE) is TRUE
assert OR(TRUE)(FALSE) is TRUE
assert OR(FALSE)(TRUE) is TRUE
assert OR(FALSE)(FALSE) is FALSE

Same result, shorter code.

It may not seem easier to read at first, but this is only the beginning of
our journey. You will get used to it rapidly.

We can now go on!

### Numbers

#### Numeration

> How can we "count" things with those rules?

Maybe another formulation can help us:

> What could we "count", that would help us to build "numbers"?

As the rules restrict us to functions, we could count *how many times a **function** is applied to **something***.

A bit surprizingly :

 * The **something**, to which a function is applied several times, is not really playing an important role. It will just be an argument of our *number functions* (because, remember, our numbers must be functions too). We will call this argument `x`, because it is much shorter than, for example, `dominique`.

 * Similarly, the exact **function** that is called several times is not the heart of this principle, so we'll abstract it by making it another argument of our number functions. It is usually called `f`.

Following this idea, we define functions that act ("behaviours"...) as numbers this way:

In [None]:
ONE = lambda f: lambda x: f(x)
TWO = lambda f: lambda x: f(f(x))
THREE = lambda f: lambda x: f(f(f(x)))

In `ONE`, `f` is applied to `x` once. In `TWO`, `f` is applied to `x` twice... you get the idea.

In a way, our numbers are a kind of API that takes 2 arguments: `f` and `x`. It may not make a lots of sense at the moment, but keeping this in mind will help later.

Just like booleans and signal levels, we can illustrate their expected behaviour by choosing off-the-rules a `f` and a `x`.

Here is an example:

In [None]:
incr = lambda x: x + 1  # This will be our `f`.

assert ONE(incr)(0) == 1
assert TWO(incr)(0) == 2
assert THREE(incr)(0) == 3

Here is another one, with a different `f` function and another *initial `x` value*, so you are convinced `f` and `x` are not "central" to our number definitions:

In [None]:
concat = lambda x: "*" + x  # Another `f`.

assert ONE(concat)("") == "*"
assert TWO(concat)("") == "**"
assert THREE(concat)("") == "***"

#### Exploration

We have numbers, and as they are functions, we may want to try a few funny things.

How about this?

In [None]:
TWO(TWO)(incr)(0)

Multiplication?

Well, could have been, but nope.

In [None]:
TWO(THREE)(incr)(0)

Exponentiation it is!

In [None]:
FOUR = lambda f: lambda x: f(f(f(f(x))))

assert FOUR(THREE)(incr)(0) == 3**4

With the same number definition, defining `ZERO` is a matter of never calling the `f` function:

In [None]:
ZERO = lambda f: lambda x: x

assert ZERO(incr)(0) == 0

#### Successor

We have numbers, but we need something more to really be able to count. We have to build a way to go from one of our numbers to the another one.

The first link we want to build is the `SUCC` function, that finds the number that immediately follows a given one. You know, like *five* comes just after *four*.

Intuitively, it means our generic `f` function is called one additional time.

The function `SUCC` will take a number function as argument, we will call it `n`. This number is the kind we defined previously, with what I called its "API", that wants 2 arguments that are `f` and `x`.

Let try this:

In [None]:
def SUCC(n):  
    #    ^---------- The number function whose successor is wanted  
    return lambda f: lambda x:   f(   n(f)(x)   )  
    #      ^^^^number API^^^^^   ^    ^^^^^^^ "old/previous number", that applies f n times  
    #                            |  
    #                    f applied once more  

In short

In [None]:
SUCC = lambda n: lambda f: lambda x: f(n(f)(x))

If you are like me, this one starts to be a bit hairy. What did we do here?

1. `n`, our number-function whose successor we want, is applied to `f`. By definition, it applies it first API argument `n` times.

2. It is applied to what? to `x`, the second argument of our numbers "API", also by definition.

3. then `f` is applied once more, so it means the result is the `n + 1` number function, as `f` was previously applied `n` times.

4. Lastly, we provide the necessary envelop of functions and parameters (i.e. all the leading `lambda` statements), to bring `n`, `f`, and `x` in the scope.

(and if it may comfort you, even as I'm writing this, it is not "obvious" to me, the way non-functional code can be).

Does it work?

Let's check:

In [None]:
assert SUCC(TWO)(incr)(0) == THREE(incr)(0)
assert SUCC(SUCC(TWO))(incr)(0) == FOUR(incr)(0)

### Arithmetics

We have numbers, we can count, we want math operations!

#### The Easy Part

First, addition. Adding is taking several times the successor of a base number.

In [None]:
ADD = lambda a: lambda b: a(SUCC)(b)  

In other words, `SUCC` is applied `a`times on top of `b`.

In [None]:
assert ADD(TWO)(THREE)(incr)(0) == 5
assert ADD(TWO)(THREE)(incr)(0) == ADD(FOUR)(ONE)(incr)(0)

Nice :-)

We are going to leave substraction for later, you will understand why.

Now, multiplication.

The main idea is to do `f` `a` times, then you want to do this `b` times. No need of `SUCC` here.

In [None]:
MUL = lambda a: lambda b: lambda f: b(a(f))

In [None]:
assert MUL(FOUR)(THREE)(incr)(0) == 12

Awesome.

Division is mathematically and functionally definable with basic steps,
so it is possible, but outside of the scope of this notebook.

#### The Less Easy Part, Substraction

We have to make a detour before being able to build substraction.

We will need...

### Data Structures

Our way to programming requires being able to aggregate informations.

And the simplest way to start is to use the good, old lispy operators:

* `CONS` to concatenate two "values", as a *t-uple* of 2 values, i.e. a *couple*.
* `CAR` to get the first value of the couple.
* `CDR` to get the second one.
 

With functions, the concept of "storage" is a bit convoluted. To stay compliant with our rules, we can consider that `lambda _: "whatever"` is a function that "stores" `"whatever"`.

(Small parenthesis: calling a function parameter `_` in Python tells everybody that we recognize a parameter should be there, because our rules require it, but that we do care about it.)

Make `"whatever"` a compliant function and we are good.

To create a couple, we are going to use a trick similar to the one we used for booleans (see [part 1](#Boolean-Values).

In [None]:
CONS = lambda a: lambda b: lambda s: s(a)(b)

Let's take simple values to begin with, and build our first couple:

In [None]:
p = CONS(2)(3)

You may not realize it, but we already have functions that allows selection.

Remember, we made electrical switches at the beginning of our journey, we even called them `TRUE` and `FALSE`. Those 2 functions can be used as the `s` argument we did not provide when we created the couple.

In [None]:
assert p(TRUE) == 2
assert p(FALSE) == 3

From there, we use them to implement the missing `CAR` and `CDR` functions, where the `c` argument is a couple like `p`:

In [None]:
CAR = lambda c: c(TRUE)
CDR = lambda c: c(FALSE)

A few assertions will ensure we are correct:

In [None]:
assert CAR(CONS(2)(3)) == 2
assert CDR(CONS(2)(3)) == 3

At top level, it seems to be limited to couples, i.e. aggragating only 2 values.

It is not, as composition gives us (linked) lists.

In [None]:
assert CAR(CDR(CONS(2)(CONS(3)(4)))) == 3

And couples of couples can make trees, and... well, you get it.

### Predecessor

We are on the way to define what is required to build the substraction operation.

To do this, we will use couples, composed of a number and its predecessor.

How do we find the predecessor?

Well, the trick is that we don't. We do this the other way around.

The couple is in fact a number and its successor, but in reverse order. Finding the successor is a solved problem, thanks to the `SUCC` function.

Let's define our couple:

In [None]:
COUPLE = lambda p: CONS(SUCC(CAR(p)))(CAR(p))

It is a data structure (thanks to `CONS`) composed of a successor of number `p` and the number `p`.

Or is it?

Well, not exactly.

Did you notice the 2 `CAR`s within `COUPLE` definition?

`p` here is not one of the *number functions* we previously defined.

If we are to apply `CAR` to `p`, `p` is supposed to be a `COUPLE`!

How are we supposed to build and use that?

First, we will use "alternative" numbers, that are `COUPLE`s, instead of previous *number-functions*.

Second, we are going to use what we already have to do so.

In this new scheme, here is what the *four* number is like:

In [None]:
FOUR_ = FOUR(COUPLE)(CONS(ZERO)(ZERO))

`CONS(ZERO)(ZERO)` is our initial value.

`COUPLE` applied to `CONS(ZERO)(ZERO)` is `CONS(ONE)(ZERO)`.

`COUPLE` applied to `CONS(ONE)(ZERO)` is `CONS(TWO)(ONE)`.

Do this 4 times, and you have `CONS(FOUR)(THREE)`.

And by definition, `FOUR(COUPLE)` applies `COUPLE` four times.

Wow, strange.

Does it even work?

In [None]:
assert CAR(FOUR_)(incr)(0) == 4
assert CDR(FOUR_)(incr)(0) == 3

It seems so.

We now have the tool to get the predecessor of any number.

Unconvinced?

Check again:

In [None]:
assert CAR(FOUR_)(incr)(0) == 4
assert CDR(FOUR_)(incr)(0) == 3  # <-- Looks a lot like 4 predecessor.

That leads us to this (far-from-obvious at first, at least for me) predecessor function for the `n` number:

In [None]:
PRED = lambda n: CDR(n(COUPLE)(CONS(ZERO)(ZERO)))

What is happening here?

1. We take the `n` number.

2. We build the `(n, n-1)` couple, by applying `COUPLE` `n` times to
   the `CONS(ZERO)(ZERO)` initial value.

3. And we only keep the second part of the resulting structure.

I'm not even kidding:

In [None]:
EXP = FOUR(THREE)  # Exponentiation, remember?
assert EXP(incr)(0) == 81
assert PRED(EXP)(incr)(0) == 80

Now, it is important to notice that the point of all this exploration is about what is possible, not its effectiveness. Which, in some way, is fortunate, because this will be even more true when building substraction.

### Substraction

When you have the `PRED` function, substraction definition looks a lot like the addition one.

In [None]:
SUB = lambda x: lambda y: y(PRED)(x)

In [None]:
assert SUB(FOUR)(TWO)(incr)(0) == 2

Fantastic!

Highly inefficient, but possible. And fantastic!

At this point, we have boolean logic, numbers and arithmetics...

We miss just a little something to be able to do some... (solemn music) programming.

Say you want to write a basic program. OK, you have booleans, numbers, operations, but you also need control flow.

So here are coming...

### Tests

In other terms, conditions, i.e. some kind of if-clause, but with functions.

Now this will start to look like we can actually write a computer program!

Promising, isn't it?

The simplest test that comes to one's mind when dealing with numbers is testing for zero value.

What we want is a function that returns `FALSE` or `TRUE` when given a *function-number* `n`.

To do this, we have to remember that `ZERO(f)(x)` never calls `f`.

In [None]:
def f(_):
    raise Exception()
    
assert ZERO(f)(0) == 0

On the opposite, any other number-function calls `f` at least once.

So `ZERO(f)("whatever")` will return `"whatever"`. Replace `"whatever"`
with `TRUE` and we have half of what we want.

In [None]:
assert ZERO(f)(TRUE) == TRUE

To get the other half, notice that whatever the (stricly positive) number of times the `f` function is called, it should always return `FALSE`.

If only we had a function like `lambda _: FALSE`, that always returns `FALSE`, to be used instead of `f`...

Oh wait!

We got it now:

In [None]:
IS_ZERO = lambda n: n(lambda _: FALSE)(TRUE)

In [None]:
assert IS_ZERO(ZERO) == TRUE
assert IS_ZERO(ONE) == FALSE
assert IS_ZERO(THREE) == FALSE

Tada! 🎉

## At last, some code

### Factorial!

Factorial is a mathematical operation that is easy to implement in a recursive way.

$$ n! = n(n-1)(n-2)\cdots (2)(1) $$

In "traditional" Python, we could write:

In [None]:
def fact(n):
    if n == 0:
        return 1
    else:
        return n * fact(n - 1)

In [None]:
assert fact(3) == 6

Indeed, $3! = 6$.

Good, good.

Now, we translate exactly this code into our basic functions:

In [None]:
FACT = lambda n: IS_ZERO(n) (ONE) (MUL(n)(FACT(PRED(n))))  
#                   if FALSE ^^^   ^^^^^^^^^^^^^^^^^^^^^ if TRUE

It is almost part-per-part translation.

If you are still reading at this point, you may find it somewhat elegant.

Does it work?

Well... in theory, yes.

Of course, that means no, and we are going to explain why and how to make it work.

But let me state first that our code is correct.

It does not work the way it is intended because of the Python interpreter.

Look:

In [None]:
try:
    FACT(THREE)
except RecursionError:
    print("Oh no! :lemming_emoji:")

#### Problem

`Recursion error`! Python can not have infinite depth of function calls. Usual limit is around 1000 calls.

Ok, but why does it triggers so many recursive calls?

Because Python is an **eager** language: it evaluates all expressions before use, even when they are not actually used.

It makes sense intuitively: Python evaluates the value of each argument of a function before calling the function. Even when the corresponding parameter is not used inside the function.

The vast majority of industry programming languages are eager. So Python is not alone with this "limitation".

Alternatively, non-eager languages are called **lazy**. Some of them, like Haskell, let you choose between eager and lazy evaluations.

The problem is identified. We now have to look for a solution.

#### Solution

We somehow have to find a way to make Python lazy.

And guess what?

This can be achieved with functions.

*Moar lambdaaaaaaas!*

Just remember that, starting at this point, we are going to introduce changes that break the initial rules we gave ourselves.

But we are doing so because of Python, not because we made mistakes.

In a way, we are switching from the Computer Science (with nice theories) domain to the Software Engineering one (reality can be sometimes unpleasant). Compromises...

To fix the problem, we introduce explicitly lazy versions of `TRUE` and `FALSE` functions:

In [None]:
LAZY_TRUE = lambda x: lambda y: x()
#                                ^^ This is not allowed by our initial rules.
LAZY_FALSE = lambda x: lambda y: y()
#                                 ^^ Same here.

Then we replace the eager version in our good, old `IS_ZERO` test function.

In [None]:
IS_ZERO = lambda n: n(lambda _: LAZY_FALSE) (LAZY_TRUE)  
#                           here^^^^^^^^^^and^^^^^^^^^there 

Let's define `FACT` again with this new code.

In [None]:
FACT = lambda n: IS_ZERO(n)(lambda: ONE)(lambda: MUL(n)(FACT(PRED(n))))

And...

In [None]:
assert FACT(THREE)(incr)(0) == 6

So pleasing.

We actually succeeded at writing and executing a program only made of single-argument functions.

But in fact, we did not.

### We Cheated

What?

Nooooooooo.

What happened?

We defined **names**.

We use `FACT` to implement `FACT`.

In [None]:
FACT = lambda n: IS_ZERO(n) (ONE) (MUL(n)(FACT(PRED(n))))
#                                         ^^^^ here

*THAT* is against the rules.

What are we going to do?

## No References

What we want to achieve is to find out how to define `FACT` without using the `FACT` name.

Our problem is that this name is not a function argument, but a reference to an external pre-existing function (a kind of global). Cheating, I told you.

First, a bit of exploration.

Remember, usual `factorial` can be written like this:

In [None]:
fact = lambda n: 1 if n == 0 else n * fact(n - 1)

In [None]:
assert fact(5) == 120

Removing a reference name can start by making it an (additional function) argument:

In [None]:
fact = (lambda f: lambda n: 1 if n == 0 else n * f(n - 1))(fact)
#              ^                                 ^         ^^^^
#              |--------------->-----------------|           |
#              |---------------<-----------------------------|

In [None]:
assert fact(5) == 120

Nothing really fancy here.

At this point, we are still using the `fact` name (as an argument).

Next step, try substitution, i.e. we replace `fact` by its implementation:

In [None]:
fact = (lambda f: lambda n: 1 if n == 0 else n * f(n - 1))(
    #   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    #                       |        ^
    #                       Get this |
    #                       and
    #                       put it here |
    #                       |           v
    # vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
    lambda f: lambda n: 1 if n == 0 else n * f(n - 1)
)

There, no more `fact` argument. How great is that?

Well, it is great if it works.

In [None]:
assert fact(0) == 1

is a good start.

Unfortunately,

In [None]:
assert fact(1) == 1

Can you spot the problem?

In the first `n * f(n - 1)`, `n`, an integer (`int`), attempts to
multiply to `f(n - 1)`, that should evaluate to another `int`.

Except `f(n - 1)` is a function, not an `int`.

In this case, the `f` argument value is the whole subsituted term,
that takes 2 arguments (`f` and `n`).

To fix this, the second argument has to be provided.

Remember, the first argument is the function we want to use as argument.
Which is `f`.

New attempt:

In [None]:
fact = (lambda f: lambda n: 1 if n == 0 else n * f(f)(n - 1))(
    #                                            ^^^^ here...
    lambda f: lambda n: 1 if n == 0 else n * f(f)(n - 1)
    #                                        ^^^^ ... and there.
)

In [None]:
assert fact(0) == 1
assert fact(5) == 120

This definition is not really beautiful with all these repetitions, but please do not ignore the rejoicing fact that we have now completely dropped references to any pre-existing name.

What do I mean?

All names can be eliminated from the computation!

In [None]:
assert (lambda f: lambda n: 1 if n == 0 else n * f(f)(n - 1))(
    lambda f: lambda n: 1 if n == 0 else n * f(f)(n - 1)
)(5) == 120

See? No fact in there. Recursive definition, but no names. How crazy is that?

Crazy, yes.

And how elegant is that?

Maybe less so... it looks like there is a bit of duplicated code.

Could we go further and eliminate the repetition?

Yes.

How?

By...

### Abstracting the Recursion

Consider our previous Python factorial definition:

In [None]:
fact = (lambda f: lambda n: 1 if n == 0 else n * f(n - 1))(fact)

Wouldn't it be nice to abstract the ~~business~~calculation part for the we-deal-with-recursion part?

Taking the "core" part first, something like:

In [None]:
# fact = (lambda f: lambda n: 1 if n == 0 else n * f(n - 1))(fact)  
R =       lambda f: lambda n: 1 if n == 0 else n * f(n - 1) 

on the one hand, and making it recursive like this:

In [None]:
# fact = (lambda f: lambda n: 1 if n == 0 else n * f(n - 1))(fact)  
# fact = (                       R                         )(fact)  # That condenses to:  
fact = R(fact)  

on the other hand.

Could it be as simple?

Well no.

Yes, this works:

In [None]:
assert fact(5) == 120

But we're cheating again.

```python
fact = R(fact)
```

is valid Python only if `fact` is defined beforehand.

References...

But next idea starts from this attempt.

### Fixed Point

If you are still there, congratulations.

My tone may be playful and cheerful, but most of the previous steps were not (and for some are still not) obvious to me. We came a long way.

The end of this journey is arriving soon. But this last step is not the simplest one.

So bear with me, the end is worth it.

(Note that I could be lying, but you won't be sure unless you read on, so at this point, why not trusting me? Did I ever deceive you?)

It is time to talk about the *fixed-point*.

This is an official (read: *mathematical*) definition:

> "`x` is fixed-point of `f`" means `f(f(x)) == f(x) == x`.

For example, consider $\sqrt{1} = 1$: 1 a fixed-point of the square root function.

Under this definition, `fact = R(fact)` means that `fact` is a fixed point of `R`.

PSA: Next steps will look like code, but is not actual, runnable Python.

We will use it though, as it is a good way to illustrate the reasoning.

Take a deep breath and let's go.

We will deal with a function called `Y`, that has a ~~unique~~interesting property: we suppose that `Y(R)` returns a fixed point of `R`.

That implies, per fixed-point definition, that

```python
        Y(R) = R(Y(R))
```

Applying `R`, i.e. writing `R(...)`, is the same as writing `(lambda x: R(x))(...)`.
So we can write it as:

```python
        Y(R) = (lambda f: R(f))(Y(R))  
```

We now want to eliminate what is a recursive call to `Y(R)`, so we use the same trick
than our `fact` implementation, i.e. we repeat the function definition instead of its
name:


```python
        Y(R) = (lambda f: R(f))(lambda f: R(f)) 
```

Remember that when we did this, it failed miserably, because its arguments are not a single function argument, but a 2-arguments functions. We solved this by "doubling the call", i.e. using `f(f)` instead of just `f` in both call sites.

Doing the same, we get:

```python
        Y(R) = (lambda f: R(f(f)))(lambda f: R(f(f)))
```

Next step, we want the `R` to become a argument. We do this by adding a `r` parameter of an englobing `lambda`, that takes the `R` value as argument:

```python
        Y(R) = ( lambda r: (lambda f: r(f(f)))(lambda f: r(f(f))) )(R)
```

Finally, because `R` has become the single and only argument on both side of the affectation, we can drop it without changing `Y` definition:

In [None]:
Y = lambda r: (lambda f: r(f(f)))(lambda f: r(f(f)))

And...

Please welcome the **Y combinator**, invention (or discovery?) of Haskell B. Curry!

It is usually written with different lambda parameter names (`f` -> `x` et `r` -> `f`), i.e. same formula, different letters:

In [None]:
Y = lambda f: (lambda x: f(x(x)))(lambda x: f(x(x)))  

Oh, if you are curious: a *combinator* is a function with no free variables.

Now we can play with it.

If we choose to define:

In [None]:
R = lambda f: lambda n: 1 if n == 0 else n * f(n - 1)

Then `fact = Y(R)` should be sufficient to get our factorial function back.

Theoritically, it is.

In ~~practice~~Python, it is not.

In [None]:
try:
    fact = Y(R)
except RecursionError:
    pass

Bitten by Python eager evaluation again!

Previously, we wrapped some intermediate function calls to `lambda`s.

We will do the same here to let Python give us something useful.

Let it be clear that this is only a Python language and runtime ~~limitation~~requirement.

More specifically:

In [None]:
# Y=lambda f: (lambda x: f(          x(x)   ))(lambda x: f(          x(x)   ))  # <-- previous
Y = lambda f: (lambda x: f(lambda z: x(x)(z)))(lambda x: f(lambda z: x(x)(z)))  # <-- adapted

In [None]:
fact = Y(R)

In [None]:
assert fact(5) == 120

Is our "recursion" abstraction working for anything else but factorial?

The answer is yes!

Here is Fibonacci:

In [None]:
fib = Y(lambda f: lambda n: 1 if n <= 2 else f(n - 1) + f(n - 2))
assert fib(10) == 55

Here it is. My mind is more than bended. My head exploded (well, figuratively, of course).

What about yours?

## Conclusion


We have come to the end of this journey, some ~~assembly~~fundamentals of functional programming.

I don't have any idea on how I could make this useful on a day-to-day basis. **UPDATE**: I've been bitten by Python eageness in some production code and used `lambda`s to properly deal with it.

I can not even pretend making sense of all of this.

But I went much farther than last time it tried, and I'm convinced it will make me a better programmer.

And if I'm wrong, at least perhaps you found this interesting, or even educational.

### Links

 * The video I used as base for this notebook: https://www.youtube.com/watch?v=5C6sv7-eTKg

 * You prefer Javascript instead of Python? Try https://lucasfcosta.com/2018/05/20/Y-The-Most-Beautiful-Idea-in-Computer-Science.html
        
 * *Y-combinator*, in Python:
 
   * https://lptk.github.io/programming/2019/10/15/simple-essence-y-combinator.html
   * https://david.ae/posts/the-z-and-y-combinators-in-python/
  
 * Previous derivations and more, still in Python: https://matt.might.net/articles/python-church-y-combinator/, with a fun realization:
 
   > This post is a proof that the indentation-sensitive constructs in Python are strictly optional