Playing around with different Y Combinator implementations
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
c++
scheme
.gitignore
README.md

README.md

What is the Y-Combinator?

Let's start with writing a simple recursive factorial function:

(define (factorial n)
  (if (= n 0)
      1
      (* n (factorial (- n 1)))))

This can be written in a more explicit style:

(define factorial
  (lambda (n)
    (if (= n 0)
        1
        (* n (factorial (- n 1))))))

Notice that when defining the factorial function we make a recursive call. We will call this kind of definition an explicit recursive definition. You will see soon what we call an implicit recursive definition (spoiler: a recursive function which is generated through non-recursive means).

Goal - Eliminating explicit recursion

Can we still implement the factorial function if we are restricted from making recursive calls? The answer is yes and this will lead us directly to the Y-Combinator.

What is the Y-Combinator and what does it do?

The Y-Combinator is a higher-order function which takes a single argument - a non-recursive function and returns a version of that function which is recursive.

More generally, Y-Combinator gives us a way to get recursion in a programming language that supports first-class functions but that doesn't have recursion built into it.

Even though we ofter refer to Y as the Y-Combinator, in fact there are an infinite number of Y-Combinators. We will only be concerned with two of them, one lazy and one strict.

Note: Scheme is a dynamic and strongly typed programming language.

What is a combinator?

A combinator is just a lambda expression with no free variables. Examples:

(lambda (x) x)
(lambda (x) (lambda (y) x))
(x (lambda (y) y))  ; <-- not a combinator, it's a function application

So, is our previous definition of factorial a combinator? Let's look at it again:

(define factorial
  (lambda (n)
    (if (= n 0)
        1
        (* n (factorial (- n 1))))))

The answer is no, because the factorial name in its definition (the recursive call) is a free variable - it doesn't appear as a formal argument of the lambda expression.

Back to the puzzle

What we want to do is come up with a version of this that does the same thing but doesn't have the recursive call.

It would be nice if you could save all of the function except for the offending recursive call and put something else there. That might look like this:

(define sort-of-factorial
  (lambda (n)
    (if (= n 0)
        1
        (* n (<???> (- n 1))))))

It's a tried-and-true principle of functional programming that if you don't know exactly what you want to put somewhere in a piece of code, just abstract it out and make it a parameter of a function.

(define almost-factorial
  (lambda (f)
    (lambda (n)
      (if (= n 0)
          1
          (* n (f (- n 1))))))

Notice that this trick is not something specific to the factorial function. You can do the same thing with any recursive function. Let's take, for example, the fibonacci function:

(define fibonacci
  (lambda (n)
    (cond ((= n 0) 0)
          ((= n 1) 1)
          (else (+ (fibonacci (- n 1)) (fibonacci (- n 2)))))))

(define almost-fibonacci
  (lambda (f)
    (lambda (n)
      (cond ((= n 0) 0)
            ((= n 1) 1)
            (else (+ (f (- n 1)) (f (- n 2))))))))

So the Y-Combinator will give us recursion wherever we need it as long as we have the appropriate almost- function available.

Now, let's define the identity function:

(define identity (lambda (x) x))

What happens if we define the factorial (named factorial0 - you will see why) like this:

(define factorial0 (almost-factorial identity))

The interesting thing is that factorial0 will correctly compute the factorial for all natural numbers up to and including zero (you'll see soon why we express it this way).

Now, what if we do this:

(define factorial1 (almost-factorial factorial0))

which is equivalent to:

(define factorial1
  (almost-factorial
    (almost-factorial identity)))

Yes, that correctly computes the factorial for all natural numbers up to and including 1. You can do this ad infinitum.

(define factorial
  (almost-factorial
    (almost-factorial
      (almost-factorial
        (almost-factorial
          ...))))...)

And that would indeed give us the factorial function we need. One way to look at this is that almost-factorial takes in a crappy factorial function and outputs a factorial function that is slightly less crappy, in that it will handle exactly one extra value of the input correctly.

What we have shown is that if we could define an infinite chain of almost-factorials, that would give us the factorial function. Another way of saying this is that the factorial function is the fixpoint of almost-factorial.

Fixpoints of functions

The fixpoint of a function f is a point x such that f(x) = x. That is, you give the function a value and take back the same thing.

Fixpoints don't have to be real numbers. They can be any type of first-class citizens as long as the function that generates them can take as input the same thing (read type) as it outputs.

If you have a higher-order function like almost-factorial that takes as its input a function

(define almost-factorial
  (lambda (f) ; <-- This lambda here
    (lambda (n)
      (if (= n 0)
          1
          (* n (f (- n 1))))))

and produces another function with the same type

(define almost-factorial
  (lambda (f)
    (lambda (n)                ; \
      (if (= n 0)              ;  \ This
          1                    ;  / lambda
          (* n (f (- n 1)))))) ; /

then it is possible to compute its fixpoint, which will obviously be a function with the same type.

fixpoint-function = (almost-factorial fixpoint-function) ; <-- Think x = f(x)

By repeatedly substituting the right-hand side of this equation into the fixpoint-function on the right, we get:

fixpoint-function =
    (almost-factorial
      (almost-factorial fixpoint-function))

    = (almost-factorial
        (almost-factorial
          (almost-factorial fixpoint-function)))

    = ...

    = (almost-factorial
        (almost-factorial
          (almost-factorial
            (almost-factorial
              (almost-factorial
                ...)))))

As we saw above, this will be the factorial function we want. Thus, the fixpoint of almost-factorial will be the factorial function.

That's well and good, but just knowing that factorial is the fixpoint of almost-factorial doesn't tell us how to compute it. Wouldn't it be nice if there was some magical higher-order function that would take as its input a function like almost-factorial and would output its fixpoint function?

That function is exactly the Y-Combinator. Let's derive it.

Eliminating (most) explicit recursion (lazy version)

Let's start by specifying what Y does:

Y f = fixpoint-of-f

What do we know about the fixpoint of f? We know that

f fixpoint-of-f = fixpoint-of-f

by the definition of what a fixpoint of a function is. Therefore, we have:

Y f = fixpoint-of-f = f fixpoint-of-of

and we can substitute further and get:

Y f = f (Y f)

And that's it. It we want it to be expressed as a Scheme function, we would have to write it like this:

(define Y
  (lambda (f)
    (f (Y f))))

There are 2 problems with this approach:

  • It will only work in a language with lazy evaluation
  • It is not a combinator - we still have a recursive call

Nevertheless, if you're using lazy Scheme, you can indeed define factorials like this:

(define Y
  (lambda (f)
    (f (Y f))))

(define almost-factorial
  (lambda (f)
    (lambda (n)
      (if (= n 0)
          1
          (* n (f (- n 1)))))))

(define factorial (Y almost-factorial))

Eliminating (most) explicit recursion (strict version)

For programming languages with eager evaluation, we could use a trick to delay the evaluation of the recursive part of the Y definition from above. And the solution in Scheme is the following definition of the Y-Combinator:

(define Y
  (lambda (f)
    (f (lambda (x) ((Y f) x)))))

Unfortunately this doesn't work (?) with functions with more than one argument. There must be an implementation for each number of arguments:

; For two arguments functions
(define Y2
  (lambda (f)
    (f (lambda (x y) ((Y2 f) x y)))))

; For three arguments functions
(define Y3
  (lambda (f)
    (f (lambda (x y z) ((Y3 f) x y z)))))

; ... and so on

Actually deriving the Y-Combinator

The lazy (normal order)

At this point we want to define not just Y, but a Y combinator.

A way to think about what we want to achieve is that you should be able to replace the name of a combinator with its definition everywhere it's found and have everything still working (that's not possible with the recursive version we have so far).

Getting back to our original recursive definition of the factorial function:

(define (factorial n)
    (if (= n 0)
        1
        (* n (factorial (- n 1)))))

another way (other than the previous almost-factorial approach) of achieving recursion without explicitly calling factorial in its definition is to pass itself as an argument:

(define (part-factorial self)
  (lambda n)
    (if (= n 0)
        1
        (* n ((self self) (- n 1)))))

Now, the only thing we have to do is define the factorial function:

(define factorial (part-factorial part-factorial))
(factorial 5) ; Gives 120

Notice that we've already defined a version of the factorial function without using explicit recursion anywhere! This is a crucial step.

Let's try to get back something like our almost-factorial function by pulling out the (self self) call using a let expression:

(define (part-factorial self)
  (let ((f (self self)))
    (lambda (n)
      (if (= n 0)
          1
          (* n (f (- n 1)))))))

(define factorial (part-factorial part-factorial))
(factorial 5) ; Gives 120

This works fine in a lazy language. In a strict language, the (self self) will send us into an infinite loop. Notice that we didn't have this problem with the previous version, that's because the call was wrapped inside a lambda definition, while here it is a let binding.

Note that in a lazy language, the (self self) call in the let binding will never be evaluated unless f is actually needed (when n is other than 0).

It turns out that any let expression can be converted into an equivalent lambda expression:

(let ((x <expr1>)) <expr2>) <==> ((lambda (x) <expr2>) <expr1>)

This leads us to:

(define (part-factorial self)
  ((lambda (f)
    (lambda (n)
      (if (= n 0)
          1
          (* n (f (- n 1))))))
    (self self)))

If you look closely, you'll see that we have our old friend, the almost-factorial function embedded inside the part-factorial function. Let's pull it outside:

(define almost-factorial
  (lambda (f)
    (lambda (n)
      (if (= n 0)
          1
          (* n (f (- n 1)))))))

(define (part-factorial self)
  (almost-factorial
    (self self)))

(define factorial (part-factorial part-factorial))
(factorial 5) ; Gives 120

We want to get rid of the part-factorial function, so first let's rewrite it:

(define part-factorial
  (lambda (self)
    (almost-factorial (self self))))

Then, we can substitute it directly into factorial, using a let binding:

(define almost-factorial
  (lambda (f)
    (lambda (n)
      (if (= n 0)
          1
          (* n (f (- n 1)))))))

(define factorial
  (let ((part-factorial (lambda (self) (almost-factorial (self self)))))
    (part-factorial part-factorial)))

(factorial 5) ; Gives 120

We can rewrite the factorial function more concisely by renaming part-factorial:

(define factorial
  (let ((x (lambda (self) (almost-factorial (self self)))))
    (x x)))

Now let's use the same let - lambda equivalence:

(define almost-factorial
  (lambda (f)
    (lambda (n)
      (if (= n 0)
          1
          (* n (f (- n 1)))))))

(define factorial
  ((lambda (x) (x x))
   (lambda (self)
     (almost-factorial (self self)))))

(factorial 5) ; Gives 120

To get this definition more concise, we can rename self to x (they won't conflict as they are in different scopes):

(define factorial
  ((lambda (x) (x x))
   (lambda (x)
     (almost-factorial (x x)))))

This works fine, but it's too specific to the factorial function. Let's change it to a generic make-recursive function that makes recursive functions from non-recursive ones (sounds familiar?):

(define almost-factorial
  (lambda (f)
    (lambda (n)
      (if (= n 0)
          1
          (* n (f (- n 1)))))))

(define (make-recursive f)
  ((lambda (x) (x x))
   (lambda (x) (f (x x)))))

(define factorial (make-recursive almost-factorial))
(factorial 5) ; Gives 120

The make-recursive is in fact our Y-Combinator, also know as normal-order Y combinator.

(define almost-factorial
  (lambda (f)
    (lambda (n)
      (if (= n 0)
          1
          (* n (f (- n 1)))))))

(define (Y f)
  ((lambda (x) (x x))
   (lambda (x) (f (x x)))))

(define factorial (Y almost-factorial))

Let's expand it a little bit more:

; Notice the similiarity between this Scheme definition and the lambda-calculus
; syntax: https://upload.wikimedia.org/math/9/0/a/90a21d15fde5ec13d1379791fa4a6548.png
(define Y
  (lambda (f)
    ((lambda (x) (x x))
     (lambda (x) (f (x x))))))

and now let's apply the inner lambda on its argument - the other inner lambda:

(define Y
  (lambda (f)
    ((lambda (x) (f (x x)))
     (lambda (x) (f (x x))))))

What this means is that, for a given function f (which is a non-recursive function like almost-factorial), the corresponding recursive function can be obtained first by computing

(lambda (x) (f (x x)))

and then applying this lambda expression to itself.

The strict (applicative-order) Y-Combinator

As we did before, the only trick here is to wrap the (x x) call into another lambda function. The definition of the applicative-order Y-Combinator becomes:

(define Y
  (lambda (f)
    ((lambda (x) (f (lambda (y) ((x x) y))))
     (lambda (x) (f (lambda (y) ((x x) y)))))))

(define factorial (Y almost-factorial))

As before, we need to add new implementations for each number of arguments needed.