# Introduction

This notebook tries to explore the dual concepts of data and recursion, and how they can be applied to common programming tasks.

**Data** represents finite values. A given data $x$ of type $T$ is made by a finite application of _constructors_ of $T$. For eg, any finite stack is made by creating a new stack and then pushing a finite amount of values.

**Codata** (usually) represents infinite values. A given codata is defined by specifying a set of _destructor_ functions. A destructor is a function able to extract some part of the infinite structure (like `head` and `tail`).

## Recursion and Corecursion

> In computer science, corecursion is a type of operation that is dual to recursion. Whereas recursion works analytically, starting on data further from a base case and breaking it down into smaller data and repeating until one reaches a base case, corecursion works synthetically, starting from a base case and building it up, iteratively producing data further removed from a base case. [wikipedia](https://en.wikipedia.org/wiki/Corecursion)

**Recursion** can be used to define functions that map values from data $x$ by invoking itself over the parts of $x$. These functions will eventually stop recursing when reaching the base cases.

**Corecursion** defines functions that map values from _codata_ by applying destructors to the results.

Consider a recursive function to add one to every element of a list of numbers:

In [3]:
def add(ns):
  if ns==[]:
    return []
  n, *ns = ns
  return [n+1] + add(ns)

assert add([1,2,3]) == [2,3,4]

> Corecursion is often used in conjunction with lazy evaluation, to produce only a finite subset of a potentially infinite structure (rather than trying to produce an entire infinite structure at once)

In Python we use generators to achieve lazy evaluation:

In [2]:
# codata, list of all naturals
def nats(i=1):
  yield i
  yield from nats(i+1)

Function `head` shows the first $n$ elements of a given list,

In [6]:
def head(xs, n):
  """ generates the first n items """
  for i,x in enumerate(xs):
    if i==n:
      break
    yield x

print(*head(nats(), 30))

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30


Let's now make a corecursive function to add one to all naturals:

In [8]:
def coadd(ns):
  yield next(ns)+1
  yield from coadd(ns)

print(*head(coadd(nats()), 30))

2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31


Another codata eg, an infinite list of ones:

In [5]:
# In Haskell: ones = 1 : ones
def ones():
  yield 1
  yield from ones()

In [None]:
print(*head(ones(), 10))

Corecursion does not need always to create infinite codata. The next corecursive function produces a finite list when $a\leq b$,

In [22]:
def count_upto(a, b):
  if a != b+1:
    yield a
    yield from count_upto(a+1,b)

In [26]:
print(*head(count_upto(2, 9), 30))
print(*head(count_upto(9, 2), 30))

2 3 4 5 6 7 8 9
9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38


Corecursion creates a _stream_ of values, starting from the bases of recursion to more complex subproblems.

In [38]:
def facts(i=0, fac=1):
  yield fac
  yield from facts(i+1, fac*(i+1))

In [None]:
print(*head(facts(), 15))

1 1 2 6 24 120 720 5040 40320 362880 3628800 39916800 479001600 6227020800 87178291200


To make a corecursive generator, we can do without recursion,

In [9]:
# alternatives to previous implementations
def nats():
  i = 1
  while True:
    yield i
    i += 1

def coadd(ns):
  for n in ns:
    yield n+1    

Let's check some more examples.

Returns the odd-indexed elements:

In [41]:
def odds(xs):
  while True:
    yield next(xs)
    next(xs)

print(*head(odds(facts()), 10))

1 2 24 720 40320 3628800 479001600 87178291200 20922789888000 6402373705728000


Python iterable functions work well in this context:

In [45]:
# remove all factorials having one or more digits 3
not3 = lambda n: '3' not in str(n)

print(*head(filter(not3, facts()), 10))

1 1 2 6 24 120 720 5040 479001600 6227020800


A stream of all factorials,

In [None]:
def facts():
  i, fac = 0, 1
  while True:
    yield fac
    i, fac = i+1, fac*(i+1)

The Fibonacci sequence,

In [None]:
def fibs():
  a, b = 0, 1
  while True:
    yield a
    a, b = b, a + b

In [None]:
print(*head(fibs(), 25))

0 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987 1597 2584 4181 6765 10946 17711 28657 46368


A corecursive generator for $\mathbb{Q}$,

In [None]:
def rationals():
  yield (1,1)
  a = [1,1]
  k = 1
  while True:
    if k % 2 == 0:
      a.append( a[k//2] )
    else:
      kHalf = k//2
      a.append (a[kHalf]+a[kHalf+1] )
    yield (a[k], a[k+1])
    k = k+1

In [None]:
print(*head(rationals(), 15))

(1, 1) (1, 2) (2, 1) (1, 3) (3, 2) (2, 3) (3, 1) (1, 4) (4, 3) (3, 5) (5, 2) (2, 5) (5, 3) (3, 4) (4, 1)


The corecursive version of the sieve of Eratosthenes:

In [None]:
from itertools import count

def sieve(ns):
  n = next(ns)
  yield n
  yield from sieve( i for i in ns if i%n!=0 )

def primes():
  yield from sieve(count(start=2))

In [None]:
print(*head(primes(), 50))

2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 89 97 101 103 107 109 113 127 131 137 139 149 151 157 163 167 173 179 181 191 193 197 199 211 223 227 229


Corecursivity can be applied to other codatatypes. The next generator traverses an infinite binary tree using breath first:

In [None]:
from queue import deque

class Tree:
  def __init__(self, val, left, right):
    self.val   = val
    self.left  = left
    self.right = right

def bf(tree):
  nodes = deque([tree])
  while nodes:
    t = nodes.popleft() 
    if t is not None:
      yield t.val
      nodes.append(t.left)
      nodes.append(t.right)

In [None]:
t1 = Tree(1, None, None)
t2 = Tree(2, None, None)
t3 = Tree(3, t1, t2)
t1.left  = t3 # make an infinite tree    
t1.right = t3

print(*head(bf(t3), 50))

3 1 2 3 3 1 2 1 2 3 3 3 3 1 2 1 2 1 2 1 2 3 3 3 3 3 3 3 3 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 3 3 3 3 3


Streams can also be defined corecursively.

Consider the following corecursive Haskell definition for the Fibonacci sequence,

    fibs = 0 : 1 : zipWith (+) fibs (tail fibs)

Let's translate that to Python,    

In [None]:
from itertools import tee, islice, chain
from operator import add

def fibs():
  def output():
    for i in chain((0,1), map(add, fib, tail)):
      yield i

  stream, fib, tail = tee(output(), 3)
  tail = islice(tail, 1, None)
  return stream

In [None]:
print(*head(fibs(), 25))

0 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987 1597 2584 4181 6765 10946 17711 28657 46368


### Unfolds

A **fold** is a recursive function that analyses a data structure and recombines their elements (usually by summarizing them).

Python provides a folding function denoted `reduce`,

In [32]:
from functools import reduce

xs = [1,2,3,4]
reduce(lambda acc,x: 2*x+acc, xs, 0)

20

An **unfold** is a corecursive function that generates a sequence by repeated application of a given function to its previous result.

A famous unfold is `iterate`:

In [35]:
def iterate(f, x):
  yield x
  yield from iterate(f, f(x))

# alternative 
def iterate(f, x):
  while True:
    yield x
    x = f(x)

Making the powers of 2 using `iterate`:

In [36]:
print(*head(iterate(lambda x:2*x, 1), 20))

1 2 4 8 16 32 64 128 256 512 1024 2048 4096 8192 16384 32768 65536 131072 262144 524288


Let's make a more general unfold factory:

In [None]:
def unfold(step, end, state):
  while True:
    if end(state):
      return
    value, state = step(state)
    yield value

forever = lambda _: False    

Some use examples:

In [None]:
powers2 = unfold(lambda st: (st,2*st), forever, 1)
print(*head(powers2, 20))

1 2 4 8 16 32 64 128 256 512 1024 2048 4096 8192 16384 32768 65536 131072 262144 524288


Defining `iterate` via `unfold`,

In [None]:
iterate = lambda f, n: unfold(lambda st: (st,f(st)), forever, n)

powers3 = iterate(lambda x:3*x, 1)
print(*head(powers3, 20))

1 3 9 27 81 243 729 2187 6561 19683 59049 177147 531441 1594323 4782969 14348907 43046721 129140163 387420489 1162261467


This next function is a zip for two indexed structures:

In [None]:
end  = lambda ps: not ps[0] or not ps[1]
step = lambda ps: ((ps[0][0],ps[1][0]), (ps[0][1:],ps[1][1:]))
zip2 = lambda xs, ys: unfold(step, end, (xs,ys))

print(*zip2('abcde',[1,2,3,4,5,6]))

('a', 1) ('b', 2) ('c', 3) ('d', 4) ('e', 5)


## References

+ Turner - [Total Functional Programming](http://sblp2004.ic.uff.br/papers/turner.pdf) (2004)

+ Mike Gordon - [Corecursion and Coinduction](https://www.semanticscholar.org/paper/Corecursion-and-coinduction-%3A-what-they-are-and-how-Gordon/41fb876f6b35971173ef1808472350b51cf3afd1) (2007)

+ Wikipedia, [Corecursion](https://en.wikipedia.org/wiki/Corecursion)