# Recursion drawbacks and pitfalls

This tutorial covers pitfalls when designing recursive implementations of algorithms and how to avoid them. Then moving from pitfals to drawbacks of recursion because the latter is not an omnipotent tool to solve problems.

## Table of contents

- [Pitfalls](#pitfalls)

- [Drawbacks](#drawbacks)

- [References](#references)

Importing auxiliary utilities.

In [20]:
from contextlib import nullcontext as does_not_raise
from typing import Any, Callable

import ipytest
import pytest

# https://github.com/chmp/ipytest
ipytest.autoconfig()


def validate_integer(number: Any) -> None:
    """Validates if an object is of int() type."""

    if not isinstance(number, int):
        raise TypeError(f"{number} is not int")


def validate_non_negative_integer(number: int) -> None:
    """Validates a number to be a non-negative integer."""

    validate_integer(number)
    if number < 0:
        raise ValueError(f"{number} < 0")


def validate_positive_integer(number: Any) -> None:
    """Validates a number to be a positive integer."""

    validate_integer(number)
    if number < 1:
        raise ValueError(f"{number} < 1")

### Pitfalls

Recalling the example of an unstoppable recursive function.

In [21]:
def foo():
    foo()


foo()

RecursionError: maximum recursion depth exceeded

No base case, no stop condition, the situation of (call) stack overflow and the function crashes.

This is also possible in case of indirect recursion. One sort of it is when a function invokes itself through a series of other functions calls.

In [None]:
def hoo():
    foo()

def goo():
    hoo()

def foo():
    goo()


foo()

All right, let's add a base case (or base cases) to stop recursive calls. Nevertheless, only adding base cases without ensuring the inevitable (!) movement of recursive steps to any of the base cases can lead to a stack overflow.

In [22]:
def countdown(number: int):
    if not number:
        return
    countdown(number - 1)


# negative numbers lead to stack overflow
countdown(-1)

RecursionError: maximum recursion depth exceeded

Ah, let's assure that only non-negative numbers can be passed into the countdown function, but the following implementation has a flaw regardless it works fine.

In [None]:
def countdown(number: int):
    if number < 0:
        raise ValueError(f"{number} < 0")
    if not number:
        print("over")
        return
    countdown(number - 1)


try:
    countdown(-1)
except ValueError as val_err:
    print("negative numbers are intercepted.")
    countdown(0)  # ok
    countdown(100)  # ok as well

The flaw is that checking the incoming number of non-negativity is done every call, which is inefficient. To optimise the solution, let's move apart the validation step to do before the main part.

In [None]:
def countdown(number: int):

    def _cntd(number: int):
        if not number:
            print("over")
            return
        _cntd(number - 1)

    if number < 0:
        raise ValueError(f"{number} < 0")
    _cntd(number)


try:
    countdown(-1)
except ValueError as val_err:
    print("negative numbers are intercepted.")
    countdown(0)  # ok
    countdown(100)  # ok as well

Yes, it works, cool, however there is another flaw reducing the applicability of the solution almost colossally - the limited size of call stack which is not a bottomless pit or cornucopia.

In [23]:
import sys

recursion_limit = sys.getrecursionlimit()
print(f"Recursion limit -> {recursion_limit}")

countdown(recursion_limit + 1)

Recursion limit -> 3000


RecursionError: maximum recursion depth exceeded

Exactly, the recursion limit on maximum depth. So we are upper-bound with this number. Regardless the recursion depth limit can be either increased or decreased, It's better not to do this, but is up to you anyway.

So, recursion has limitations and is not a panacea.

### Drawbacks

Main pitfalls are left behind, but recursion has its price. Let's compare the performance of a recursive and iterative version of computing the factorial of a non-negative integer number.

In [24]:
def _rec_fact(nbr: int) -> int:
    """Returns the (nbr)!"""

    if nbr in (0, 1):
        return 1
    return nbr * _rec_fact(nbr - 1)


def factorial_recursive(number: int) -> int:
    """Returns the factorial of a non-negative integer."""

    validate_non_negative_integer(number)
    return _rec_fact(number)


def factorial_iterative(number: int) -> int:
    """Returns the factorial of a non-negative integer."""

    validate_non_negative_integer(number)
    # making it similar to the recursive implementation above
    for nbr in range(number - 1, 2, -1):
        number *= nbr

In [25]:
%%ipytest

@pytest.mark.parametrize(
    ("number", "answer", "expectation"),
    [
        (-1, None, pytest.raises(ValueError)),
        (0, 1, does_not_raise()),
        (1, 1, does_not_raise()),
        (4, 24, does_not_raise()),
        (5.0, 120, pytest.raises(TypeError)),
        # increase if you dare
        (5000, None, pytest.raises(RecursionError)),
    ],
)
def test_factorial(number, answer, expectation):
    with expectation:
        res_rec = factorial_recursive(number)
        res_iter = factorial_iterative(number)
        res_rec == res_iter == answer

[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m                                                                                       [100%][0m
[32m[32m[1m6 passed[0m[32m in 0.02s[0m[0m


The implementations stand the (unit) tests, but now let's check their performance.

In [26]:
%timeit -r 10 factorial_iterative(100)
%timeit -r 10 factorial_recursive(100)

5.04 µs ± 68.5 ns per loop (mean ± std. dev. of 10 runs, 100,000 loops each)
12 µs ± 134 ns per loop (mean ± std. dev. of 10 runs, 100,000 loops each)


The recursive implementation is about twice times slower that its iterative competitor, so, the time performance cup goes to the iterative version of the algorithm.

Another drawback can be manifested in redundant calls, thereby wasting computing resources. A classic example is a naive implementation of a recursive algorithm for calculating the nth Fibonacci number.

In [27]:
def fibonacci_recursive(nth: int) -> int:
    """Returns the nth Fibonacci number."""

    def _fibrec(nbr: int) -> int:
        """Actual recursive implementation."""

        if not nbr:
            return 0
        if nbr < 3:
            return 1
        return _fibrec(nbr - 2) + _fibrec(nbr - 1)

    validate_non_negative_integer(nth)
    return _fibrec(nth)


def fibonacci_iterative(nth: int) -> int:
    """Returns the nth Fibonacci number."""

    validate_non_negative_integer(nth)
    fibcurr, fibnext = 0, 1
    for _ in range(nth):
        fibcurr, fibnext = fibnext, fibcurr + fibnext
    return fibcurr

In [28]:
%%ipytest


@pytest.mark.parametrize(
    ("number", "answer", "expectation"),
    [
        (-1, None, pytest.raises(ValueError)),
        (0, 0, does_not_raise()),
        (1, 1, does_not_raise()),
        (1.5, None, pytest.raises(TypeError)),
        (2, 1, does_not_raise()),
        (3, 2, does_not_raise()),
        (4, 3, does_not_raise()),
        (5, 5, does_not_raise()),
        (6, 8, does_not_raise()),
    ]
)
def test_fibonacci(number, answer, expectation):
    with expectation:
        res_iter = fibonacci_iterative(number)
        res_rec = fibonacci_recursive(number)
        assert res_iter == res_rec == answer

[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m                                                                                    [100%][0m
[32m[32m[1m9 passed[0m[32m in 0.03s[0m[0m


Time for time performance test.

In [29]:
%timeit -r 10 fibonacci_iterative(10)
%timeit -r 10 fibonacci_recursive(10)

1.16 µs ± 21.4 ns per loop (mean ± std. dev. of 10 runs, 1,000,000 loops each)
8.77 µs ± 39.2 ns per loop (mean ± std. dev. of 10 runs, 100,000 loops each)


Fantastic, the recusive implementation is as slow as a lazy snail. Welcome to longer computations when increasing the input number. But why so? The following picture may be more illustrative than words:

![Fibonacci calls tree](./fibonacci_calls_tree.png)

There are redundant calls computing the same values, overhead costs can become unreasonably high. Can you count and group the same calls?

There is an optimisation technique called "memoisation" (not "memorisation", not a typo). Memoisation allows to cache (store) an input value and the output for it. For this purpose you can declare a dictionary and put there inputs-outputs (key, value) pairs.

In [30]:
MEMO: dict[int, int] = {}


# This is a decorating function.
# It receives a function as an input
# and returns another function.
# This is Decorator pattern in use.
def cache(func: Callable) -> Callable:
    """Caching decorator."""

    def _wrapper(number: int) -> int:
        if number in MEMO:
            return MEMO[number]
        res = func(number)
        MEMO[number] = res  # caching
        return res
    
    return _wrapper


@cache
def fibonacci_recursive_2(nth: int) -> int:
    """Returns the nth Fibonacci number."""

    def _fibrec(nbr: int) -> int:
        """Actual recursive implementation."""

        if not nbr:
            return 0
        if nbr < 3:
            return 1
        return _fibrec(nbr - 2) + _fibrec(nbr - 1)

    validate_non_negative_integer(nth)
    return _fibrec(nth)

The Red Letter Day...

In [31]:
%timeit -r 10 fibonacci_iterative(10)
%timeit -r 10 fibonacci_recursive_2(10)

1.16 µs ± 15 ns per loop (mean ± std. dev. of 10 runs, 1,000,000 loops each)
678 ns ± 18 ns per loop (mean ± std. dev. of 10 runs, 1,000,000 loops each)


Wow, the memoised recursive version became faster than the iterative version. Can these results be trusted? Let's memoise the iterative version and make sure that the order of times is aligned to the expected values.

In [32]:
@cache
def fibonacci_iterative_2(nth: int) -> int:
    """Returns the nth Fibonacci number."""

    validate_non_negative_integer(nth)
    fibcurr, fibnext = 0, 1
    for _ in range(nth):
        fibcurr, fibnext = fibnext, fibcurr + fibnext
    return fibcurr


%timeit -r 10 fibonacci_iterative_2(10)
%timeit -r 10 fibonacci_recursive_2(10)

663 ns ± 20.1 ns per loop (mean ± std. dev. of 10 runs, 1,000,000 loops each)
666 ns ± 9.78 ns per loop (mean ± std. dev. of 10 runs, 1,000,000 loops each)


Now the results are close one to another, which means that outputs are retrieved from the cache, not computed every time the function is invoked.

Can you answer whether it is reasonable to memoise the factorial function?

### References

- [Recursion Explained](https://tozturk.hashnode.dev/recursion-explained-breaking-down-the-core-concepts-benefits-and-drawbacks-of-using-recursive-functions)

- [JakeVDP: Timing and Profiling](https://jakevdp.github.io/PythonDataScienceHandbook/01.07-timing-and-profiling.html)

- [PyNash: Timing and Profiling](https://pynash.org/2013/03/06/timing-and-profiling/)