## 1-minute introduction to Jupyter ##

A Jupyter notebook consists of cells. Each cell contains either text or code.

A text cell will not have any text to the left of the cell. A code cell has `In [ ]:` to the left of the cell.

If the cell contains code, you can edit it. Press <kbd>Enter</kbd> to edit the selected cell. While editing the code, press <kbd>Enter</kbd> to create a new line, or <kbd>Shift</kbd>+<kbd>Enter</kbd> to run the code. If you are not editing the code, select a cell and press <kbd>Ctrl</kbd>+<kbd>Enter</kbd> to run the code.

# Lesson 9: Recursion

In Assignment 6 Part 2, you wrote a function that walked through a list of files and directories, and returned the contents of the subdirectories as well.

You probably used a `while` loop to do this. But have you ever wondered, “If the purpose of this function is to return a directory listing, why cant we use it to get the contents of the subdirectories too?”

As it turns out, we can. This idea of *using a function within itself* is called **recursion**.

Let’s revisit the original code:

In [None]:
import os

def walk_listing(filepath):
    list_of_files = os.listdir(filepath)
    i = 0
    while i < len(list_of_files):
        this = list_of_files[i]
        full_path = '/'.join([filepath, this])
        if os.path.isdir(full_path):
            sub_list = os.listdir(full_path)
            for each in sub_list:
                list_of_files.append(f'{this}/{each}')
        i += 1
    return list_of_files

walk_listing('.')  # walk the current directory

The code is doing a few things:

    this = list_of_files[i]
    full_path = '/'.join([filepath, this])
    
It has to construct a full path that it can pass to `os.listdir()` to get the subdirectory listing.

    if os.path.isdir(full_path):
        sub_list = os.listdir(full_path)

If it finds a directory, it has to get the directory listing of that directory.

    for each in sub_list:
        list_of_files.append(f'{this}/{each}')

... and then it has to remember to add the current directory in front to form the full path.

## Recursive implementation of `walk_listing()`

Another way to implement `walk_listing()` is as follows:

1. Get the list of files and directories in the `filepath` directory.
2. Initialise a new list, `full_list`.
3. For each entry,
   - if it is a file, append it to `full_list`,
   - if it is a directory,append it to `full_list`.
     Then call `walk_listing()` to get its files and directories as a list.
4. return `full_list`

Notice that in this implementation, we are calling `walk_listing()` *within itelf*.

What does that look like in Python code?

In [None]:
import os

def walk_listing(filepath):
    full_list = []
    for each in os.listdir(filepath):
        full_path = '/'.join([filepath, each])
        full_list.append(full_path)                 # Adds entry to full_list
        if os.path.isdir(full_path):
            sub_listing = walk_listing(full_path)   # Calls walk_listing() to get listing from subdirectory
            full_list.extend(sub_listing)           # Adds subdirectory listing to full_list
    return full_list

walk_listing('.')  # walk the current directory

Much shorter, and only one loop instead of two. Let’s see what this is doing:

    for each in os.listdir(filepath):

We are simply checking the requested directory (without going into subdirectories).

    full_list.append(full_path)

We add each entry to our new list, `full_list`

    if os.path.isdir(full_path):
        sub_listing = walk_listing(full_path)
        full_list.extend(sub_listing)

If the entry is a directory, we just call `walk_listing()` to get the list of files in that subdirectory.

But what about the sub-sub-subdirectories?

Let’s put in some `print()` statements to help us see what is going on. I added some prefixes to help us know what's happening:
- `ENTER` means it is calling `walk_listing()` on a directory (and listing its contents)
- `ADD` means it added a new entry to the list
- `EXIT` means `walk_listing()` has completed returning a list of files to the function that called it.

In [None]:
import os

def walk_listing(filepath):
    print(f'INSIDE: {filepath}')
    full_list = []
    for each in os.listdir(filepath):
        full_path = '/'.join([filepath, each])
        print(f'   ADD: {full_path}')
        full_list.append(full_path)
        if os.path.isdir(full_path):
            print(f' ENTER: {full_path}')
            sub_listing = walk_listing(full_path)
            full_list.extend(sub_listing)
    print(f'  EXIT: {filepath}')
    return full_list

walk_listing('.')   # walk the current directory

See what happens when it encounters a subdirectory? It calls `walk_listing()` to enter that subdirectory, and then gathers a list of files.

And if it encounters another subdirectory there? Call `walk_listing()` again.

Finally, when the innermost `walk_listing()` returns the list, the next-innermost `walk_listing()` adds it to `full_list()`.

With a recursive implementation of `walk_listing()`, we are able to use a `for` loop instead.

## Checking a recursive implementation

How do we catch recursive implementations that will never terminate? In general, recursive functions should meet the following 3 conditions:

1. A recursive function should have at least one **base case** that returns a value directly without calling itself.
   - This is necessary because it must eventually reach a starting value from which it can "work backward".


2. A recursive function should call itself (**self-invocation**) when the base case is unmet.
   - It should call itself with a different state, i.e. the input to the next function call should not be the same as the input for the current function call.
   - Each successive recursive call should **reduce the input** (problem) to bring it *closer to the base case*.


3. A recursive function should have a **return value** that enables the calling function (i.e. the "parent" function) to build up the final return value.
   - It is much easier to think through the logic of recursion if the *return value has a consistent type*.

# How does walk_listing() meet these conditions?

1. **Base case**: There must necessarily be a folder that contains no other folders.
2. **Self-invocation**: `walk_listing()` calls itself in line 12, when a subdirectory is detected.
3. **Return value**: `walk_listing()` returns the list of files and directories that it has built up so far, allowing the calling function to add that to its own list.

## (optional) Tracking recursion level

We can make use of additional arguments to trace how deep we are in the recursion.

In [None]:
import os

def walk_listing(filepath, lvl):
    print(f'INSIDE: {filepath} ({lvl} level(s) deep)')
    full_list = []
    for each in os.listdir(filepath):
        full_path = '/'.join([filepath, each])
        print(f'   ADD: {full_path}')
        full_list.append(full_path)
        if os.path.isdir(full_path):
            print(f' ENTER: {full_path}')
            sub_listing = walk_listing(full_path, lvl=(lvl + 1)
            full_list.extend(sub_listing)
    print(f'  EXIT: {filepath}')
    return full_list

walk_listing('.', lvl=1)

## Exercise 1: Write a recursive Fibonacci function

In the Fibonacci sequence, each term is the sum of the **previous two terms**.

A non-recursive implementation of the fibonacci sequence is shown below:

In [None]:
def fibonacci(n):
    if type(n) != int or n <= 0:
        raise ValueError('n must be a positive integer (got {n})')
    if n == 1:
        return 0
    elif n == 2:
        return 1
    else:
        prev = 0
        this = 1
        for i in range(3, n + 1):
            new = this + prev
            prev = this
            this = new
        return this

for n in range(1, 10):
    print(f'{fibonacci(n)}, ', end='')
print(fibonacci(10))

Write a function that calculates the Fibonacci sequence **recursively**.

In [None]:
def fibonacci(n):
    if type(n) != int or n <= 0:
        raise ValueError('n must be a positive integer (got {n})')
    if n == 1:
        return 0
    elif n == 2:
        return 1
    else:
        return fibonacci(n - 1) + fibonacci(n - 2)

for n in range(1,10):
    print(f'{fibonacci(n)}, ', end='')
print(fibonacci(10))

In [None]:
test_ans = [0, 1, 1, 2, 3, 5, 8, 13, 21, 34]

for i in range(len(test_ans)):
    result = fibonacci(i + 1)
    assert result == test_ans[i], \
        f'Term {i} of fibonacci sequence should be {test_ans[i]}, got {result} instead.'