# xSoc Python Course - Week 6

The end is in sight! This week's content is going to focus on efficiency, compactness and making multiple python files work together.

### Recursion
**Recursion** means defining a function in terms of itself; you'll call the function you're writing inside itself. Every recursive function has to have 3 features:
- **Base case**: the non-recursive case, which returns an actual value (or none!)
- **General case**: the recursive case, which makes a call to the function again. Each call should move you closer to reaching the base case
- You ***must*** reach a base case after a finite number of function calls. Bad things happen if you don't

Have a look at this code snippet:

In [None]:
from random import randint

total = 0
for i in range(0,100,2):
    total += i * randint(1,11) * randint(1,11)

    if total > 75:
    break

print(total)

We initialise `total` to 0, and while `i` is less than 100, we increase `total` by `i` multiplied by two (possibly different) random numbers between 1 and 10 (inclusive). If this value of `total` is more than 75, we exit the loop early. While it may not look it, we can re-write this loop to be a recursive function! There are two base cases: `i` > 100 *or* `total` > 75. The general case is just adding to total. Here's the above for loop rewritten, note that if you run both snippets the outcome is the same.

In [None]:
from random import randint

def my_first_recursive_func(total, i):
    if i > 100:
        return total
    else:
        total += i * randint(1, 11) * randint(1, 11)
        if total > 75:
            return total
    return my_first_recursive_func(total, i+2)

total = my_first_recursive_func(0, 0)
print(total)

### A Few Useful Data Structures


### Sorting Things Out
Now that we've introduced some data structures, it's worth introducing some algorithms you might find handy. The first class of algorithms we'll cover are the sorting algorithms. We'll start with the ***Bubble Sort***:

In [None]:
def bubble_sort(arr):
    for i in range(len(arr)):
        for j in range(len(arr)):
            if arr[i] < arr[j]:
                temp = arr[i]
                arr[i] = arr[j]
                arr[j] = temp
    return arr

unsorted_array = [2, 7, 1, 5, 3, 8, 0]
print(unsorted_array)
sorted_array = bubble_sort(unsorted_array)
print(sorted_array)

Seems fairly simple right? Iterate through the list and check that each element is smaller than all the rest. Our algorithm works, but it's not as efficient as it could be. 

Luckily, we can exploit one of the features of a bubble sort: after each swap, the largest element (if you're sorting in ascending order) will move to the end of the unsorted list - that means we don't need to check it anymore! Let's re-write our bubble sort to include this idea:

In [None]:
def better_bubble_sort(arr):
    for i in range(len(arr)):
        for j in range(1, len(arr)-i-1):
            if arr[j] > arr[j+1]:
                temp = arr[j]
                arr[j] = arr[j+1]
                arr[j+1] = arr[j]
    return arr

unsorted_array = [2, 7, 1, 5, 3, 8, 0]
print(unsorted_array)
sorted_array = better_bubble_sort(unsorted_array)
print(sorted_array)

But what happens if we pass an already-sorted array into the function? The for-loops would execute anyway, but they wouldn't do anything useful! In the end, the sort would take the same amount of time regardless of how sorted your array already is. We can do better still; what if we check how sorted our array is after each pass by counting the number of swaps we make. If we make no swaps in a pass, we know the array has been sorted. Let's add it to the code:

In [None]:
def betterer_bubble_sort(arr):
    swaps = True
    n = len(arr) - 1
    while (swaps == True):
        swaps = False
        for i in range(n):
            if arr[i] > arr[i+1]:
                temp = arr[i]
                arr[i] = arr[i+1]
                arr[i+1] = temp
                swaps = True
        n -= 1
    return arr

unsorted_array = [2, 7, 1, 5, 3, 8, 0]
print(unsorted_array)
sorted_array = betterer_bubble_sort(unsorted_array)
print(sorted_array)

> Task: Another common type of sort is the ***Insertion Sort***. We "split" the array into sorted and unsorted parts, then values in the unsorted part are picked and placed in the correct position in the sorted array. Try to implement your own Insertion Sort (there are plenty of solutions online, but try to solve it yourself as much as possible).
>
> Task (optional): What are the advantages of an Insertion Sort over the Bubble Sort? When might you use one over the other?

### A Quick Detour: Efficiency and Big-Oh
If the first bubble sort we wrote worked, then why did we improve it? It has to do with *code efficiency*. Often, there are lots of different implementations that solve the same problem. The best solutions will run the fastest, and use the least amount additional storage space. Many of the constructs you've learned over the past weeks will help with writing efficient code, such as:
- Using loops for repeated actions
- Using data structures instead of separate variables
- Using functions if you're going to be repeating the same blocks of actions throughout your code
- Use of in-built features / external code libraries
- Use of recursion

A fairly common notation you'll see when talking about the efficiency of algorithms is **Big-Oh notation O()**. Big-Oh gives us an upper bound to the growth rate of an algorithm as the input size *n* increases. 
Big-Oh has a couple of basic rules:
- If your run time is a polynomial of degree *d*, then the run time is *O(nᵈ)*. You drop any lower-order and constant terms (as n increases, the nᵈ term grows the fastest)
- Use the *smallest possible* class of function

For example, consider going one by one through a list of *n* elements. As we add more elements to the list, it's going to take longer to visit all of the elements. The time varies linearly (hence the term *linear search*) with input size, so we say it runs in O(n) time.

By contrast, the bubble and insertion sorts we wrote in the previous section run in O(n²) time. That means that the run time of our algorithms grow proportional to the **square** of the size of the input. For smaller inputs, it's not a huge problem, but if were sorting arrays with thousands of elements, things would quickly get out of hand.

Not to say that there aren't *worse* sorting algorithms out there. The ***Bogosort*** has no upper bound on its runtime (aka O(∞)) and an **average** runtime of O((n+1)!). We're not going to bother with a Python implementation (you can attempt it if you really want, check Python's `random` library to help), but the pseudocode for the randomised Bogosort is:
<pre><code>while not inOrder(list):
    shuffle(list)
</code></pre>
Unsuprisingly, nobody actually uses Bogosorts.

### Divide and Conquer

### More Algorithms: Searching


### Compactness