<section class="section1"><h1>Lab Week 4</h1>
<p>In weeks 2 and 3 we did</p>
<ul>
<li>lists: definition, slicing;</li>
<li><code>for</code> loops;</li>
<li><code>numpy</code> arrays;</li>
<li><code>numpy</code> linear algebra operations.</li>
</ul>
<section class="section2"><h2>Script</h2>
<p>In previous weeks we introduced containers such as lists and <code>numpy</code> arrays. These can be used for one essential operation in Mathematics, and particularly Operational Research: ordering and sorting things.</p>
<p>For example, let us take an <em>un-ordered</em> list:</p>
</section></section>

In [None]:
list_1 = [3, 2, 5, 1, 4]

<p>There are two ways that we can sort this. There is the <code>sorted</code> function:</p>

In [None]:
print(sorted(list_1))

In [None]:
[1, 2, 3, 4, 5]


<p>This function returns a <em>new list</em>. We can check that the original list is unchanged:</p>

In [None]:
print(list_1)

In [None]:
[3, 2, 5, 1, 4]


<p>There is also the <code>sort</code> <em>method</em>:</p>

In [None]:
list_1.sort()
print(list_1)

In [None]:
[1, 2, 3, 4, 5]


<p>This changes the values of the original list. It's usually slightly faster, but also usually not what we want to do.</p>
<p>If using a <code>numpy</code> array, there are similar functions and methods:</p>

In [None]:
import numpy

In [None]:
array_1 = numpy.array([3, 2, 5, 1, 4])
print(numpy.sort(array_1))
print(array_1)
array_1.sort()
print(array_1)

In [None]:
[1 2 3 4 5]
[3 2 5 1 4]
[1 2 3 4 5]


<p>The in-built functions are efficient and useful. What we want to do is to implement our own sorting algorithms to show how this can be done.</p>
<section class="section3"><h3>Unpacking and multiple assignment</h3>
<p>We will start with a small, two-element list:</p>
</section>

In [None]:
list_1 = [2, 1]

<p>We want to sort it in ascending order. We see that to do this we need to switch the first (<code>0</code>) and second (<code>1</code>) entries.</p>
<p>How would you do this?</p>
<p>If you tried:</p>

In [None]:
list_1[0] = list_1[1]
list_1[1] = list_1[0]

<p>then it will fail:</p>

In [None]:
print(list_1)

In [None]:
[1, 1]


<p>The first assignment throws away the entry <code>2</code>, and it can't be recovered.</p>
<p>We could use a temporary variable:</p>

In [None]:
list_1 = [2, 1]

tmp = list_1[0]
list_1[0] = list_1[1]
list_1[1] = tmp

print(list_1)

In [None]:
[1, 2]


<p>This works, but involves more lines of code than we would like.</p>
<p>We can take advantage of some Python features with multiple variables. If you have multiple variables, or a container (like a list) containing multiple objects, on one side of the assignment (<code>=</code>), then Python will expand each individually.</p>
<p>For example, we can assign to multiple variables using unpacking:</p>

In [None]:
list_1 = [2, 1]

a, b = list_1
print(a)
print(b)

In [None]:
2
1


<p>We can use the same variable on both sides, and Python will "do the right thing":</p>

In [None]:
b, a = a, b
print(a)
print(b)

In [None]:
1
2


<p>So we can use this to flip the entries of <code>list_1</code> as needed:</p>

In [None]:
list_1[1], list_1[0] = list_1[0], list_1[1]
print(list_1)

In [None]:
[1, 2]


<section class="section5"><h5>Exercise</h5>
<p>Write a function <code>swap</code> that takes a list, <code>unswapped</code>, and two integers <code>i, j</code>. It should swap the <code>i</code>th and <code>j</code>th entries of the list and return the swapped list.</p>
</section><section class="section3"><h3>Conditional statements</h3>
<p>We still need to get the computer to check when it should swap the entries in a list. That is, how do we make it do the swap <em>only if</em> a certain condition holds?</p>
<p>This is the Python <code>if</code> statement. The syntax is similar to that for functions (which used the <code>def</code> keyword) and loops (which used the <code>for</code> keyword):</p>
</section>

In [None]:
list_1 = [2, 1]
list_2 = [3, 4]

if list_1[1] < list_1[0]:
    print("Swapping entries in list 1")
    list_1[1], list_1[0] = list_1[0], list_1[1]

if list_2[1] < list_2[0]:
    print("Swapping entries in list 2")
    list_2[1], list_2[0] = list_2[0], list_2[1]

print(list_1)
print(list_2)

In [None]:
Swapping entries in list 1
[1, 2]
[3, 4]


<p>We see that <em>only the indented code</em> for <code>list_1</code> was executed. The syntax is to use the <code>if</code> keyword followed by some logical (Boolean, true of false) statement. After the condition a colon <code>:</code> is used to indicate the lines that should be executed. The lines to be executed are then indented by four spaces. Again, this follows the syntax for functions and loops.</p>
<section class="section5"><h5>Exercise</h5>
<p>Write a function <code>max_swap</code> that takes the unswapped list, <code>unswapped</code>, and two integers <code>n</code> and <code>nmax</code>. It should compute two integers <code>child1 = 2*n+1</code> and <code>child2 = 2*(n+1)</code>, find which is the largest entry <code>unswapped[i]</code>, where <code>i</code> can be <code>n</code> or <code>child1</code> or <code>child2</code>. If the largest entry is not <code>unswapped[n]</code>, the <code>swap</code> function should be used to make it the largest. The <code>child*</code> entries should only be used if <code>child*</code> is less than <code>nmax</code>.</p>
</section><section class="section5"><h5>Exercise</h5>
<p>The function <code>max_swap</code> is part of setting up a <em>heap</em>, as required for the <em>heap sort</em> algorithm seen in lectures. Two modifications are needed to create the heap. First, it should be called on all the entries. Second, if it swaps entries, it must call itself again, to check if any of the new children of the current maximum are larger. So write a function <code>heapify</code> that adds this one line to <code>max_swap</code>.</p>
<p>In a heap sort, we first construct the heap. We start from the <em>middle</em> of our list, work backwards, and ensure that the children of all points in the first half of the list are smaller than their parents. With our <code>heapify</code> function, we can do that with a loop:</p>
</section>

In [None]:
list_1 = [3, 2, 5, 1, 4]
for i in range(len(list_1)//2 - 1, -1, -1):
    list_1 = heapify(list_1, i, len(list_1))
print(list_1)

In [None]:
[5, 4, 3, 1, 2]


<p>Let us check that we have a heap. The first entry (<code>0</code>) is the largest, as required. Its children (entries <code>2 n + 1 = 1</code>, with value <code>4</code> and <code>2 (n + 1) = 2</code> with value <code>3</code>) are both smaller than it, as required. The first child has children (entries <code>2 n + 1 = 3</code>, with value <code>1</code>, and <code>2 (n + 1) = 4</code>, with value <code>2</code>) that are both smaller than it, as required. The second child has no children within the list (both <code>2 n + 1 = 5</code> and <code>2 (n + 1) = 6</code> are too large).</p>
<p>As soon as we have a heap we can work from the end of the list to the beginning, swapping the entries as we go. We then recreate the heap, but only for the entries that haven't been swapped: that is, for the start of the array.</p>

In [None]:
for i in range(len(list_1)-1, 0, -1):
    list_1 = swap(list_1, i, 0)
    list_1 = heapify(list_1, 0, i)
print(list_1)

In [None]:
[1, 2, 3, 4, 5]


<section class="section5"><h5>Exercise</h5>
<p>Convert this into a <code>heap_sort</code> function that sorts a list in place.</p>
</section><section class="section3"><h3>Functions and documentation</h3>
<p>When we defined the functions above we gave them minimal documentation. We can see this documentation on the screen using the <code>help</code> function:</p>
</section>

In [None]:
help(heap_sort)

In [None]:
Help on function heap_sort in module __main__:

heap_sort(unsorted)
    The heap sort algorithm



<p>Alternatively, in <code>spyder</code>, go to the "Help" pane in the top right and type <code>bubble_sort</code> into the box. The help text should appear.</p>
<p>The documentation is needed to explain to the next person to use the function</p>
<ul>
<li>what it does</li>
<li>how it does it</li>
<li>how it should be used</li>
<li>what values it returns.</li>
</ul>
<p>A function without documentation is <strong>fundamentally broken</strong>, as the next person to use it will not (easily) be able to understand it. As this person is most likely to be you, the person that wrote it, then you are helping yourself by writing proper documentation.</p>
<p>There are various conventions for how to document functions in Python. We recommend the "numpydoc" convention, which is used within <code>numpy</code>. We will now improve the documentation for the <code>bubble_sort</code> function to illustrate:</p>

In [None]:
def heap_sort(unsorted):
    """
    The heap sort algorithm

    Parameters
    ----------

    unsorted: list of float
        The unsorted list

    Returns
    -------

    sorted: list of float
        The sorted list (which is unsorted, sorted in place)

    Notes
    -----

    This algorithm sorts the list in place, replacing the original entries.
    The worst case speed is O(n log(n)).
    """
    # Create the heap
    for i in range(len(unsorted)//2 - 1, -1, -1):
        unsorted = heapify(unsorted, i, len(unsorted))
    # Flatten the heap
    for i in range(len(unsorted)-1, 0, -1):
        unsorted = swap(unsorted, i, 0)
        unsorted = heapify(unsorted, 0, i)
    return unsorted

<p>The documentation starts with a brief description of the algorithm. There is then a section describing the input arguments, or parameters, to the function. The name of each is given, along with its expected type, and what it means. Next is a section on the variables returned from the function, using the same conventions. Finally, a notes section allows us to add more details of the algorithm, when there may be problems, references to the literature, or examples of how to use it.</p>
<p>Check how this documentation appears using the <code>help</code> function, and also when using <code>spyder</code>'s help facility.</p>
<section class="section5"><h5>Exercise</h5>
<p>Think of documentation like a contract. The documentation guarantees that <em>if</em> the input follows the specified form, <em>then</em> the function will return correct output in the specified form.</p>
<p>For the <code>heap_sort</code> function, what input causes the function to fail? Test various inputs (empty lists, single values, strings, lists of lists, etc) and see what happens. How would you tighten and/or improve the documentation to make this clear?</p>
</section>