# Selection Sort

## Lesson Overview

**Selection sort** is a simple sorting algorithm that works by repeatedly moving the minimum element from the input (unsorted) array into the output (sorted) array.

> The average case time complexity of selection sort is $O(n^2)$.

### Algorithm

An implemention of the selection sort algorithm is outlined here.

0. **Initialize** an output array that will eventually contain all of the elements of the input array, sorted.

1. **Select** the minimum element of the input array and move it the end of the output array. (Ensure to remove the selected element from the input array.)

2. **Repeat** the selection step until the input array is empty.

Selection sort can be implemented either in-place or out-of-place. Selection sort may appear like an out-of-place sorting algorithm since it creates a new output array. However, it does not *copy* the input array, it *moves* elements from the input array to the output array. Selection sort therefore does not create any *new* storage, so is $O(1)$.

**Example**

The following table demonstrates sorting the array [2, 1, 4, 5] using selection sort.



**Iteration** | **Input array** | **Output array**
--- | --- | ---
0 | [2, 1, 4, 5] | []
1 | [2, 4, 5] | [1]
2 | [4, 5] | [1, 2]
3 | [5] | [1, 2, 4]
4 | [] | [1, 2, 4, 5]

## Question

An important step in the selection sort algorithm is finding the index of the minimum element of the array. Write this `minimum_index` function.

If there is more than one minimum element, return the lowest index. For example, `minimum_index([2, 3, 1, 4, 1])` should return `2`, since 1, the lowest element, appears at indices 2 and 4.

In [None]:
def minimum_index(arr):
  """Returns the index of the minimum value within a numerical array."""
  # TODO(you): Implement

### Hint

Use the following code scaffolding.

In [None]:
def minimum_index(arr):
  """Returns the index of the minimum value within a numerical array."""
  # Initialize the minimum index.
  min_index = -1
  # Initialize the minimum value. Infinity is the standard initialization for
  # such functions, since every integer is less than infinity.
  min_value = float("Inf")

  # Iterate through the input array list.
  for i in range(len(arr)):
    # Reset min_index and min_value if you find a value lower than min_value.
    # TODO(you): Complete
  
  return min_index

### Unit Tests

Run the following cell to check your answer against some unit tests.

In [None]:
print(minimum_index([2, 3, 1, 4, 5]))
# Should print: 2
print(minimum_index([2, 3, 1, 4, 1]))
# Should print: 2

### Solution

In [None]:
def minimum_index(arr):
  """Returns the index of the minimum value within a numerical array."""
  # Initialize the minimum index.
  min_index = -1
  # Initialize the minimum value. Infinity is the standard initialization for
  # such functions, since every integer is less than infinity.
  min_value = float("Inf")

  # Iterate through the input array list.
  for i in range(len(arr)):
    # Reset min_index and min_value if you find a value lower than min_value.
    if arr[i] < min_value:
      min_index = i
      min_value = arr[i]
  
  return min_index

## Question

What is the best, worst, and average case time complexity of `minimum_index`?

In [None]:
#freetext

### Solution

All of the single-line operations in `minimum_index` are $O(1)$. Therefore the time complexity of `minimum_index` is the number of iterations of the `for` loop. In all cases, the loop iterates over all $n$ elements of `arr`, so time complexity is $O(n)$ in the best, worst, and average case.

## Question

Implement `selection_sort` using the `minimum_index` function.

In [None]:
def minimum_index(arr):
  """Returns the index of the minimum value within a numerical array."""
  # Initialize the minimum index.
  min_index = -1
  # Initialize the minimum value. Infinity is the standard initialization for
  # such functions, since every integer is less than infinity.
  min_value = float("Inf")

  # Iterate through the input array list.
  for i in range(len(arr)):
    # Reset min_index and min_value if you find a value lower than min_value.
    if arr[i] < min_value:
      min_index = i
      min_value = arr[i]
  
  return min_index

In [None]:
def selection_sort(arr):
  """Sorts an array of integers in ascending order."""
  # TODO(you): Implement
  print('This function has not been implemented.')

### Unit Tests

Run the following cell to check your answer against some unit tests.

In [None]:
print(selection_sort([2, 1, 4, 5, 2, 3, 7, 6]))
# Should print: [1, 2, 2, 3, 4, 5, 6, 7]

### Solution

This is by no means the only possible implementation. If your solution works and has the same time complexity, then it is completely valid!

In [None]:
def selection_sort(arr):
  """Sorts an array of integers in ascending order."""
  output = []

  while len(arr) > 0:
    # Find the index of the minimum value within the input array.
    min_index = minimum_index(arr)
    # Remove the minimum value from the input array.
    min_value = arr.pop(min_index)
    # Add the minimum value to the output array.
    output.append(min_value)
  
  return output

## Question

What is the best and worst case time complexity of selection sort?

In [None]:
#freetext

### Hint

Remember that `minimum_index` is $O(n)$. How many iterations does `selection_sort` have in the worst case?

### Solution

Unlike bubble sort, every step of selection sort must be performed regardless of what the input is. For example, even if the input array is already sorted, selection sort still repeatedly selects and moves the minimum of the input array to the output array. This indicates that the best and worst case time complexities should be the same.

There are only two lines in `selection_sort` that contribute to the time complexity. All other lines are $O(1)$.

```python
while len(arr) > 0:
  min_index = minimum_index(arr)
```

The `while` loop has $n$ iterations, since the initial length of `arr` is $n$ and at each iteration 1 element is popped out. This is true in the best *and* the worst case. As per a previous question and the hint, `minimum_index` is $O(n)$. Therefore, since `selection_sort` contains $n$ calls of an $O(n)$ function, the best and worst case time complexity is $O(n^2)$.

## Question

What is the average case time complexity of selection sort?

In [None]:
#freetext

### Solution

Since the best and worst case time complexities are both $O(n^2)$, the time complexity must be $O(n^2)$ in *all* cases. Therefore, the average case time complexity is also $O(n^2)$.

## Question

Your friend Novell works for a publishing company called *A2ZBooks*, and his first task is to create a dictionary. Instead of building the dictionary from scratch, Novell has the clever idea to take all of the unique words in all of the books published by *A2ZBooks*, and just put them in order.

Novell has created an array of all the words used in all the books. This array is called `words`. He first converts all of the words to lower-case. He then uses some nifty code to create an array of unique words, in just one line of code. Finally, since he knows that sorting algorithms can be used just as effectively on strings as on integers, he uses your selection sort algorithm above to sort the unique words.

```python
# Lower-case every word in words.
lower_case_words = [word.lower() for word in words]
# Create a list of the unique words.
unique_words = list(set(lower_case_words))
# Sort the words using selection sort.
sorted_words = selection_sort(unique_words)
```

Novell is convinced that this approach makes sense, but the `selection_sort` call throws a `TypeError`. Why is this? Can you adapt your code above to make it work for strings, while also still working for integers and floats? Below is the `selection_sort` code for reference.

(If your solution above already works for strings, then you have already completed this question!)

In [None]:
def minimum_index(arr):
  """Returns the index of the minimum value within an array."""
  # Initialize the minimum index.
  min_index = -1
  # Initialize the minimum value. Infinity is the standard initialization for
  # such functions, since every integer is less than infinity.
  min_value = float("Inf")

  # Iterate through the input array list.
  for i in range(len(arr)):
    # Reset min_index and min_value if you find a value lower than min_value.
    if arr[i] < min_value:
      min_index = i
      min_value = arr[i]
  
  return min_index


def selection_sort(arr):
  # TODO(you): Make this function also work for strings.
  """Sorts an array in ascending order."""
  output = []

  while len(arr) > 0:
    # Find the index of the minimum value within the input array.
    min_index = minimum_index(arr)
    # Remove the minimum value from the input array.
    min_value = arr.pop(min_index)
    # Add the minimum value to the output array.
    output.append(min_value)
  
  return output

### Unit Tests

Run the following cell to check your answer against some unit tests.

In [None]:
print(selection_sort(['cat', 'ant', 'bee', 'aardvark']))
# Should print: ['aardvark', 'ant', 'bee', 'cat']

### Solution

*Almost* every line of `selection_sort` works just as well for floats and strings as it does for integers. The only exception is that `min_value` is initialized as `float("Inf")` in `minimum_index`, and Python cannot compare this to a string.

To fix this, we would need to find an equivalent "positive infinity" word that is greater than every possible string, similarly to how `float("Inf")` is greater than any float or integer. Unfortunately, such a string does not exist. Therefore, the solution is a bit more nuanced.

Instead of initializing `min_index` and `min_value` at `-1` and `float("Inf")` respectively, we will initialize them at `0` and `arr[0]` respectively, with the caveat that if `arr` is empty, we `return -1`. Then, we iterate through the remaining elements of `arr`.

In [None]:
def minimum_index(arr):
  """Returns the index of the minimum value within an array."""
  # This is necessary to ensure that arr[0] exists.
  if len(arr) == 0:
    return -1

  # Initialize the minimum index.
  min_index = 0
  # Initialize the minimum value.
  min_value = arr[0]

  # Iterate through the input array list.
  for i in range(1, len(arr)):
    # Reset min_index and min_value if you find a value lower than min_value.
    if arr[i] < min_value:
      min_index = i
      min_value = arr[i]
  
  return min_index

Now, `selection_sort` works for strings.

In [None]:
def selection_sort(arr):
  """Sorts an array in ascending order."""
  output = []

  while len(arr) > 0:
    # Find the index of the minimum value within the input array.
    min_index = minimum_index(arr)
    # Remove the minimum value from the input array.
    min_value = arr.pop(min_index)
    # Add the minimum value to the output array.
    output.append(min_value)
  
  return output

Since the new `selection_sort` now works for strings as well as integers and floats, it is a more robust algorithm. It is therefore probably a better implementation, even though it contains some extra logic.