# Sorting

In [2]:
# Sorting is bringing order to the data present in a data structure. There are dozens of sorting algorithm varying 
# in the method and the implementation. We will go through some. But lets first understand why sorting is important 
# and some of the problems that can be solved using different sorting techniques

### Problems sorted by sorting

In [3]:
# 1. Searching on a list is easier if it's sorted
# 2. Finding duplicates in a list is easier when it's sorted
# 3. If you want to find the Kth element in a list, you use sorted one
# 4. You want to analyse the data distribution, it's easier if you sort the list.
# ETC

## Bubble sort

> As the name implies, there is a bubble which goes from start to finish of a list, comparising or two adjacent elements. If those elements are not in correct order then the bubble swaps them and then move to the next step. It does this swaping thing, n times, n being the number of elements in the list to be sorted. As the bubble moves each time, it pushes the greatest element in the array to the last position, hence it doesn't compare it in the next iteration. So in each iteration, it compares one less element as it's already sorted.

In [4]:
# Implementation
def bubble_sort(arr):
    l = len(arr)
    for i in range(l):
        for j in range(l-i-1):
            # check condition
            if arr[j] > arr[j+1]:
                # swapping if the elements are not in order
                arr[j], arr[j+1] = arr[j+1], arr[j]
    return arr

In [5]:
# Let's test it
list1 = [19,2,31,45,6,11,121,27]
bubble_sort(list1)

[2, 6, 11, 19, 27, 31, 45, 121]

In [8]:
# cool. We can improve this algo by adding a new param to check if the array gets sorted before it going through
# n loops. This is not part of bubble sort but it optimises the code a bit, so let's try it out.

# Implementation
def bubble_sort(arr):
    l = len(arr)
    for i in range(l):
        # Inside every loop we set the already_sorted tag to true
        already_sorted = True
        for j in range(l-i-1):
            # check condition
            if arr[j] > arr[j+1]:
                # swapping if the elements are not in order
                arr[j], arr[j+1] = arr[j+1], arr[j]
                # if in any loop there is a swap, it means it's not already_sorted. So set it to false
                already_sorted = False
        # If there was no occurence of swapping in a loop, could be the first loop or second last loop, it means its
        # already sorted and we can break the for loop now
        if already_sorted:
            print("breaking at {}th loop".format(i))
            break
    return arr

In [9]:
# Testing the optimised code
list1 = [19,2,31,45,6,11,121,27]
bubble_sort(list1)

breaking at 3th loop


[2, 6, 11, 19, 27, 31, 45, 121]

In [10]:
# It was sorted in the third loop itself. We saved 4 loops. We should be proud. 

In [11]:
# Big O notation.
# We can either use a timeit module to measure the time it takes to execute multiple modules but a better approach to 
# understand the relationship between the length of input n and the time to execute, which is usually denoted in Big O
# notation
# O(1) : Constant time, like accessing a value from a hash table
# O(n) : Linear time, finding the lenght of a list or doing an operation on each element of the list
# O(n**2) : Quadratic time, if any item of the list needs to be visited multiple times, like in the case of bubble sort
# O(2**n) : Exponential time. Very inefficient. Execution time grows expenentially as the length of the input array
#           increases. NP hard problems like travelling salesman problem, grows expenentially in time
# O(log n) : Logarithmic time, Searching a binary tree. Very efficient. Time grows linearly when input length grows
#            exponentially. It's really crazy


## Insertion Sort

> Insertion sort starts by picking an element from the array and "inserting" it in the right position in the array. Just like bubble sort, insertion sort also creates a sorted list from the begining. Each new element gets compared to element in this list and gets inserted at the right position.

In [12]:
# Implementation
def insertion_sort(arr):
    l = len(arr)
    # Since element at index 0 is already sorted, we can begin from the next element
    for i in range(1, l):
        curr_item = arr[i]
        # we need to find the position of element before our current item
        j = i - 1
        # next we keep on comparing each element with curr element and move to the left side of the sorted array
        while j >= 0 and arr[j] > curr_item:
            # if the element at j is bigger than curr_item, move it one position to the right
            arr[j + 1] = arr[j]
            
            # now we move to the position one before the current one
            j -= 1
        # Once we found the element which is lesser than the curr_item, we can push the curr_item, on the position
        # right to it
        arr[j+1] = curr_item
    # after this for loop we get a sorted list
    return arr
        

In [14]:
# Testing
list1 = [19,2,31,45,6,11,121,27]
insertion_sort(list1)

[2, 6, 11, 19, 27, 31, 45, 121]

In [15]:
# Hooray. This looks good