# Comparators

When the data you want to sort is an array or list of mobjects, and these objects have several attributes. If you had an array of integers or an array of strings,
it’s pretty obvious what is being sorted! Your favorite programming language is already smart enough
to know how to sort simple data like numbers and strings. But suppose you had a list of planet objects.
Each planet might have four attributes: name, number of moons, distance from the sun, and mass. You
could conceivably desire to sort your planets on any one of these attributes.

<pre>
Writing a comparator entails writing a function that takes two objects of the same type. Suppose we
call these two objects obj1 and obj2. The comparator function will return an integer. There are three
possibilities.
• If obj1 should appear before obj2 in a sorted list, then we return a negative number.
• If obj1 should appear after obj2 in a sorted list, then we return a positive number.
• If we don’t care which object comes first (because they might be equal), then we return zero.
By “negative number” and “positive number”, it does not matter exactly which integer you return. By
convention, we would return –1 or +1, or you may decide to “subtract” the object values, if that makes
sense. (Note that the way that comparators are defined implies that by default we would expect lists to
be sorted in ascending order.)
</pre>

<pre>
To illustrate, suppose we had the following data about countries. Let’s see how we would allow a
program to sort on any of the three columns (attributes).
Name Dialing code Population (millions)
India 91 1148
China 86 1321
New Zealand 64 5
France 33 65
Mexico 52 110
</pre>

In [3]:
from functools import cmp_to_key

class Country:

    def __init__(self,name,code,population):
        self.name=name
        self.code=code
        self.population=population

    def byAlpha(countryA,countryB):
        if countryA[0]<countryB[0]:
            return -1
        elif countryA[0]>countryB[0]:
            return 1
        else:
            return 0

    def byCode(countryA,countryB):
        if countryA[1]<countryB[1]:
            return -1
        elif countryA[1]>countryB[1]:
            return 1
        else:
            return 0

    def byPopulation(countryA,countryB):
        if countryA[2]<countryB[2]:
            return -1
        elif countryA[2]>countryB[2]:
            return 1
        else:
            return 0


def sortArray(country):
    country=sorted(country,key=cmp_to_key(Country.byCode))
    print(country)

if __name__ == '__main__':
    country=[["India", 91, 1148],["China", 86, 1321], ["New Zealand", 64, 5], ["France", 33, 65], ["Mexico", 52, 110]]
    sortArray(country)


[['France', 33, 65], ['Mexico', 52, 110], ['New Zealand', 64, 5], ['China', 86, 1321], ['India', 91, 1148]]


# Sort elements by frequency

Print the elements of an array in the decreasing frequency if 2 numbers have same frequency then print the one which came first. 

Approach- Hashing and Sorting

In [4]:
from functools import cmp_to_key

map={}

def compare(a,b):
    if map[a][1]<map[b][1]:
        return 1
    elif map[a][1]>map[b][1]:
        return -1
    else:
        if map[a][0]<map[b][0]:
            return -1
        elif map[a][0]>map[b][0]:
            return 1
        else:
            return 0

def sortByFrequency(arr):
    n=len(arr)
    # map={}
    for i in range(n):
        if arr[i] in map:
            map[arr[i]][1]+=1
        else:
            map[arr[i]]=[i,1]
    # print(map)

    sortedEle=sorted(map,key=cmp_to_key(compare))
    for i in range(len(sortedEle)):
        for j in range(map[sortedEle[i]][1]):
            print(sortedEle[i],end=" ")
    # print(sortedEle)

if __name__ == '__main__':
    # arr=[2,5,2,8,5,6,8,8]
    arr=[2, 5, 2, 6, -1, 9999999, 5, 8, 8, 8]
    # arr="tree"
    sortByFrequency(arr)


8 8 8 2 2 5 5 6 -1 9999999 

Approach 2 - Using BST

In [5]:
class Node:
    def __init__(self,data,count):
        self.data=data
        self.count=count
        self.left=None
        self.right=None

def store(root,value,count):
    if root is None:
        return Node(value,count)
    if root.count<count:
        root.right=store(root.right,value,count)
    else:
        root.left=store(root.left,value,count)
    return root

def inorder(root):
    if root is None:
        return
    inorder(root.right)
    for i in range(root.count):
        print(root.data,end=" ")
    # print(root.data,root.count)
    inorder(root.left)

def sortByFrequency(arr):
    map={}
    for i in arr:
        if i in map:
            map[i]+=1
        else:
            map[i]=1
    # print(map)
    root=None
    for i in map:
        # print(i,map[i])
        root=store(root,i,map[i])
    # print(root)

    inorder(root)

if __name__ == '__main__':
    arr=[2,5,2,8,5,6,8,8]
    # arr=[2, 5, 2, 6, -1, 9999999, 5, 8, 8, 8]
    # arr="tree"
    # arr=[2, 3, 2, 4, 5, 12, 2, 3, 3, 3, 12]
    sortByFrequency(arr)


8 8 8 2 2 5 5 6 

# Count Inversions in an array

Inversion Count for an array indicates – how far (or close) the array is from being sorted. If array is already sorted then inversion count is 0. If array is sorted in reverse order that inversion count is the maximum. 

Formally speaking, two elements a[i] and a[j] form an inversion if a[i] > a[j] and i < j 

Approach 1 -> Simple

Fix one element and check for the rest elements if they form an inversion

In [2]:
# Simple Approach
def countInversions(arr):
    result=0
    for i in range(len(arr)):
        for j in range(i+1,len(arr)):
            if arr[j]<arr[i]:
                result+=1
    return result

if __name__ == '__main__':
    arr=[8, 4, 2, 1]
    result=countInversions(arr)
    print(result)


6


Time Complexity - O(n^2), Space Complexity - O(1)

Approach 2-> Enhanced Merge Sort

In merge process, let i is used for indexing left sub-array and j for right sub-array. At any step in merge(), if a[i] is greater than a[j], then there are (mid – i) inversions. because left and right subarrays are sorted, so all the remaining elements in left-subarray (a[i+1], a[i+2] … a[mid]) will be greater than a[j]

In [3]:
def countInversions(arr):
    inversionCount=0
    if len(arr)<=1:
        return inversionCount
    mid=len(arr)//2
    left=arr[:mid]
    right=arr[mid:]
    inversionCount+=countInversions(left)
    inversionCount+=countInversions(right)

    i=j=k=0

    while i<len(left) and j<len(right):
        if left[i]<right[j]:
            arr[k]=left[i]
            i+=1
            k+=1
        else:
            arr[k]=right[j]
            j+=1
            k+=1
            inversionCount+=(mid-i)

    while i<len(left):
        arr[k]=left[i]
        i+=1
        k+=1

    while j<len(right):
        arr[k]=right[j]
        j+=1
        k+=1

    return inversionCount

if __name__ == '__main__':
    arr=[8, 4, 2, 1]
    # arr=[3,1,2]
    result=countInversions(arr)
    print(result)


6


Other Approaches- AVL TREE & BINARY INDEX TREE

# Sort an array of 0s, 1s and 2s

Mainly to segregate 0,1,2

Simple solution is to calculate the number of 0,1,2 and with use that number to create a result array. Time Complexity is O(n), but it requires 2 traversals of the array

<strong>Another Approach-> Dutch National Flag Problem</strong>

In [5]:
def rearrangeElements(arr,n):
    low=0
    curr=0
    high=n-1

    while curr<=high:
        if arr[curr]==0:
            arr[curr],arr[low]=arr[low],arr[curr]
            low+=1
            curr+=1
        elif arr[curr]==1:
            curr+=1
        else:
            arr[curr],arr[high]=arr[high],arr[curr]
            high-=1

if __name__ == '__main__':
    # arr=[0, 1, 2, 0, 1, 2]
    arr=[0, 1, 1, 0, 1, 2, 1, 2, 0, 0, 0, 1]
    rearrangeElements(arr,len(arr))
    print(arr)


[0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 2, 2]


Time Complexity - O(n) and space complexity - O(1)