## Divide and Conquer algorithms

**Input**: File that contains all of the 100,000 integers between 1 and 100,000 (inclusive) in some order, with no integer repeated.

Your task is to compute the number of inversions in the file given, where the $i^{th}$ row of the file indicates the $i^{th}$ entry of an array.

In [1]:
import numpy as np

def merge_count(A, B):
    '''
    Merge two sorted arrays A and B into a single sorted array C.
    Count the number of inversions in the merged array.
    An inversion is a pair of indices (i, j) such that i < j and A[i] > A[j].
    
    Arguments:
    A -- sorted array of integers
    B -- sorted array of integers
    
    Returns:
    C -- merged sorted array of integers
    count -- number of inversions in the merged array
    '''
    #Initialize indices for A, B and C
    i = j = k = 0
    count = 0
    C = np.empty(len(A) + len(B), dtype=int)

    #Fill sorted array C with elements from A and B
    #using merge sort strategy
    while i < len(A) and j < len(B):
        if A[i] <= B[j]:
            C[k] = A[i]
            i += 1
        else:
            C[k] = B[j]
            # Count inversions: all remaining elements in A are greater than B[j]
            count += len(A) - i
            j += 1
        k += 1
    while i < len(A):
        C[k] = A[i]
        i += 1
        k += 1
    while j < len(B):
        C[k] = B[j]
        j += 1
        k += 1
    return C, count

def sort_count(A):
    '''
    Sort the array A and count the number of inversions using a modified merge sort.
    
    Arguments:
        A -- array of integers
    
    Returns:
        A_sorted -- sorted array of integers
        count -- number of inversions in the array
    '''
    #Case when A is empty or has one element
    #In this case, there are no inversions
    if len(A)<=1:
        return A , 0
    #Recursively sort and count inversions in left and right halves
    A_left , left_count = sort_count(A[:len(A)//2])
    A_right , right_count = sort_count(A[len(A)//2:])
    #Merge sorted halves and count inversions across the two halves
    A_sorted , merge_count_ = merge_count(A_left, A_right)
    return A_sorted , left_count + right_count + merge_count_

In [None]:
#The file text must be in the same directory as this script
#The file must be named IntegerArray.txt
#The file must contain integers separated by new lines
#The file used is the same as generated at the course but will not be provided in this repository
#The file is not too big so it can be read in memory

filename = 'IntegerArray.txt'
with open(filename, 'r') as file:
        lines = file.readlines()
        A = [int(line.strip()) for line in lines]
        A = np.array(A)

print(f'Data: {A}')
print(f'Length: {len(A)}')

Data: [54044 14108 79294 ... 74018 71187 91901]
Length: 100000


In [3]:
#Apply merge sort and count inversions
sorted_A, count = sort_count(A)

#First check if array is sorted
is_sorted = np.all(np.diff(sorted_A) >= 0)
print("Is sorted:", is_sorted)

#If sorted, print the number of inversions
if is_sorted:
    print("Number of inversions:", count)
else:
    print("Array is not sorted correctly.")

Is sorted: True
Number of inversions: 2407905288
