## Counting Inversions: Brute Force ($On^2$) and with Divide and Conquer ($O(n\log n)$)

An *inversion* in an array is a pair of elements that are "out of order."  For example, every pair of elements in an array where the earlier element is larger than the later element is considered an inversion.

For example, in this array
```
1 3 5 2 4 6
```
There are 3 inversions, (3,2), (5,2) and (5,4)


Your job will be to write two functions to count the number of inversions in an array of numbers.

### But Why Would We Want to Do This
One of the primary uses of counting inversions is to provide a quantitative measure of the similarity between two lists.  The fewer number of inversions, the more similar are those lists.  This is actually a key idea when doing **Collabartive Filtering** -- a technique for generating recommendations.  If, for example, Netflix knew your 100 favorite movies or tv shows, they could find other people with very similar lists and make recommendations based on what they have watched, but you haven't. Very effective and useful, don't you agree?



## Counting Inversions
## Part 1: Brute Force

First watch this video explaining the idea of Counting Inversions in an array:

[Video 1: Understanding Inversions (12 minutes)](https://www.youtube.com/watch?v=7_AJfusC6UQ&list=PLEGCF-WLh2RLHqXx6-GZr_w7LgqKDXxN_&index=13)


One method for finding the inversions is with a double for loop through the array, comparing each number to all those that follow, and counting each inversion as it arises.  Your first function will use this "Brute Force" approach.  In part 2, you'll try another approach to do better than the $On^2$ performance of this approach


In [0]:
l = ['age fx', 'age fx full burst', 'hg destiny', 'rg destiny', 'rg strike freedom', 'rg qan t', 'rg unicorn', 'hg 00r', 'rg 00r', '30mm', 'justice knight', 'goku', 'age fx magnum', 'aegis k']

In [2]:
def count_inversions_brute(A):
    '''
    count and return the number of inversions in array A
    inputs: array
    outputs: the amount of inversions
    '''
    c = 0
    for i in range(len(A)):
        for k in range(i+1, len(A)):
            if A[i] > A[k]:
                c += 1
    return c

count_inversions_brute([1,3,5,2,4,6])

3

In [3]:
# Test your Brute Function on our original sample list:
nums = [1, 3, 5, 2, 4, 6]

# do you get the correct number of inversions? 3?
count_inversions_brute(nums)

3

In [4]:
# Test on a list of 100,000 numbers :|
# Is your computer annoyed with this problem?  Is it working really hard?
nums = []
with open('inversions.txt') as f:
    nums = [int(line.strip()) for line in f]
    
    
count_inversions_brute(nums)

## Part 2: Doing Better with Divide and Conquer

Now that we are familiar with the powerful divide and conquer approach from MergeSort, Binary Search, and Quicksort, we can use this skill to count inversions in $O(n\log(n))$

To do this, the "divide" step will be exactly as we implemented in MergeSort, with one recursive call for the left half and one for the right.

For implementation Details, watch the following Tim Roughgarden video:

[Video 2: Counting Split Inversions using Merge Sort Approach (17 minutes)](https://www.youtube.com/watch?v=I6ygiW8xN7Y&list=PLEGCF-WLh2RLHqXx6-GZr_w7LgqKDXxN_&index=14)

Write your version of the `count_split_inversions` function below.

In [42]:
def count_split_inversions(Arr):
    '''
    does a merge sort and counts inversions
    inputs: the array/list
    outputs: sorted array/list and inversion count
    '''
    len_ = len(Arr)
    
    # dealing with the base case
    if len_ == 1:
        return Arr, 0
    
    else:
        # the recursive call up-front, like previous mergesort algorithm
        arr_l = count_split_inversions(Arr[:len_//2])
        arr_r = count_split_inversions(Arr[len_//2:])
        # due to the returnings being different, I had to assign the variables differently
        arr_l_li, len_l, arr_r_li, len_r = arr_l[0], len(arr_l[0]), arr_r[0], len(arr_r[0])
        # the inversion counts from previous recursions
        c = arr_l[1] + arr_r[1]
        
        Arr = []
        
        # due to the need for an inversion count, I decided to not use the pop/append method, but use indexes to identify inversions
        l_ind = 0
        r_ind = 0
        for ele in range(len_):
            # first 2 makes sure no index out of bounds, if all these variables are being added after the other list is finished
            # there would be no inversions either of these 2 should happen, as even if only the left remains, all the inversions would
            # have been accounted for
            if l_ind == len_l:
                Arr.append(arr_r_li[r_ind])
                r_ind += 1
                continue
            elif r_ind == len_r:
                Arr.append(arr_l_li[l_ind])
                l_ind += 1
                continue
                
            # if the right current item is smaller than the left, the inversion counter goes up by however remains between the left index
            # and the length of the list, as they would all be misplaced
            if arr_r_li[r_ind] < arr_l_li[l_ind]:
                c += len(arr_l_li) - l_ind
                Arr.append(arr_r_li[r_ind])
                r_ind += 1
            # otherwise just left item on and increase the index count
            else:
                Arr.append(arr_l_li[l_ind])
                l_ind += 1
            
        return Arr, c

In [43]:
# Test your Brute Function on our original sample list:
nums = [1, 3, 5, 2, 4, 6]

# Note, since this function does double duty and
# returns a sorted list AND the number of inversions, 
# I'm printing just the inversions

# print(count_split_inversions_(nums)[1])
z = count_split_inversions(nums)
print(z[1])
print(z[0] == sorted(nums), z)

3
True ([1, 2, 3, 4, 5, 6], 3)


In [44]:
# Test your Brute Force Function on a longer list
# should find 86 inversions

nums = [54044, 14108, 79294, 29649, 25260, 60660, 2995, 53777, 49689, 9083, 16122, 90436, 4615, 40660, 25675, 58943, 92904, 9900, 95588, 46120]

z = count_split_inversions(nums)
print(z[1])
print(z[0] == sorted(nums))

86
86
True


In [45]:
# Test on a list of 100,000 numbers

nums = []
with open('inversions.txt') as f:
    nums = [int(line.strip()) for line in f]
    
z = count_split_inversions(nums)
print(z[1])
print(z[0] == sorted(nums))

2407905288


2407905288
True
