# 1. Intro: Merge-sort algorithm

1. recursively sort 1st half of the array
2. recursively sort 2d half of the array
3. merge two sorted sublists into one  
& ignore base cases: if your handed an array that has either zero or one elements, well it's already sorted

The hard part is how do you implement the **merge depth**? The recursive calls have done their work. We have these two sort of separated half the numbers. The left half and the right half. How do we combine them into one? 
* The idea is you just populate the output array in a sorted order, by traversing pointers or just traversing through the two, sorted sub-arrays in parallel.
    * arr = the output array
    * L and R = the results of 2 recursive calls
    * L = the reult of first recursive call, which contains the left half of the input array in sorted order
    * R = the reult of second recursive call, which contains the right half of the input array in sorted order
    * counter i, to traverse through L 
    * counter j to traverse through R
    * We're going to do a single pass of the output array copying it in an increasing order, always taking the smallest from the union of the two sorted sub arrays
    * The main idea is that: the minimum element that you haven't yet looked at in A and B has to be at the front of one or the two lists

**Running time** 
* you should imagine that you're just running the algorithm in a debugger. 
* every time you press enter, you advance with one line of the program through the debugger. 
* the running time is just a number of operations executed ~ the number of lines of code executed

a. Let's just think about how many operations are gonna get executed when we do a *single merge of two sorted sub arrays*
* initialization of i and j = 2 operations
* within "for" loop
    * comparaison = 1 operation
    * 1 assignment = L(i) or R(j) = 1 operation 
    * 1 incrementation = i or j = 1 operation
    * 1 incr of "for" loop's k = 1 op
    * = > 4 operations per cycle
* running time of merge on array of m numbers is =< 4m + 2
* So maybe you might argue that,
    * each loop iteration should count as two operations, not just one.
    * because you don't just have to increment K, but you also have to compare it to the, upper bound of N. 
    * Would have been 5M+2 instead of 4M+2
    * 4m + 2 =< 6m (since m >=1)   

b. Analyzing Merge Sort seems a lot more intimidating, because if it keeps spawning off these recursive versions of itself. 
* So the number of recursive calls, the number of things we have to analyze, is blowing up exponentially as we think about various levels of the recursion
* every time we make a recursive call it's on a quite a bit smaller input then what we started with, it's on an array only half the size of the input array
* So there's some kind of tension between 
    * on the one hand explosion of sub problems, a proliferation of sub problems 
    * and the fact that successive subproblems only have to solve smaller and smaller subproblems.
* Claim: Merge Short never needs than more than 6*N*log_2(N) + 6*N (to correctly sort an input array of N numbers)
    * insertion sort, selection sort, and bubble sort performance was governed by the quadratic function of the input size
    * Proof by *recursion tree method*
        * The idea is to write out all over the work done by the recursive merge sort algorithm in a tree structure with 
        * the children of a given node corresponding to the recursive calls made by that node. 
        * The point of this tree structure is it will facilitate a interesting way to count up the overall work done by the algorithm and will greatly facilitate the analysis
        * So at level zero we have a root. And this corresponds to the outer call of merge sort 
        * Now, this tree is going to be binary in recognition of the fact that each invocation of MergeSort makes two recursive calls. 
        * So the two children will correspond to the two recursive calls of MergeSort
        * the number of levels of recursion is exactly the number of times you need to divide n by 2, until you get down to a number that's most one. log_2(n) = leaves = single element arrays = base cases
        * total number of levels log_2(n)+1
        * the *first question* is, at a given level j of this recursion, exactly how many distinct sub problems are there as a function of the level j = 2^j
        * The second question is, for each of those distinct sub problems at level j, what is the input size = n/2^j
        * number of operations on a level =< 2^j * 6 * n/2^j = 6*n (independent of j = perfect equilibrium between two competing force: the number of subproblems is doubling with each level but the amount of work that we do per subproblem is halving with each level)
        * total = 6*n*(log_2(n) + 1)
    
**Three biases**  
that we made when we did this analysis of Merge Short 
1. we used what's often called worst case analysis. 
     * By worst case. Analysis, I simply mean that our upper bound of 6N*log_2 N + 6N applies to the number of lines of executed for every single input array of length N. 
     * We made absolutely no assumptions about the input, where it comes from, what it looks like beyond what the input length N was.
     * This is opposed of 2 other types of analysis
         * average case analysis = the average running time of an algorithm under some assumption about the relative frequencies of different inputs.
         * the use of a set of prespecified benchmarks = one agrees up front about some set, say ten or twenty, benchmark inputs, which are thought to represent practical or typical inputs for the algorithm.
     * both average-case analysis and benchmarks are useful in certain settings, 
         * but for them to make sense, you really have to have domain knowledge about your problem. 
         * You need to have some understanding of what inputs are more common than others, what inputs better represent typical inputs than others
     
2. in this course, when we analyze algorithms, we won't worry unduly about small constant factors or lower order terms
    * why do we do this, and can we really get away with it? 
    * simply way easier mathematically, if we don't have to, precisely pin down what the constant factors and lower-order terms are
    * constants depend on the exact processor, the compiler, the compiler optimizations, the programming implementation
    * we're just going to be able to get away with it: we'll get extremely accurate predictive power even though we won't be keeping track of lower terms and constant factors
    
3. we're going to use what's called asymptotic analysis
    * we will focus on the case of a large input sizes. 
    * The performance of an algorithm as the size N of the input grows large, that is, tends to infinity
    * 6*n*(log_2(n) + 1) is better than n^2/2: this is a mathematical statement that is true if and only if N is sufficiently large once N grows large 
    
**fast algorithm** = whose worst case running time grows slowly as a function of the input size
* Mathematical definition = mathematical tractability + predictive power 
* the holy grail will be to have what's called a linear time algorithm, an algorithm whose number of instructions grows proportional to the input size

In [115]:
a = [13, 4, 1, 8, 12, 9, 5]

In [4]:
def mergesortarray(arr):
    print("array is: ", arr)
    if len(arr) > 1:
        # midpoint
        mid = len(arr)//2
        # subarrays
        L = arr[:mid]
        R = arr[mid:]
        # recursion on each of subarrays
        print("L is: ", L, ", R is: ", R)
        mergesortarray(L)
        mergesortarray(R)  
        
        i = j = 0 # two iterators for traversing the two halves
        k = 0 # iterator for the main list
        
        # copy data to temp arrays l[] and r[]
        while i < len(L) and j < len(R):
            print("L_temp is: ", L, ", R_temp is: ", R)
            if L[i] < R[j]:
                arr[k] = L[i]
                i += 1
            else:
                arr[k] = R[j]
                j += 1
            k += 1
            print("array_temp_1 is: ", arr, "i is:", i, ", j is:", j, ", k is:", k)
        
        # checking if any element was left
        while i < len(L):     
            arr[k] = L[i]
            i += 1
            k += 1
            print("array_temp_2 is: ", arr, "i is:", i, ", j is:", j, ", k is:", k)
            
        while j < len(R):
            arr[k] = R[j]
            j += 1
            k += 1
            print("array_temp_3 is: ", arr, "i is:", i, ", j is:", j, ", k is:", k)
        
        print("array_temp_final is: ", arr)
        print("\n")
        
        return arr

In [84]:
mergesortarray(a)

array is:  [13, 4, 1, 8, 12, 9, 5]
L is:  [13, 4, 1] , R is:  [8, 12, 9, 5]
array is:  [13, 4, 1]
L is:  [13] , R is:  [4, 1]
array is:  [13]
array is:  [4, 1]
L is:  [4] , R is:  [1]
array is:  [4]
array is:  [1]
L_temp is:  [4] , R_temp is:  [1]
array_temp_1 is:  [1, 1] i is: 0 , j is: 1 , k is: 1
array_temp_2 is:  [1, 4] i is: 1 , j is: 1 , k is: 2
array_temp_final is:  [1, 4]
L_temp is:  [13] , R_temp is:  [1, 4]
array_temp_1 is:  [1, 4, 1] i is: 0 , j is: 1 , k is: 1
L_temp is:  [13] , R_temp is:  [1, 4]
array_temp_1 is:  [1, 4, 1] i is: 0 , j is: 2 , k is: 2
array_temp_2 is:  [1, 4, 13] i is: 1 , j is: 2 , k is: 3
array_temp_final is:  [1, 4, 13]
array is:  [8, 12, 9, 5]
L is:  [8, 12] , R is:  [9, 5]
array is:  [8, 12]
L is:  [8] , R is:  [12]
array is:  [8]
array is:  [12]
L_temp is:  [8] , R_temp is:  [12]
array_temp_1 is:  [8, 12] i is: 1 , j is: 0 , k is: 1
array_temp_3 is:  [8, 12] i is: 1 , j is: 1 , k is: 2
array_temp_final is:  [8, 12]
array is:  [9, 5]
L is:  [9] , R is

[1, 4, 5, 8, 9, 12, 13]

In [5]:
b = [5, 3, 8, 9, 1, 7, 0, 2, 6, 4]
mergesortarray(b)

array is:  [5, 3, 8, 9, 1, 7, 0, 2, 6, 4]
L is:  [5, 3, 8, 9, 1] , R is:  [7, 0, 2, 6, 4]
array is:  [5, 3, 8, 9, 1]
L is:  [5, 3] , R is:  [8, 9, 1]
array is:  [5, 3]
L is:  [5] , R is:  [3]
array is:  [5]
array is:  [3]
L_temp is:  [5] , R_temp is:  [3]
array_temp_1 is:  [3, 3] i is: 0 , j is: 1 , k is: 1
array_temp_2 is:  [3, 5] i is: 1 , j is: 1 , k is: 2
array_temp_final is:  [3, 5]


array is:  [8, 9, 1]
L is:  [8] , R is:  [9, 1]
array is:  [8]
array is:  [9, 1]
L is:  [9] , R is:  [1]
array is:  [9]
array is:  [1]
L_temp is:  [9] , R_temp is:  [1]
array_temp_1 is:  [1, 1] i is: 0 , j is: 1 , k is: 1
array_temp_2 is:  [1, 9] i is: 1 , j is: 1 , k is: 2
array_temp_final is:  [1, 9]


L_temp is:  [8] , R_temp is:  [1, 9]
array_temp_1 is:  [1, 9, 1] i is: 0 , j is: 1 , k is: 1
L_temp is:  [8] , R_temp is:  [1, 9]
array_temp_1 is:  [1, 8, 1] i is: 1 , j is: 1 , k is: 2
array_temp_3 is:  [1, 8, 9] i is: 1 , j is: 2 , k is: 3
array_temp_final is:  [1, 8, 9]


L_temp is:  [3, 5] , R_te

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

In [116]:
# another method with for loops
def mergesortarraynew(arr):
    if len(arr) > 1:
        # midpoint
        mid = len(arr)//2
        # subarrays
        L = arr[:mid]
        R = arr[mid:]
        # recursion on each of subarrays
        mergesortarraynew(L)
        mergesortarraynew(R)  
        
        i = j = 0 # two iterators for traversing the two halves

        for k in range(0, len(arr)):
            if i < len(L) and j < len(R):
                if L[i] < R[j]:
                    arr[k] = L[i]
                    i += 1
                else:
                    arr[k] = R[j]
                    j += 1
                k += 1
            
            elif i >= len(L) and j < len(R):
                arr[k] = R[j]
                j += 1
                k += 1
                
            elif i < len(L) and j >= len(R):
                arr[k] = L[i]
                i += 1
                k += 1
        
        return arr    

In [117]:
mergesortarraynew(a)

[1, 4, 5, 8, 9, 12, 13]

# 2 Asymptotic analysis

* provides basic vocabulary for discussing the design and analysis of algorithms.
    * The vocab coarse enough, to suppress all of the details that you want to ignore. Details that depend on the choice of architecture, the choice of programming language, the choice of compiler
    * it's sharp enough to be useful. In particular, to make predictive comparisons between different high level algorithmic approaches to solving a common problem

* the main point is to suppress both leading constant factors and lower order terms
    * lower order terms basically, by definition, become increasingly irrelevant as you focus on large inputs
    
* 6Nlog_2(N) + 6*N
    * 6*N – growing more slowly than than n log n, so we just drop that
    * the leading constant factor is the 6 so we suppress that as well.
    * => Nlog_2(N)

* Terminology: **running time** of merge sort is big O of n log n: O(Nlog_2(N))
    * Big O notation =  after you've dropped the lower order terms, and suppressed the leading constant factor, you're left with the function f of n
    
**Examples**

1. linear scan through the array  
Problem: searching an array for a given integer   
for i in range (1, n)   
    if A[i] == t: return True   
    else: return False  
Running time: 2*n = O(n)  

2. two loops in sequence  
Problem: 2 arrays A, B of the same length N, is there t within either of them?  
for i in range (1, n)   
    if A[i] == t: return True   
for i in range (1, n)   
    if B[i] == t: return True   
return False  
Running time: 2*n + 2*n = 4*n = O(n)  

3. two nested loops I  
Problem: of searching whether two given input arrays each of length n, contain a common number.  
for i in range (1, n):  
    for j in range (1, n):  
        if A[i] == B[j]: return True  
return False  
Running time: 2n*n = 2*n^2 = O(n^2)  

4. two nested loops II  
Problem: looking for duplicates in a single array A of length n  
for i in range (1, n-1):  
    for j in range (i+1, n):  
        if A[i] == A[j]: return True  
return False  
Running time: sum_from_1_to_n-1(i) = n*n/2 = O(n^2)  

**Big Oh notation**
* Concerns functions defined on the positive integers, we'll call it T(n)
* We're gonna be concerned about the worst-case running time of an algorithm, as a function of the input size, n
* what does it mean when we say a function, T(n) = O(f(n))?
    * for all sufficiently large values of n, it's bounded above by a constant multiple of f(n)
    * T(n) = O(f(n)) <=> there exist 2 constants c, n_0 > 0, such that T(n) =< c*f(n) for all n >= n_0

**Examples**
1. if T(n) = a_k*n^k + ... + a_1*n + a_0 for any integer k positive and any coeficients integers positive or negative, then T(n) = O(n^k)  
Proof: 
   * choose n_0 = 1 and c = |a_k| + ... + |a_0|  
    * we need to show that for all n >= 1, T(n) =< c*n^k
        * we have for every n >= 1
        * T(n) =< |a_k|*n^k + ... + |a_1|*n + |a_0| =< |a_k|*n^k + ... + |a_1|*n^k + |a_0|*n^k = c*n^k
    * And if you yourself want to prove that something is big O of something else, usually what you do is you reverse engineer constants that work.

2. for every k >= 1, n^k is not O(n^(k-1))  
Proof: by contradiction    
   * Suppose n^k = O(n^(k-1))  
   * Then, by definition, there are  c, n_0 > 0, 
   * such that n^k =< c*n^(k-1) for all n >= n_0
   * but then n =< c
   * if n_0 = c+1 that inequality is wrong
   
**Big Omega and Theta**
*  If big O is analogous to =< 
    * omega is analogous to >=
    * theta is analogous to =
* T(n) = Omega(f(n)) <=> there exist 2 constants c, n_0, such that T(n) >= c*f(n) for all n >= n_0
* T(n) = Theta(f(n)) 
    * <=> T(n) = O(f(n)) and T(n) = Omega(f(n))
    * T(n) is sandwiched between two different constant multiples of f(n)
    * <=> there exist 2 constants c_1, c_2, n_0, such that c_1*f(n) =< T(n) =< c_2*f(n) for all n >= n_0
* **(!)** One way that algorithm designers can be quite sloppy is by using O notation instead of theta notation. So that's a common convention 
    * Example: we have a subroutine, which does a linear scan through an array of length N. It looks at each entry in the array and does a constant amount of work with each entry (the merge subroutine would be more or less an example of a subroutine of that type)

**Little O notation**
* one function is growing strictly less quickly than another
* T(n) = o(f(n)) <=> for all constants c > 0, exists a constant n_0 such that T(n) =< c*f(n) for all n >= n_0 
* We have to prove that, for every single constant C, no matter how small, there exists some large enough "n" not beyond which T(n) is bounded above by c*f(n).

**Additional examples**
1. Claim: 2^(n+10) = O(2^n)  
Proof: 
    * need to pick c, n_0 such that: 2^(n+10) =< c*2^n for any n >= n_0
    * 2^n * 2^10 =< c*2^n 
    * 2^10 =< c
    * c = 1024, n_0 = 0 

2. None example: 2^(n*10) is not O(2^n)  
Proof by contradiction:
    * If yes: there are c, n_0 such that 2^(n*10) =< c*2^n for any n >= n_0
    * 2^(n*9) =< c for any n >= n_0 wrong

3. claim: for every pair of functions F and G, both of these functions are defined on the positive integers, it doesn't matter, up to a constant factors, whether 
    * we take pointwise maximum of the two functions 
    * or whether we take the pointwise sum of the two functions
    * <=> max(f,g) = Theta(f(n)+g(n))  
Proof:   
    * need to exhibit c_1, c_2, n_0: such that c_1*(f(n)+g(n)) =< max(f(n),g(n)) =< c_2*(f(n)+g(n)) for all n >= n_0
    * for any n >= 0: 
        * max(f(n),g(n)) =< f(n) + g(n) (the first one is just a number) because f and g are positive!
        * 2* max(f(n),g(n)) >= f(n) + g(n)
    * c_1 = 1/2, c_2 = 1 for all n >= 1 QED


### Programming assignment 1 [Karatsuba multiplication]

In [208]:
x = 3141592653589793238462643383279502884197169399375105820974944592
y = 2718281828459045235360287471352662497757247093699959574966967627

In [209]:
def multipl(x, y):
    if (x < 10) or (y < 10):
        return x*y
    else:
        m = max(len(str(x)), len(str(y))) // 2
        
        a = int(str(x)[:-m])
        c = int(str(y)[:-m])
        b = int(str(x)[-m:])
        d = int(str(y)[-m:])
    
        # recursive calls
        z0 = multipl(a, c)
        z1 = multipl((a+b), (c+d))
        z2 = multipl(b, d)
    
        return z0*10**(2*m) + (z1-z0-z2)*10**m + z2

In [211]:
x*y

8539734222673567065463550869546574495034888535765114961879601127067743044893204848617875072216249073013374895871952806582723184

In [210]:
multipl(x, y)

8539734222673567065463550869546574495034888535765114961879601127067743044893204848617875072216249073013374895871952806582723184

# 3. Devide and Conquer algorithms examples  
  
Steps of paradigm:
1. Devide the problem into smaller sub-problems
2. Conquer the sub-problems just using recursion
3. Combine the solutions to the subproblem into one problem

### 3.1 O(nlogn) algorithm for counting inversions 

**Def:** 
* Given as input an array A with a length N. 
* The array a contains any all distinct numbers, but to keep it simple we assume that it contains the numbers one through n.
* The goal is to compute the number of inversions of this array 
* Inversion = a pair of (i, j) of array indices with i<j and A[i]>A[j]
    * if the array contains numbers in sorted order (1 up to N) => the number of inversions is zero  

**Example:**  
* (1, 3, 5, 2, 4, 6)
* How many inversions?
* (5, 2) (3, 2) (5, 4)  

**Motivation:** 
* a measure of the numerical similarity that quantifies how close to different ranked lists are to each other. 
* The more inversions this array has, it quantifies that your lists look more and more different from each other
* collaborative filtering
* Array (1, ..., n): number of inversions = C(n,2) = (n-1)*n/2 

**Brute force algorithm: O(n^2)**
* nested double four loop
* which goes through I, one which goes through J bigger than I, 
* we check each pair IJ individually with I less than J 
* whether entities AI and AJ is inverted 
* and if it is then we add it to our running count

**D&C algorithm:**
* we recurse separately on the left and the right half's of the array
* To understand how much progress we can make purely using recursion let's classify the inversions of array into one of three types. Call an inversion (i, о) with i<j
    * Left inversion = if i,j =< n/2
    * Right inversion = if i,j > n/2
    * Split inversion = if i =< n/2 =< о
* Count 1 and 2 types of inversions recursively (2 recursions)
* For third one we need a separate subroutine (within 3 step of D&C paradigm: cleaning)
  
**High level algorithm:**
* Count(array A, length n)
    * if n = 1: return 0
    * else:
        * x = count(1st half of A, n/2)
        * y = count(2d half of A, n/2)
        * z = countSplitInv(A, n) # currently unimplemented
    * return x +  y + z
* Goal: 
    * We'd like to do only linear work in counting up the number of split inversions: O(n) time
    * Then Count(array A, length n) will run in O(nlog(n))
    * It's rather ambitious goal, because number of such split inversions could be up to n^2 for an array oа this kind: (n/2, ..., n, 1, ...(n-1)/2).  
    
**Split inversions count algorithm:**
* We're going to ask our recursive calls to not only count inversions in the array that they're passed, but also along the way to sort the array. 
* We know sorting is fast. Merge sort will do it in n log in time
* the merge subroutine almost seem designed just to count the number of split inversions: will naturally uncover, all of the split inversions
* We update previous algorithm:
* SortAndCount(array A, length n)
    * if n = 1: return 0
    * else:
        * (B, x) = SortAndCount(1st half of A, n/2) # B - sorted version of subarray, x – inversion count
        * (C, y) = SortAndCount(2d half of A, n/2) # С - sorted version of subarray, y – inversion count
        * (D, z) = MergeAndCountSplitInv(A, n) # currently unimplemented # D – sorted version of original array A, я - inv ct). Sorted subarrays B and C are now passed to this function 
    * return x +  y + z
* Goal: 
    * We'd like to do only linear work in MergeAndCountSplitInv: O(n) time 
    * Then SortAndCount(array A, length n) will run in O(nlog(n))  

**Why Merge Sort uncovers number of inversions?**
* D = output[len = n]
* B = 1st sorted array [len = n/2]
* C = 2d sorted array [len = n/2]
* i = 1
* j = 1
* for k = 1 to n
    * if B(i) < C(j)
        * D(k) = B(i)
        * i++
    * else [B(i) > C(j)]
        * D(k) = C(j)
        * j++
    * end
* Let's think of following case:
    * input array A with no split inversions. 
        * So every inversion in this input array A is going to be 
        * either a left inversion, so both indices are at most n/2, 
        * or a right end version. So both indexes are strictly greater than n/2. 
    * Given such an array A, 
        * once you're merging at this step, 
        * what do the assorted subarrays B and C look like for an input array that has no split inversions?
    * Every element in the 1st half is less them every element of the second half
    * In merge algorithm that means that: nothing is copied from C, until B is exhausted
    * => Copying elements over from the second sub-array C has something to do with the number of split inversions in the original array
        * for each copy action of an element y from C, the number of split inversions is the number of untouched elements in B
        * Proof: 
        * let x be an element of first array B
        * (1) ix x copied to output D before y, then x < y = > no inversion
        * (2) if y copied to output D before x, then x > y = > x, y are a (split) inversion
        * putting these two together, it says that the elements x of the array B that form split inversions with y are precisely those that are going to get copied to the output array after y
        * those are exactly the number of elements remaining in B when y gets copied over.
        * QED

**MergeAndCountSplitInv(A, n)**
* while merging the two sorted subarrays keep running total of number of splti inversions
* when element of 2d subarray is copied to an input array, we should increase the split inversion counter by a number of untouched elements remaining in 1st subarray
* Running time of this subroutine: only additional work here is incrementing the number which is also O(n) = > O(n) of merge + O(n) of incrementation = O(n) total 
    * (!) Be careful: if you added O(n) to itself n times, it would not be O(n), but if you add O(n) to itself a constant number of times, it is still O(n).
* So the SortAndCount algorithm's running time is O(nlog(n))

### 3.2. Strassen's Subcubic Matrix Multiplication Algorithm

Cool algorithm for two reasons. 
1. completely non-trivial, very clever, not at all clear how Strassen ever came up with it.
2. it's for such a fundamental problem. So computers as long as they've been in use from the time they were invented up til today a lot of their cycles is spent multiplying matrices. 


* 3 matrices nxn: x, y, z
* z(ij) = (i-th row x).(jth column y) = sum_k_1_n(x(ik)y(kj))
* input size n^2, output size n^2 => the best algorithm could be of O(n^2)
* The straight forward algorithm = 3 nested four loops = Theta(x^3) running time

D&C
* identify smaller matrices:
    * break X on 4 matrices = blocks = A, B, C, D = n/2 x n/2
    * break Y on 4 matrices = blocks = E, F, G, H = n/2 x n/2
    * the blocks behave as just atomic elements: the multiplication rules are just the same

**Recursive algorithm #1**
X*Y =   
[AE + BG][AF + BH]  
[AF + BH][CF + DH]  
* Step 1: recurcively compute 8 distinct products 
* Step 2: do additions (O(n^2) time)
* Fact: run time is O(n^3) as for straight forward algorithm

**Question**
* Can we do something clever, to reduce the number of recursive calls, from 8 down to something lower?
* that is where Strassen's Algorithm comes in

**Strassen's Algorithm**
* 2 steps
    * It recursively computes some products of smaller n/2 by n/2 matrices. But there's only going to be seven of them (will be much more cleverly chosen than in the first recursive algorithm)
    * step two then is to produce the product of x and y. Produce each of those four blocks that we saw. With suitable additions and subtractions of these seven products. These are much less straightforward than in the first recursive algorithm
* while the additions and subtractions involved will be a little bit more numerous than they were in the naive recursive algorithm. 
    * It's only going to change the work in that part of the algorithm by a constant factor.
    * So we'll still spend only Theta(n^2) work adding and subtracting things, 
    * and we get a huge win in decreasing the number of recursive calls from 8 to 7.    
* Fact: it does beat cubic time

7 products:
* X = [A B / C D]
* Y = [E F / G H]
    * P1 = A(F-H)
    * P2 = (A+B)H
    * P3 = (C+D)E
    * P4 = D(G-E)
    * P5 = (A+D)(E+H)
    * P6 = (B-D)(G+H)
    * P7 = (A-C)(E+F)
* From just these seven products we can using only addition and subtraction recover all four of the blocks of X*Y.
    * block1 = P5+P4-P2+P6 = [AE + BG]
    * block2 = P1+P2 = [AF + BH]  
    * block3 = P3+P4 = [AF + BH]
    * block4 = P1+P5-P3-P7 = [CF + DH]

### 3.3 O(nlog(n)) Algorithm for closest Pair

* you're given n points in the plane and you want to figure out which pair of points are closest to each other
* computational geometry

**Conditions:**
* Input: a set p = {p1, ..., pn} of n points in the plane (R)
* Noation: d(pi, pj) = eucledian distance 
    * if pi = (xi, yi), pj = (xj, yj)  
    * d(pi, pj) = sqrt((xi-xj)^2 + (yi-yj)^2)  
* Output: a pair of p*, q* belonging to p, of distinct points that minimize d(p,q)
* assume all endpoints have distinct x coordinates, and also all endpoints have distinct y coordinates
* Brute force (2 nested for loops): Theta(N^2) time
* Sorting might help here to reduce time

**1D case**
* sort points according to their only coordinate = O(nlog(n))
* scan through the points, so this takes linear time. O(n)
* for each consecutive pair, we compute their distance 
* and we remember the smallest of those consecutive pairs and we return that 
* (!) even in the line, there are a quadratic number of different pairs, so brute-force search is still a quadratic time algorythm even in the 1D case

**2D**
* points have two coordinates => there's two ways to sort them. 
* So let's just sort them both ways, that is, the first step of our algorithm, which you should really think of as a preprocessing step. D&C1:
    * px = an array p sorted by x
    * py = an array p sorted by y
    * it is a merge-sort: it takes O(nlon(n)) time
    * for free primitives, what are manipulations or operations you can do on data which basically are costless. Sorting is one of them
* D&C2 on sorted arrays
    * devide: we take the input points set and 
    * conquer: do a recursion on left and right halves of them
    * combine: the most tricky part: 
        * you've computed the closest pair on the left half of the points, 
        * and the closest pair on the right half of the points, 
        * how do you then quickly recover the closest pair for the whole point set
        
**ClosestPair(Px, Py) Algorithm D&C2**
* Base case omitted: once you have a small number point, say two points or three points, then you can just solve the problem in constant time by a brute-force search
* (1) Devide: let Q - left part of P, R – right half of P from Qx, Qy, Rx, Ry 
    * think through carefully how would you form Qx, Qy, Rx and Ry given that you already have Px and Py.
    * think through carefully or maybe even code up after the video is how would you form Qx, Qy, Rx and Ry given that you already have Px and Py.
    * if you think about it, because Px and Py are already sorted just producing these sorted sublists takes linear time
    * in some sense the opposite of the merge subroutine used in merge sort: splitting rather than merging. But again, this can be done in linear time
* (2) Conquer: 
    * we recursively call closest pair line on each of the two subproblems, 
    * so when we invoke closest pair on the left half of the points on Q 
    * we're going to get back what are indeed, the closest pair of points amongst those in Q
    * (a) (p1, q1) =  closest pair (Qx, Qy)
    * (b) (p2, q2) =  closest pair (Rx, Ry)
    * lucky case: the closest pair of points in all of P, actually, both of them lie in Q or both of them lie in R
    * unlucky case: split: one of the points lies in the left half, in Q, and the other point lies in the right half, in R. Nither of both recursive calls would find it => special subroutine to find:
        * (c) (p3, q3) =  closest pair (Px, Py)
    * the split, then the closest pair has to either be on the left or on the right or it has to be split
    * return best of (p1, q1) (p2, q2) (p3, q3)
* the running time of the closest pair algorithm is going to be in part determined by the running time of closest split pair.     
    * what is all of the work that we would do in this algorithm or we do have this preprocessing step we call merge sort twice, we know that's nlog(n)
    * we have a recursive algorithm with the following flavor, it makes two recursive calls
        * Each recursive call is on a problem of exactly half the size with half the points of the original one. 
        * And outside of the recursive calls, by assumption, by, in the problem, we do a linear amount of work in computing the closest split pair
        * the exact same recursion tree which proves an n =log(n) bound for merge sort, proves an n log n bound for how much work we do after the preprocessing step, so that gives us an overall running time bound of nlog(n)  
        
**Key idea**
* we don't actually need a full-blown correct implementation of the closets split pair subroutine. 
    * We do not actually need a subroutine that, for every point sets, always correctly computes the closest split pair of points
    * The reason is that's actually a strictly harder problem than what we need to have a correct recursive algorithm. 
* we only need to bother computing the closest split pair in unlucky case where its distance is less than d(p1,q1) = result of 1st call and d(p2,q2) = result of 2d call

**ClosestPair(Px, Py) Algorithm improved**
let's rewrite the high level recursive algorithm slightly to make use of this observation that the closest split pair subroutine only has to operate correctly in the regime of the unlucky case
* (1) Devide: let Q - left part of P, R – right half of P from Qx, Qy, Rx, Ry 
* (2) (p1, q1) =  closest pair (Qx, Qy)
* (3) (p2, q2) =  closest pair (Rx, Ry)
* (4) d = min(d(p1,q1), d(p2,q2)
* (5) we pass d as a parameter in ClosestSplitPair Subroutine (p3,q3) = ClosestSplitPair(Px, Py, d)
* (6) return best of (p1, q1) (p2, q2) (p3, q3)  

We need (5) to be
* O(n)
* We need the algorithm to correctly compute the closest pair of P if it is a split pair
  
**ClosestSplitPair(Px, Py, d) Subroutine Algorithm**
* let x_bar = biggest x coordinate in left of P
    * since we're passed as input a copy of the points sorted by x coordinate, we can figure out what x bar is in constant time just by accessing the relevant entry of the array, px. (O(1) time)
* d - is the parameter that controls whether or not we actually care about the closest split pair or not, 
    * we care if and only if there is a split pair at distance less than d.
    * width of the strip is 2d
* Let Sy = points of P with X coordinates within (x_bar-d; x_bar+d) sorted by y
    * To extract Sy from Py, all we need to do is a simple linear scan through Py, checking each point where it's x coordinate is => O(n) time
* It's useful to have this set sorted by y, and use it in vertical strip sorted by y coordinate. 
    * it was useful that we did this merge sort at the beginning of the algorithm before we even underwent any recurssion. 
    * Remember, what is our running time goal for closest split pair? 
    * We want this to run in linear time, that means we cannot sort inside the closest split pair subroutine
    * Extracting sorted sublists from those sorted lists of points can be done, done in linear time
* Variables to keep track of the best candidate we've seen so far
    * best = d
    * best_pair = null
*  for i = 1 to |Sy|-1
    * for j = 1 to min(7, |Sy|-i): 
        * let p,q = i-th and (i+j)th points of Sy # we are looking at pairs that are within seven position of each other
        * if d(p,q) < best => best_pair = (p,q)
* at the end 
    * return best pair
    * one possible execution of closest split pair is that it never finds a pair of points, p and q, at distance less than delta. In that case, this is going to return null and then in the outer call. 
* Running time: regardless of 2 nested loops have O(n^) running time, here the inner loop has only 7 points to iterate through, so its running time is O(1), and running time of 2 nested loops is O(n) for this case

**Correctness claim**
* let p belonging Q, q belonging to R, such as d(p,q)<d
* then:
    * (A) p and q are members of Sy
    * (B) p and q are at most 7 positions apart from each other
* if this claim is true
    * Corollary 1: If the closest pair of P is a split pair, then ClosestSplitPair finds it
        * the claim is guaranteeing that every potentially interesting split pair of points and every split pair of points with distance less than d meets both of the criteria which are necessary to be examined by the ClosestSplitPair subroutine.
        * => in the unlucky case where the best pair of points is a split pair, then this claim guarantees that the ClosestSplitPair will compute the closest pair of points 
    * Corollary 2: closest paor is correct, and runs in O(nlog(n)) time

**Claim explanation**  
Part A 
* point p: (X1, Y1) from the left half of the point set. 
* point q: (X2, Y2) from the right half of the point set. 
* We're assuming that these points are close to each other: the Euclidean distance between p and q is no more than this parameter d. 
    * if you have two points that are close in Euclidean distance, then both of their coordinates have to be close to each other 
    * sqrt((X1 - X2)^2 + (Y1 - Y2)^2)) =< d => 
    * |X1 - X2| =< d and |Y1 - Y2| =< d
* p and q are both in Sy <=> X1 and X2 are within [X_bar - d, X_bar + d]
* proof by picture 
    * p belongs to Q, and x_bar is the most right coordinate of Q set => X1 =< X_bar
    * q belongs to R, and x_bar is the most right coordinate of Q set => X2 >= X_bar
    * since |X1 - X2| =< d and X1 =< X_bar => X2 =< X_bar + d
    * since |X1 - X2| =< d and X2 >= X_bar => X1 =< X_bar - d
  
Part B  
*  Here what we're saying is really, once we do a suitable filtering focus on points in this vertical strip, then we just go through the points according to their Y coordinate. 
    * We don't just look at adjacent pairs. 
    * We look at pairs within seven positions, but still we basically do a linear sweep through the points in Sy
* Picture: 
    * 8 boxes d/2 x d/2 side, 
    * centered around X_bar
    * the bottom is smaller of Y coordinates of points p and q
    * by design: either p or q is on this bottom line
        * (i) both p and q have to be in these boxes. 
        * (ii) these boxes are sparsely populated: every one contains either zero or one point of the array Sy
    * we're gonna see is that there's at most eight points in this picture, 
        * two of which are p and q, 
        * and therefore, if you look at these points sorted by Y coordinate, 
        * it has to be that they're within seven of each other, 
        * the difference of indices is no more than seven    
    * Lemma 1: all of the points of Sy, which show up in between the Y coordinates of the points we care about p and q have to appear in this picture, they have to lie in one of these eight boxes
        *  First, we're going to argue that all such points have to have Y coordinates within the relevant range of this picture between the minimum of Y1 and Y2 and delta more than that
            * sqrt((X1 - X2)^2 + (Y1 - Y2)^2)) =< d => |Y1 - Y2| =< d
            * whichever is lower of p and q, whichever one has a smaller y coordinate is precisely at the bottom of this diagram. 
            * For example, if q is the one with the smaller y coordinate, it might be on the black line right here. So that means in particular x has y coordinate no more than the top part of this diagram = no more than delta bigger than q.
            * So that's why all points of Sy with a Y coordinate between those of p and q have to be in the range of this picture, between min{Y1, Y2} and min{Y1, Y2} + d
        * Secondly that they have to have X coordinates in the range of this picture, namely between X bar minus delta and X bar plus delta
            * Def: All points within Sy have x coordinates within X_bar - d and X_bar + d 
            * By def all x coordinates of such points are within this picture
    * Lemma 2: at most one point of P in each box
        * Proof by contradiction
        * Suppose a, b points lie in a single box, then
        * (i) a, b are either in Q, or in R, because every box lie either in Q, or in R (points with x coordinates at least x bar have to lie inside the right half of the point set capital R)
        * (ii) d(a,b) = d*sqrt2/2 < d 
        * (i) + (ii) contradict how we define d.
* Lemmas 1 + 2 => at most 8 points in this picture including p and q
    * in a worst case every box has a point and each of them have coordinates between p and q
    * starting from q and looking 7 positions ahead of the array, you are guaranteed to find this point p

**Recap**
1. we're given endpoints in the plane. 
2. We begin by sorting them, first by x-coordinate and then by y-coordinate. That takes **nlog(n) time**. 
3. Then we enter the main recursive divide and conquer part of the algorithm. 
    * We divide the point set into the left half and the right half, Q and R
    * then we conquer: 
        * we recursively compute the closest pair in the left half of the point set Q. 
        * We recursively compute the closest pair in the right half of the point set R. 
    * There is a lucky case where the closest pair on the entire point set lies either all on the left or all on the right. 
        * In that case, the closest pair is handed to us on a silver platter, by one of the two recursive calls. 
    * But there remains the unlucky case where the closest pair is actually split with one point on the left and one point on the right. 
        * We need to have a linear time implementation of this subroutine 
        * Actually we need something weeker: a linear time algorithm, which whenever the closest pair in the whole point set is in fact split, then computes that split pair in linear time. It has two basic steps.
            * Filtering step. So it looks at, first of all, a vertical strip, roughly down the middle of the point set. And it looks at, only at points which fall into that vertical strip. That was a subset of the points that we called Sy, 
            * Linear scan through Sy. So we go through the points one at a time, and for each point, we look at only the almost adjacent points. So for each index I, we look only at J's that are between one and seven positions further to the right, than I. So among all such points, we compare them, we look at their distances. We remember the best such pair of points. And then that's what we return from the count split pair subroutine. 
            
            
            
    

# 4. The Master Method


### 4.1 Intro
* A general mathematical tool for analyzing the running time of divide and conquer algorithms
* It'll give us good advice about which divide and conquer algorithms are likely to run quickly and which ones are likely to run less quickly
* A "school" algprithm of multiplying 2 n-digit numbers: T(n) = O(n^2)
* A recursive algorithm of multiplying 2 n-digit numbers
    * T(n) = maximum number of operations this algorithm needs to multiply those 2 n-digit numbers = the worst case number of operations
    * Reccurence: the way to express T(n) in terms of running time of recursive calls. 2 ingredients
    * Base case T(1) =< a constant
    * General case: running time in terms of two pieces
        * First of all the work done by the recursive calls, 
        * and second of all the work that's done outside of the recursive calls
        * For all n > 1: T(n) =< 4T(n/2) (= 4 rec calls) + O(n) (= outside of calls: we pad the results of the recursive calls with a bunch of zeros and we add them up)
* "Clever" recursive algorithm with Gauss
    * We recursively compute ac, like before, 
    * and bd, like before. 
    * But then we compute the product of a + b with c + d.
    * ad + bc = (3) - (2) - (1)
    * New recurrence:
        * Base case: T(1) =< a constant
        * General case: For all n > 1: T(n) =< 3T(n/2) (= 3 rec calls) + O(n) (we ignore that 3d recursive call might have n+1 digit numbers on entery)
* We compare those 2 last algs:
    * we have no idea what the running time is on either of these two recursive algorithms. But we should confident that this one certainly can only be better
    * Another point of contrast is merge sort. So think about what the recurrence would look like for the merge sort algorithm
        * Another point of contrast is merge sort. So think about what the recurrence would look like for the merge sort algorithm
            *  It would be almost identical to this one except instead of a three we'd have a two. 
            * Merge sort makes two recursive calls, each on an array of half the size. 
            * And outside of the recursive calls it does linear work, namely for the merge sub-routine
            * We know the running time of merge sort: nlog(n). 
            * Gauss's algorithm, is going to be worse, but we don't know by how much
* we have no idea what the running time of Gauss's recursive algorithm for integer multiplication really is. 
    * We don't know what the solution to this recurrence is. 
    * But it will be one super-special case of the general master method
    
### 4.2 Formal statement
* a black box for solving recurrences. 
    * takes an input a recurrence in a particular format 
    * spits out as output a solution to that recurrence, an upper bound on the running time of your recursive algorithm
* Requires a few assumptions
    * (1) only going to be relevant for problems in which all of the subproblems have exactly the same size
        * There are generalizations of the master method that I'm going to show you which can accommodate unbalanced subproblem sizes, but those are outside the scope of this course.
        * deterministic algorithm for linear time selection, that will be one algorithm which has two recursive calls on different subproblem sizes
* (2) the format of the recurrences to which the master method applies (covers pretty much all the cases you're likely to ever encounter)
    * Base case: T(n) =< cst, for all sufficiently small n. 
    * General case: for all larger n: T(n) =< a*T(n/b) + O(n^d)
        * a – number of subproblems, number of recursive calls >= 1
        * b – the factor by which the input size shrinks before a recursive call is applied = cst > 1 (It better be strictly bigger than 1 so that eventually you stop recursion) 
        * d — the exponent in the running time of the "combine" step = the amount of work which is done outside of the recursive call. d could be as small as 0, which would indicate constant amount of work outside of the recursive calls
         * a, b, d — constants, independent on n
  
**Statement of Master Method:** 
* It has 3 cases: T(n) =
    * if a = b^d: O(n^d*log(n)) 
    * if a < b^d: O(n^d)
        * You do some recursion, plus you do n to the d work outside of the recursion. 
        * So in the second case, it actually says that the work is dominated by just what's done outside the recursion in the outermost call
    * if a > b^d: O(n^(log_base_b(a))
* this version of the master method is that it only gives upper bounds. 
    * we only say that the solution to the recurrence is big-O of some function.
    * And that's because if you go back to our recurrence, we used big-O rather than theta in the recurrence. 
    * And this is in the spirit of the course, where as algorithm designers, our natural focus is on upper bounds, on guarantees for the worst case running time of an algorithm 
    * But it works on Theta aтв the solution becomes asyptotically exact
* Logarithms:
    * in case one, with the logarithm, I'm not specifying the base
        * it's because the the logarithm, with respect to any two different bases, differs by a constant factor (independent of the argument n)
        * So you can switch this logarithm to whatever constant base you like, it only changes the leading constant factor, 
        * which of course is being suppressed in the big-O notation anyways
    *  in case three, where we have a logarithm in the exponent, once it's in the exponent, we definitely care about that constant

### 4.3 Examples

**Ex1: Merge Sort**
* Constant determination
    * a = 2 rec calls
    * b = 2 factor by which the subproblem size is smaller than the original
    * d = 1 because the outside of the recursive calls, all merge sort does is merge. And that's a linear time subroutine
* a = 2 = b^d = 2 => case 1
    * O(n^d*log(n)) = O(nlog(n))

**Ex2: binary search algorithm in a sorted array**
* a search algorithm that finds the position of a target value within a sorted array
    * Binary search compares the target value to the middle element of the array. 
    * If they are not equal, the half in which the target cannot lie is eliminated and the search continues on the remaining half
* Constant determination
    * a = 1 rec call: comparaison of a number with a middle element 
    * b = 2 you recurse on a problem of half the size
    * d = 0 the outside of the recursive call the only thing you do is one comparison: You just determine whether the element you're looking for is bigger than or less than the middle element of the array that you recursed on
* a = 1 = b^d = 1 => case 1
    * O(n^d*log(n)) = O(log(n))
    
**Ex 3: first recursive algorithm for integer multiplication**
* Constant determination
    * a = 4 recursions
    * b = 2
    * d = 1 (doing is additions and adding by zeros and that can be done in linear time)
* a = 4 > b^d = 2 => case 3
    * O(n^(log_base_b(a)) = O(n^2)
* Recall that the iterative algorithm for multiplying two integers also takes an n squared number of operations. 
    * So this was a clever idea to attack the problem recursively. 
    * But at least in the absence of Gauss's Trick where you just naively compute each of the four necessary products separately
    * You do not get any improvement over the iterative algorithm

**Ex 4: Gauss recursive algorithm for integer multiplication**
* Constant determination
    * a = 3 recursions
    * b = 2
    * d = 1 (doing is additions and adding by zeros and that can be done in linear time)
* a = 3 > b^d = 2 => case 3
    * O(n^(log_base_b(a)) = O(n^log_base_2(3)) = O(n^1.59): better than n^2 for naive

**Ex 5: Strassen's matrix multiplication algorithm**
* Naive way: 8 subproblems
* Strassen: 7 subproblems
* Constant determination
    * a = 7 recursions
    * b = 2
    * d = 2 (the amount of work done outside of the recursive calls is linear in the matrix size. So quadratic in the end, quadratic in the dimension because there's a quadratic number of entries in terms of the dimension)
* a = 7 > b^d = 4 => case 3
    * O(n^(log_base_b(a)) = O(n^log_base_2(7)) = O(n^2.81): better than n^3 for naive

**Ex 6: fictitious recurrence just to illustrate Case 2**
* recurrence: 
    * like merge-sort: There's two recursive calls each on half the problem size.
    * The only difference is in this recurrence we're working  harder on the combined step. Instead of linear time outside of the recursive calls we're doing a quadratic
    * T(n) =< 2T(n/2) + O(n^2)
* * Constant determination
    * a = 2 recursions
    * b = 2
    * d = 2 
* a = 2 < b^d = 4 => case 2
    * O(n^d) = O(n^2)
* And you might find this a little counter-intuitive,
    * Given that merge sort, all we do here is change the combine step from linear to quadratic. 
    * And merge sort has a running time of n log n. 
    * You might have expected the running time here to be n squared log n. 
    * But that would be an over estimate, so the master method gives us a tighter upper bound, shows that it's only quadratic work. 
    * So put differently, the running time of the entire algorithm is governed by the work outside of the recursive calls.

### 4.4 Proof 

* the proof would be quite conceptual, not witt every i dotted and every t crossed. 
    * There's a couple of spots where we're going to have to do some computations. 
    * And the computations are worth seeing once in your life
    * the proof will follow a recursionary approach just like we used in the running time analysis of the MergeSort algorithm
    * it worth remembering what three types of recursion trees – the three cases that the master method corresponds to.
    
**Assumptions**  
1. recurrence is (for some constant C)
    * Base case: T(n) =< c 
    * General case: for all larger n: T(n) =< a*T(n/b) + c*n^d
2. n is a power of b
    * general case is similar but more tedious)
    
**Proof**    
* Idea: generalize MergeSort analysis (= use a recursion tree)
* We mimic the recursion tree
    * input size = n
    * a = number of recursions
    * b = shrink factor
    * j = level = [0, log_base_b(n)]
    * Identify pattern at level j:
        * how many distinct subproblems are there at level J? a^j
        * what is the input size that each of those level J subproblems has to operate on? n/(b^j)
* Total amount of work
    * Amount of work at level j [ignoring work in recursive calls] =< a^j * с*[n/(b^j)]^d
        * a^j – number of subproblems
        * from the recurrence's second term we know the amount of work done outside the reccurence: c*n^d => for j-th level it wll be: с*[n/(b^j)]^d
        * = c*n^d * [a/(b^d)]^j
    * Total amount of work =< c*n^d * sum_on_j(0, log_b(n)) of ([a/(b^d)]^j)
    
**Interpretation of the total amount expression**  
* look at that expression, 
* attach some semantics to it, and 
* look at how that interpretation naturally leads to three cases, 
* and also give intuition for some of the running times that we see in a master method
  
For j-level upper bound for the amount of work is: c*n^d * [a/(b^d)]^j  

* a = rate of subproblem proliferation the deeper we go into the recursion tree (RSP) = forces of evil
* b^d = rate of work shrinkage per subproblem (RWS) = forces of good
    * we don't really care about the input size of a subproblem, except inasmuch as it determines the amount of work that we do solving that subproblem
    * Imagine cases where you have a linear amount of work outside the recursive calls, versus a quadratic amount of work that is considered the cases where D equals one or two
        * If B = 2 and D = 1 <=> you reverse on half the input and do linear work. => not only is the input size dropping by factor two but so is the amount of work that you do per sub problem 
        * If B = 2 and D = 2 => the recursive call's only gonna do 25 percent as much work as what you did at the current level 
    * So in general the input size goes down by a factor B, but what we really care about, how much less work we do per subproblem, goes down by B to the D => b^d = the extent to which we work less hard with each occursion level J.

3 cases:
* if a = b^d: they are equal 
    * => cst work per level 
    * => total running time: [number of levels]*[amount of work on a level] = log_b(n)*c*n^d = O(n^d*log(n))
* if a < b^d: forces of good win 
    * => less work on each recursion level 
    * => worst level at a root level 
    * => the simplest possible thing that might be true (would be that actually) = the root level just dominates the overall running time of the algorithm, and the other levels really don't matter up to a constant factor
    * if that's true <=> we might expect a running time that's just proportional to the running time of the root
    * <=> we already know that that's n^d, cuz that's just the outermost call to the algorithm
    * <=> O(n^d)
* if a > b^d: forces of evel win 
    * => more work on each recursion level
    * => most work on the leaves level and they might be dominate
    * if that's true <=> we'd expect a running time proportional to the number of leaves in the recursion tree
    * <=> O(# of leaves)
    
**Proof (pt 2)**  
Total amount of work =< c*n^d * sum_on_j(0, log_b(n)) of ([a/(b^d)]^j)
* if a = b^d 
    * Total amount of work c*n^d * (log_b(n) + 1) = O(n^d*log(n))
* if a != b^d <=> if r = a / b^d:  r ! = 1
    * remember geometric series: 
        * 1 + r + r^2 + ... + r^k = (r^(k+1) - 1) / (r - 1) (proof by induction)
    * if r < 1:  
        * 1 + r + r^2 + ... + r^k =<  1 / (1 - r) = cst (indep of k)
        * ie 1st term dominates
        * Total amount of work c*n^d * cst = O(n^d)
    * if r < 1:
        * 1 + r + r^2 + ... + r^k =<  cst*r^k 
        * ie last term dominates
        * Total amount of work c*n^d * cst* r^(log_b(n)) = O(n^d * r^(log_b(n))) = O(n^d * (a/(b^n))^(log_b(n))) = O(a^log_b(n))  
       * NB: a^log_b(n) = The number of leaves of the recursion tree     
       * If you go back to the statement of the master method, 
           * we didn't say, a^log_b(n). In case three, it says the running time n^log_b(a) 
           * but a^log_b(n) = n^log_b(a)
           * the left side is more intuitive, the right side is simpler to apply
* QED

General approach:  
* we started by just writing down a recursion tree for the recursive algorithm and in a generic way. 
    * And going level by level, we counted up the work done by the algorithm.
    * And this part of the proof had nothing to with how A and B related to each other. 
* Then we recognized that there are three fundamentally different types of recursion trees. 
    * Those with the same amount of work per level, 
    * those where it increases with the level, 
    * and those where it decreases with the level. 
* The running times of the three cases should be.
    * In the case where you do the same amount of every work at each level.
        * We know there's a logarithmic number of levels. 
        * We know we do end in D work at the root. 
        * So that gives us the running time in case one had ended the day you log in. 
    * When the amount of work is decreasing with the levels, 
        * we now know that the route dominates. 
        * Up to a constant, we can throw out the rest of the levels, 
        * and we know end of the D work gets done at the root, 
        * so that's the overall running time. 
    * And in the third case, where it's increasing in the levels, 
        * the leaves dominate. 
        * The number of leaves is A raised to the log based of B of N, 
        * and that's the same as N, the log based B of A. 
        * And that's proportional to running time in case three of the master method

### Assignment 2.1 (Main)

1. **Suppose the running time of an algorithm is governed by the recurrence T(n) = 7*T(n/3) + n^2. What's the overall asymptotic running time (i.e., the value of T(n))?**

a = 7, b = 3, d = 2 => a = 7 < b^d = 9 => O(n^d) => Theta(n^2)

2. **Suppose the running time of an algorithm is governed by the recurrence T(n) = 9*T(n/3) + n^2. What's the overall asymptotic running time (i.e., the value of T(n))?**

a = 9, b = 3, d = 2 => a = 9 = b^d = 9 => O(n^d * log(n)) => Theta(n^2 log(n))

3. **Suppose the running time of an algorithm is governed by the recurrence T(n) = 5*T(n/3) + 4n. What's the overall asymptotic running time (i.e., the value of T(n))?**

a = 5, b = 3, d = 1 => a = 5 > b^d = 3 => O(n^log_b(a)) => Theta(n^log_3(5))

4. **Consider the following pseudocode for calculating a^b (where a and b are positive integers)**

FastPower(a,b) :  
  if b = 1  
    return a  
  else  
    c := a*a  
    ans := FastPower(c,[b/2])  
  if b is odd  
    return a*ans  
  else return ans  
end  

**Here [x] denotes the floor function, that is, the largest integer less than or equal to x. Now assuming that you use a calculator that supports multiplication and division (i.e., you can do multiplications and divisions in constant time), what would be the overall asymptotic running time of the above algorithm (as a function of b)?**

Solution:  
* there algorithm can be interpreted in a following way:
    * if b is even: a^b = a*^(b/2) * a*^(b/2)
    * if b is odd: a^b = a * a^(b-1)
    * if b = 1: a^b = a
* in terms of time that means
    * if b is even: T(b) = T(b/2) + const1 (const1 for simple operations of comp, div, etc)
    * if b is odd: T(b) = T((b-1)/2) + const2 = T(b/2) + const3 + const2 = T(b/2) + const (where const > const2) 
    * if b = 1: T(b) = T(a) = const4
* The second expression gives us an upper bound for all 3 cases because const > const2
    * x = 1, y = b, z = 0 => x = 1 = y^z = 1 => O(n^z * log(n)) = O(b^0 * log(b)) => Theta(log(b))


5. **Choose the smallest correct upper bound on the solution to the following recurrence: T(1) = 1 and T(n)=< T([sqrt(n)]) + 1  for n>1. Here [x] denotes the "floor" function, which rounds down to the nearest integer. (Note that the Master Method does not apply.)**

Solution: 
* We have a recurrence that is ostensibly over the integers. 
* But if 𝑛 is an integer, then sqrt(𝑛) is not necessarily an integer
* One way of getting an informal answer is to imagine that we start with an integer 𝑛 = 2^(2^k)
* then [sqrt(n)] = 2^(2^(k-1)) 
* T(2) = T(2^(2^k)) 
    * = T([sqrt(2^(2^k))]) + 1 
    * = T([2^(2^(k-1))]) + 1 
    * = T([sqrt(2^(2^(k-1)))]) + 1 + 1 
    * = T([2^(2^(k-2))]) + 2
    * = T([2^(2^0)]) + k
    * = T(2) + k
* And T(2) = T([sqrt(2)]) + 1 = T(1) + 1 = 2
    * => T(n) = 2 + k
* k = log_2(log_2(n)) 
    * T(n) = 2 + log_2(log_2(n)) = O(log(log(n)) for n>1

### Assignment 2.2 (Additional)

**1. You are given as input an unsorted array of n distinct numbers, where n is a power of 2. Give an algorithm that identifies the second-largest number in the array, and that uses at most n + log_2(n) - 2 comparisons.**

In [20]:
# straitforward algorithm

def SecondLarge(arr):
    if len(arr) > 1: # 1 operation
        mid = int(len(arr)/2) # 1 operation
        L = arr[:mid] # 1 operation
        R = arr[mid:] # 1 operation
        SecondLarge(L) 
        SecondLarge(R)  
 
        i = j = 0 # two iterators for traversing the two halves

        for k in range(0, len(arr)):
            if i < len(L) and j < len(R):
                if L[i] < R[j]:
                    arr[k] = L[i]
                    i += 1
                else:
                    arr[k] = R[j]
                    j += 1
                k += 1
            
            elif i >= len(L) and j < len(R):
                arr[k] = R[j]
                j += 1
                k += 1
                
            elif i < len(L) and j >= len(R):
                arr[k] = L[i]
                i += 1
                k += 1
        
        return arr[-2] 
    
    else:
        return "no second largest number"

In [220]:
x = [110, 9, 5, 4, 11, 10, 120, 100]
SecondLarge(int_list)

99999

**Analysis:**  
* n = 2^k
* 2 recursions of 4 operations each 
    * number of levels: log_2(n) = k
    * subproblems on level j: 2^j
    * input length on level j: n/2^j = 2^(k-j)
* cycle of merge: as for merge-sort: 4j + 2 =<6j
* Total: sum_j_0_k(2^j * 6 * 2^(k-j))
    * = sum_j_0_k(2^k * 6) 
    * = k*6*2^k 
    * = 6*n*log_2(n)
    * = O(nlog(n))

In [349]:
# another approach
def get_greaters_lst(lst):
    if len(lst) == 1: # 1 comp, but we ignore the base case
        return lst

    lst1 = get_greaters_lst(lst[0:int(len(lst)/2)]) 
    lst2 = get_greaters_lst(lst[int(len(lst)/2):])
        
    if lst1[0] > lst2[0]: # 1 comp
        lst1.append(lst2[0]) 
        return lst1
    else:
        lst2.append(lst1[0])
        return lst2


def SecondLargest(arr):
    greaters_lst = get_greaters_lst(arr)[1:]  
    max_num = greaters_lst[0]  
    
    for candidate in greaters_lst: # log_2(n) - 1 = number of candidates 
        if candidate > max_num: # 1 comp
            max_num = candidate 

    return max_num


In [350]:
x = [3,100,1,15,7,120,22,4,5,125,14,16,77,121,222,41]
SecondLargest(x)

[222, 41, 121, 125, 120]


125

**Analysis**
* Number of comparaisons per block:
    * reccurence = 1 comp * (n/2 + n/4 + ... + n/n) = n-1
    * candidates = 1 comp * (log_2(n) - 1) 
* total = n + log_2(n) - 2


**2. You are a given a unimodal array of n distinct elements, meaning that its entries are in increasing order up until its maximum element, after which its elements are in decreasing order. Give an algorithm to compute the maximum element that runs in O(log n) time.**


In [512]:
def MaxUnimodal(arr):
    
    if len(arr) == 1: # 1 op
        return arr[0] 
    
    elif len(arr) == 2 and (arr[0] < arr[1]): # 2 op
        return arr[1]

    elif len(arr) == 2 and (arr[0] > arr[1]):  # 2 op
        return arr[0]
    
    mid = (len(arr)-1)//2  # 1 op
    mid_point = arr[mid]   # 1 op
        
    if (arr[mid-1] < arr[mid]) and (arr[mid] > arr[mid+1]): # 2 op
        return arr[mid]
    
    elif (arr[mid-1] < arr[mid]) and (arr[mid] < arr[mid+1]): # 2 op
        subarray = arr[mid:]   # 1 op
        return MaxUnimodal(subarray) # log(n) op
    
    else: #(arr[mid-1] > arr[mid]) and (arr[mid] > arr[mid+1]):
        subarray = arr[:mid]
        return MaxUnimodal(subarray)
        

In [513]:
x = [1, 13, 11, 10]
MaxUnimodal(x)

13

**3. You are given a sorted (from smallest to largest) array A of n distinct integers which can be positive, negative, or zero. You want to decide whether or not there is an index i such that A[i] = i. Design the fastest algorithm that you can for solving this problem.**

In [None]:
1. Делю длину последовательности пополам, получаю средний индекс
2. Смотрю на число, соответствующее этому индексу
3. Если A[i] < i => работаю с правой частью
4. Если наоборот => с левой
5. До тех пор, пока длина последовательности не станет = 2, ее проверяю вручную


**4. You are given an n by n grid of distinct numbers. A number is a local minimum if it is smaller than all of its neighbors. (A neighbor of a number is one immediately above, below, to the left, or the right. Most numbers have four neighbors; numbers on the side have three; the four corners have two.) Use the divide-and-conquer algorithm design paradigm to compute a local minimum with only O(n) comparisons between pairs of numbers. (Note: since there are n^2 numbers in the input, you cannot afford to look at all of them. Hint: Think about what types of recurrences would give you the desired upper bound.)**

https://www.coursera.org/learn/algorithms-divide-conquer/discussions/forums/-HCDbbpyEeayJw6eqJ0T8g/threads/MjTuIv-yEeadOw6eHge7Ug

### Programming Assignment 2 [on counting inversions]

The file contains all of the 100,000 integers between 1 and 100,000 (inclusive) in some order, with no integer repeated. 

Your task is to compute the number of inversions in the file given, where the i-th row of the file indicates the i-th entry of an array.

Because of the large size of this array, you should implement the fast divide-and-conquer algorithm covered in the video lectures.

The numeric answer for the given input file should be typed in.


In [207]:
with open('../algorithms_course_code/data/pg_asmt_2_data.txt', 'r') as f:
    x = f.read().splitlines()
    f.close()
x[0:5]    

['54044', '14108', '79294', '29649', '25260']

In [208]:
int_list = [int(i) for i in x]
int_list[:3]

[54044, 14108, 79294]

In [209]:
def InvCountMS(arr):
    length = len(arr)
    
    if length == 1:
        return arr, 0
    
    elif length == 2:
        if arr[0] > arr[1]:
            return [arr[1], arr[0]], 1
        else:
            return arr, 0
    
    elif length > 2:
        mid = length//2       
        L = arr[:mid]
        R = arr[mid:]
        arr_L_res = InvCountMS(L)
        arr_R_res = InvCountMS(R)
        arr_L_sorted = arr_L_res[0]
        arr_R_sorted = arr_R_res[0]
        
        inv_count = arr_L_res[1] + arr_R_res[1]
        
        l = r = 0
        k = 0
        
        sorted_arr = []
           
        for k in range(0, length):    
            if l == len(arr_L_sorted):
                sorted_arr.append(arr_R_sorted[r])
                r+=1
            
            elif r == len(arr_R_sorted):
                sorted_arr.append(arr_L_sorted[l])
                l+=1 
            
            elif arr_L_sorted[l] < arr_R_sorted[r]:
                sorted_arr.append(arr_L_sorted[l])
                l+=1
            elif arr_L_sorted[l] > arr_R_sorted[r]:
                sorted_arr.append(arr_R_sorted[r])
                inv_count += len(arr_L_sorted) - l 
                r+=1
           
        return sorted_arr, inv_count

In [211]:
count = InvCountMS(int_list)[1]
count

2407905288