In [3]:
import random
import numpy as np

# Section 8.1

## Problem 1

Assume the array has $n$ elements. Then we have that the smallest depth of a leaf in the binary decision tree will be $n$. This will be achieved when the array is already sorted. 

## Problem 2

We have:

$$
\lg ( n ! ) = \sum_{k=1}^{n} \lg k \sim \int_{1}^{n} \lg x dx = \Theta ( n \lg n ).
$$

## Problem 3

Consider $n!/2$ inputs when we run a comparison sort algorithm. If $l$ denotes the number of reachable leaves, and $h$ denotes the height of the binary tree, we must have that,

$$
n!/2 \leq l \leq 2^{h}.
$$

Taking logs, we have that we must have $h \geq \lg (n! / 2) = \Theta \big ( \frac{n \ lg n }{2} \big ) = \Theta ( n \lg n)$. Hence even in the worst case, we must have that $ h = \Omega ( n \lg n)$.

Now let's consider $1/n$ of the inputs of length $n$. We have the inequality,

$$
(n-1)! = n!/n \leq l \leq 2^{h}.
$$

Taking logs, we have that we must have $h \geq \lg ((n-1)!) = \Theta ( (n - 1) \lg (n - 1) ) = \Theta ( n \lg n) $. So again the answer is no.

Now let's consider $1/2^{n}$ of the inputs of length $n$. We have the inequality,

$$
n!/2^{n} \leq l \leq 2^{h}.
$$

Taking logs, we have that we must have $h \geq \lg ( n!/2^{n} ) = \lg (n!) - n = \Theta (n \lg n ) - \Theta ( n ) = \Theta (n \lg n)$. So again the answer is no.

## Problem 4

WLOG, assume that $n = 4k$. Then for $k$ elements in the array, only $\Theta(1)$ number of comparisons have to be made. Therefore, we can ignore the permutations in which each element in position $4l$ for $0 \leq l \leq k$ is not in position $4i - 1, 4i$ or $4i + 1$. Therefore, $3^k (3k)! = 3^{n/4} ( 3n/4 )! = \Theta (n!)$ permutations in the leaf nodes. Hence the $\Omega(n \lg n)$ continues to hold.

Consider $n/k$ binary trees, drawn side by side, illustrating the comparison sort algorthim on the $n/k$ subsequences. Now if $l$ denotes the TOTAL number of reachable leaves and if $h$ denotes the MAXIMUM of the individual heights of all the trees, we must have that,

$$
(n/k) k! \leq l \leq 2^{h}.
$$

Taking logs, we get that we must have $h \geq \lg \big ( \frac{n}{k} k! \big ) = \lg  ( n (k-1)! ) = \Theta ( n \lg k)$. Hence we must have that $h = \Omega (n \lg k)$.


# Section 8.2

In [4]:
def Counting_Sort(A,k):
    
    #We assume each input is an integer between 0 and k. Hence there are a totalof k + 1 possibilities for
    #each entry.
    
    C = [0]*(k+1)
    B = []
    
    n = len(A)
    
    for j in range(n):
        B.append(0)   
        C[A[j]] = C[A[j]] + 1
        
    #C[i] now contains the number of elements equal to i  
    
    for i in range(1,k+1):
        
        C[i] = C[i] + C[i-1]
        
    #C[i] now contains the number of elements less than or equal to i.
    
    for j in range(n-1,-1,-1):
        
        B[C[A[j]]-1] = A[j]
        C[A[j]] = C[A[j]] - 1
        
    return B  

## Problem 1

In [5]:
A = [6,0,2,0,1,3,4,6,1,3,2]
B = Counting_Sort(A,6)
print(B)

[0, 0, 1, 1, 2, 2, 3, 3, 4, 6, 6]


## Problem 2

This is clear.

## Problem 4

This is also clear. We give an informal argument. The loop invariant is true because each time an element with value $i$ is placed in $B$, the value of $C[i]$ is decreased by one. This ensures that $C$ is updated correctly such that each time $i$ is placed in its correct position in $B$, $C[i]$ is updated correctly and its value informs us of the the number of elements less than or equal to $i$ which are not currently processed.

## Problem 6

Use the same preprocessing as done in the COUNTING-SORT algorithm to construct the array $C$.  When queried about how many integers fall into a range $[a,b]$, simply compute $C[b] − C[a − 1]$. This takes $O(1)$ time.

## Section 8.4

In [6]:
def bucket_sort(A):
    
    n = len(A)
    
    # Find maximum value in the list and use length of the list to determine which value in the list goes into which bucket 
    max_value = max(A)
    size = max_value/n

    # Create n empty buckets where n is equal to the length of the input list
    buckets = []
    
    for i in range(n):
        
        buckets.append([])

    # Put list elements into different buckets based on the size
    for i in range(n):
        
        j = int(A[i]/size)
        
        if j !=n:

            buckets[j].append(A[i])
            
        else:
            
            buckets[n-1].append(A[i])
    
    #Sort elements within the buckets using some sorting algorithm 
    
    for z in range(n):
        
        buckets[z].sort()
            
    # Concatenate buckets with sorted elements into a single list
    output = []
    
    for x in range(n):
        
        output = output + buckets[x]
    
    return output

## Problem 1

In [7]:
A = []

for i in range(10):
    A.append(random.randint(1,5))

print(bucket_sort(A))

[1, 1, 1, 2, 2, 3, 3, 3, 4, 4]


## Problem 2

If all the keys fall in the same bucket and they happen to be in reverse order, we have to sort a single bucket with $n$ items in reversed order with insertion sort. This takes time $\Theta ( n^2 )$. We can use merge sort or heapsort to improve the worst-case running time.

## Problem 3

We have $\mathbf{E}(X) = 1, \mathbf{E}(X) = 1.5, (\mathbf{E}(X))^2 = 1$.

## Problem 5

Simply bucket sort by radius. We can take $r_i = \sqrt{i/n}$.

## Problem 6

Simply perform bucket sort with non-equal intervals. We can choose the length of the i^th interval to be $p_{i+1} - p_i$ where $p_i$ is defined such that $\mathbf{P}(X \leq p_i) = i/n$.