## Q2

Q: Efficiently count words that appear in a piece of text (single or multiple).

In [55]:
def wordcount(text, target):
    w_s = 0
    w_e = 0
    c = 0
    
    for idx in range(0, len(text)):
        if text[idx] == " ":
            if text[w_s:idx] == target:
                c += 1
                
            w_e = w_s = idx + 1
        
        else:
            w_e += 1
            
    if text[w_s:] == target:
        c += 1
    
    return c

In [56]:
wordcount("foo bar foo foo", "foo")

3

In [61]:
def wordcount_multiple(text, targets):
    out = {target: 0 for target in targets}
    
    w_s = w_e = 0
    
    for idx in range(0, len(text)):
        if text[idx] == " ":
            for target in targets:
                if text[w_s:idx] == target:
                    out[target] += 1
                
            w_e = w_s = idx + 1
        
        else:
            w_e += 1
            
    for target in targets:
        if text[w_s:] == target:
            out[target] += 1
    
    return out

In [64]:
wordcount_multiple("foo bar foo foo", ["foo", "bar"])

{'foo': 3, 'bar': 1}

This algorithm is $O(n)$ for the single word count version, and $O(nk)$ for the multiple word count version (where $k$ is the number of words passed).

## Q3

Q: Given the start and end points of two line segments, compute the intersection (if one exists).

Easy to find a closed form solution to this problem. Solved $O(1)$.

## Q4

Q: Determine of a player has won a game of tic tac toe.

In [66]:
def tictactoe(grid):
    return (any([grid[0][i] == grid[1][i] == grid[2][i] for i in range(3)]) or
            any([grid[i][0] == grid[i][1] == grid[i][2] for i in range(3)]) or
            grid[0][0] == grid[1][1] == grid[2][2] or
            grid[0][2] == grid[1][1] == grid[2][0]
           )

In [67]:
tictactoe([["X", "X", "X"], [" ", " ", " "], [" ", " ", " "]])

True

Simple modification of this works (need an index character that we ignore). $O(1)$.

## Q5

Q: Solve for the number of trailing zeros in $n!$.

In [93]:
def n_twos_fives(n):
    c_2s = 0
    c_5s = 0
    
    while n % 2 == 0:
        c_2s += 1
        n = n / 2
        
    while n % 5 == 0:
        c_5s += 1
        n = n / 5
    
    return c_2s, c_5s


def factorial_zeros(n):
    i = 1
    c_2s = c_5s = 0
    while i <= n:
        new_c_2s, new_c_5s = n_twos_fives(i)
        c_2s += new_c_2s
        c_5s += new_c_5s
        i += 1
        
    return min(c_2s, c_5s)

In [96]:
factorial_zeros(10)

2

This solution uses the fact that a trailing zero will only appear if the number under the microscope is a factor of both 2 and 5. The number of zeros that appears will be the minimum of the number of 2 and 5 terms that appear in the prime factorization of the factorial output.

This algorithm has an $O(n)$ amortized cost.

We can optimize further by taking note of the fact that in a sequence of increasing multiples, 5s are strictly rarer than 2s, so in reality we need only count 5s. And 5s have a very simple appearance rule: they appear whenever a multiple ends in a 5 or a 0. Implementing this optimization would improve the runtime by a constant multiple factor, 1/5, so the O-speed wouldn't change, but the algorithm would run five times faster in practice (because we can iterate five at a time instead of one at a time).

## Q6

Q: *Counterfactuals* &mdash; Write an algorithm to find the pairs of integers in two unsorted lists which are closest to one another.

I had to think about this one a bit.

The brute force solution would be to sort both lists, and the crawl forward on the lists. This algorithm is $O(2n\log{n} + 2n) = O(n\log{n})$ (should have started with implementing this incidentally).

A constant time optimization is to only sort the first list, and then iterate over the second list, finding the elements in the first list which are closest to each iteree in the second list. We could then use a modified binary search to find the element in the first list closest to the one in the first. This is $n\log{n} + n\log{n}$ amortized operations, which is better for larger values of $n$ but equivalent in O: $O(n\log{n})$.

An alternative approach is to go this optimized route, but sort the first list using a radix sort. That reduces the time to $n + n\log{n}$, a constant time optimization for something that is still $O(n\log{n})$, but also costs $n$ space.

---

The following is an implementation of the second algorithm, but using a linear search instead of a binary search.

In [184]:
def sort(l):
    return sorted(l)

def findClosestIDXInSortedList(l, v):
    best_seen_idx, best_dist = 0, abs(l[0] - v)
    
    for li_idx, li in enumerate(l):
        curr_dist = abs(li - v)
        if best_dist >= curr_dist:
            best_seen_idx, best_dist = li_idx, curr_dist
        else:
            break
    
    return best_seen_idx
    
def counterfactual(l1, l2):
    l1 = sort(l1)
    
    best_result_l1_idx = best_result_l2_idx = 0
    best_result_dist = abs(l1[best_result_l1_idx] - l2[best_result_l2_idx]) 
    
    for li2_idx, li2_v in enumerate(l2):
        curr_best_idx = findClosestIDXInSortedList(l1, li2_v)
        curr_dist = abs(l1[curr_best_idx] - li2_v)
        
        if curr_dist < best_result_dist:
            best_result_l1_idx = curr_best_idx
            best_result_l2_idx = li2_idx
            best_result_dist = curr_dist
            
            if best_result_dist == 0:
                break
    
    return (best_result_l1_idx, best_result_l2_idx, best_result_dist)

In [129]:
counterfactual([1,2,3],[1,2,3])

(0, 0, 0)

In [130]:
counterfactual([1, 2], [5, 4, 3, 2])

(1, 3, 0)

Here's one using a proper binary search.

In [199]:
def sort(l):
    return sorted(l)

def findClosestIDXInSortedList(l, v, best_prior_dist=None, best_prior_idx=None):
    
    if len(l) == 0:
        return best_prior_idx
    
    pivot = len(l) // 2
    pivot_dist = abs(l[pivot] - v)
    
    if best_prior_dist and pivot_dist > best_prior_dist:
        return best_prior_idx
    else:
        if l[pivot] == v:
            return pivot
        elif l[pivot] < v:
            pivot = pivot + pivot // 2
            return findClosestIDXInSortedList(l[:pivot], v, pivot_dist, pivot)
        else:
            pivot = pivot // 2
            return + findClosestIDXInSortedList(l[pivot + 1:], v, pivot_dist, pivot)
        
def counterfactual(l1, l2):
    l1 = sort(l1)
    
    best_result_l1_idx = best_result_l2_idx = 0
    best_result_dist = abs(l1[best_result_l1_idx] - l2[best_result_l2_idx]) 
    
    for li2_idx, li2_v in enumerate(l2):
        curr_best_idx = findClosestIDXInSortedList(l1, li2_v)
        curr_dist = abs(l1[curr_best_idx] - li2_v)
        
        if curr_dist < best_result_dist:
            best_result_l1_idx = curr_best_idx
            best_result_l2_idx = li2_idx
            best_result_dist = curr_dist
            
            if best_result_dist == 0:
                break
    
    return (best_result_l1_idx, best_result_l2_idx, best_result_dist)

In [200]:
counterfactual([1, 2], [5, 4, 3, 2])

(1, 3, 0)

Next up would be implementing radix sort...but we'll stop here.

## Q7

Q: Find the maximum of two numbers without using any if-else logic or comparators.

A: this is a dumb question. It uses binary logic of some kind.