### Compare the intersection of two sorted arrays

Write a program that takes as input two sorted arrays, and returns a new array containing elements that are present in both the input arrays.  The input arrays may have duplicate entries but the returned array should be free of duplicates.

Hint: Solve the problem if the input array lengths differ by orders of magnitude.  What if they are approximately equal?

In [93]:
import bisect
from collections import Counter

p = [1, 2, 3, 3, 4, 6, 7, 9, 45, 243, 12234]
p.extend(x)
q = [2, 2, 3, 3, 3, 4, 5, 123456]

# a = [1]
# b = [2]

def show_lists(a, b):
    print("A: {}".format(a))
    print("B: {}".format(b))

def get_intersection(a, b):
    
    def trim(one, two):
        one_min, one_max = one[0], one[-1]
        edge_1 = bisect.bisect_left(two, one_min)
        edge_2 = bisect.bisect_right(two, one_max)
        return two[edge_1:edge_2]
    
    # edge case
    if not b or not a:
        return False
    if (a[-1] < b[0]) or (b[-1] < a[0]):
        return False
    # trim the arrays
    b = trim(a,b)
    a = trim(b,a)
    print("After first trim")
    show_lists(a,b)
    b = trim(a,b)
    a = trim(b,a)
    print("After second trim")
    show_lists(a,b)

    counter_a = Counter()
    for item in a:
        counter_a[item] += 1
    counter_b = Counter()
    for item in b:
        counter_b[item] += 1
    
    if len(counter_b.keys()) > len(counter_a.keys()):
        larger, smaller = counter_b, counter_a
    else:
        larger, smaller = counter_a, counter_b
        
    result = [x for x in smaller.keys() if x in larger.keys()] 
    return result

# Tests
print("Original lists")
show_lists(p, q)
print("intersection: {}".format(get_intersection(p,q)))
print("intersection: {}".format(get_intersection(q,p)))


Original lists
A: [1, 2, 3, 3, 4, 6, 7, 9, 45, 243, 12234, 1000000, 1000001, 1000002, 1000003, 1000004, 1000005, 1000006, 1000007, 1000008, 1000009, 1000010, 1000011, 1000012, 1000013, 1000014, 1000015, 1000016, 1000017, 1000018, 1000019, 1000020, 1000021, 1000022, 1000023, 1000024, 1000025, 1000026, 1000027, 1000028, 1000029, 1000030, 1000031, 1000032, 1000033, 1000034, 1000035, 1000036, 1000037, 1000038, 1000039, 1000040, 1000041, 1000042, 1000043, 1000044, 1000045, 1000046, 1000047, 1000048, 1000049, 1000050, 1000051, 1000052, 1000053, 1000054, 1000055, 1000056, 1000057, 1000058, 1000059, 1000060, 1000061, 1000062, 1000063, 1000064, 1000065, 1000066, 1000067, 1000068, 1000069, 1000070, 1000071, 1000072, 1000073, 1000074, 1000075, 1000076, 1000077, 1000078, 1000079, 1000080, 1000081, 1000082, 1000083, 1000084, 1000085, 1000086, 1000087, 1000088, 1000089, 1000090, 1000091, 1000092, 1000093, 1000094, 1000095, 1000096, 1000097, 1000098, 1000099]
B: [2, 2, 3, 3, 3, 4, 5, 123456]
After fi

### Remarks

This took about 35 minutes to accomplish.  The most important part of developing this was learning more about the bisect library, which I used to do a binary search on both sorted arrays, to try ot exclude as much as possible from the remaining calculation.  Also, I had to test some edge cases, such as lists with no intersection at all or empty lists.  After processing the edge cases and reducing both arrays to only trimmed sub-arrays, we save a lot of time on the worst-case.  From the remaining trimmed sub-arrays, we run a counter over both to get the unique keys.  This should be an O(a + b) runtime for the loops, plus O(e) for the number of unique elements in array a.  This could be improved by ensuring that a is always the smaller of the two lists