# Find the Missing Element
## Problem
Consider an array of non-negative integers. A second array is formed by shuffling the elements of the first array and deleting a random element. Given these two arrays, find which element is missing in the second array.

Here is an example input, the first array is shuffled and the number 5 is removed to construct the second array.

### Input:

finder([1,2,3,4,5,6,7],[3,7,2,1,4,6])

### Output:

5 is the missing number

In [6]:
def construct_ht(arr, ht):
    for element in arr:
        if element in ht:
            ht[element] += 1
        else:
            ht[element] = 1

def finder(arr1, arr2):
    #initialize hts
    ht1 = {}
    ht2 = {}
    
    #construct hts
    construct_ht(arr1, ht1)
    construct_ht(arr2, ht2)
    
    for key, value in ht1.items():
        if key in ht2:
            if value != ht2[key]:
                return key
        else:
            return key
    
    return "Unexpected Error"
    

In [7]:
"""
RUN THIS CELL TO TEST YOUR SOLUTION
"""
from nose.tools import assert_equal

class TestFinder(object):
    
    def test(self,sol):
        assert_equal(sol([5,5,7,7],[5,7,7]),5)
        assert_equal(sol([1,2,3,4,5,6,7],[3,7,2,1,4,6]),5)
        assert_equal(sol([9,8,7,6,5,4,3,2,1],[9,8,7,5,4,3,2,1]),6)
        print('ALL TEST CASES PASSED')

# Run test
t = TestFinder()
t.test(finder)

ALL TEST CASES PASSED


# Instructor's solution
The naive solution is to go through every element in the second array and check whether it appreas in the first array. Note that there may be duplicate elements in the arrays so we should pay special attention to that. The complexity of this is O(n^2) since we'd use two nested for-loops

A more efficient solution is to SORT the first array, so while checking whether an element in the FIRST aray appears in the second, we can do binary search which is O(logn) time.
Complexity of this algo would ultimately be O(n*logn) time.

If we don't want to deal with the special case of duplicate numbers, we can sort BOTH arrays and iterate over them simultaneously.

ONCE the two iterators have different values we can STOP and return the value of the first iterator.

In [9]:
def finder2(arr1,arr2):
    
    #O(n*logn) because of sorting
    arr1.sort()
    arr2.sort()
    
    for num1, num2 in zip(arr1,arr2):
        print(num1, num2)
        if num1 != num2:
            return num1
    
    #otherwise, return last element of the arr
    return arr1[-1]

In [10]:
# Instructors Hash Table Solution
import collections

In [14]:
def finder3(arr1,arr2):
    
    #if the key isn't in dict, instead of an error, it add its to the dict
    d = collections.defaultdict(int)
    #like a normal python dictionary,
    #except if there is no key, it simply adds it without error
    
    #count each time the element shows up in arr2
    for num in arr2:
        d[num] += 1
    
    #more elegant than my solution
    #again incrementing and decrementing
    
    #go through the first array
    #for each number in first arr,
    #check if the number's mapped value is zero,
    #if so, that is the missing element
    #else it does show up in arr2, so we decrement that number
    for num in arr1:
        if d[num] == 0:
            return num
        else:
            d[num] -= 1