# Find the Missing Element

## Problem

Consider an array of non-negative integers. A second array is formed by shuffling the elements of the first array and deleting a random element. Given these two arrays, find which element is missing in the second array. 

Here is an example input, the first array is shuffled and the number 5 is removed to construct the second array.

Input:
    
    finder([1,2,3,4,5,6,7],[3,7,2,1,4,6])

Output:

    5 is the missing number

## Solution

Fill out your solution below:

## Try 1.  Sorting takes O(log(N)), and single loop takes O(N).  Thus, total is O(Nlog(N)).

In [11]:
def finder(arr1,arr2):
    
    # Make sure there are elements to check
    if len(arr2) < len(arr1): 
        
        # if arr2 is empty, then 0th of arr1 is the answer
        if len(arr2) == 0:
            return arr1[0]
        
        # otherwise lets sort and incrementally go through each value till we find a missmatch
        # Sorting costs O(log(N))
        arr1.sort()
        arr2.sort()
        for i in xrange(len(arr1)): # loop costs O(N)
            if arr1[i] != arr2[i]:
                return arr1[i]
        

In [12]:
arr1 = [1,2,3,4,5,6,7]
arr2 = [3,7,2,1,4,6]
finder(arr1,arr2)

5

In [13]:
arr1 = [5,5,7,7]
arr2 = [5,7,7]

finder(arr1,arr2)

5

_____

# Test Your Solution

In [14]:
"""
RUN THIS CELL TO TEST YOUR SOLUTION
"""
from nose.tools import assert_equal

class TestFinder(object):
    
    def test(self,sol):
        assert_equal(sol([5,5,7,7],[5,7,7]),5)
        assert_equal(sol([1,2,3,4,5,6,7],[3,7,2,1,4,6]),5)
        assert_equal(sol([9,8,7,6,5,4,3,2,1],[9,8,7,5,4,3,2,1]),6)
        print 'ALL TEST CASES PASSED'

# Run test
t = TestFinder()
t.test(finder)

ALL TEST CASES PASSED


## Good Job!

# Other ways to do it

First, you could simply take the sum of the two arrays, then the difference between them would be the missing element.  BUT, if the arrays are really long, or values really big, it could lead to an overflow error.  If the values are very small, it could lead to a precission issue.  Thus, it is not a very robust approach.

## Try 2.  Single loop, so O(N).

In [20]:
def finder2(arr1, arr2): 
    
    # Make an empty dict
    d={ }
    
    # Add a count for every instance in Array 1
    for num in arr2:
        if num in d:
            d[num]+=1 
        else:
            d[num] = 1
    
    # Check if num not in dictionary
    for num in arr1: 
        if num in d:
            if d[num]==0: 
                return num 
            # Otherwise, subtract a count
            else: 
                d[num]-=1 
        else:
            return num        

In [22]:
t.test(finder2)

ALL TEST CASES PASSED


## Try 2, but with collections.defaultdict.  Also, O(N).

In [21]:
import collections

def finder2_2(arr1, arr2): 
    """ Same as above, but using the collections defaultdict tool to clean things up a bit.
        Just removes the need for 'if num in d:' conditionals.
    """
    # Using default dict to avoid key errors
    d=collections.defaultdict(int) 
    
    # Add a count for every instance in Array 1
    for num in arr2:
        d[num]+=1 
    
    # Check if num not in dictionary
    for num in arr1: 
        if d[num]==0: 
            return num 
        
        # Otherwise, subtract a count
        else: d[num]-=1 

In [23]:
t.test(finder2_2)

ALL TEST CASES PASSED


## Try 3, using XOR.  'elegent' option.  Single loop, so O(N).

In [34]:
def finder3(arr1, arr2): 
    result=0 
    #print 'inputs: '+repr(arr1)+', '+repr(arr2)
    # Perform an XOR between the numbers in the arrays
    for num in arr1+arr2: 
        #print num
        #print result
        result^=num 
        #print result
        #print
        
    return result 

In [35]:
t.test(finder3)

ALL TEST CASES PASSED
