### Problem Statement

Given list of integers that contain numbers in random order, write a program to find the longest possible sub sequence of consecutive numbers in the array. Return this subsequence in sorted order. The solution must take O(n) time

For e.g. given the list `5, 4, 7, 10, 1, 3, 55, 2`, the output should be `1, 2, 3, 4, 5`

*Note- If two arrays are of equal length return the array whose index of smallest element comes first.*



In [85]:
def longest_consecutive_subsequence(nums):
    
    """
    Space complexity: O(n)
    Time complexity : O(n)
    
    The inner while loop runs through the num_set only once per function call. 
    Hence overall time complexity is not O(n * n) rather O(n + n) which is O(n)
    """
    
    longest_streak = []
    num_set = {}
    
    # create dict to store all elements and their indices. Except for tie-breaker dict used like a set in this function
    for ix, e in enumerate(nums):
        num_set[e] = ix

    for num in num_set:
        if num - 1 not in num_set:  # a streak starts ONLY when its lower-bound is found. GENIUS! Why? Makes subsequent search trivial.
            curr_num = num
            curr_streak = [curr_num]

            while curr_num + 1 in num_set:  # this loop runs through the whole num_set only once per function call
                curr_num += 1
                curr_streak.append(curr_num)  # add it to the final sorted list

            # keep track of the longest streak
            if len(longest_streak) == len(curr_streak):
                if num_set[longest_streak[0]] > num_set[curr_streak[0]]:  # Tie-breaker: Only time index lookup is needed.
                    longest_streak = curr_streak
            elif len(longest_streak) < len(curr_streak):
                longest_streak = curr_streak

    return longest_streak

My weaker points:

1. As I programmed my way into a mess of having to look both ways (like if I start with 5 then I need to check the presence of 4 and 6. When I find 4 I need to check both 6 and 3 etc.), it did not occur to me to ignore exactly those numbers and first search for one end of the streak. 

2. Granted it is not simple to stumble upon this idea but one way would be to precisely "wish" for an easier version of problem (for example the longest sequence is ordered but separated in the array) and solve for that... 

3. Or realize that as problems with the logic rise, there's almost certainly a simpler way to do things given the time frame. The goal of such problems is not if you can write  200 lines of simple code but elegant 20 line code.

4. People have used a DFS and I need to know how that solution works.

In [86]:
def test_function(test_case):
    output = longest_consecutive_subsequence(test_case[0])
    print(output)
    if output == test_case[1]:
        print("Pass")
    else:
        
        print("Fail")

In [87]:
test_case_1 = [[5, 4, 7, 10, 1, 3, 55, 2], [1, 2, 3, 4, 5]]
test_function(test_case_1)

[1, 2, 3, 4, 5]
Pass


In [88]:
test_case_2 = [[2, 12, 9, 16, 10, 5, 3, 20, 25, 11, 1, 8, 6 ], [8, 9, 10, 11, 12]]
test_function(test_case_2)

[8, 9, 10, 11, 12]
Pass


In [89]:
test_case_3 = [[0, 1, 2, 3, 4], [0, 1, 2, 3, 4]]
test_function(test_case_3)

[0, 1, 2, 3, 4]
Pass


In [90]:
test_case_4 = [[6, 7, 8, 9, 1, 2, 3, 4], [6, 7, 8, 9]]
test_function(test_case_4)

[6, 7, 8, 9]
Pass


<span class="graffiti-highlight graffiti-id_et1ek54-id_r15x1vg"><i></i><button>Show Solution</button></span>