# Longest Collatz sequence
## Problem 14
------
The following iterative sequence is defined for the set of positive integers:

n → n/2 (n is even)  
n → 3n + 1 (n is odd)

Using the rule above and starting with 13, we generate the following sequence:

13 → 40 → 20 → 10 → 5 → 16 → 8 → 4 → 2 → 1  

It can be seen that this sequence (starting at 13 and finishing at 1) contains 10 terms. Although it has not been proved yet (Collatz Problem), it is thought that all starting numbers finish at 1.

Which starting number, under one million, produces the longest chain?

NOTE: Once the chain starts the terms are allowed to go above one million.

---
Correct Result: **837799**

### Discussion of Methods Used:

Four slightly different approaches are shown below. The first of these simply tracks the current value and adds up the number of steps taken until the value reaches 1. The second is the same except that it uses a cache dictionary so that previously computed values can be looked up instead of recomputed. The third approach tracks the entirety of the sequence until the last entry is 1, and then computes its length. Finally, the fourth approach combines tracking the sequence as a whole with a cache.

For the values in question, the simplest approach (counting the number of steps for each starting number individually) is in fact the fastest of the approaches considered, with the others adding almost 50% to the running time. It appears that the dict lookups/assignments added more time to the total than was saved by avoiding duplication of work. Passing the entire sequence and appending to it likewise did not save any time.

In [1]:
def collatz_counting_no_cache(num):
    chain_length = 1
    while num != 1:
        if num % 2 == 0:
            num = num // 2
        else:
            num = 3 * num + 1
        chain_length += 1
    return chain_length

def collatz_counting_with_cache(num, cache):
    chain_length = 1
    while num != 1:
        if num % 2 == 0:
            num = num // 2
        else:
            num = 3 * num + 1
        if num in cache:
            chain_length += cache[num]
            break
        else:
            chain_length += 1
    return chain_length

def collatz_track_sequence(num):
    sequence = [num]
    while num != 1:
        if num % 2 == 0:
            num = num // 2
        else:
            num = 3 * num + 1
        sequence.append(num)
    return len(sequence)

def collatz_track_sequence_with_cache(num, cache):
    sequence = [num]
    while num != 1:
        if num % 2 == 0:
            num = num // 2
        else:
            num = 3 * num + 1
        if num in cache:
            sequence.append(cache[num])
            return len(sequence)
        else:
            sequence.append(num)
    for i in range(0, len(sequence)):
        if sequence[i] not in cache:
            cache[i] = sequence[i:]
        else:
            break
    return len(sequence)

def run_sequence(func, limit, uses_cache=False):
    max_len = 1
    top_num = 1
    cache = {}
    for n in range(1, limit):
        length = func(n, cache) if uses_cache else func(n)
        if length > max_len:
            max_len = length
            top_num = n
    return max_len, top_num

In [2]:
# Running and timing the alternative approaches:
from utils import computation_timer

limit = 1_000_000
results = computation_timer({'name':'Collatz - Just Counting, No Cache',
                             'func': lambda: run_sequence(collatz_counting_no_cache, limit)},
                            {'name':'Collatz - Just Counting, With Cache',
                             'func': lambda: run_sequence(collatz_counting_with_cache, limit, True)},
                            {'name':'Collatz - Tracking Sequence',
                             'func': lambda: run_sequence(collatz_track_sequence, limit)},
                            {'name':'Collatz - Tracking Sequence, With Cache',
                             'func': lambda: run_sequence(collatz_track_sequence_with_cache, limit, True)})
print("Timed Results:")
for result in results:
    print("\t%s:" % result['name'])
    print("\t\tResult: Maximum # of Steps is %d, generated by the starting value %d." % (result['result'][0], result['result'][1]))
    print("\t\t- Obtained in %f seconds" % result['running_time'])

Timed Results:
	Collatz - Just Counting, No Cache:
		Result: Maximum # of Steps is 525, generated by the starting value 837799.
		- Obtained in 16.483948 seconds
	Collatz - Just Counting, With Cache:
		Result: Maximum # of Steps is 525, generated by the starting value 837799.
		- Obtained in 20.552460 seconds
	Collatz - Tracking Sequence:
		Result: Maximum # of Steps is 525, generated by the starting value 837799.
		- Obtained in 23.277078 seconds
	Collatz - Tracking Sequence, With Cache:
		Result: Maximum # of Steps is 525, generated by the starting value 837799.
		- Obtained in 28.116155 seconds
