# Longest Collatz sequence

[Problem 14](https://projecteuler.net/problem=14)
> The following iterative sequence is defined for the set of positive integers:

> \begin{align}
n &→ \frac n2 &\text{(n is even)} \\
n &→ 3n + 1 &\text{(n is odd)}
\end{align}

> Using the rule above and starting with 13, we generate the following sequence:

> $$
13 → 40 → 20 → 10 → 5 → 16 → 8 → 4 → 2 → 1
$$

> It can be seen that this sequence (starting at 13 and finishing at 1) contains 10 terms. Although it has not been proved yet (Collatz Problem), it is thought that all starting numbers finish at 1.

> Which starting number, under one million, produces the longest chain?

> NOTE: Once the chain starts the terms are allowed to go above one million.

The truly naive approach requires too much time but this problem is a great candidate for memoization.  

If the result of the collatz operation is a number we have already calculated the collatz length we should just return how many operations it took to reach that point plus the length for the value we have already seen.

In [1]:
LIMIT = 10**6

In [2]:
def euler14_naive_memo():   
    from operator import itemgetter  
    
    def collatz_length(n, lengths):
        if n not in lengths:
            if n & 1:
                lengths[n] = 1 + collatz_length(3*n+1, lengths)
            else:
                lengths[n] = 1 + collatz_length(n>>1, lengths)
        return lengths[n]
    
    def collatz_lengths(n):
        lengths = {0:1, 1:1}
        for i in range(n):
            collatz_length(i, lengths)
        return lengths
    
    return max(enumerate(collatz_lengths(LIMIT)), key=itemgetter(1))[0]

%timeit euler14_naive_memo()
print(euler14_naive_memo())

1 loops, best of 3: 2.43 s per loop
2021576


The collatz sequences increase and decrease, sometimes out of the range of values we are interested in.

This can cause an huge cache to accumulate so to reduce the amount of memory and dictionary updates, we can just memoize the values within the domain of interest without much performance difference.

In [3]:
def euler14_memo_lower():    
    from operator import itemgetter  
    
    def collatz_lengths(n):
        lengths = [None] * n
        lengths[0] = lengths[1] = 1
        for i in range(2, n):
            k = i
            count = 0
            while k >= i:
                count += 1
                if k & 1:
                    k = 3*k+1
                else:
                    k >>= 1
            lengths[i] = count + lengths[k]
        return lengths

    return max(enumerate(collatz_lengths(LIMIT)), key=itemgetter(1))[0]

%timeit euler14_memo_lower()
print(euler14_memo_lower())

1 loops, best of 3: 1.73 s per loop
837799


## Double stepping

Because the way the collatz operations change the values, we know that $3n + 1$ will always be even.

That means we can combine an odd operation with the next even operation.

In [4]:
def euler14_double_step_naive():
    from operator import itemgetter
    
    def collatz_length(n, lengths):
        if n not in lengths:
            if n & 1:
                lengths[n] = 2 + collatz_length((3*n+1)>>1, lengths)
            else:
                lengths[n] = 1 + collatz_length(n>>1, lengths)
        return lengths[n]
    
    def collatz_lengths(n):
        lengths = {0:1, 1:1}
        for i in range(n):
            collatz_length(i, lengths)
        return lengths

    return max(collatz_lengths(LIMIT).items(), key=itemgetter(1))[0]

%timeit euler14_double_step_naive()
print(euler14_double_step_naive())

1 loops, best of 3: 1.91 s per loop
837799


In [5]:
def euler14_double_step_memo_lower():  
    from operator import itemgetter   
    
    def collatz_lengths(n):
        lengths = [None] * n
        lengths[0] = lengths[1] = 1
        for i in range(2, n):
            k = i
            count = 0
            while k >= i:
                if k & 1:
                    count += 2
                    k = (3*k+1)>>1
                else:
                    count += 1
                    k >>= 1
            lengths[i] = count + lengths[k]
        return lengths
    
    return max(enumerate(collatz_lengths(LIMIT)), key=itemgetter(1))[0]
    
%timeit euler14_double_step_memo_lower()
print(euler14_double_step_memo_lower())

1 loops, best of 3: 1.45 s per loop
837799
