# Longest Collatz sequence

[Problem 14](https://projecteuler.net/problem=14)
> The following iterative sequence is defined for the set of positive integers:

> \begin{align}
n &→ \frac n2 &\text{(n is even)} \\
n &→ 3n + 1 &\text{(n is odd)}
\end{align}

> Using the rule above and starting with 13, we generate the following sequence:

> $$
13 → 40 → 20 → 10 → 5 → 16 → 8 → 4 → 2 → 1
$$

> It can be seen that this sequence (starting at 13 and finishing at 1) contains 10 terms. Although it has not been proved yet (Collatz Problem), it is thought that all starting numbers finish at 1.

> Which starting number, under one million, produces the longest chain?

> NOTE: Once the chain starts the terms are allowed to go above one million.

The truly naive approach requires too much time but this problem is a great candidate for memoization.  

If the result of the collatz operation is a number we have already calculated the collatz length we should just return how many operations it took to reach that point plus the length for the value we have already seen.

In [1]:
class NaiveMemo:    
    def memoize(obj):
        import functools
        cache = obj.cache = {}

        @functools.wraps(obj)
        def memoizer(self, arg):
            if arg not in cache:
                cache[arg] = obj(self, arg)
            return cache[arg]
        return memoizer
    
    @memoize
    def collatz_length(self, n):
        if n == 1: return 1
        if n & 1:
            return 1 + self.collatz_length(3*n+1)
        return 1 + self.collatz_length(n>>1)

    def max_collatz_length(self, n):
        mx = self.collatz_length(1)
        for i in range(2, n):
            if self.collatz_length.cache[mx] <= self.collatz_length(i):
                mx = i
        return mx

    def euler14(self):
        return self.max_collatz_length(10**6)

print(NaiveMemo().euler14())
%timeit NaiveMemo().euler14()

837799
1 loops, best of 3: 595 ms per loop


The collatz sequences increase and decrease, sometimes out of the range of values we are interested in.

This can cause an huge cache to accumulate so to reduce the amount of memory and dictionary updates, we can just memoize the values within the domain of interest without much performance difference.

In [2]:
class NaiveMemoLower:    
    def memoize(obj):
        import functools
        cache = obj.cache = {}

        @functools.wraps(obj)
        def memoizer(self, arg):
            if arg not in cache:
                cache[arg] = obj(self, arg)
            return cache[arg]
        return memoizer
    
    @memoize
    def collatz_length(self, n):
        if n == 1: return 1
        k = n
        count = 0
        while k >= n:
            count += 1
            if k & 1:
                k = 3*k+1
            else:
                k >>= 1
        return count + self.collatz_length.cache[k]

    def max_collatz_length(self, n):
        mx = self.collatz_length(1)
        for i in range(2, n):
            if self.collatz_length.cache[mx] <= self.collatz_length(i):
                mx = i
        return mx

    def euler14(self):
        return self.max_collatz_length(10**6)

print(NaiveMemoLower().euler14())
%timeit NaiveMemoLower().euler14()

837799
1 loops, best of 3: 542 ms per loop


## Double skipping

Because the way the collatz operations change the values, we know that $3n + 1$ will always be even.

That means we can combine an odd operation with the next even operation.

In [3]:
class DoubleSkipNaive:    
    def memoize(obj):
        import functools
        cache = obj.cache = {}

        @functools.wraps(obj)
        def memoizer(self, arg):
            if arg not in cache:
                cache[arg] = obj(self, arg)
            return cache[arg]
        return memoizer
    
    @memoize
    def collatz_length(self, n):
        if n == 1: return 1
        if n & 1:
            return 2 + self.collatz_length((3*n+1)>>1)
        return 1 + self.collatz_length(n>>1)

    def max_collatz_length(self, n):
        mx = self.collatz_length(1)
        for i in range(2, n):
            if self.collatz_length.cache[mx] <= self.collatz_length(i):
                mx = i
        return mx

    def euler14(self):
        return self.max_collatz_length(10**6)

print(DoubleSkipNaive().euler14())
%timeit DoubleSkipNaive().euler14()

837799
1 loops, best of 3: 575 ms per loop


In [4]:
class DoubleSkipMemoLower:    
    def memoize(obj):
        import functools
        cache = obj.cache = {}

        @functools.wraps(obj)
        def memoizer(self, arg):
            if arg not in cache:
                cache[arg] = obj(self, arg)
            return cache[arg]
        return memoizer
    
    @memoize
    def collatz_length(self, n):
        if n == 1: return 1
        k = n
        count = 0
        while k >= n:
            count += 2
            if k & 1:
                count += 2
                k = (3*k+1)>>1
            else:
                count += 1
                k >>= 1
        return count + self.collatz_length.cache[k]

    def max_collatz_length(self, n):
        mx = self.collatz_length(1)
        for i in range(2, n):
            if self.collatz_length.cache[mx] <= self.collatz_length(i):
                mx = i
        return mx

    def euler14(self):
        return self.max_collatz_length(10**6)

print(DoubleSkipMemoLower().euler14())
%timeit DoubleSkipMemoLower().euler14()

837799
1 loops, best of 3: 532 ms per loop
