# Problem 13
The following iterative sequence is defined for the set of positive integers:

n → n/2 (n is even)
n → 3n + 1 (n is odd)

Using the rule above and starting with 13, we generate the following sequence:

13 → 40 → 20 → 10 → 5 → 16 → 8 → 4 → 2 → 1
It can be seen that this sequence (starting at 13 and finishing at 1) contains 10 terms. Although it has not been proved yet (Collatz Problem), it is thought that all starting numbers finish at 1.

Which starting number, under one million, produces the longest chain?

NOTE: Once the chain starts the terms are allowed to go above one million.

## First brute force recursive implementation

In [1]:
%%timeit -n1 -r1
longest_sequence = (0, 0)
for i in range(1, int(1e6)):
    val = i
    sequence_length = 1
    while val != 1:
        sequence_length += 1
        if val%2 == 0:
            val = val//2
        else:
            val = val*3+1
            
    if sequence_length > longest_sequence[1]:
        longest_sequence = (i, sequence_length)
        
print(longest_sequence)

(837799, 525)
13 s ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)


## Slightly vectorized approach

In [2]:
from time import time
import numpy as np
def get_next_step(original, data):
    mask_even = (data & 1) == 0
    mask_odd = ~mask_even
    
    data[mask_even] = data[mask_even]/2
    data[mask_odd] = 3*data[mask_odd] + 1
    
    mask_one = (data != 1)
    mask_sum = np.sum(mask_one)
    
    if mask_sum != 1:
        result = get_next_step(original[mask_one], data[mask_one])
        return result
    else:
        return original[mask_one][0]

In [12]:
%%timeit -r1 -n1

tries = np.arange(1, int(1e6), dtype=np.int64)
number = get_next_step(tries, 1*tries)    
print(number)

837799
4.55 s ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)


## Multiprocessing
See: problem_14_mulitprocessing.py
(due to the limitations with a jupyter notebook)

## Smart recursie implementation

In [4]:
%%timeit -n1 -r1
sequence_history = []
for i in range(1, int(1e6)):
    val = i
    sequence_length = 1
    
    while val != 1:
        sequence_length += 1
        if val < i:
            # Sequence has been calculated already
            sequence_length += sequence_history[val-1]
            val = 1
        elif val%2 == 0:
            val = val//2
        else:
            val = val*3+1
            
    sequence_history.append(sequence_length)
    
max_length = max(sequence_history)
index = sequence_history.index(max_length)
print(f'Number: {index + 1}')    

Number: 837799
885 ms ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)


## Cython implementation
Factor 70x faster!!

In [45]:
%load_ext cython

The cython extension is already loaded. To reload it, use:
  %reload_ext cython


In [117]:
%%cython 
#--annotate

import numpy as np

cimport cython
cimport numpy as np
 
def get_max_length(int number):
    cdef int n = number
    cdef int max_number = 0 
    
    cdef unsigned long long val
    cdef unsigned int i, sequence_length
    #cdef list sequence_history = []
    
    cdef np.ndarray[np.int_t, ndim=1] sequence_history = np.zeros((n,), dtype=np.int)
    
    for i in range(1, n):
        val = i
        sequence_length = 1
        while val != 1:
            sequence_length += 1
            if val < i:
                # Sequence has been calculated already
                sequence_length += sequence_history[val]
                val = 1
            elif val%2 == 0:
                val = val//2
            else:
                val = val*3+1

        sequence_history[i] = sequence_length
    
   # print(f'Number: {sequence_history.argmax()}')    

In [121]:
%timeit -n100 -r1 get_max_length(1000000)

12.7 ms ± 0 ns per loop (mean ± std. dev. of 1 run, 100 loops each)


In [123]:
# Faster by:
885/12.7

69.68503937007874