# Longest Collatz Sequence

([Problem 14](https://projecteuler.net/problem=14))

The following iterative sequence is defined for the set of positive integers:

* if $n$ is even:
    * $n \to n/2$

* if $n$ is odd:
    * $n \to 3n + 1$


Using the rule above and starting with $13$, we generate the following sequence:

$$
13 \to 40 \to 20 \to 10 \to 5 \to 16 \to 8 \to 4 \to 2 \to 1
$$

It can be seen that this sequence (starting at $13$ and finishing at $1$) contains $10$ terms. Although it has not been proved yet (Collatz Problem), it is thought that all starting numbers finish at $1$.

Which starting number, under one million, produces the longest chain?

NOTE: Once the chain starts the terms are allowed to go above one million.

## first, write the Collatz chain method

In [114]:
# get the Collatz chain of a particular number
def collatz_chain(n):
    chain = [1]
    while n != 1:
        chain.insert(-1, int(n))
        if n % 2 == 0:
            n = n / 2
        else:
            n = 3 * n + 1
    return chain

In [115]:
chain_999999 = collatz_chain(999999)
print(chain_999999[::-1])
print(len(chain_999999))

[1, 2, 4, 8, 16, 5, 10, 20, 40, 80, 160, 53, 106, 35, 70, 23, 46, 92, 184, 61, 122, 244, 488, 976, 325, 650, 1300, 433, 866, 1732, 577, 1154, 2308, 4616, 9232, 3077, 6154, 2051, 4102, 1367, 2734, 911, 1822, 3644, 7288, 2429, 4858, 1619, 3238, 1079, 2158, 719, 1438, 479, 958, 319, 638, 1276, 425, 850, 283, 566, 1132, 377, 754, 251, 502, 167, 334, 668, 1336, 445, 890, 1780, 593, 1186, 395, 790, 263, 526, 175, 350, 700, 233, 466, 155, 310, 103, 206, 412, 137, 274, 91, 182, 364, 121, 242, 484, 161, 322, 107, 214, 71, 142, 47, 94, 188, 376, 125, 250, 83, 166, 55, 110, 220, 73, 146, 292, 97, 194, 388, 776, 1552, 3104, 6208, 2069, 4138, 1379, 2758, 919, 1838, 3676, 1225, 2450, 4900, 1633, 3266, 6532, 2177, 4354, 1451, 2902, 967, 1934, 3868, 1289, 2578, 859, 1718, 3436, 1145, 2290, 4580, 9160, 3053, 6106, 2035, 4070, 8140, 16280, 32560, 10853, 21706, 7235, 14470, 4823, 9646, 3215, 6430, 12860, 25720, 8573, 17146, 34292, 68584, 22861, 45722, 91444, 182888, 365776, 121925, 243850, 487700, 975400

## brute force method

Took about __45 seconds__ to get the solution.  Not bad.

In [None]:
# get the Collatz chain of a particular number
def collatz_chain(n):
    chain = [1]
    while n != 1:
        chain.insert(-1, int(n))
        if n % 2 == 0:
            n = n / 2
        else:
            n = 3 * n + 1
    return chain

In [118]:
def find_long_chain(start_less_than):
    long_chain_count = (0, 0)
    for i in range(2, start_less_than):
        chain = collatz_chain(i)
        if long_chain_count[1] < len(chain):
            long_chain_count = (i, len(chain))
    return '{} has the longest Collatz sequence at {}'.format(long_chain_count[0], long_chain_count[1])

In [119]:
# .025 ms
test_time(find_long_chain, (10,))

9 has the longest Collatz sequence at 20


'0.026988983154296875ms'

In [120]:
# 1 ms
test_time(find_long_chain, (100,))

97 has the longest Collatz sequence at 119


'1.6853809356689453ms'

In [121]:
# 25 ms
test_time(find_long_chain, (1000,))

871 has the longest Collatz sequence at 179


'25.497770309448242ms'

In [122]:
# 280 ms
test_time(find_long_chain, (10000,))

6171 has the longest Collatz sequence at 262


'283.4444046020508ms'

In [123]:
# 3.5 seconds
test_time(find_long_chain, (100000,))

77031 has the longest Collatz sequence at 351


'3293.332576751709ms'

In [124]:
# 42 seconds
test_time(find_long_chain, (1000000,))

837799 has the longest Collatz sequence at 525


'40602.92820930481ms'

## test top 50% and only odd numbers

__The best solution__

I noticed that the longest sequence seems to land within the last 50% of the number limit.

Additionally, numbers with the largest sequences are odd numbers (makes sense because they always start by increasing the number in the sequence.)

The below solution took 1/3 the time.

In [None]:
# get the Collatz chain of a particular number
def collatz_chain(n):
    chain = [1]
    while n != 1:
        chain.insert(-1, int(n))
        if n % 2 == 0:
            n = n / 2
        else:
            n = 3 * n + 1
    return chain

In [214]:
# find the number less than a given number with the longest Collatz sequence
def find_long_chain(less_than):
    import math
    long_chain_count = (0, 0)
    start_number = math.floor(less_than / 2)
    start_number += 1 if start_number % 2 == 0 else 0 # always odd
    for i in range(start_number, less_than, 2): # odd numbers
        chain = collatz_chain(i)
        if long_chain_count[1] < len(chain):
            long_chain_count = (i, len(chain))
    return '{} has the longest Collatz sequence at {}'.format(long_chain_count[0], long_chain_count[1])


In [215]:
# .018 ms
test_time(find_long_chain, (10,))

9 has the longest Collatz sequence at 20


'0.018215179443359375ms'

In [216]:
# .45 ms
test_time(find_long_chain, (100,))

97 has the longest Collatz sequence at 119


'0.45299530029296875ms'

In [217]:
# 7 ms
test_time(find_long_chain, (1000,))

871 has the longest Collatz sequence at 179


'7.358121871948242ms'

In [218]:
# 91 ms
test_time(find_long_chain, (10000,))

6171 has the longest Collatz sequence at 262


'91.38760566711426ms'

In [219]:
# 1 seconds
test_time(find_long_chain, (100000,))

77031 has the longest Collatz sequence at 351


'1091.6337966918945ms'

In [220]:
# 13 seconds
test_time(find_long_chain, (1000000,))

837799 has the longest Collatz sequence at 525


'12710.10217666626ms'

## observations

* numbers with the largest sequences are __odd numbers__ (makes sense because they always start by increasing the number in the sequence.)
* The Collatz chain of the last 10 numbers before 1,000,000 are quite large, and all __share a majority of numbers__.
* we can use the largest chain of 999990-999999 as __a reference__ so that we __don't need to calculate__ the full chain of each number we test.
    * We will use the collatz chain for __999999__ (with chain length of __259 numbers__) as our reference chain.
    * Turns out this method is actually __worse__, processing-speed-wise.

In [99]:
# get the Collatz chain of a particular number
# if a reference number is given, use that number's
# Collatz chain as a reference to shorten processing time 
def collatz_chain_plus(n, reference_chain=False):
    chain = [1]
    while n != 1:
        if reference_chain:
            if n in reference_chain:
                i = reference_chain.index(n)
                chain = chain[0:-1] + reference_chain[i:]
                return chain
        chain.insert(-1, int(n))
        if n % 2 == 0:
            n = n / 2
        else:
            n = 3 * n + 1
    return chain

In [104]:
print(collatz_chain_plus(557, chain_999999))
print(collatz_chain_plus(557))

[557, 1672, 836, 418, 209, 628, 314, 157, 472, 236, 118, 59, 178, 89, 268, 134, 67, 202, 101, 304, 152, 76, 38, 19, 58, 29, 88, 44, 22, 11, 34, 17, 52, 26, 13, 40, 20, 10, 5, 16, 8, 4, 2, 1]
[557, 1672, 836, 418, 209, 628, 314, 157, 472, 236, 118, 59, 178, 89, 268, 134, 67, 202, 101, 304, 152, 76, 38, 19, 58, 29, 88, 44, 22, 11, 34, 17, 52, 26, 13, 40, 20, 10, 5, 16, 8, 4, 2, 1]


In [112]:
test_time(collatz_chain_plus, (975000,), 50)

[975000, 487500, 243750, 121875, 365626, 182813, 548440, 274220, 137110, 68555, 205666, 102833, 308500, 154250, 77125, 231376, 115688, 57844, 28922, 14461, 43384, 21692, 10846, 5423, 16270, 8135, 24406, 12203, 36610, 18305, 54916, 27458, 13729, 41188, 20594, 10297, 30892, 15446, 7723, 23170, 11585, 34756, 17378, 8689, 26068, 13034, 6517, 19552, 9776, 4888, 2444, 1222, 611, 1834, 917, 2752, 1376, 688, 344, 172, 86, 43, 130, 65, 196, 98, 49, 148, 74, 37, 112, 56, 28, 14, 7, 22, 11, 34, 17, 52, 26, 13, 40, 20, 10, 5, 16, 8, 4, 2, 1]


'0.05398273468017578ms'

In [113]:
test_time(collatz_chain_plus, (975000, chain_999999), 50)

[975000, 487500, 243750, 121875, 365626, 182813, 548440, 274220, 137110, 68555, 205666, 102833, 308500, 154250, 77125, 231376, 115688, 57844, 28922, 14461, 43384, 21692, 10846, 5423, 16270, 8135, 24406, 12203, 36610, 18305, 54916, 27458, 13729, 41188, 20594, 10297, 30892, 15446, 7723, 23170, 11585, 34756, 17378, 8689, 26068, 13034, 6517, 19552, 9776, 4888, 2444, 1222, 611, 1834, 917, 2752, 1376, 688, 344, 172, 86, 43, 130, 65, 196, 98, 49, 148, 74, 37, 112, 56, 28, 14, 7, 22, 11, 34, 17, 52, 26, 13, 40, 20, 10, 5, 16, 8, 4, 2, 1]


'0.9338998794555664ms'

## tester

In [1]:
def test_time(func_to_test, any_params=(), num_times_to_run=5, print_results=True):
    import time
    start = time.time()

    for i in range(num_times_to_run):
        results = func_to_test(*any_params)

    end = time.time()
    if print_results:
        print(results)
    return str((end - start) * 10**3 / num_times_to_run) + "ms"