# Computing Stopping Times for the Collatz Function on Binary Strings

We define the *Collatz function* on positive integers by
$$
T(n) =
\begin{cases}
\frac{3n + 1}{2} & n \text{ is odd} \\
~~\frac{n}{2} & n \text{ is even}
\end{cases}
$$
The *Collatz Conjecture* hypothesizes that every integer $n$ eventually reaches 1 under successive iterations of the Collatz function.
(Note that once a sequence reaches 1 it remains in the cycle $1 \rightarrow 4 \rightarrow 2 \rightarrow 1$ forever more).

We define the *stopping time* of a positive integer by
$$
\sigma(n) = \inf\{ k \ge 0 : T^k(n) = 1 \}
$$
Thus, a restatement of the Collatz conjecture is that $\sigma(n) < \infty$ for all positive integers $n$.

Because of the division by 2 in $T$, the stopping time grows like $\log n$.
This prompts us to define one final statistic $\gamma$ by
$$
\gamma(n) = \frac{\sigma(n)}{\log n}
$$
In this notebook, we produce a large dataset of binary strings and their corresponding $\gamma$ values.

In [1]:
import time
import random
import numpy as np
from math import log

from gamma import *

In [2]:
# examples of some function in gamma.py
seq(5)
print(f'sigma(5): {sigma(5)}, gamma(5): {gamma(5)}')

5 8 4 2 1 
sigma(5): 4, gamma(5): 2.4853397382384474


## Creating Data Set

In [24]:
rng = np.random.default_rng()

def random_binary(rng=rng, N=1000) :
    '''Generate random binary string of length N'''
    a = rng.integers(low=0, high=2, size=1000)
    return ''.join([str(b) for b in a])

In [25]:
n = int(random_binary(), 2)
print(n)

6541643499186055663771779328526469627081095288196595523680383547052491563421123870449850966187717703280037334515593694237002611891077117952455920949798828173923591768316646776642950379418199574201749562252947993917600033282790517957126253203260841628458181229757520308969262614048681404644781441437778


In [33]:
# choose file to output to
outfile = 'gamma_values.csv'

# choose number of digits and how many digits
num_values = 10000
N = 1000

a = time.time()

with open(outfile, 'a') as f :
    for i in range(num_values) :
        
        # get random binary number as binary string and python int
        b = random_binary()
        n = int(b, 2)
        
        # compute gamma and write to file
        g = gamma(n)
        f.write(f'{b},{g}\n')
        
        if i % 1000 == 0 :
            b = time.time()
            print(f'{i}, {int(b - a)} s')
            a = time.time()

0, 0 s
1000, 3 s
2000, 3 s
3000, 3 s
4000, 3 s
5000, 3 s
6000, 3 s
7000, 3 s
8000, 3 s
9000, 3 s
