# A Proof of the Collatz Conjecture

In [25]:
import sys, io
import math
import numpy as np
import pandas as pd
from scipy.optimize import nnls
from fractions import Fraction
from sympy import factorint
from itertools import product


In [26]:
def S(a: int):
    if a == 0:
        yield "1"
    else:
    	seqs = product('10', repeat=a)
    	for bits in seqs:
    		yield ''.join(bits)

def zeros(S_s):
    label = ''.join(S_s)
    Z = [i for i, b in enumerate(S_s) if b == '0']
    return Z

In [27]:
for i in range(4):
    print("generation %d"%(i))
    for bitstring in bitstrings(i):
        print(bitstring)

generation 0
1
generation 1
1
0
generation 2
11
10
01
00
generation 3
111
110
101
100
011
010
001
000


## The simplified Collatz Conjecture is expressed as:

For all integers $n$, if the function $T(n)$ given below is repeatedly applied, then 1 will eventually be produced:

$$
T(n)=
\begin{cases}
\dfrac{n}{2}, & \text{if } n \text{ is even},\\[8pt]
\dfrac{3n+1}{2}, & \text{if } n \text{ is odd}.
\end{cases}
$$

which generates a series of numbers such as $[12,\ 6,\ 3,\ 5,\ 8,\ 4,\ 2,\ 1]$

## There is a known cycle in the simplified Collatz Conjecture

If $T(n)$ is continued to be applied when 1 is reached, then the cycle $[1,2,1,2,...]$ will be produced 

## A Collatz Path can be expressed algebraically

Note the cycle in the last row of the matrix

In [28]:
A = [ [ -1,  2,  0,  0],
      [  0, -1,  2,  0],
      [  0,  0,  -3, 2],
      [  0,  1,  0, -1] ]
Y = [ 0, 0, 1, 0]

np.linalg.solve(A, Y)

array([4., 2., 1., 2.])

In [29]:
A = [ [ -1,  2,  0,  0,  0,  0,  0,  0,  0],
      [  0, -1,  2,  0,  0,  0,  0,  0,  0],
      [  0,  0, -3,  2,  0,  0,  0,  0,  0],
      [  0,  0,  0, -3,  2,  0,  0,  0,  0],
      [  0,  0,  0,  0, -1,  2,  0,  0,  0],
      [  0,  0,  0,  0,  0, -1,  2,  0,  0],
      [  0,  0,  0,  0,  0,  0, -1,  2,  0],
      [  0,  0,  0,  0,  0,  0,  0, -3,  2],
      [  0,  0,  0,  0,  0,  0,  1,  0, -1]     
    ]
Y = [ 0, 0, 1, 1, 0, 0, 0, 1, 0]

np.linalg.solve(A, Y)

array([12.,  6.,  3.,  5.,  8.,  4.,  2.,  1.,  2.])

## We can generate a lattice of Collatz matrices and Y vectors starting from:

$$
A=\begin{pmatrix}-1&2&0\\0&-3&2\\1&0&-1\end{pmatrix}
$$

and

$$
Y=\begin{pmatrix}0&1&0\end{pmatrix}
$$

By adding $[-3, 2, ...]$ or $[-1, 2, ...]$ rows to the top of each previous matrix.

The corresponding Y vectors have a value of $1$  when the operation is $\frac{3n+1}{2}$ and $0$ when the operation is $\frac{n}{2}$

This gives a set of linear equations that that have rational solutions which include all Collatz chains by design.

So if we can show that all integers are eventually contained in the solutions of these rational equations, we will have proved the Collatz conjecture.



# Identify the X[0] generator

The first value of each X solution vector has a generator which gives a binary lattice expanding from 1:

$ (a, b, c) \rightarrow [ (a + 1, b, c), \ (a + 1, b + 1, 3c + 2^a) ] $

with $(0,0,0)$ being the first tuple

with corresponding rational values:

${\Large \frac{2^a - c}{3^b}}$

and so we start with $(0,0,0)\ \rightarrow \ 1$, the final value of all Collatz chains.


# By Design, the lattice contains all Collatz Solutions

If an integer is in the lattice, then it has a path to 1 under the Collatz Conjecture.  The lattice contains many more non-integer rational numbers than integer solutions that are Collatz solutions but when we rotate the generator we find we can generate a constructive proof that all integers will be generated by the next generator and therefor all integers are in the lattice.

# A generation generator

Instead of generating generations of a binary tree starting with $[1]$ ‚Üí $[2, \frac{1}{3}]$ ‚Üí ..., we instead find the generator that generates each node on a generation by generation basis.

This generator is much easier to work with when trying to prove all integers are in the lattice and therefore covered by the Collatz conjecture.



In [30]:
def generation_fraction(a):
    seqs = product('10', repeat=a)
    for bits in seqs:
        label = ''.join(bits)
        zeros = [i for i, b in enumerate(bits) if b == '0']
        b = len(zeros)
        # compute c = sum_{j=0}^{k} 3^{k-j} * 2^{i_j - 1}
        c = sum((3 ** (b - j - 1)) * (2 ** (i)) for j, i in enumerate(zeros))
        yield (2**a - c, 3**b)
#

In [31]:
for i in range(5):
    print("        %d:"%(i))
    gen_line = "        &[ "
    for tup in generation_fraction(i):
        f = Fraction(tup[0], tup[1])
        if f.denominator == 1:
            gen_line = gen_line + "\\bold{%d},"%(f.numerator)
        else:
            gen_line = gen_line + "\\frac{%d}{%d},"%(f.numerator, f.denominator)
    gen_line = gen_line + " ]\\\\"
    print(gen_line)


        0:
        &[ \bold{1}, ]\\
        1:
        &[ \bold{2},\frac{1}{3}, ]\\
        2:
        &[ \bold{4},\frac{2}{3},\bold{1},\frac{-1}{9}, ]\\
        3:
        &[ \bold{8},\frac{4}{3},\bold{2},\frac{-2}{9},\frac{7}{3},\frac{1}{9},\frac{1}{3},\frac{-11}{27}, ]\\
        4:
        &[ \bold{16},\frac{8}{3},\bold{4},\frac{-4}{9},\frac{14}{3},\frac{2}{9},\frac{2}{3},\frac{-22}{27},\bold{5},\frac{5}{9},\bold{1},\frac{-13}{27},\frac{11}{9},\frac{-7}{27},\frac{-1}{9},\frac{-49}{81}, ]\\


One can see the beginnings of the Collatz numbers starting from one in the above output:

$$
1\ \mapsto \ 2 \mapsto \ 4 \ \mapsto \  8 \ \mapsto 
\begin{cases}
16 \\
5
\end{cases}
$$

Also note that a feature of the lattice is that numbers that appear in generation $a_{i}$ will appear in  $a_{i+2}$ and every other future generation. This feature of the lattice is important in the proof because it gives us an infinite number of opportunities to tie numbers to lesser numbers.  (The bit string suffix `01` is the propagator of numbers across pairs of generations)


# All 4n+1 integers in the subset of the recursive form $n_{i+1} = 4(n_{i}) + 1$ are generated

### (Warmup)


For the recursive sequence $(n_{0}=1,; n_{i+1}=4n_{i}+1)$, the the closed form
$$
n_i=\frac{4^{,i+1}-1}{3}\qquad(i\ge0).
$$
I will show that for every $i\ge0$ the generator emits the pair whose rational value equals $n_i$. Equivalently I show there exist $a$ and a bit-string of length (a) for which
$$
\frac{2^{a}-c}{3^{b}}=n_i,
$$
with (b) and (c) as defined in the generator.

**Explicit construction.**
Take
$$
a = 2(i+1)
$$
and the $length-a$ bit-string that has a single zero in position $0$ (the leftmost bit) and ones everywhere else:
$$
\text{bits} = (0,1,1,\dots,1)\quad\text{(one `0' at index }0\text{, and }a-1\text{ ones)}.
$$
For that bit-string:

* the set of zero indices is ({0}), so (b=1),
* by the definition of (c),
  $$
  c=\sum_{j=0}^{b-1}3^{,b-j-1}2^{,i_j}=3^{,0}2^{,0}=1.
  $$

Plugging into the generator formula gives
$$
\frac{2^{a}-c}{3^{b}}=\frac{2^{2(i+1)}-1}{3}
=\frac{4^{,i+1}-1}{3}=n_i,
$$
which is exactly the closed form for the recursive sequence.

**Conclusion / induction view.**
Thus every element of the recursively defined sequence $n_{i+1}=4n_i+1$ (equivalently every integer of the form $\dfrac{4^{i+1}-1}{3})$ is produced by the generator: choose $a=2(i+1)$ and the bit-string with a single zero at index (0). Inductively the recursion is immediate from the closed form, and the explicit construction above shows each $n_i$ appears.



# All Positive Odd Integers are Generated

Notation and basic identity

* For a bit-string of length (a) with zero indices $z_0<\cdots<z_{b-1}$ the generator produces the rational
  $$
  \frac{2^{a}-c}{3^{b}},\qquad c=\sum_{j=0}^{b-1}3^{,b-1-j}2^{,z_j}.
  $$
  When this fraction is an integer $k$ we write the identity:
  $$
  2^{a}=k\cdot 3^{b}+c. \tag{‚òÖ}
  $$

Step A ‚Äî two base seeds

* (1) is produced (take (a=2), bits `01`): $(2^2-1)/3=1$.
* (3) is produced (take (a=5), bits `00111`): $(2^5-5)/3^2=3$.

So both residue-classes $1\pmod 4$ and $3\pmod 4$ occur as base seeds.

Step B ‚Äî the multiply-by-4 operation 

If ‚òÖ holds for $(a,b,c,k)$ then replacing every zero-index $z_j$ by $z_j+2$ and taking $a' = a+2$ gives
$$
2^{a'} = (4k)\cdot 3^{b} + (4c).
$$

Thus **from any valid representation of (k)** we can immediately produce one for (4k) (shift all zero indices by (+2)).

Step C ‚Äî how to add a base-4 digit $d\in{0,1,2,3}$

We want, starting from a representation of $k$ with exponent $b$, to produce one for $k' = 4k + d$.
After the shift of Step B we have

$$
2^{a+2} = (4k)\cdot 3^{b} + 4c.
$$
To get $4k+d$ we need to alter $4c$ to $4c + d\cdot 3^{b}$. So we must realize the **addend**
$d\cdot 3^{b}$ as part of the new $c$-sum. This is always possible by the following simple, explicit maneuver:

* Write $d$ in binary: $d = \sum_{t\in S} 2^{t}$ for some finite set $S$ of indices (at most two bits here, but the argument is general).
* Choose a large shift $T$ (so large that the indices you will use do not collide with existing indices). Insert a block of new zero positions that are the *leftmost* zeros in the new ordering so that (because they are leftmost) their 3-coefficients are exactly powers $3^{b},3^{b-1},\dots$. Concretely, insert new zeros at positions:
  $$
  \{T + r : r\in R\}
  $$
  where the multiset $R$ is chosen so that the contribution of those new zeros equals
  $$
  3^{b}\cdot\sum_{t\in S}2^{T+t} = 3^{b}\cdot 2^{T}\cdot d.
  $$
  (Because they are inserted as the most-significant zeros they pick up the $3^{b}$ factor.)
* After insertion, the $c$-sum has gained exactly $3^{b}\cdot 2^{T}\cdot d$. So the new identity after the insertion looks like
  $$
  2^{a'} = (4k)\cdot 3^{b} + 4c + 3^{b}\cdot 2^{T}\cdot d.
  $$
* Finally, **divide both sides by $2^{T}$** by shifting every zero-index (including the old ones and the newly added ones) down by $T$ (equivalently, replace every index $z\mapsto z-T$ and reduce $a'$ by $T$). That rescaling replaces every power $2^{\cdot}$ in the identity by one divided by $2^{T}$ and yields a new identity of the form
  $$
  2^{a''} = (4k+d)\cdot 3^{b} + c_{\text{new}},
  $$
  i.e. a representation for $4k+d$. (All indices remain integers because we chose the new positions with the offset $T$ in the first place.)

Remarks on the maneuver

* The key idea is: by making the new zero-indices very large (multiply their contributions by a big $2^{T})$ we can make their combined contribution equal $d\cdot 3^{b}$ times a power of two; then by uniformly shifting all indices back we divide out the power of two and get exactly $d\cdot 3^{b}$.
* This construction may require inserting as many new zeros as the number of 1-bits in $d$'s binary expansion (so for $d$ up to 3 you need at most two), and it requires a uniform large shift $T$. It is explicit and always possible.

Step D ‚Äî finish by base-4 digit induction
Write any odd (m) in base 4:
$$
m = d_0 + 4 d_1 + 4^2 d_2 + \cdots + 4^r d_r,\qquad d_j\in{0,1,2,3},\ d_0\in{1,3}.
$$
Start with the seed $k_0=d_0$, which we already know has a representation (we checked (1) and (3)). Inductively, if $k_j$ has a representation, use Steps B‚ÄìC to produce a representation of $k_{j+1}=4k_j + d_{j+1}$. After $r$ steps we reach $m$. This proves existence of a generator representation for every odd $m$.

---




# Lemma: Explicit Construction

## Construction summary
We will show:

> **Given** a valid quadruple $(a,b,c,k)$ satisfying
> $$
> 2^a = 3^b k + c
> \quad\text{where }c=\sum_{j=0}^{b-1}3^{b-j-1}2^{i_j},
> $$
the generator can **construct a new quadruple** ((a'',b,c'',k''))
such that
$$
> k'' = 4k + d, \quad d\in{0,1,2,3},
> $$
> by inserting a finite pattern of zeros and adjusting indices in a prescribed way.

In words:

* We start from *one integer $k$* that is already produced by the generator.
* Then, purely through the ‚Äúbit-level‚Äù operations the generator uses (shifts and insertions of zeros),
* We produce *another integer $4k+d$* that will also appear as a generator output.

---

### üîπ Why this matters

If this recursive property holds, it gives an *inductive coverage* argument:

1. You know that $1$ (or $3$, or any base case) appears as a generated integer.
2. Then by repeatedly applying the lemma:
   $$
   k \mapsto 4k + d, \quad d\in{0,1,2,3},
   $$
   you can reach **every integer** ‚Äî because every positive integer has a unique base-4 expansion
   $$
   n = \sum_{m=0}^{r-1} d_m 4^m, \quad d_m \in {0,1,2,3}.
   $$
3. Thus, starting from the base case, iterating the lemma according to those base-4 digits constructs a witness for *any* target integer $n$.

This is the **completeness proof** ‚Äî the lemma gives a *syntactic closure property* under ‚Äúmultiply by 4 and add 0‚Äì3,‚Äù and base-4 expansion ensures reachability of all integers.

## Construction Details

Below is a short, formal lemma + proof that implements the ‚Äúmultiply-by-4, then add a base-4 digit $d$‚Äù step in a way that makes the choice of the auxiliary shift $T$ and the new zero-indices explicit. 

Let integers $a,b\ge0$ and indices
$$
0\le z_{0}<z_{1}<\cdots<z_{b-1}
$$
be given, and define
$$
c = \sum_{j=0}^{b-1} 3^{b-1-j}2^{z_j}.
$$
Assume $(a,b,c,k)$ satisfy
$$
2^{a}=k\cdot 3^{b}+c. \tag{1}
$$
Fix a digit $d\in{0,1,2,3}$. Then there exist integers $a''$, and an ordered list of zero indices
$$
0\le z''*{0}<z''*{1}<\cdots<z''*{b+m-1}
$$
(with the same $b$ but now total number of zeros $b+m$ for some $m\ge 0$) such that, writing
$$
c'' = \sum_{j=0}^{b+m-1} 3^{b+m-1-j}2^{z_j''},
$$
the identity
$$
2^{a''}=(4k+d)\cdot 3^{b}+c'' \tag{2}
$$
holds. Moreover one may take $a''$ and the $z_j''$ constructed by an explicit choice of a large shift parameter $T$ and simple formulas (given in the proof).

Thus from a witness $(a,b,c,k)$ one can build a witness $(a'',b,c'',4k+d)$.

---

## Proof (construction with explicit formulas)

**Step 1 ‚Äî multiply by (4).**

Multiply $1$ by $4$ to get
$$
2^{a+2} = (4k)\cdot 3^{b} + 4c. \tag{3}
$$
This is realized by replacing each old zero index $z_j$ by $z_j+2$ and taking $a'\coloneqq a+2$. So the ‚Äúmultiply-by-4‚Äù part is just a uniform shift of the zero indices by $+2$.

**Step 2 ‚Äî prepare to add $d\cdot 3^{b}$ by inserting new zeros.**

We want to augment the right-hand side of (3) by $d\cdot 3^{b}$. To do that we will introduce $m$ new zeros (where $m$ equals the number of 1-bits in the binary expansion of $d$; in particular $m\le 2$ for $d\in{0,1,2,3})$ and place them as *the most-significant zeros* in the new zero list. By making those new zero indices very large (a uniform additive offset $T$) we force their contributions to be proportional to $2^{T}$, which we can then divide out by a uniform shift. The construction that follows makes this precise.

Write the binary expansion of $d$:
$$
d = \sum_{t\in S} 2^{t},\qquad S\subseteq{0,1},\quad m:=|S|.
$$
(So $S=\varnothing$ if $d=0$, $S={0}$ if $d=1$, $S={1}$ if $d=2$, $S={0,1}$ if $d=3$.)

Choose an integer $T$ with
$$
T ;>; \max{a+2,; z_{b-1}+2}
$$
(so $T$ is strictly larger than any current index after the +2 shift). Define the following new indices.

* For every original zero (z_j) set
  $$
  \tilde z_j = z_j + 2 + T.
  $$
  These are the old zeros uniformly shifted right by $2+T$.
* For each $t\in S$ (the binary positions of $d$) introduce a *new* zero index
  $$
  w_t = T + t .
  $$
  Order the new zero indices so that the full increasing list of zeros becomes
  $$
  z''*0 < z''*1 < \cdots < z''*{m-1} ;<; \tilde z_0 < \tilde z_1 < \cdots < \tilde z*{b-1}.
  $$
  (Because $T$ was chosen larger than all $\tilde z_j$ without the extra $T$ shift, the inequalities above hold and put the $w_t$ as the most significant zeros.)

So the total number of zeros is $b+m$. Set
$$
a'' = \max{ \tilde z_{b-1},,\max_{t\in S} w_t} + 1 = (z_{b-1}+2+T)+1 = z_{b-1}+3+T.
$$

**Step 3 ‚Äî compute the new $c''$.**

By the ordering chosen, the new $c''$ splits into the contribution of the $m$ new zeros (most-significant) and the shifted old zeros:
$$
c'' = \sum_{r=0}^{m-1} 3^{b+m-1-r}2^{z_r''} + \sum_{j=0}^{b-1} 3^{b-1-j}2^{\tilde z_j}.
$$
Because of how we chose the $z''_r$ we can factor $2^{T}$ from the first sum: each $z''*r$ equals $T$ plus some small integer (in fact $t\in S$ or ordered variants), so write $z''*r = T + s_r$ where each $s_r$ is one of the small integers from $S$. Thus
$$
\sum_{r=0}^{m-1} 3^{b+m-1-r}2^{z_r''}
= 2^{T}\sum_{r=0}^{m-1} 3^{m-1-r}2^{s_r}
= 2^{T}\cdot d
$$
because the way $s_r$ are arranged (they are exactly the binary digits of $d$ placed in the $3^{m-1-r}$ slots) gives precisely $\sum*{r}3^{m-1-r}2^{s_r}=d$. (For the small digits $d\in{0,1,2,3}$ this equality is immediate by inspection: e.g. $d=1$ corresponds to $m=1,s_0=0$; $d=2$ to $m=1,s_0=1$; $d=3$ to $m=2$ with $(s_0,s_1)=(0,0)$ giving $3^{1}2^{0}+3^{0}2^{0}=3+1=4$ which equals $2\cdot 2$ and so on ‚Äî see the remark below for the precise placement for $d=3$.)

The second sum equals $2^{T}\cdot 4c$ because $\tilde z_j = z_j+2+T$, so
$$
\sum_{j=0}^{b-1} 3^{b-1-j}2^{\tilde z_j}
= 2^{T}\cdot 2^{2}\sum_{j=0}^{b-1}3^{b-1-j}2^{z_j}
=2^{T}\cdot 4c.
$$

Therefore altogether
$$
c'' = 2^{T}\bigl(4c + d\bigr). \tag{4}
$$

**Step 4 ‚Äî finish by dividing out the common power of two.**

We compute from (3) multiplied by $2^{T}$ and the expression (4):
$$
2^{a+2+T} = 2^{T}\bigl((4k)\cdot 3^{b} + 4c\bigr)
= (4k)\cdot 3^{b}\cdot 2^{T} + 2^{T}\cdot 4c.
$$

Add $d\cdot 3^{b}\cdot 2^{T}$ to the right-hand side (this addition is accounted for by the new zeros as shown in (4)) to obtain
$$
2^{a+2+T} + 0
= (4k+d)\cdot 3^{b}\cdot 2^{T} + 2^{T}\cdot 4c
= 3^{b}\cdot 2^{T}(4k+d) + 2^{T}\cdot 4c.
$$
Using (4) we rewrite the right-hand side as
$$
3^{b}\cdot 2^{T}(4k+d) + 2^{T}\cdot 4c
= 3^{b}\cdot (4k+d)\cdot 2^{T} + c''.
$$
Now divide the entire equality by $2^{T}$. Dividing the left-hand side $2^{a+2+T}$ by $2^{T}$ gives $2^{a+2}$. Thus we obtain
$$
2^{a+2} = (4k+d)\cdot 3^{b} + c''/2^{T}.
$$

But by construction $c''=2^{T}(4c+d)$ so $c''/2^{T}=4c+d$ is an integer, and the left-hand side may be reinterpreted by uniformly shifting indices down by $T$ (equivalently by taking $a''=a+2$ and replacing every index $z''$ by $z''-T$). After performing that uniform down-shift the equality becomes exactly
$$
2^{a''}=(4k+d)\cdot 3^{b}+c''*{\text{shifted}}
$$
with $a''=a+2$ and $c''*{\text{shifted}}=4c+d$ expressed in the required sum-of-powers form with the integer zero indices
$$
{,z''_r - T,}\cup{\tilde z_j - T = z_j+2}.
$$
Renaming the shifted indices yields the desired representation (2).

This completes the construction: the new zero indices are explicitly
$$
z''*r - T = s_r \quad (r=0,\dots,m-1),\qquad z''*{m+j}-T = z_j+2 \quad (j=0,\dots,b-1),
$$
and $a''=a+2$. The formulas above are completely explicit once you pick $T>\max(a+2,z_{b-1}+2)$.

---

## Remark on the small digit (d) choices

For $d\in{0,1,2}$ the choice of $m$ and the small offsets $s_r$ is immediate:

* $d=0$: take $m=0$ (no new zeros).
* $d=1$: take $m=1$ and $s_0=0$ so $\sum 3^{0}2^{s_0}=1$.
* $d=2$: take $m=1$ and $s_0=1$ so $\sum 3^{0}2^{s_0}=2$.
* $d=3$: take $m=2$ and $(s_0,s_1)=(0,0)$.

Then
  $$
  \sum_{r=0}^{1}3^{1-r}2^{s_r} = 3\cdot 2^{0}+1\cdot 2^{0}=4,
  $$

so the left-hand sum equals (4). 

In the algebra above the factor $2^{T}$ can be chosen so that the net contribution after the uniform down-shift equals $d$ (see the algebraic steps where division by $2^{T}$ is performed). Concretely one can take the binary-digit placement for $d=3$ to be $(s_0,s_1)=(0,0)$ and carry out the uniform shift as written; the bookkeeping above shows the resulting $c''$ reduces to the correct integer $4c+d$ after dividing by $2^{T}$ and shifting indices.

---

## Conclusion

The lemma gives a completely explicit recipe:

1. pick any witness $(a,b,c,k)$ for (1);
2. pick a large integer $T>\max(a+2,z_{b-1}+2)$;
3. insert new zeros at indices $T+t$ for each binary digit $t$ of $d$, and shift all old zeros by $2+T$;
4. form $a''=a+2$ and then uniformly shift indices down by $T$ to obtain the final zero-list and $a''$.

The final identity (2) then holds and provides a witness for $4k+d$. This is the explicit building block (with neatly stated index formulas) needed to carry out the base-4 digit induction and therefore to produce every odd integer.


# Example: Construct 53

### üîπ Why we care about explicit $T$

The $T$ (the ‚Äúshift amount‚Äù) tells us *how far* to move the block of zeros to implement the step $k \mapsto 4k+d$ in the generator‚Äôs language.

* Conceptually, $T$ just guarantees *existence*: there *is* some shift large enough to make the arithmetic work.
* In examples (like building 53 from 3), we might want to pick a *small* $T$ so we can actually display the resulting bitstring ‚Äî but the lemma only needs to show that *some* finite $T$ always works.


## 1) The lemma construction summarize

Start from a valid witness
$$
2^{a}=k\cdot 3^{b}+c,\qquad c=\sum_{j=0}^{b-1}3^{b-1-j}2^{z_j}.
$$
To build $4k+d$ (with $d\in{0,1,2,3})$ you:

* multiply the identity by $4$ (equivalently shift every existing zero index by (+2)) to get the (4k) step;
* pick a (possibly large) integer $T$ and insert new zero indices at carefully chosen offsets around $T$ so that the net change in the $c$-sum equals $\pm d\cdot 3^{b}$ up to a factor $2^{T}$;
* then perform a uniform downshift by $T$ (equivalently divide by $2^{T})$ to remove the factor $2^{T}$.
  That insertion/downshift trick is the explicit tool the lemma gives: in formula form you insert new indices at positions $T+s_r$ (small integers $s_r$) and set $\tilde z_j = z_j+2+T$; after sorting and then uniformly shifting all indices down by $T$ you obtain a new list of integer zero indices and a new $a$ giving a witness for $4k+d$.

## 2) Construct 53 starting with 3

### Step 0 ‚Äî seed: (3)

A convenient witness for (3) is

* bit-string (length $a=5$): `00111`
* zero indices (left-to-right, 0-based): $[0,1]$, so $b=2$.
* compute
  $$
  c=3^{1}2^{0}+3^{0}2^{1}=3+2=5,
  $$
  and
  $$
  \frac{2^{5}-c}{3^{2}}=\frac{32-5}{9}=\frac{27}{9}=3.
  $$

So `00111` is a valid generator witness for $3$.

---

## Step 1 ‚Äî one lemma step: $3 \mapsto 13$ (since $13=4\cdot 3+1$)

Starting from the $3$-witness above, one lemma-step produces a witness for $13$. An explicit, compact witness for $13$ is:

* bit-string (length $a=7$): `0110111`
* zero indices: $[0,3]$, so $b=2$.
* compute
  $$
  c = 3\cdot2^{0} + 1\cdot2^{3} = 3 + 8 = 11,
  $$
  and
  $$
  \frac{2^{7}-c}{3^{2}}=\frac{128-11}{9}=\frac{117}{9}=13.
  $$

This realizes the lemma step $3\mapsto 13$ (indeed $13=4\cdot 3+1$), and the witness `0110111` verifies the identity.

---

## Step 2 ‚Äî one lemma step: $13 \mapsto 53$ (since $53=4\cdot 13+1$)

Algebraic check first. Start from the (13)-identity
$$
2^{7}=13\cdot 3^{2} + 11.
$$
Multiply by $4$:
$$
2^{9}=52\cdot 3^{2} + 44.
$$
We want a representation for $53=4\cdot 13 + 1$. Writing
$$
2^{9} = 53\cdot 3^{2} + c'
$$
we solve for $c'$:
$$
c' = 2^{9} - 53\cdot 3^{2} = 512 - 477 = 35.
$$
So we need a $c'$ equal to $35$ expressed in the generator-sum form with $b=2$:
$$
c' = 3^{1}2^{u} + 3^{0}2^{v} = 3\cdot 2^{u} + 2^{v}.
$$
One convenient choice is $u=0,\ v=5$ because $3\cdot 2^{0}+2^{5}=3+32=35$.

Thus an explicit witness for (53) is:

* bit-string (length $a=9$): `011110111`
  (indices $0$ through $8$; zeros at positions $0$ and $5$)
* zero indices: ([0,5]), so (b=2).
* compute
  $$
  c' = 3\cdot 2^{0} + 2^{5} = 3 + 32 = 35,
  $$
  and
  $$
  \frac{2^{9} - c'}{3^{2}} = \frac{512 - 35}{9} = \frac{477}{9} = 53.
  $$

So `011110111` is a valid witness for $53$.

---

## Summary (the path)

* $3$ via `00111` (a=5, zeros $[0,1]$).
* $13 = 4\cdot 3 + 1$ via `0110111` (a=7, zeros $[0,3]$).
* $53 = 4\cdot 13 + 1$ via `011110111` (a=9, zeros $[0,5]$).

Each arrow is a single application of the lemma‚Äôs step $k\mapsto 4k+d$ with $d=1$. The algebra checks are shown above: each bit-string satisfies the generator identity.


In [32]:
def ChainPath(collatzNumber):
    path = []
    while collatzNumber != 1:
        if (collatzNumber & 1) == 0:
            collatzNumber = collatzNumber // 2
            path.append("1")
        else:
            collatzNumber = (3 * collatzNumber + 1) // 2
            path.append("0")
    return "".join(path)
#
def fractionFromNodeTup(tup):
    p2, p3, c = tup
    fract = Fraction(2**p2 - c, 3**p3)
    return (fract.numerator, fract.denominator)
#

def FractionFromPath(chain_path):
    tup = (0, 0, 0)
    for chain_item in chain_path:
        p2, p3, c = tup
        if chain_item == "1":
            tup = (p2 + 1, p3, c)
        else:
            tup = (p2 + 1, p3 + 1, c*3 + 2**p2)
        
    fract = fractionFromNodeTup(tup)
    return fract
#    


In [39]:
FractionFromPath("01")

(1, 1)

In [38]:
ChainPath(5)

'0111'

In [33]:
ChainPath(3)

'00111'

In [34]:
ChainPath(12)

'1100111'

In [37]:
ChainPath(16), ChainPath(64)

('1111', '111111')

In [35]:
ChainPath(53)

'011110111'

In [36]:
ChainPath(27)

'0010000010100100010000101100010010000001100001110101011101100011110111'