<a href="https://colab.research.google.com/github/joshtburdick/misc/blob/master/plog/Factoring3.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Further attempt at factoring

This is to test a simpler variant of the factoring method using loopy belief propagation. But without the loopy belief propagation.

The main idea is, given $n$, to solve $n=(a+b)(a-b) \mod m$, where $m = \prod p_i$ for $P$ (smallish) primes. Then check $\mathrm{GCD}(n, x)$ for $2^{2P}$ numbers $x$ which are derived from $a$ and $b$ using the Chinese Remainder Theorem.

N.B.: This seems fairly impractical, as it requires computing GCD $2^{2P}$ times, for multiple choices of $a+b$ and $a-b$.

In [1]:
!pip install --quiet modulo

In [2]:
import itertools
import math

import numpy as np

from modulo import modulo

We'll need some Chinese Remainder Theorem utilities.

In [3]:
def solve_mod_primes(x_mod, primes):
    """Given what x is (mod some primes), solve for x.

    x_mod: an array of small integers, such that x % primes[i] == x_mod[i]
    primes: an array of primes

    Returns: x, in the range 1 <= x <= product(primes),
        satisfying x % primes[i] == xmod[i].
    """
    x = modulo(x_mod[0], primes[0])
    for i in range(1, len(primes)):
        x &= modulo(x_mod[i], primes[i])
    return int(x)

Let $n$ be the number to be factored (WLOG assume $n$ is odd). We want to write $n=(a+b)(a-b)$.

### Factoring $n$ $mod$ $m$

Fermat's method writes $n = a^2-b^2 = (a+b)(a-b)$. For convenience, let $x = a+b$ and $y = a-b$.

I had assumed we can assume $y=1$. This doesn't always work. Therefore, we search for increasing values of $y$, starting with 1. (We skip numbers divisible by any of the $p_i$.)

This guarantees that we'll *eventually* find a factor (when $x$ or $y$ is a factor of $n$). This is essentially trial division -- hopefully the method will find a factor sooner, but it's unclear if that will happen.

Given a value of $g$, it's easy enough to solve for $f$, using the Chinese Remainder Theorem. (Especially using the `modulo` library.)


In [4]:
primes = [3,5,7,11,13]
m = math.prod(primes)
y = modulo(17, m)
n = modulo(17*1001, m)

print(f"y = {y}, n = {n}")
x = n // y
print(f"n // y = {x}")
print(f"x*y = {x*y}")
print(f"x*y % m = {x*y % m}")


y = modulo(17, 15015), n = modulo(2002, 15015)
n // y = modulo(1001, 15015)
x*y = modulo(2002, 15015)
x*y % m = modulo(2002, 15015)


In [5]:
int(modulo(5-2, 17))

3

In [6]:
def compute_a_and_b(n, y, m):
  """Solves for $x$ in $xy = n (mod m)$.

  n: number to be factored
  y: the value of y == a-b
  m: the modulus
  """
  x = modulo(n, m) // y
  # print(x, y)
  # I think that we want to make sure that y-x is even;
  # this may not matter?
  if (int(y) - int(x)) % 2 != 0:
    print("switched x and y")
    (x, y) = (y, x)
  b = (y - x) // 2
  a = x + b
  return (int(a), int(b))

In [7]:
# test of this
compute_a_and_b(29*31, 1, 7*11*13*17*19*23)

(450, 7435980)

In [8]:
def factor(n, primes):
    """Factor n using the Chinese Remainder Theorem.

    n: the number to factor
    primes: an array of primes
    Returns: a nontrivial factor of n, or None on failure
    """
    a_mod_m, b_mod_m = compute_a_and_b(n, 1, math.prod(primes))
    print(f"a_mod_m = {a_mod_m}, b_mod_m = {b_mod_m}")
    a = [[a_mod_m % p, -a_mod_m % p] for p in primes]
    b = [[b_mod_m % p, -b_mod_m % p] for p in primes]
    print(f"a = {a}")
    print(f"b = {b}")
    # get all possible a +/- b ("generalized", for however
    # many prime factors)
    a_mod_m = [solve_mod_primes(a1, primes)
        for a1 in itertools.product(*a)]
    b_mod_m = [solve_mod_primes(b1, primes)
        for b1 in itertools.product(*b)]
    # check GCD of each of these
    for (a1,b1) in itertools.product(a_mod_m, b_mod_m):
        f = math.gcd(n, a1+b1)
        if f != 1 and f != n:
            # print(f"f = {f}")
            return f
    return None

Some tests:

In [9]:
primes = [11,13,17]

In [10]:
factor(3*5, primes)

a_mod_m = 8, b_mod_m = 2424
a = [[8, 3], [8, 5], [8, 9]]
b = [[4, 7], [6, 7], [10, 7]]


3

In [11]:
factor(5*7, primes)

a_mod_m = 18, b_mod_m = 2414
a = [[7, 4], [5, 8], [1, 16]]
b = [[5, 6], [9, 4], [0, 0]]


5

In [12]:
primes = [11,13,17,19,23]

In [13]:
factor(29*31, primes)

a_mod_m = 450, b_mod_m = 1061898
a = [[10, 1], [8, 5], [8, 9], [13, 6], [13, 10]]
b = [[2, 9], [6, 7], [10, 7], [7, 12], [11, 12]]


31

In [14]:
primes = [11,13,17,19,23,29,31]

In [15]:
factor(37*41, primes)

a_mod_m = 759, b_mod_m = 955049195
a = [[0, 0], [5, 8], [11, 6], [18, 1], [0, 0], [5, 24], [15, 16]]
b = [[1, 10], [9, 4], [7, 10], [2, 17], [1, 22], [25, 4], [17, 14]]


37

In [16]:
factor(41*43, primes)

a_mod_m = 882, b_mod_m = 955049072
a = [[2, 9], [11, 2], [15, 2], [8, 11], [8, 15], [12, 17], [14, 17]]
b = [[10, 1], [3, 10], [3, 14], [12, 7], [16, 7], [18, 11], [18, 13]]


43

In [17]:
factor(3*47, primes)

a_mod_m = 71, b_mod_m = 955049883
a = [[5, 6], [6, 7], [3, 14], [14, 5], [2, 21], [13, 16], [9, 22]]
b = [[7, 4], [8, 5], [15, 2], [6, 13], [22, 1], [17, 12], [23, 8]]


3

In [18]:
factor(47*59, primes)

a_mod_m = 1387, b_mod_m = 955048567
a = [[1, 10], [9, 4], [10, 7], [0, 0], [7, 16], [24, 5], [23, 8]]
b = [[0, 0], [5, 8], [8, 9], [1, 18], [17, 6], [6, 23], [9, 22]]


59

In [19]:
# just confirming that, even though 47*59 isn't divisible by 11,
# a == 1386 is
a = (47*59-1) // 2
a, a % 11

(1386, 0)

In [20]:
factor(61*67, primes)

a_mod_m = 2044, b_mod_m = 955047910
a = [[9, 2], [3, 10], [4, 13], [11, 8], [20, 3], [14, 15], [29, 2]]
b = [[3, 8], [11, 2], [14, 3], [9, 10], [4, 19], [16, 13], [3, 28]]


61

## Slightly more testing

It seems to work for small numbers. What about slightly larger numbers?

In [21]:
def is_prime(n):
    if n < 2:
        return False
    for i in range(2, int(math.sqrt(n)) + 1):
        if n % i == 0:
            return False
    return True

def get_primes(n_primes):
    primes = []
    num = 2
    while len(primes) < n_primes:
        if is_prime(num):
            primes.append(num)
        num += 1
    return primes

primes = get_primes(1000)
display(primes[:10]) # display the first 10 primes

[2, 3, 5, 7, 11, 13, 17, 19, 23, 29]

In [22]:
# use some small primes (starting with 7) for m
small_primes = primes[3:12]
print(small_primes)
larger_primes = primes[300:310]
m = math.prod(small_primes)
print(f"m = {m}")

for (a, b) in itertools.combinations(larger_primes, 2):
    n = a*b
    if n > m:
        continue
    print(f"n = {n} = {a} * {b}", flush=True)
    f = factor(n, small_primes)
    if f is not None:
        print(f"f = {f}\n", flush=True)
    else:
        print(f"failed to factor {n} = {a}*{b}", flush=True)
        break

[7, 11, 13, 17, 19, 23, 29, 31, 37]
m = 247357937827
n = 3980021 = 1993 * 1997
a_mod_m = 1990011, b_mod_m = 247355947817
a = [[2, 5], [1, 10], [10, 3], [8, 9], [8, 11], [5, 18], [2, 27], [28, 3], [3, 34]]
b = [[6, 1], [0, 0], [4, 9], [10, 7], [12, 7], [19, 4], [28, 1], [4, 27], [35, 2]]
f = 1993

n = 3984007 = 1993 * 1999
a_mod_m = 1992004, b_mod_m = 247355945824
a = [[0, 0], [3, 8], [1, 12], [12, 5], [6, 13], [20, 3], [23, 6], [6, 25], [35, 2]]
b = [[1, 6], [9, 2], [0, 0], [6, 11], [14, 5], [4, 19], [7, 22], [26, 5], [3, 34]]
f = 1993

n = 3991979 = 1993 * 2003
a_mod_m = 1995990, b_mod_m = 247355941838
a = [[3, 4], [7, 4], [9, 4], [3, 14], [2, 17], [4, 19], [7, 22], [24, 7], [25, 12]]
b = [[5, 2], [5, 6], [5, 8], [15, 2], [18, 1], [20, 3], [23, 6], [8, 23], [13, 24]]
f = 2003

n = 4007923 = 1993 * 2011
a_mod_m = 2003962, b_mod_m = 247355933866
a = [[2, 5], [4, 7], [12, 1], [2, 15], [13, 6], [18, 5], [4, 25], [29, 2], [5, 32]]
b = [[6, 1], [8, 3], [2, 11], [16, 1], [7, 12], [6, 17], [2

However, if we don't use "enough" prime factors, using $a-b=1$ sometimes doesn't work:

In [23]:
# use some small primes (starting with 7) for m
small_primes = primes[3:9]
print(small_primes)
larger_primes = primes[302:305]
m = math.prod(small_primes)
print(f"m = {m}")

for (a, b) in itertools.combinations(larger_primes, 2):
    n = a*b
    if n > m:
        continue
    print(f"n = {n} = {a} * {b}", flush=True)
    f = factor(n, small_primes)
    if f is not None:
        print(f"f = {f}\n", flush=True)
    else:
        print(f"failed to factor {n} = {a}*{b}", flush=True)
        break

[7, 11, 13, 17, 19, 23]
m = 7436429
n = 4003997 = 1999 * 2003
a_mod_m = 2001999, b_mod_m = 5434431
a = [[6, 1], [10, 1], [12, 1], [11, 6], [7, 12], [10, 13]]
b = [[2, 5], [2, 9], [2, 11], [7, 10], [13, 6], [14, 9]]
f = 2003

n = 4019989 = 1999 * 2011
a_mod_m = 2009995, b_mod_m = 5426435
a = [[1, 6], [9, 2], [0, 0], [0, 0], [4, 15], [2, 21]]
b = [[0, 0], [3, 8], [1, 12], [1, 16], [16, 3], [22, 1]]
failed to factor 4019989 = 1999*2011


We could guarantee that this would always work, by trying many random values of $a+b$ and $a-b$ -- eventually, we'd just find those which are factors of $n$! This obviously might take a while...

# Questions

- With $a$ and $b$ chosen with $a-b=1$, how often will at least one of the "generalized $a+b$" numbers have a nontrivial GCD with $n$? (It works for some small examples, but not larger examples.)

- What value of $m = \prod p_i$ works best? Presumably there are some trade-offs here.

- Given that the number of "generalized $a+b$" numbers grows like $2^{|P|}$, is this practical? (Presumably not.)

- What is this most similar to? (Gemini suggests the quadratic sieve, which seems plausible.)
