<a href="https://colab.research.google.com/github/joshtburdick/misc/blob/master/plog/Factoring3.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Further attempt at factoring

This is to test a simpler variant of the factoring method using loopy belief propagation. But without the loopy belief propagation.

The main idea is, given $n$, to solve $n=(a+b)(a-b) \mod m$, where $m = \prod p_i$ for $P$ (smallish) primes. Then check $\mathrm{GCD}(n, x)$ for $2^{2P}$ numbers $x$ which are derived from $a$ and $b$ using the Chinese Remainder Theorem.

N.B.: This seems fairly impractical, as it requires computing GCD $2^{2P}$ times, for multiple choices of $a+b$ and $a-b$.

In [1]:
!pip install --quiet modulo

In [2]:
import itertools
import math

import numpy as np

from modulo import modulo

We'll need some Chinese Remainder Theorem utilities.

In [3]:
def solve_mod_primes(x_mod, primes):
    """Given what x is (mod some primes), solve for x.

    x_mod: an array of small integers, such that x % primes[i] == x_mod[i]
    primes: an array of primes

    Returns: x, in the range 1 <= x <= product(primes),
        satisfying x % primes[i] == xmod[i].
    """
    x = modulo(x_mod[0], primes[0])
    for i in range(1, len(primes)):
        x &= modulo(x_mod[i], primes[i])
    return int(x)

Let $n$ be the number to be factored (WLOG assume $n$ is odd). We want to write $n=(a+b)(a-b)$.

### Factoring $n$ $mod$ $m$

Fermat's method writes $n = a^2-b^2 = (a+b)(a-b)$. For convenience, let $x = a+b$ and $y = a-b$.

I had assumed we can assume $y=1$. This doesn't always work. Therefore, we search for increasing values of $y$, starting with 1. (We skip numbers divisible by any of the $p_i$.)

This guarantees that we'll *eventually* find a factor (when $x$ or $y$ is a factor of $n$). This is essentially trial division -- hopefully the method will find a factor sooner, but it's unclear if that will happen.

Given a value of $g$, it's easy enough to solve for $f$, using the Chinese Remainder Theorem. (Especially using the `modulo` library.)


In [4]:
primes = [3,5,7,11,13]
m = math.prod(primes)
y = modulo(17, m)
n = modulo(17*1001, m)

print(f"y = {y}, n = {n}")
x = n // y
print(f"n // y = {x}")
print(f"x*y = {x*y}")
print(f"x*y % m = {x*y % m}")


y = modulo(17, 15015), n = modulo(2002, 15015)
n // y = modulo(1001, 15015)
x*y = modulo(2002, 15015)
x*y % m = modulo(2002, 15015)


In [5]:
int(modulo(5-2, 17))

3

In [6]:
def compute_a_and_b(n, y, m):
  """Solves for $x$ in $xy = n (mod m)$.

  n: number to be factored
  y: the value of y == a-b
  m: the modulus
  """
  x = modulo(n, m) // y
  # print(x, y)
  # I think that we want to make sure that y-x is even;
  # this may not matter?
  if (int(y) - int(x)) % 2 != 0:
    print("switched x and y")
    (x, y) = (y, x)
  b = (y - x) // 2
  a = x + b
  return (int(a), int(b))

In [7]:
# test of this
compute_a_and_b(29*31, 1, 7*11*13*17*19*23)

(450, 7435980)

In [8]:
def factor(n, primes):
    """Factor n using the Chinese Remainder Theorem.

    n: the number to factor
    primes: an array of primes
    Returns: a nontrivial factor of n, or None on failure
    """
    def check_for_factors(y):
        a_mod_m, b_mod_m = compute_a_and_b(n, 1, math.prod(primes))
        a = [[a_mod_m % p, -a_mod_m % p] for p in primes]
        b = [[b_mod_m % p, -b_mod_m % p] for p in primes]
        # get all possible a +/- b ("generalized", for however
        # many prime factors)
        a_mod_m = [solve_mod_primes(a1, primes)
            for a1 in itertools.product(*a)]
        b_mod_m = [solve_mod_primes(b1, primes)
            for b1 in itertools.product(*b)]
        # check GCD of each of these
        for (a1,b1) in itertools.product(a_mod_m, b_mod_m):
            f = math.gcd(n, a1+b1)
            if f != 1 and f != n:
                # print(f"f = {f}")
                return f
        return None
    y = 1
    num_y_checked = 1
    while y < n:
        print(f"y = {y}", flush=True)
        f = check_for_factors(y)
        if f is not None:
            return {"f": f, "num_y_checked": num_y_checked}
        # go to next relatively prime y
        y += 1
        num_y_checked += 1
        while math.gcd(y, n) != 1:
            y += 1
    # hopefully _some_ value of y will work
    assert(False)

Some tests:

In [9]:
primes = [11,13,17]

In [10]:
factor(3*5, primes)

y = 1


{'f': 3, 'num_y_checked': 1}

In [11]:
factor(5*7, primes)

y = 1


{'f': 5, 'num_y_checked': 1}

In [12]:
primes = [11,13,17,19,23]

In [13]:
factor(29*31, primes)

y = 1


{'f': 31, 'num_y_checked': 1}

In [14]:
primes = [11,13,17,19,23,29,31]

In [15]:
factor(37*41, primes)

y = 1


{'f': 37, 'num_y_checked': 1}

In [16]:
factor(41*43, primes)

y = 1


{'f': 43, 'num_y_checked': 1}

In [17]:
factor(3*47, primes)

y = 1


{'f': 3, 'num_y_checked': 1}

In [18]:
factor(47*59, primes)

y = 1


{'f': 59, 'num_y_checked': 1}

In [19]:
# just confirming that, even though 47*59 isn't divisible by 11,
# a == 1386 is
a = (47*59-1) // 2
a, a % 11

(1386, 0)

In [20]:
factor(61*67, primes)

y = 1


{'f': 61, 'num_y_checked': 1}

## Slightly more testing

It seems to work for small numbers. What about slightly larger numbers?

In [21]:
def is_prime(n):
    if n < 2:
        return False
    for i in range(2, int(math.sqrt(n)) + 1):
        if n % i == 0:
            return False
    return True

def get_primes(n_primes):
    primes = []
    num = 2
    while len(primes) < n_primes:
        if is_prime(num):
            primes.append(num)
        num += 1
    return primes

primes = get_primes(1000)
display(primes[:10]) # display the first 10 primes

[2, 3, 5, 7, 11, 13, 17, 19, 23, 29]

In [22]:
# use some small primes (starting with 7) for m
small_primes = primes[3:12]
print(small_primes)
larger_primes = primes[300:310]
m = math.prod(small_primes)
print(f"m = {m}")

for (a, b) in itertools.combinations(larger_primes, 2):
    n = a*b
    if n > m:
        continue
    print(f"n = {n} = {a} * {b}", flush=True)
    f = factor(n, small_primes)
    if f is not None:
        print(f"f = {f}\n", flush=True)
    else:
        print(f"failed to factor {n} = {a}*{b}", flush=True)
        break

[7, 11, 13, 17, 19, 23, 29, 31, 37]
m = 247357937827
n = 3980021 = 1993 * 1997
y = 1
f = {'f': 1993, 'num_y_checked': 1}

n = 3984007 = 1993 * 1999
y = 1
f = {'f': 1993, 'num_y_checked': 1}

n = 3991979 = 1993 * 2003
y = 1
f = {'f': 2003, 'num_y_checked': 1}

n = 4007923 = 1993 * 2011
y = 1
f = {'f': 1993, 'num_y_checked': 1}

n = 4019881 = 1993 * 2017
y = 1
f = {'f': 2017, 'num_y_checked': 1}

n = 4039811 = 1993 * 2027
y = 1
f = {'f': 1993, 'num_y_checked': 1}

n = 4043797 = 1993 * 2029
y = 1
f = {'f': 1993, 'num_y_checked': 1}

n = 4063727 = 1993 * 2039
y = 1
f = {'f': 1993, 'num_y_checked': 1}

n = 4091629 = 1993 * 2053
y = 1
f = {'f': 2053, 'num_y_checked': 1}

n = 3992003 = 1997 * 1999
y = 1
f = {'f': 1997, 'num_y_checked': 1}

n = 3999991 = 1997 * 2003
y = 1
f = {'f': 1997, 'num_y_checked': 1}

n = 4015967 = 1997 * 2011
y = 1
f = {'f': 2011, 'num_y_checked': 1}

n = 4027949 = 1997 * 2017
y = 1
f = {'f': 1997, 'num_y_checked': 1}

n = 4047919 = 1997 * 2027
y = 1
f = {'f': 2027, 'n

However, if we don't use "enough" prime factors, using $a-b=1$ sometimes doesn't work:

In [23]:
# use some small primes (starting with 7) for m
small_primes = primes[3:9]
print(small_primes)
larger_primes = primes[302:305]
m = math.prod(small_primes)
print(f"m = {m}")

for (a, b) in itertools.combinations(larger_primes, 2):
    n = a*b
    if n > m:
        continue
    print(f"n = {n} = {a} * {b}", flush=True)
    f = factor(n, small_primes)
    if f is not None:
        print(f"f = {f}\n", flush=True)
    else:
        print(f"failed to factor {n} = {a}*{b}", flush=True)
        break

[7, 11, 13, 17, 19, 23]
m = 7436429
n = 4003997 = 1999 * 2003
y = 1
f = {'f': 2003, 'num_y_checked': 1}

n = 4019989 = 1999 * 2011
y = 1
y = 2
y = 3
y = 4
y = 5
y = 6
y = 7
y = 8
y = 9
y = 10
y = 11
y = 12
y = 13
y = 14
y = 15
y = 16
y = 17
y = 18
y = 19
y = 20
y = 21
y = 22
y = 23
y = 24
y = 25
y = 26
y = 27
y = 28
y = 29
y = 30
y = 31
y = 32
y = 33
y = 34
y = 35
y = 36
y = 37
y = 38
y = 39
y = 40
y = 41
y = 42
y = 43
y = 44
y = 45
y = 46
y = 47
y = 48
y = 49
y = 50
y = 51
y = 52
y = 53
y = 54
y = 55
y = 56
y = 57
y = 58
y = 59
y = 60
y = 61
y = 62
y = 63
y = 64
y = 65
y = 66
y = 67
y = 68
y = 69
y = 70
y = 71
y = 72
y = 73
y = 74
y = 75
y = 76
y = 77
y = 78
y = 79
y = 80
y = 81
y = 82
y = 83
y = 84
y = 85
y = 86
y = 87
y = 88
y = 89
y = 90
y = 91
y = 92
y = 93
y = 94
y = 95
y = 96
y = 97
y = 98
y = 99
y = 100
y = 101
y = 102
y = 103
y = 104
y = 105
y = 106
y = 107
y = 108
y = 109
y = 110
y = 111
y = 112
y = 113
y = 114
y = 115
y = 116
y = 117
y = 118
y = 119
y = 120
y = 121
y = 122
y

KeyboardInterrupt: 

In this case, it's searching for many values of $y$, and so is very slow. (I'm not sure it finds a factor, for this example.)

# Questions

- With $a$ and $b$ chosen with $a-b=1$, how often will at least one of the "generalized $a+b$" numbers have a nontrivial GCD with $n$? (It works for some small examples, but not larger examples.)

- What value of $m = \prod p_i$ works best? Presumably there are some trade-offs here.

- Given that the number of "generalized $a+b$" numbers grows like $2^{|P|}$, is this practical? (Presumably not.)

- What is this most similar to? (Gemini suggests the quadratic sieve, which seems plausible.)
