Consider a large positive integer $N$ that we would like to factor and take a finite set $B$ of small primes. We call $B$ the factor base. Sometimes it is convenient to allow $-1$ as an element of $B$.

An integer is called $B$-smooth if all of its prime factors are in the set $B$.
Trial division is the simplest algorithm for factorisation. It works by systematically testing for divisibility by a set of numbers to find its factors.

In [3]:
import random

def is_b_smooth(N, B):
    '''
    Tests if an integer N is B-smooth using trial division and returns its factorisation.
    Args:
        N: The integer to test.
        B: The factor base (a set of primes).
    Returns:
        A dictionary of prime factors if N is B-smooth, otherwise None.
    '''
    if N == 0:
        return None
    if abs(N) == 1:
        return {} # 1 and -1 are B-smooth with no prime factors from B
    factors = {}
    num = N
    # Handle the factor -1 if it's in the factor base
    if -1 in B and num < 0:
        factors[-1] = 1
        num = abs(num)
    elif num < 0: # N is negative but -1 is not in B
        return None
    # Perform trial division with the primes in B
    for p in sorted(list(B)):
        if p == -1:
            continue # Skip -1 in the main loop
        while num % p == 0:
            factors[p] = factors.get(p, 0) + 1
            num //= p
    # If num is 1, then all its prime factors were in B
    if num == 1:
        return dict(factors)
    else:
        return None

def estimate_probability(d_range, num_samples, B):
    '''
    Estimates the probability that a d-digit integer is B-smooth.
    Args:
        d_range (range): A range of digit lengths to test (e.g., range(2, 11)).
        num_samples (int): The number of random integers to test for each d.
        B (set): The factor base.
    '''
    print(f"Factor Base B = {sorted(list(B))}\n")
    print(f"{'Digits (d)':<12}{'Smooth Count':<15}{'Total Samples':<15}{'Probability':<15}")
    print("-" * 60)
    for d in d_range:
        smooth_count = 0
        # A d-digit number is in the range [10**(d-1), 10**d - 1]
        start = 10**(d - 1)
        end = 10**d - 1
        for _ in range(num_samples):
            random_n = random.randint(start, end)
            if is_b_smooth(random_n, B):
                smooth_count += 1
        probability = smooth_count / num_samples
        print(f"{d:<12}{smooth_count:<15}{num_samples:<15}{probability:<15.6f}")

In [4]:
B_primes = {2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47}
B_with_neg_one = B_primes.union({-1})

print("\nEstimating B-Smooth Probability")
suitable_d_range = range(2, 21)
samples_per_d = 10000
estimate_probability(suitable_d_range, samples_per_d, B_primes)


Estimating B-Smooth Probability
Factor Base B = [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47]

Digits (d)  Smooth Count   Total Samples  Probability    
------------------------------------------------------------
2           8905           10000          0.890500       
3           4825           10000          0.482500       
4           2088           10000          0.208800       
5           774            10000          0.077400       
6           265            10000          0.026500       
7           64             10000          0.006400       
8           22             10000          0.002200       
9           3              10000          0.000300       
10          0              10000          0.000000       
11          1              10000          0.000100       
12          0              10000          0.000000       
13          0              10000          0.000000       
14          0              10000          0.000000       
15          0       

The probability of a number being B-smooth decreases rapidly as the number of digits increases. This is expected, as larger numbers are statistically more likely to have large prime factors.