# Random Number Statistics

Consider the case of a base-10 random number of $N$ digits. Every digit is randomly selected in the range $[0,9]$.

The probability of a specific series of digits, length $l$, arising is $10^{-l}$. There are $N-l+1$ opportunities to make a match.

The probability to *not match* each time is $1-10^{-l}$. Therefore the probability to not match anywhere in this number is

$(1-10^{-l})^{N-l+1}$

Take the specific example of a 3-digit match. Then the probability of a number of $N$ digits not matching is:

In [3]:
function non_match(N; length=3)
    N < length && return 1.0
    (1.0 - 10.0^(-length))^(N+1-length)
end

non_match (generic function with 1 method)

In [11]:
for n in [1, 3, 10, 50, 100, 500, 1000, 5000, 10000, 50000, 100000, 100001, 500000, 1000000]
    println("p($n) -> $(non_match(n))")
end

p(1) -> 1.0
p(3) -> 0.999
p(10) -> 0.992027944069944
p(50) -> 0.9531108968798943
p(100) -> 0.9066044494080757
p(500) -> 0.607593524316293
p(1000) -> 0.36843192017940235
p(5000) -> 0.006734574374039293
p(10000) -> 4.5263828369959795e-5
p(50000) -> 1.8848653148880386e-22
p(100000) -> 3.5456153734747036e-44
p(100001) -> 3.542069758101229e-44
p(500000) -> 5.558812362627309e-218
p(1000000) -> 0.0


Note that $p(1\,000\,000)$ is so small it cannot be represented as a `Float64`!

Now, what is the probability that above some $N_{min}$ that every random number *does* contain a match for a particular string?

Although the series is infinite, the probability constantly drops, so the probability sum can remain $<1$...

$\Sigma_{n=N_{min}}^\infty p(n)$

Make a numerical approximation to this probability, in the range $[10\,000, 1\,000\,000]$...

(Numerically it's hard to do this when the match probability is greater than $1-\epsilon$, because the `Float64` representation rounds up to $1$)

In [19]:
prob = 1.0
for n in 1_000_000:-1:10_000
    prob *= (1.0 - non_match(n))
end
println("Final probability estimate: $prob")

Final probability estimate: 0.9557448060534242


So this is really a *high* probability that every number *does* match!