[2026-01-23 Fiddler](https://thefiddler.substack.com/p/bingo)
====================

Fiddler
-------
There are 45! ways to order the numbers, but since the numbers not on my grid are
irrelevant, there are $8!$ ways to order the numbers.

There are 8 ways to get bingo after the first two numbers are marked, so the
probability of bingo after marking two numbers is $8\cdot6!/8!$.

There are 16 ways to have adjacent numbers marked after the first two numbers, and
there are 16 ways to have a corner and a non-adjacent number marked, there are
8 ways to have two adjacent non-corners marked, and there are 8 ways
to have two adjacent corners marked.

In the $16\cdot6!/8!$ chance of having adjacent numbers marked, there is a 3/6 probability
of bingo after marking three numbers, a 1/6 probability of having a corner and its
two adjacent numbers marked, meaing bingo is guaranteed after marking four numbers.
With the remaining 2/6, there is a 4/5 probability of bingo after marking four numbers
and 1/5 probability of bingo after marking five numbers.

In the $16\cdot6!/8!$ chance of having a corner and a non-adjacent number marked, there is
a 2/6 probability of bingo after marking three numbers, and 1/6 chance that the
third number is adjacent to the marked corner, 1/6 chance that the third number
is adjacent to the marked non-corner, 1/6 chance that the third number is the
remaining corner, and 1/6 chance that the third number is the remaining non-corner.
When the third number is adjacent to the marked corner, the case is the same as
when the first two numbers were adjacent and the third was a non-adjacent non-corner.

This is getting to be too much to think about, so I'll resort to code.

In [1]:
import functools
import time
@functools.cache
def bingo_distribution(n, marked):
    dist = {}
    p = 1/(n*n - len(marked))
    if n > 3 and (len(marked) == 2 or len(marked) == 3):
        print(time.asctime(),len(marked),marked) # show progress for long calculations
    for i in range(n):
        for j in range(n):
            if (i,j) in marked:
                continue
            new_marked = marked | frozenset([(i,j)])
            if (all((i,k) in new_marked for k in range(n))
                or all((k,j) in new_marked for k in range(n))
                or (i == j and all((k,k) in new_marked for k in range(n)))
                or (i == n-1-j and all ((k,n-1-k) in new_marked for k in range(n)))):
                dist[len(marked)] = dist[len(marked)] + p if len(marked) in dist else p
            else:
                for m, pm in bingo_distribution(n, canonicalize(n, new_marked)).items():
                    dist[m] = dist[m] + pm*p if m in dist else pm*p
    return dist

@functools.cache
def canonicalize(n, marked):
    s1 = set()
    s2 = set()
    s3 = set()
    s4 = set()
    s5 = set()
    s6 = set()
    s7 = set()
    for i, j in marked:
        s1.add((n-1-i,j))
        s2.add((i,n-1-j))
        s3.add((n-1-i,n-1-j))
        s4.add((j,i))
        s5.add((n-1-j,i))
        s6.add((j,n-1-i))
        s7.add((n-1-j,n-1-i))
    return frozenset(min(marked, s1, s2, s3, s4, s5, s6, s7))

In [2]:
f3 = bingo_distribution(3, frozenset([(1,1)]))
m3 = sum(k*p for k,p in f3.items())
print(f"3-by-3: {m3} ≈ {numerical_approx(m3)} markers on average before bingo")

3-by-3: 243/70 ≈ 3.47142857142857 markers on average before bingo


Extra credit
------------

In [3]:
# this is very slow: f5 = bingo_distribution(5, frozenset([(2,2)]))
f5 = {5: 3/1771,
 6: 25/5313,
 7: 5/483,
 8: 4835/245157,
 9: 16627/490314,
 10: 4388/81719,
 11: 679/8602,
 12: 5598/52003,
 13: 7020/52003,
 14: 174905/1144066,
 15: 24793/163438,
 16: 10266/81719,
 17: 6561/81719,
 18: 1171/33649,
 19: 39/4807,
 20: 1/1771,
 4: 2/5313}
m5 = sum(k*p for k,p in f5.items())
print(f"5-by-5: {m5} ≈ {numerical_approx(m5)} markers on average before bingo")

5-by-5: 4245967/312018 ≈ 13.6080835080027 markers on average before bingo


Numerical simulations
---------------------
[Numerical simulations](20260123.go) agree:

    $ go run 20260123.go
    3-by-3, 800000 trials: 3.470803 markers
    5-by-5, 800000 trials: 13.604291 markers
    3-by-3, 800000 trials: 17.746680 calls, 3.471909 markers
    5-by-5, 800000 trials: 41.383085 calls, 13.607770 markers
    3-by-3, 8000000 trials: 3.472149 markers
    5-by-5, 8000000 trials: 13.606783 markers
    3-by-3, 8000000 trials: 17.736121 calls, 3.471165 markers
    5-by-5, 8000000 trials: 41.368968 calls, 13.607681 markers
    3-by-3, 80000000 trials: 3.471211 markers
    5-by-5, 80000000 trials: 13.608461 markers
    3-by-3, 80000000 trials: 17.742934 calls, 3.471474 markers
    5-by-5, 80000000 trials: 41.368508 calls, 13.607828 markers

Further thoughts
----------------
I originally thought the problem was the average number of numbers called before bingo
rather than the number of numbers marked, which is slightly more complicated.

Let $f_N(k)$, calculated by the code above, be the probability distribution of getting
bingo after placing $k$ markers on a $N$-by-$N$ grid.  We can get the average number of
numbers called when $k$th number is marked, $g_{N,M}(k)$, where $M$ is the number of
numbers that can be called.  The average number of markers placed is $\sum kf_N(k)$, and
the average number of numbers called is $\sum g_{N,M}(k)f_N(k)$.

For the 3-by-3 grid, $N = 3$, and I'm taking $M = 45$.  For the 5-by-5 grid, $N = 5$ and
$M = 75$.

To calculate $g_{N,M}(k) = \sum_{n=k}^{M-(N^2-1-k)} np(n,k)$, where $p(n,k)$ is the
probability that the $k$th marker is the $n$th number called.  There are $M!$ ways to
order all the numbers called.  There are $(n-1)!$ ways to order the first $n-1$ numbers,
and $(M-n)!$ ways to order the numbers after the $n$th number.  And there are $N^2-1$
possible numbers that can be the $n$th number called.  There are $\binom{N^2-2}{k-1}$ 
ways to divide the numbers on the grid between the ones called before the $n$th and the
ones called after the $n$th.  Similarly for all the numbers not on the grid.

In [4]:
n = var("n")
g(N,M,k) = sum(n*
               factorial(n-1)*
               factorial(M-n)*
               (N^2-1)*
               binomial(N^2-2,k-1)*
               binomial(M-(N^2-1),n-k)/factorial(M),
               n, k, M-(N^2-1-k))               

In [5]:
m3 = sum(k*p for k,p in f3.items())
c3 = sum(g(3,45,k)*p for k,p in f3.items())
print(f"3-by-3: {c3.simplify()} ≈ {numerical_approx(c3)} called, {m3} ≈ {numerical_approx(m3)} markers")
m5 = sum(k*p for k,p in f5.items())
c5 = sum(g(5,75,k)*p for k,p in f5.items())
print(f"5-by-5: {c5.simplify()} ≈ {numerical_approx(c5)} called, {m5} ≈ {numerical_approx(m5)} markers")

3-by-3: 621/35 ≈ 17.7428571428571 called, 243/70 ≈ 3.47142857142857 markers
5-by-5: 8491934/205275 ≈ 41.3685738643283 called, 4245967/312018 ≈ 13.6080835080027 markers


And these results agree with the results of the numerical simulations.