# IBM Ponder This - February 2025

## Problem Statement


Given a number $A$ , we wish to construct an $N\times N$ square of digits $(0,1,2,3,4,5,6,7,8,9)$ such that

1. The sum of each row, column, and diagonal is $A$ .
2. The number appearing in each row (left-to-right), column (top-down), and diagonals (top left to bottom right and top right to bottom left) is an $N$-digit prime number (no leading zeros).

Note that the same digit can appear multiple times in the square, and the same prime number can appear multiple times in the square.

The cost of a square is computed in the following manner: We list all the prime numbers appearing in the rows, columns, and diagonals of the square. We remove any duplicates from the list. For each digit $d$, we count the total number of times it appears in the primes in the list. Each time incurs a cost: the first time the cost is 0, the second time it's 1, the third time it's 2, and so on. Note that the same digit in the square appears in more than one number and hence might be counted more than once; we count the occurrences of the digits in the prime numbers, not in the square itself.

For example, for $N=4$ we have the following square:

1 7 7 7 \
5 5 5 7 \
9 3 9 1 \
7 7 1 7 

in which $A=22$ and which contains the unique primes 7717, 9391, 5557, 7591, 7537, 1597, and 1777. The square has a total of 5 occurrences of the digit 1, 2 of the digit 3, 6 of the digit 5, 11 of the digit 7, and 4 of the digit 9, resulting in a cost of 87 in total. This is the $4\times 4$ square with minimum cost. Similarly, the $4\times 4$ square of maximum cost is

5 7 9 1 \
1 7 7 7 \
7 5 3 7 \
9 3 3 7

which has a cost of 143.

Your goal: For $N=5$, find the squares of minimal cost and maximal cost (without limiting $A$ to $A=22$). Provide your squares as written above, with the score of each square written above it.

A bonus "*" will be given for finding squares of minimal and maximal costs for $N=6$.

## Solution

First we generate all the prime numbers in the relevant range and group together the primes with the same digits sum. Then, we proceed group by group. First, we use a Trie data structure to efficiently store all the primes (belonging to the current group) and their prefixes. We create a $N \times N$ grid with -1 values that we will fill with relevant values. We will fill the grid using backtracking. To figure out the potential values in a cell, we want to keep track of the valid values based on the vertical, horizontal, diagonal and anti-diagonal prefixes. To do so, we associate each grid cell with a reference to the Trie nodes corresponding to the cell for all the orientations (vertical, horizontal, etc). The candidate values are at the intersection of the sets of values for the cell for each orientation. As soon as a grid is fully filled, we compute its score. Scoring is done by removing the prime duplicates and couting the frequency of occurence of each digit. Then if $x$ is the number of occurences of a digit, we increase the score by $\frac{1}{2}x(x-1)$. If the total score is lower/higher than the minimum/maximum observed so far, we update our result.

This method is suitable for a $5 \times 5$ grid but is not efficient enough for larger grids.

In [1]:
from collections import defaultdict, Counter
import sympy

In [2]:
n = 5

In [3]:
class TrieNode:
    def __init__(self):
        self.children = {}
        self.is_end_of_word = False

class Trie:
    def __init__(self):
        self.root = TrieNode()

    def insert(self, word: str) -> None:
        node = self.root
        for char in word:
            if char not in node.children:
                node.children[char] = TrieNode()
            node = node.children[char]
        node.is_end_of_word = True


def digit_sum(n):
    res = 0
    while n != 0:
        n, r = divmod(n, 10)
        res += r
    return res

In [4]:
# Generate the relevant primes
primes = set(sympy.primerange(10**(n - 1), 10**n))

# Group primes by sum of digits
groups = defaultdict(list)
for prime in primes:
    groups[digit_sum(prime)].append(str(prime))

In [5]:
# Initialize result variables
min_score = float('inf')
max_score = 0
min_grid = None
max_grid = None


def backtrack(n: int, i: int, j: int, horizontals, verticals, diagonal, antidiagonal, grid, root):

    global min_score, max_score, min_grid, max_grid

    # If grid is complete, score the grid and update result variables if necessary
    if i == n:
        # Add all the primes in the grid to a set to remove duplicates
        curr_primes = set()
        diag = []
        antidiag = []
        for i in range(n):
            curr_primes.add(''.join(grid[i]))
            curr_primes.add(''.join([grid[x][i] for x in range(n)]))
            diag.append(grid[i][i])
            antidiag.append(grid[i][n - i - 1])
        curr_primes.add(''.join(diag))
        curr_primes.add(''.join(antidiag))
        # Count the occurence of each digit
        s = ''.join(curr_primes)
        freq = Counter(s)
        # Update the score for each digit
        curr_score = 0
        for x in freq.values():
            curr_score += ((x - 1) * x) // 2
        # Update result variables if necessary
        if curr_score <= min_score:
            min_score = curr_score
            min_grid = [row[:] for row in grid]
        if curr_score >= max_score:
            max_score = curr_score
            max_grid = [row[:] for row in grid]
        return

    # Get intersection of candidate horizontal and vertical values
    candidates = horizontals[i][j].children.keys() & verticals[i][j].children.keys()
    # Backtrack if no valid value
    if len(candidates) == 0:
        return

    # Get diagonal candidates
    if i == j:
        candidates &= diagonal[i].children.keys()
        if len(candidates) == 0:
            return

    # Get anti-diagonal candidates
    if i == n - j - 1:
        candidates &= antidiagonal[i].children.keys()

    # Try to fill the cell with candidates and go to next cell
    for digit in candidates:
        # Fill digit
        grid[i][j] = digit

        # Update next cell horizontal trie reference
        if j < n - 1:
            horizontals[i][j + 1] = horizontals[i][j].children[digit]
        elif j == n - 1 and i < n - 1:
            horizontals[i + 1][0] = root

        # Update next cell vertical trie reference
        if i < n - 1:
            verticals[i + 1][j] = verticals[i][j].children[digit]

        # Update next cell diagonal trie reference
        if i == j and i < n - 1:
            diagonal[i + 1] = diagonal[i].children[digit]

        # Update next cell anti-diagonal trie reference
        if i == n - j - 1 and i < n - 1:
            antidiagonal[i + 1] = antidiagonal[i].children[digit]

        # Proceed to next cell
        if j < n - 1:
            backtrack(n, i, j + 1, horizontals, verticals, diagonal, antidiagonal, grid, root)
        else:
            backtrack(n, i + 1, 0, horizontals, verticals, diagonal, antidiagonal, grid, root)



# Initialize grid
grid = [[-1] * n for _ in range(n)]

# Precess groups one by one
for group in groups.values():
    # Create trie and insert all the relevant primes
    trie = Trie()
    for prime in group:
        trie.insert(prime)

    # Initialize arrays to keep track of trie node references corresponding to each cell
    horizontals = [[trie.root] * n for _ in range(n)]
    verticals = [[trie.root] * n for _ in range(n)]
    diagonal = [trie.root] * n
    antidiagonal = [trie.root] * n

    # Find the answer using backtracking
    backtrack(n, 0, 0, horizontals, verticals, diagonal, antidiagonal, grid, trie.root)

In [6]:
min_score, max_score

(61, 488)

In [7]:
max_grid

[['1', '7', '3', '3', '3'],
 ['4', '1', '8', '1', '3'],
 ['3', '3', '3', '1', '7'],
 ['2', '3', '2', '9', '1'],
 ['7', '3', '1', '3', '3']]

In [8]:
min_grid

[['2', '8', '4', '6', '3'],
 ['8', '9', '0', '5', '1'],
 ['4', '0', '7', '3', '9'],
 ['6', '5', '3', '2', '7'],
 ['3', '1', '9', '7', '3']]