<font size="18">Automatically Solving Sudokus and Meta Sudokus</font>

In [None]:
# set this notebook to use a large part of the browser window width
from IPython.core.display import HTML, display
display(HTML("<style>.container { width:70% !important; }</style>"))

# The well known Sudoku Puzzle

## Introduction

The Sudoku version played on a 9 x 9 squares board, is a well known puzzle game. 
The goal of the puzzle is for the player to put a digit from 1 to 9 in each free square,
so that:
    (a) in every row of 9 squares, no digit occurs more than once
    (b) in every column of 9 squares, no digit occurs more than once
    (c) in every marked 3x3 subsquare, no digit occurs more than once.
A given 9 * 9 Sudoku puzzle typically contains some squares that already contain a number from 1 to 9.
An example Sudoku challenge, taken from [1] is:

<!DOCTYPE html>
<html>
<head>
<style>
table { border-collapse: collapse; font-family: Calibri, sans-serif; }
colgroup, tbody { border: solid medium; }
td { border: solid thin; height: 1.4em; width: 1.4em; text-align: center; padding: 0; }
</style>
</head>
<body>
<table>
  <caption>A standard Sudoku challenge</caption>
  <colgroup><col><col><col>
  <colgroup><col><col><col>
  <colgroup><col><col><col>
  <tbody>
   <tr> <td>1 <td>  <td>3 <td>6 <td>  <td>4 <td>7 <td>  <td>9
   <tr> <td>  <td>2 <td>  <td>  <td>9 <td>  <td>  <td>1 <td>
   <tr> <td>7 <td>  <td>  <td>  <td>  <td>  <td>  <td>  <td>6</td></tr>
  </tbody>
  <tbody>
   <tr> <td>2 <td>  <td>4 <td>  <td>3 <td>  <td>9 <td>  <td>8
   <tr> <td>  <td>  <td>  <td>  <td>  <td>  <td>  <td>  <td>
   <tr> <td>5 <td>  <td>  <td>9 <td>  <td>7 <td>  <td>  <td>1
  <tbody>
   <tr> <td>6 <td>  <td>  <td>  <td>5 <td>  <td>  <td>  <td>2
   <tr> <td>  <td>  <td>  <td>  <td>7 <td>  <td>  <td>  <td>
   <tr> <td>9 <td>  <td>  <td>8 <td>  <td>2 <td>  <td>  <td>5
</table>
</body>
</html>

<html>
<head>

  <style>
  table { border-collapse: collapse; font-family: Calibri, sans-serif; }
  colgroup, tbody { border: solid medium; }
  td { border: solid thin; height: 1.4em; width: 1.4em; text-align: center; padding: 0; }
  </style>
  <title></title>
</head>
<body>
  <table>
    <caption>
      A standard Sudoku challenge
    </caption>
    <colgroup>
      <col>
      <col>
      <col>
    </colgroup>
    <colgroup>
      <col>
      <col>
      <col>
    </colgroup>
    <colgroup>
      <col>
      <col>
      <col>
    </colgroup>
    <tbody>
      <tr>
        <td>1</td>
        <td></td>
        <td>3</td>
        <td>6</td>
        <td></td>
        <td>4</td>
        <td>7</td>
        <td></td>
        <td>9</td>
      </tr>
      <tr>
        <td></td>
        <td>2</td>
        <td></td>
        <td></td>
        <td>9</td>
        <td></td>
        <td></td>
        <td>1</td>
        <td></td>
      </tr>
      <tr>
        <td>7</td>
        <td></td>
        <td></td>
        <td></td>
        <td></td>
        <td></td>
        <td></td>
        <td></td>
        <td>6</td>
      </tr>
    </tbody>
    <tbody>
      <tr>
        <td>2</td>
        <td></td>
        <td>4</td>
        <td></td>
        <td>3</td>
        <td></td>
        <td>9</td>
        <td></td>
        <td>8</td>
      </tr>
      <tr>
        <td></td>
        <td></td>
        <td></td>
        <td></td>
        <td></td>
        <td></td>
        <td></td>
        <td></td>
        <td></td>
      </tr>
      <tr>
        <td>5</td>
        <td></td>
        <td></td>
        <td>9</td>
        <td></td>
        <td>7</td>
        <td></td>
        <td></td>
        <td>1</td>
      </tr>
    </tbody>
    <tbody>
      <tr>
        <td>6</td>
        <td></td>
        <td></td>
        <td></td>
        <td>5</td>
        <td></td>
        <td></td>
        <td></td>
        <td>2</td>
      </tr>
      <tr>
        <td></td>
        <td></td>
        <td></td>
        <td></td>
        <td>7</td>
        <td></td>
        <td></td>
        <td></td>
        <td></td>
      </tr>
      <tr>
        <td>9</td>
        <td></td>
        <td></td>
        <td>8</td>
        <td></td>
        <td>2</td>
        <td></td>
        <td></td>
        <td>5</td>
      </tr>
    </tbody>
  </table>
</body>
</html>

## Sudoku Model Formulation

A Sudoku is a very easy problem to formulate as an Integer Linear Programming (ILP) problem. Its mathematical declarative model can be stated as follows.

### Sets and Indices


$D \in \mathbb{N}$: meta sudoku dimension of the Sudoku board.

$\overline{n} = D^{2}$: the maximum number to be filled out in the Sudoku. The minimum number is always 1.

$r \in R$: indices for the row of the Sudoku board.

$c \in C$: indices for the column of the Sudoku board.

$n \in N$: indices for the allowable number set of the Sudoku board.


### Parameters 

$f_{r, c}: R \times C \mapsto N$: Specifies the already decided - fixed - numbers for some board squares in the Sudoku problem definition. 
f_{r,c,n} is 1 if n is the number decided for row r and column c of the Sudoku board. Typically, note all combinations of r and c are be specified.


### Decision Variables
$b_{r, c, n} \in \{0, 1\}$: This variable is equal to 1, if we decide to put in row r and column c the number n. Otherwise, the decision variable is equal to zero.

$n_{r, c} \in \{0, \overline{n}\}$: This variable is equal to n, if we decide to put in row r and column c the number n. Otherwise, the decision variable is equal to zero.
They formulate a more direct way to represent the solution than the b variables. Of course b and n cannot be decided upon separately so there will be a binding constraint between them.

### Objective Function

Since Sudoku is a feasibility problem only, there is no concept of optimality here and so no objective function to be minimize or maximized needs to be specified.

### Constraints 

- **Square-wise constraints **. Each square of the Sudoku board holds exactly 1 number $n \in N$.

\begin{equation}
\sum_{n \in N} b_{r, c, n} = 1 \quad \forall (r,c) \in R \times C
\tag{1}
\end{equation}

- **Row-wise constraints **. For each row $r$, ensure that each number $n \in N$ occurs exactly once.

\begin{equation}
\sum_{c \in C} b_{r, c, n} = 1 \quad \forall (r,n) \in R \times N
\tag{2}
\end{equation}

- **Columns-wise constraints **. For each column $c$, ensure that each number $n \in N$ occurs exactly once.

\begin{equation}
\sum_{r \in R} b_{r, c, n} = 1 \quad \forall (c,n) \in C \times N
\tag{3}
\end{equation}

- **SubBoard-wise constraints **. For each $D*D$-sized subboard $n$, ensure that each number $n \in N$ occurs exactly once.

\begin{equation}
\sum_{(r\_, c\_) \in R \times C} b_{r*D+r\_, c*D+c\_, n} = 1 \quad \forall (r, c, n) \in R \times C\times  N
\tag{4}
\end{equation}


- **Preset squares constraints **.

\begin{equation}
\sum_{n \in N} b_{r, c, n} \cdot (n+1)= f_{r,c} \quad \forall (r,c) \in dom(f_{r,n})
\tag{5}
\end{equation}

- **Linking constraints between binary and numeric decision variables **.

\begin{equation}
\sum_{n \in N} b_{r, c, n} \cdot (n+1)= n_{r,c} \quad \forall (r,c) \in R \times C
\tag{6}
\end{equation}

# General Implementation in Python3 with gurobipy API to the Gurobi MILP Solver

We store the Sudoku problem formulation in a json file, like this one.

## Store a Sudoku Problem specification in a Json File

In [None]:
cat 'sudokuDim3.json'

The "dim" field indicates the dimension of the Sudoku which is in the most common case equal to 3. The "fixed" field prefixes a dictionary with first key being the Sudoku board row, second key being the board column and the value being the number that is already decided for that row and column of the board.

## Read a Sudoku Problem from a Json File

We then write a function to read the sudoku from such a json file, perform some checks at the same time and return us the dimension "dim" and the "fixed" data structure as a python dictionary.

In [None]:
import json

def read_sudoku_from_json_file_and_check(file_name):
    verbose = 0  # set to 1 to see more output

    sudoku_json = json.load(open(file_name))

    # do some basic checks to see we have all the information needed and none other
    errors = ''
     
    # read dim(ension) field
    dim = int(sudoku_json["dim"])
    if verbose > 0: 
        print("dim = {:d}".format(dim))
    max_nr = pow(dim, 2)
    if verbose > 0: 
        print("max_nr = {:d}".format(max_nr) if (verbose>0) else '')
    nrs = list(range(1, max_nr+1))
    nrs_str = '[' + ','.join([str(nr) for nr in nrs]) + ']'
    if verbose > 0: 
        print(nrs_str if (verbose>0) else '')

    # read fixed part
    fixed = sudoku_json["fixed"]
    for row in fixed:
        r = int(row)
        if r not in nrs:
            errors += \
            'row index number should be in {:s} but is {:d}.\n'.\
            format(nrs_str, r)
        for col in fixed[row]:
            c = int(col)
            if c not in nrs:
                errors += 'column index number should be in {:s} but is {:d}.\n'.\
                format(nrs_str, c)
            num = fixed[row][col]
            n = int(num)
            if n not in nrs:
                errors += 'square[{:d}][{:d}] number should be in {:s} but is {:d}.\n'.\
                format(r, c, nrs_str, n)

    print('I have read a ' + ('faulty' if (errors!='') else 'valid') +\
          ' MetaSudoku problem description of dimension {:d}.'.format(dim) + '\n' + errors)
    return dim, fixed
    
dim, fixed = read_sudoku_from_json_file_and_check('sudokuDim3.json')

## Solve a Sudoku Problem

The following function solves the problem using the solver Gurobi.

In [None]:
# tested with Python 3.7.6 & Gurobi 9
from gurobipy import *

def solve_sudoku_with_gurobi(dim, fixed):

    verbose = 0
    
    n_rows = n_cols = n_nums = dim * dim
    n_subs = dim

    rows = cols = nums = list(range(n_rows))
    subs  = list(range(n_subs))

    if verbose > 0:
        print(rows); print(cols); print(nums)
        print(subs)
    
    m = Model()

    # define the binary core variables
    bin_vars = m.addVars(n_rows, n_cols, n_nums, vtype=GRB.BINARY, name='bin')

    # define the basic Constraints
    for r in rows:
        for c in cols:
            constr_name = \
            'uniqueNumberPerSquare_r{:d}_c{:d}'.format(r, c)
            m.addConstr(quicksum(bin_vars[r,c,n] for n in nums) == 1, 
                        constr_name)

    for r in rows:
        for n in nums:
            constr_name = \
            'noDoublesInRow_r{:d}_n{:d}'.format(r, n)
            m.addConstr(quicksum(bin_vars[r,c,n] for c in cols) == 1, 
                        constr_name)

    for c in cols:
        for n in nums:
            constr_name = \
            'noDoublesInCol_c{:d}_n{:d}'.format(c, n)
            m.addConstr(quicksum(bin_vars[r,c,n] for r in rows) == 1, 
                        constr_name)

    import itertools
    combos = list(itertools.product(*[subs, subs]))
    if verbose > 0:
        print(combos)

    for r in subs:
        for c in subs:
            for n in nums:
                constr_name = \
                    'noDoublesInSubboard_r{:d}_c{:d}_n{:d}'.format(r, c, n)
                m.addConstr(quicksum(bin_vars[r*dim+r_,c*dim+c_,n] \
                                     for r_,c_ in combos) == 1, constr_name)

    # define the numeric helper variables so that the board 
    # can easily be displayed:
    num_vars = m.addVars(n_rows, n_cols, vtype=GRB.INTEGER, 
                         lb=1, ub=n_nums, name='num') 
    # note that the lower bound is 1 and not 0.     
    
    # initial squares, fixed
    for r_str in fixed:
        r = int(r_str)-1
        for c_str in fixed[r_str]:
            c = int(c_str)-1
            f = int(fixed[r_str][c_str])
            constr_name = 'binFixRelation_r{:d}_c{:d}_f{:d}'.format(r, c, f)
            m.addConstr(quicksum(bin_vars[r,c,n] * (n+1) for n in nums)\
                        == f, constr_name)    
    
    # define the constraints linking binary and numeric constraints
    for r in rows:
        for c in cols:
            constr_name = 'binNumRelation_r{:d}_c{:d}'.format(r, c)
            m.addConstr(quicksum(bin_vars[r,c,n] * (n+1) for n in nums) \
                        == num_vars[r,c], constr_name)
            # note the n+1 i.o. because of the lower bound 
            # of 1 of the num_vars.

    # optimize the model
    m.optimize()
    
    # retrieve solution
    num_vals = m.getAttr('x', num_vars)
        
    return rows, cols, subs, num_vals

rows, cols, subs, num_vals = solve_sudoku_with_gurobi(dim, fixed)

## Display a solved Sudoku problem

This function generates html code that can be easily displayed in this python notebook.

In [None]:
def display_solution(
    rows, cols, subs, 
    fixed, num_vals, caption, 
    sudoku_table_style="table { border-collapse: collapse; " + \
    "font-family: Calibri, sans-serif; } " + \
    "colgroup, tbody { border: solid thin; } td { td border: solid thin; "\
    "height: 1.4em; width: 1.4em; text-align: center; padding: 0; }"):
    table = '\n<table>\n'
    table += '  <caption>{:s}</caption>\n'.format(caption)
    N = len(subs)
    for s1 in subs:
        table += '  <colgroup>'
        for s2 in subs:
            table += '<col>'
        table += '\n'
    for r in rows:
        if (r % N) == 0:
            table += '\n<tbody>'
        table += '\n  <tr>'
        for c in cols:
            pre = '<td style="color:black;">'
            if str(r+1) in fixed:
                if str(c+1) in fixed[str(r+1)]:
                    pre = '<td style="color:red;">'   
            table += ' ' + pre + '{:d} '.format(int(num_vals[(r,c)]))
    table += '\n</table>'
    # HTML(table)
    return HTML('<html><head><style>' + sudoku_table_style  + \
                '</style></head><body>' + table + '</body></html>') 
    
display_solution(rows, cols, subs, fixed, num_vals, '3x3x3x3 Sudoku')

## Write the solution back to a json File

You may have spotted that our input file 'sudokuDim3.json', specifying the Sudoku problem, contained an empty subdictionary with key "solved" and that it was not read at all by the function 'read_sudoku_from_json_file_and_check'. This is of course a placeholder for the solution to be written back. Let's write a function to do that. Note that we want to keep the separation between the fixed and the solved squares in the output file.

In [None]:
def write_sudoku_solution_to_json_file(dim, fixed, num_vals, output_file_name):
    d = {}
    d["dim"] = dim
    d["fixed"] = fixed  # fixed stores keys in row then col and both 
    # in string form already, since we read it from json input
    d["solved"] = {}
    for (row, col) in num_vals:  # num_vals stores row, col keys 
        # as an integer pair 
        #print(row, col)
        row_str = str(row+1)
        col_str = str(col+1)
        if row_str in fixed and col_str in fixed[row_str]:
            # it's part of the fixed squares and will be written 
            # out via d["fixed"]
            pass
        else:
            if not (row_str in d["solved"]):
                d["solved"][row_str]= {}
            d["solved"][row_str][col_str] = int(num_vals[(row,col)])
    with open(output_file_name, 'w') as outfile:
        json.dump(d, outfile, indent=2)
        
write_sudoku_solution_to_json_file(dim, fixed, num_vals, 
                                   'sudokuDim3_solved.json')

In [None]:
cat 'sudokuDim3_solved.json'

## One function to read, solve, write and display a Sudoku

Taking it all together we can bundle the reading, solving and displaying into one function.

In [None]:
def read_solve_write_display_sudoku(input_file_name, display=True):
    dim, fixed = read_sudoku_from_json_file_and_check(input_file_name)
    rows, cols, subs, num_vals = solve_sudoku_with_gurobi(dim, fixed)
    output_file_name = input_file_name.replace('.json', '_solved.json')
    write_sudoku_solution_to_json_file(dim, fixed, 
                                       num_vals, output_file_name)
    if display:
        html_table = display_solution(
            rows, cols, subs, fixed, num_vals,
            '{:d} x {:d} x {:d} x {:d} Sudoku'.format(dim, dim, 
                                                      dim, dim))
        return html_table

## Fixed point check
The fixed part of this dictionary is of course exactly the same as of the unsolved version in the file 'sudokuDim3.json'. This means we could test that the solved version solves to the same solution. 

In [None]:
read_solve_write_display_sudoku('sudokuDim3_solved.json', 
                                display=False)

Indeed:

diff sudokuDim3_solved.json sudokuDim3_solved_solved.json

gives no output, meaning the files are identical.

# Meta Sudoku

By the definition of the variable $D$ above, or just by the title of this article, you will have realised that
a Sudoku can be extended to higher values of dim. An example of a Meta Sudoku of dimension 4 is for example.

## Scaling Down

Oh, let's first try to solve smaller Sudokus, like for D=1 and for D=2. That's a good test to see if our code is robust against corner cases.

In [None]:
read_solve_write_display_sudoku('sudokuDim2.json')

That is easy to check for corectness.

In [None]:
read_solve_write_display_sudoku('sudokuDim1.json')

That's fine and the only 1x1 Sudoku around.

In [None]:
read_solve_write_display_sudoku('sudokuDim0.json')

Even that works! :) That may not surprise you but for example the solver XPRESS, at least for its C++ API in 2016, gave an error if you pass it a problem with 0 variables and zero constraints.

## Scaling Up

Time to scale up now. How about dimension 4?

In [None]:
read_solve_write_display_sudoku('sudokuDim4.json')

In [None]:
read_solve_write_display_sudoku('sudokuDim5.json')

In [None]:
read_solve_write_display_sudoku('sudokuDim6.json')

In [None]:
read_solve_write_display_sudoku('sudokuDim7.json')

# Other Sudoku Related Ideas

How does Gurobi compare to previous versions of its solver and also to other solvers in terms of solver times for larger MetaSudoku instances? The code we wrote uses the Gurobi specific Python API, and we do not want to recode it for each separate solver. But one can imagine it should be possible to write out solver independent AMPL code and then run CPLEX or XPRESS or CBC via that AMPL code.

An App on your phone that would recognize a Sudoku problem by camera, also recognize the filled in digits using some OCR and then immediately solve the problem and overlay the solution on screen in an Augmented Reality sense would not save the world, but still be really cool, right?! :)

We can very well solve Sudokus by computer now, but in fact Sudokus are created because some humans seem to derive pleasure from solving them with their natural brains. So depriving them from that satisfaction is not a very useful undertaking. Here, we have been reading Sudokus from a json file, but I generated them manually in a pretty much trial and error way. As for computer generation of them, surely brute force generation of random numbers for some random squares can lead to infeasible Sudoku problems. So clearly something smarter is needed. It could be enjoyable to dabble a bit into that. However, there are plenty to be found on the internet already. 

I also wonder how a technique like reinforcement learning could solve these discrete optimisation problems. We could have agents per constraint, each trying to satisfy their constraint, without coordination with any other agent. They would receive a reward if their constraint is satisfied or close to satisfied. Would that work? Or, due to the discrete nature of the problem, rather just keep oscillating and never converge to a valid solution? It sounds quite similar to decoupling a MILP approach into an ADMM approach, which also generally has no guarantees to converge to a solution when integer variables are contained in the problem. However, [2] argues it proves convergence for a special ADMM case it set up.

# References

[1] Sudoku Solving Algorithms, Wikipedia (https://en.wikipedia.org/wiki/Sudoku_solving_algorithms#Computation_time) <br>

[2] Baoyuan Wu, Bernard Ghanem, lp-Box ADMM: A Versatile Framework for Integer Programming. (https://arxiv.org/pdf/1604.07666.pdf)

Peter Sels, March 22nd, 2020. Copyright © 2020 Logically Yours BV.