# Sudoku Solver
---

Sudoku is number puzzle with 9X9 grid with 9 3X3 subgrids. Some of the squares are filled with numbers.
The objective of the puzzle is to fill all the 81 squares with numbers from 1-9 in a way that no number gets repeated in each row, column or box. 

The objective of this work is to design an agent that solves sudoku puzzle.


In [1]:
#

from SudokuSolver import SudokuSolver 
%load_ext autoreload
%autoreload 2

squares   = str
gridValDict = dict
cols = '123456789'
rows      = 'ABCDEFGHI'
#cols      = digits

In [2]:
def cross(A,B) -> tuple:
    return tuple(a+b for a in A for b in B)

## Representation of puzzle
---

The puzzle is represented as follows ( as mentioned in Stuart Russell and Peter Norvig's 'Artificial Intelligence: A Modern Approach' book)

    - 81 squares [A1', 'A2', ... 'I9']
    - 3X3 subgrids are represented as boxes(9 boxes)
    - units are  rows, columns and boxes(subgrids)  
    - peers of a square represents are the squares that are present in the same row or colum or subgrid

In [3]:
squares   = cross(rows, cols)
#list of tuples - [('A1', 'A2', 'A3', 'B1', 'B2', 'B3', 'C1', 'C2', 'C3'),...] 
#all 3x3 boxes in puzzle - total 9
all_boxes = [cross(rs, cs)  for rs in ('ABC','DEF','GHI') for cs in ('123','456','789')] 
#list of tuples - [('A1', 'B1', 'C1', 'D1', 'E1', 'F1', 'G1', 'H1', 'I1'),..]
#all  rows, columns and boxes - total 27
all_units = [cross(rows, c) for c in cols] + [cross(r, cols) for r in rows] + all_boxes

"""
dictionary holding units for each square - {'A1': (('A1',...,'I1'), ('A1',...,'A9'), ('A1',...,'C3')),
                                            'A2': (('A2',...,'I2'), ('A1',...,'A9'), ('A1',...,'C3')),
                                            .
                                            .
                                            'I9':(('A9',...,'I9'),  ('I1',...,'I9'), ('G7',...,'I9'))}

Total - 81 units
"""

units     = {s: tuple(u for u in all_units if s in u) for s in squares}
# Dictionary holding set of peers - {'A1': {'A2', 'A3',.....,I1},.....'I9':{A9,...,I8} }
# Total 81 units

peers     = {s: set().union(*units[s]) - {s} for s in squares}

In [4]:
# Module to display the Sudoku grid

def display(values):
    "Display these values as a 2-D grid."
    # rows      = 'ABCDEFGHI'
    # cols      = '123456789'

    width = 1+max(len(values[s]) for s in squares)
    
    
    line = '+'.join(['-'*(width*3)]*3)
    for r in rows:
        print ( ''.join(values[r+c].center(width)+('|' if c in '36' else '')
                    for c in cols) )
        if r in 'CF': print(line)
    print()



## Constraint Satisfaction Problem
---
Constraint Satisfaction Probelm is a mathematical problem that can be solved by finding the values of all the variables with respect to the constraints. The Constraint Satisfaction problem consists of

- **Variables** - Variables are the states. In the case of Sudoku the 81 squares from A1 to I9 are variables

- **Domains**  - Domains are the set of values that can be assigned to the variables. In Sudoku, domain is the values from 1-9 which are the possible values of each state

- **Constraints** - Constraint is the set of restrictions applied on the problem. In sudoku, The numbers are constrained in a way that no number should be repeated in each Columns, Rows and subgrid(box)


Sudoku is considered as a Constrain satisfaction problem with partial assignment  since few of the grids are already filled and the remaining squares should be filled

There are different types of constraints. 
* **Unary Constraint** - Constraint on a single variable
* **Binary Constraint** - Constraint is related with two variables
* **Global Constraint** - This contains arbitrary number of variables. Alldiff constraint is a global constraint which means all the variables are different. The contraint in sudoku problem is alldiff constraint since all the variables in each unit(row, column, box) should be different.

  

In [5]:
# Example sudoku solved - one easy and one hard

grid1 = '..3.2.6..9..3.5..1..18.64....81.29..7.......8..67.82....26.95..8..2.3..9..5.1.3..'
grid2 = '..53.....8......2..7..1.5..4....53...1..7...6..32...8..6.5....9..4....3......97..'

solver = SudokuSolver(squares, all_units, peers)
constraint = 3
gridValDict = solver.gridtoValues(grid1)
consProp = solver.reduceGridVal(gridValDict, constraint)
print('-'*60)
print("\tPuzzle after contraint propagation")
print('-'*60)
display(consProp)
print('-'*60)
print("\tPuzzle after backpropagation")
print('-'*60)
solution = solver.backtrack(consProp, constraint)
display(solution)

# solution = solver.solvePuzzle(grid)
# display(solution)

------------------------------------------------------------
	Puzzle after contraint propagation
------------------------------------------------------------
4 8 3 |9 2 1 |6 5 7 
9 6 7 |3 4 5 |8 2 1 
2 5 1 |8 7 6 |4 9 3 
------+------+------
5 4 8 |1 3 2 |9 7 6 
7 2 9 |5 6 4 |1 3 8 
1 3 6 |7 9 8 |2 4 5 
------+------+------
3 7 2 |6 8 9 |5 1 4 
8 1 4 |2 5 3 |7 6 9 
6 9 5 |4 1 7 |3 8 2 

------------------------------------------------------------
	Puzzle after backpropagation
------------------------------------------------------------
4 8 3 |9 2 1 |6 5 7 
9 6 7 |3 4 5 |8 2 1 
2 5 1 |8 7 6 |4 9 3 
------+------+------
5 4 8 |1 3 2 |9 7 6 
7 2 9 |5 6 4 |1 3 8 
1 3 6 |7 9 8 |2 4 5 
------+------+------
3 7 2 |6 8 9 |5 1 4 
8 1 4 |2 5 3 |7 6 9 
6 9 5 |4 1 7 |3 8 2 



In the above examples, the easiest puzzle is solved by the constraint propagation itself. But the hard one is not solved with constraint propagation alone. So backtracking is done to search the solution.

## Constraint Propagation
Constraint propagation is an inference that can be used to reduce the domain values for the variable, that will inturn reduces the domain value of another variable and so on.

In the sudoku solver module, I have implemented three constraint propagation methods and backtrackking for searching the required combination.
Simple naive backtracking alone will be able to find the solution  for the sudoku puzzles. But the drawback would be the time complexity. It will search each and every combination from scratch. That would be n*9!*(81-n)! where n is the number of digits filled already filled.
Inorder to fasten the process, The constraint propagation will be helpful to reduce the domain values for some variables by appying some strategies. I applied the following strategies and compared its results.

- **Elimination Strategy**
    - If there is a value present in one of the squares (variable), then eliminate the value from all its peer's (ie squares belong to the same row/column/box) domains.


- **Single posssibility strategy**
    - In this strategy, for a square, if there is only one possible number then assign the number in that square(remove all the other values in the variables domain).
    

- **Naked twins rule**
    - This is a little complicated strategy that will be helpful for solving sone hard puzzles. When two squares in the same unit(two/column/box) has two domain values and the same domain values, then those two values can occur in only those two squares. So those two values can be removed from domain of all the other squares that are peers.


After the constaint propagation most of the easy puzzles got solved. But for the hard problem again had to do backtracking.
# Backtracking Search
Back tracking search uses a depth first search to try all values till the bottom and then backtrack if these is any inconsistency (ie no value can be assigned to the variable to satisfy the constraint)

when selecting an unassigned variable in back tracking value ordering and variable ordering should be considered. In this problem, I used minimum remaining value heuristic and also compared its peformance with static variable ordering and random ordering. For value ordering I just used all the digits in order.
<br>
Then I did forward checking which allows to eliminate all the domains that are inconsistent with the constraint, every time we assign a value to the variable in bactracking process.

# Performance comparison for the solver with various combinations of methods
---


For performance comparison I used the easy, hard and hardest puzzle from Peter Norvig's website.
There are 50 easy sudokus and  95 hard and 11 hardest sudokus.
<br>
I used Elimination and single possibility strategy and compared results for easy, hard and hardest puzzles.
Then I used Elimination, single possibility strategy and naked twin rule for all the sudoku puzzles.


In [17]:
import time

In [13]:
# Module to calculate time taken to solve all the puzzles in ths given file
def timetaken(filename, constraint, var_order):
    file1 = open(filename, 'r')
    Lines = file1.readlines()
    start = time.time()
    count = 0
    for line in Lines:
        count = count+1
        solver.solvePuzzle(line, constraint,var_order)
        
    end = time.time()
    avgTime = (end-start)/count
    return avgTime
    # print(f"The time of execution grids in  {filename} : {end-start}")
    # print(f"Average execution time grids in  filename {filename}: {(end-start)/count }")


In [14]:
# Generates table give dictionary containing filename and Average execution time

def generateTable(execTime):   
    # Print the names of the columns.
    print ("{:<15} {:<15}".format('Filename','Average execution time'))
    print ("{:<35} ".format('-'*35))
    # print each data item.
    for key, value in execTime.items():
        print ("{:<15} {:<15}".format(key,value))

In [10]:
# def procTime(constraint):
#     execTime = {}
#     files = ['Easy50.txt', 'Hard.txt', 'Hardest.txt']
#     for file in files:
#         execTime[file] = timetaken(file,constraint)

#     generateTable(execTime)  



In [None]:
def procTime(constraint, var_order):
    execTime = {}
    files = ['Easy50.txt', 'Hard.txt', 'Hardest.txt']
    for file in files:
        execTime[file] = timetaken(file,constraint, var_order)

    generateTable(execTime)  

In [23]:
""" 
Processing time for solving puzzle with elimination single possibility constraint propagation
 and backtracking with minimum remaining value heuristic
 """
procTime(2,'min')

Filename        Average execution time
----------------------------------- 
Easy50.txt      0.006899986267089844
Hard.txt        1.3040096809989528
Hardest.txt     0.17414966496554288


In [24]:
""" 
Processing time for solving puzzle with elimination single possibility and naked pair constraint propagation
 and backtracking with minimum remaining value heuristic
 """
procTime(3,'min')

Filename        Average execution time
----------------------------------- 
Easy50.txt      0.005900020599365234
Hard.txt        0.6962368287538228
Hardest.txt     0.19408321380615234


From the above results, Easy puzzles are solved very fast. And When comparing the strategies, Easy and hard puzzles got solved faster with elimination and single possibility strategy. But the hardest puzzles were solved when naked twin strategy is introduced. 

In [18]:
""" 
Processing time for solving puzzle with elimination and single possibility constraint propagation
 and backtracking with random variable order
 """
procTime(2,'rand')

Filename        Average execution time
----------------------------------- 
Easy50.txt      0.011603693962097168
Hard.txt        0.6178520177540027
Hardest.txt     0.10165192864157936


In [19]:
""" 
Processing time for solving puzzle with elimination, single possibility constraint propagation
 and backtracking with static variable order
 """
procTime(2, 'static')

Filename        Average execution time
----------------------------------- 
Easy50.txt      0.006216158866882324
Hard.txt        0.7976338888469495
Hardest.txt     0.06374352628534491


In [21]:
""" 
Processing time for solving puzzle with elimination single possibility and naked pair constraint propagation
 and backtracking with random variable order
 """
procTime(3,'rand')

Filename        Average execution time
----------------------------------- 
Easy50.txt      0.01024759292602539
Hard.txt        0.5482491267354865
Hardest.txt     0.07780662449923428


In [22]:
""" 
Processing time for solving puzzle with elimination, single possibility and naked pair constraint propagation
 and backtracking with static variable order
 """
procTime(3, 'static')

Filename        Average execution time
----------------------------------- 
Easy50.txt      0.005596528053283692
Hard.txt        0.4191200933958355
Hardest.txt     0.08402037620544434


In [12]:

from SudokuSolver import SudokuSolver 
# import importlib
# importlib.reload(SudokuSolver.backtrack1) 

solver = SudokuSolver(squares, all_units, peers)
gridValDict = solver.gridtoValues(grid2)
constraint = 2
display(solver.reduceGridVal(gridValDict, constraint))
display(solver.backtrack(gridValDict, constraint, 'min'))
#print(solver.backtrack_counter)

 1269  249    5   |  3   24689 24678 |14689 14679  1478 
  8    349   169  | 4679   5    467  | 1469   2    1347 
 2369   7    269  | 4689   1    2468 |  5    469   348  
------------------+------------------+------------------
  4    289  26789 | 1689  689    5   |  3    179   127  
 259    1    289  | 489    7     3   | 249   459    6   
 5679   59    3   |  2    469   146  | 149    8    1457 
------------------+------------------+------------------
 1237   6    1278 |  5    2348 12478 | 1248   14    9   
12579  2589   4   | 1678  268  12678 | 1268   3    1258 
 1235  2358  128  | 1468 23468   9   |  7    1456 12458 

1 4 5 |3 2 7 |6 9 8 
8 3 9 |6 5 4 |1 2 7 
6 7 2 |9 1 8 |5 4 3 
------+------+------
4 9 6 |1 8 5 |3 7 2 
2 1 8 |4 7 3 |9 5 6 
7 5 3 |2 9 6 |4 8 1 
------+------+------
3 6 7 |5 4 2 |8 1 9 
9 8 4 |7 6 1 |2 3 5 
5 2 1 |8 3 9 |7 6 4 



# Results and discussion

# References

1. https://en.wikipedia.org/wiki/Sudoku

2. https://norvig.com/sudoku.html