# Assignment 2
## Sudoku


In this assignment you will implement a Sudoku solver using CSPs. If you've never played Sudoku before, you can learn about it [here](https://en.wikipedia.org/wiki/Sudoku).

In the `sudoku` subfolder, there are several puzzles of varying difficulty. The solutions are in [sudoku/sudoku_solutions.pdf](sudoku/sudoku_solutions.pdf).


(Assignment adapted from Chris Callison-Burch.)


In this section, we will view a Sudoku puzzle not from the perspective of its grid layout, but more abstractly as a collection of cells. Accordingly, we will represent it internally as a dictionary mapping from cells, i.e. (row, column) pairs, to sets of possible values. This dictionary should have a fixed (9 x 9=81) set of pairs of keys, but the number of elements in each set corresponding to a key will change as the board is being manipulated.

**Part 1**

In the Sudoku class below, write an initialization method `__init__(self, board)` that stores such a mapping for future use. Also write a method `get_values(self, cell)` that returns the set of values currently available at a particular cell.

In addition, write a function `read_board(path)` that reads the board specified by the file at the given path and returns it as a dictionary. Sudoku puzzles will be represented textually as 9 lines of 9 characters each, corresponding to the rows of the board, where a digit between "1" and "9" denotes a cell containing a fixed value, and an asterisk "*" denotes a blank cell that could contain any digit.

```python
>>> b = read_board("sudoku/medium1.txt")
>>> Sudoku(b).get_values((0, 0))
set([1, 2, 3, 4, 5, 6, 7, 8, 9])
    
>>> b = read_board("sudoku/medium1.txt")
>>> Sudoku(b).get_values((0, 1))
set([1])
```

In [2]:
import math
import random
import time
import os
#import Queue
import copy

fileDir = os.path.dirname(os.path.realpath('__file__'))



In [3]:

def sudoku_cells():
    # Returns the list of all cells in a Sudoku puzzle as (row, column) pairs
    c = list()
    for i in range(0,9):
        for j in range(0,9):
            c.append((i,j))
    return c
cell_temp = sudoku_cells()

def sudoku_arcs():
    arcs = list()
    for key1 in cell_temp:
        for key2 in cell_temp:
            # first ensure they are not the same
            if key1 != key2:
                # if they are in same row or col
                if key1[0] == key2[0] or key1[1] == key2[1]:
                    arcs.append((key1,key2))
                # if they are in the same box
                elif math.floor(key1[0]/3) == math.floor(key2[0]/3) and \
                     math.floor(key1[1]/3) == math.floor(key2[1]/3):
                    arcs.append((key1,key2))
    return arcs

def read_board(path):
    board=dict()
    index = 0
    filename = os.path.join(fileDir, path)
    filehandle = open(filename)
    cell_list = list()
    for line in filehandle:
        line = line.rstrip().lstrip()
        cell_list.append(line)
    filehandle.close()
    for i in cell_list:
        for element in i:
            if element == '*':
                board[cell_temp[index]] = set([1,2,3,4,5,6,7,8,9])
            else:
                board[cell_temp[index]] = set([int(element)])
            index +=1
    return board


class Sudoku(object):

    CELLS = sudoku_cells()
    ARCS = sudoku_arcs()

    def __init__(self, board):
        self.board = board
        pass

    def get_values(self, cell):
        return self.board[cell]

    
    def remove_inconsistent_values(self, cell1, cell2):
        set_cell1=self.get_values(cell1)
        set_cell2=self.get_values(cell2)
        if len(self.board[cell2]) == 1:
            for x in set_cell1:
                if x in set_cell2:
                    set_cell1.remove(x)
                    return True
            return False
        else:
            return False
    
         

    def infer_improved(self):
        reduced = 1
        while reduced == 1:
            reduced = 0
            self.infer_ac3()
            if self.is_solved():
                return self
            for i in range(0,9):
                for j in range(0,9):
                    element = (i,j)
                    if len(self.board[element]) > 1:
                        for value in self.board[element]:
                            # check if the value is not in anywhere in the block
                            if not self.is_in_block(value,element):
                                self.board[element] = set([value])
                                reduced = 1
                            # check if the value is not in anywhere in the row
                            if not self.is_in_row(value,element):
                                self.board[element] = set([value])
                                reduced = 1
                            # check if the value is not in anythere in the col
                            if not self.is_in_col(value,element):
                                self.board[element] = set([value])
                                reduced = 1
        return self

    def infer_ac3(self):
        queue = list()
        for element in self.ARCS:
            queue.append(element)
        while queue:
            frontier = queue.pop(0)
            if self.is_solved():
                break
                return self
            if self.remove_inconsistent_values(frontier[0], frontier[1]):
                self.ac3_s(queue, frontier)
                    
    def ac3_s(self, queue, arc):
        row = arc[0][0]
        col = arc[0][1]
        # Check the rows and lines
        for i in range(9):
            if i not in range(row, row+3):
                if (i,col) != arc[0][1]:
                    queue.append((i,col), (row, col))
        for j in range(9):
            if j not in range(col, col+3):
                if (row, j) != arc[0][1]:
                    queue.append((row,j), (row,col))
        # Check the blocks
        for i in range(row, row+3):
            for j in range(col, col+3):
                if i != row or j != col:
                    if (i,j) != arc[0][1]:
                        queue.append((i,j), (row,col))
                                   

    def is_solved(self):
        '''check if it is solved, based on length'''
        for cell in self.CELLS:
            if len(self.board[cell]) != 1:
                return False
        return True
    
    def print_board(self):
        # printing board for answer checking
        for i in range(0,9):
            result = []
            for j in range(0,9):
                result.extend(list(self.board[(i,j)]))
            print(result)

In [4]:
# function read_board(path) and method get_values(self, cell)
b = read_board("sudoku/medium1.txt")
Sudoku(b).get_values((0, 0))

{1, 2, 3, 4, 5, 6, 7, 8, 9}

In [5]:
b = read_board("sudoku/medium1.txt")
Sudoku(b).get_values((0, 1))

{1}

**Part 2**

Write a function `sudoku_cells()` that returns the list of all cells in a Sudoku puzzle as (row, column) pairs. The line `CELLS = sudoku_cells()` in the Sudoku class then creates a class-level constant `Sudoku.CELLS` that can be used wherever the full list of cells is needed. Although the function `sudoku_cells()` could still be called each time in its place, that approach results in a large amount of repeated computation and is therefore highly inefficient. The ordering of the cells within the list is not important, as long as they are all present. (For more information on the difference between class-level constants and fields of a class, see this [helpful guide](https://www.python-course.eu/python3_class_and_instance_attributes.php)).

```python
>>> sudoku_cells()
[(0, 0), (0, 1), (0, 2), (0, 3), (0, 4), ..., (8, 5), (8, 6), (8, 7), (8, 8)]
```

In [6]:
# Check the function sudoku_cells()
sudoku_cells()

[(0, 0),
 (0, 1),
 (0, 2),
 (0, 3),
 (0, 4),
 (0, 5),
 (0, 6),
 (0, 7),
 (0, 8),
 (1, 0),
 (1, 1),
 (1, 2),
 (1, 3),
 (1, 4),
 (1, 5),
 (1, 6),
 (1, 7),
 (1, 8),
 (2, 0),
 (2, 1),
 (2, 2),
 (2, 3),
 (2, 4),
 (2, 5),
 (2, 6),
 (2, 7),
 (2, 8),
 (3, 0),
 (3, 1),
 (3, 2),
 (3, 3),
 (3, 4),
 (3, 5),
 (3, 6),
 (3, 7),
 (3, 8),
 (4, 0),
 (4, 1),
 (4, 2),
 (4, 3),
 (4, 4),
 (4, 5),
 (4, 6),
 (4, 7),
 (4, 8),
 (5, 0),
 (5, 1),
 (5, 2),
 (5, 3),
 (5, 4),
 (5, 5),
 (5, 6),
 (5, 7),
 (5, 8),
 (6, 0),
 (6, 1),
 (6, 2),
 (6, 3),
 (6, 4),
 (6, 5),
 (6, 6),
 (6, 7),
 (6, 8),
 (7, 0),
 (7, 1),
 (7, 2),
 (7, 3),
 (7, 4),
 (7, 5),
 (7, 6),
 (7, 7),
 (7, 8),
 (8, 0),
 (8, 1),
 (8, 2),
 (8, 3),
 (8, 4),
 (8, 5),
 (8, 6),
 (8, 7),
 (8, 8)]

**Part 3**

Write a function `sudoku_arcs()` that returns the list of all arcs between cells in a Sudoku puzzle corresponding to inequality constraints. In other words, each arc should be a pair of cells whose values cannot be equal in a solved puzzle. The arcs should be represented a two-tuples of cells, where cells themselves are (row, column) pairs. The line `ARCS = sudoku_arcs()` in the Sudoku class then creates a class-level constant `Sudoku.ARCS` that can be used wherever the full list of arcs is needed. The ordering of the arcs within the list is not important, as long as they are all present. Note that this is asking not for the arcs in a particular board, but all of the arcs that exist on an empty board.

```python
>>> ((0, 0), (0, 8)) in sudoku_arcs()
True
>>> ((0, 0), (8, 0)) in sudoku_arcs()
True
>>> ((0, 8), (0, 0)) in sudoku_arcs()
True
>>> ((0, 0), (2, 1)) in sudoku_arcs()
True
>>> ((2, 2), (0, 0)) in sudoku_arcs()
True
>>> ((2, 3), (0, 0)) in sudoku_arcs()
False
```

In [73]:
((0, 0), (0, 8)) in sudoku_arcs()

True

In [74]:
((0, 0), (8, 0)) in sudoku_arcs()

True

In [75]:
((0, 8), (0, 0)) in sudoku_arcs()

True

In [76]:
((0, 0), (2, 1)) in sudoku_arcs()

True

In [77]:
((2, 2), (0, 0)) in sudoku_arcs()

True

In [78]:
((2, 3), (0, 0)) in sudoku_arcs()

False

**Part 4**

In the Sudoku class, write a method `remove_inconsistent_values(self, cell1, cell2)` that removes any value in the set of possibilities for `cell1` for which there are no values in the set of possibilities for `cell2` satisfying the corresponding inequality constraint (which we have represented as an arc). Each cell argument will be a (row, column) pair. If any values were removed, return `True`; otherwise, return `False`. Note that this question is asking you both to change the class attributes (i.e., change the dictionary representing the board) and to return a boolean value - in Python one can do both in the same method!

*Hint: Think carefully about what this exercise is asking you to implement. How many values can be removed during a single invocation of the function?*

```python
>>> sudoku = Sudoku(read_board("sudoku/easy.txt")) # See below for a picture.
>>> sudoku.get_values((0, 3))
set([1, 2, 3, 4, 5, 6, 7, 8, 9])
>>> for col in [0, 1, 4]:
...     removed = sudoku.remove_inconsistent_values((0, 3), (0, col))
...     print(removed, sudoku.get_values((0, 3)))
...
True set([1, 2, 3, 4, 5, 6, 7, 9])
True set([1, 3, 4, 5, 6, 7, 9])
False set([1, 3, 4, 5, 6, 7, 9])
```

In [79]:
sudoku = Sudoku(read_board("sudoku/easy.txt")) # See below for a picture.

In [80]:
sudoku.get_values((0, 3))

{1, 2, 3, 4, 5, 6, 7, 8, 9}

In [81]:
for col in [0, 1, 4]:
    removed = sudoku.remove_inconsistent_values((0, 3), (0, col))
    print(removed, sudoku.get_values((0, 3)))

True {1, 2, 3, 4, 5, 6, 7, 9}
True {1, 3, 4, 5, 6, 7, 9}
False {1, 3, 4, 5, 6, 7, 9}


**Part 5**

In the Sudoku class, write a method `infer_ac3(self)` that runs the AC-3 algorithm on the current board to narrow down each cell’s set of values as much as possible (see lectures 5 and 6 and the book for details on this arc consistency algorithm). Although this will not be powerful enough to solve all Sudoku problems, it will produce a solution for easy-difficulty puzzles such the one in `easy.txt`. By “solution”, we mean that there will be exactly one element in each cell’s set of possible values, and that no inequality constraints will be violated.



In [67]:
sudoku.infer_ac3()

In [68]:
sudoku.print_board()

[8, 2, 1, 1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 2, 3, 4, 5, 6, 7, 8, 9, 7]
[1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 2, 3, 4, 5, 6, 7, 8, 9, 8, 1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 2, 3, 4, 5, 6, 7, 8, 9, 6, 1, 2, 3, 4, 5, 6, 7, 8, 9]
[1, 2, 3, 4, 5, 6, 7, 8, 9, 6, 1, 2, 3, 4, 5, 6, 7, 8, 9, 9, 3, 1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 2, 3, 4, 5, 6, 7, 8, 9, 5]
[1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 2, 3, 4, 5, 6, 7, 8, 9, 8, 2, 1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 6, 1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 2, 3, 4, 5, 6, 7, 8, 9]
[1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 2, 3, 4, 5, 6, 7, 8, 9, 7, 1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 2, 3, 4, 5, 6, 7, 8, 9, 2, 8, 4]
[2, 4, 1, 2, 3, 4, 5, 6, 7, 8, 9, 6, 1, 2, 3, 4, 5, 6, 7, 8, 9, 3, 7, 1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 2, 3, 4, 5, 6, 7, 8, 9]
[6, 1, 2, 3, 4, 5, 6, 7, 8, 9, 5, 1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 2, 3, 4, 5, 6, 7, 8, 9, 

**Part 6: 6620 students only**

Consider the outcome of running AC-3 on the medium-difficulty puzzle in `medium2.txt`. Although it is able to determine the values of some cells, it is unable to make significant headway on the rest.

<img src="sudoku/ac3breaks.png" width="20%"/>

However, if we consider the possible placements of the digit 7 in the upper-right block, we observe that the 7 in the third row and the 7 in the final column rule out all but one square, meaning we can safely place a 7 in the indicated cell despite AC-3 being unable to make such an inference.

In the Sudoku class, write a method `infer_improved(self)` that runs this improved version of AC-3, using `infer_ac3(self)` as a subroutine (perhaps multiple times). You should consider what deductions can be made about a specific cell by examining the possible values for other cells in the same row, column, or block. Using this technique, you should be able to solve all of the medium-difficulty puzzles. Note that this goes beyond the typical AC3 approach because it involves constraints that relate more than 2 variables.

In [7]:
sudoku = Sudoku(read_board("sudoku/medium2.txt")) # See below for a picture.
sudoku.infer_improved()

<__main__.Sudoku at 0x7ff022ad6a58>

In [10]:
sudoku.is_solved()

True

**Part 7 (all students)**

The algorithms from parts 5 and 6 are still not powerful enough to solve all Sudoku problems. Describe in words an improvement to the approaches above that will allow you to do this.


**Enter your answer here**
We can look at pairs or triples of cells within a row, column, or block. Then find that a pair of cells has only two options of entries, but don't know which goes where. What we can still gain from this observation is that those pair of numbers cannot occur anywhere else in the neighborhood. This will decrease the number of possibilities for the other cells in the neighborhood and help us get closer to a solution. Similarly, a triple of cells having only three possibilities of entries between them will eliminate these entries in all other cells in a neighborhood of this triple.